Building Data-Driven Web Apps with Pyramid and SQLAlchemy Transcripts
Chapter: MongoDB edition
Lecture: Importing the data from SQL to Mongo
0:00 It's really cool that we have this working on MongoDB.
0:02 It's less cool we have no data.
0:04 So the final thing we need to do
0:05 is actually put some data into our new database.
0:09 And this is a really good technique to see.
0:11 It'll show you how to take basically any other data source
0:15 and import it into MongoDB.
0:17 And it would of course would've work
0:18 in exactly the opposite direction.
0:19 If you want to go from, say, MongoDB
0:21 to SQLAlchemy in some relational database
0:24 you'll see there's some serious parallels here.
0:26 Now there's just a lot of little steps
0:28 so I'm going to pace this
0:29 talk through it, and then we'll just run it.
0:31 So I have this program here called migrate_to_mongo.
0:34 We're going to putmigrate_to_mongo in the bin folder.
0:38 Now it's going to say init_dbs is not singular
0:40 'cause we're going to work with both databases.
0:42 So down here at the bottom
0:44 we're going to do the same thing we did
0:46 before for SQLAlchemy
0:47 and the same thing we did before
0:49 except for this one I'm calling nosql
0:52 same thing we did to set up MongoDB.
0:54 We need to do them both this time
0:56 'cause we're going to read from SQLAlchemy, write to MongoDB.
0:59 We could do reverse, like I said.
1:01 Then we're just going to import the various chunks of data.
1:04 We're going to do it more or less by collection in MongoDB.
1:08 So get all these users there
1:09 get the packages and the maintainers
1:11 and then put the releases.
1:12 So let's just look at an example here: migrate_users.
1:16 Well, first of all we're going to say
1:17 if we have some users, this must already have been done
1:20 so let's bail, and for that to work
1:22 let's just drop all of them out of there.
1:25 All right, no records, great.
1:27 We're going to go use SQLAlchemy to do a query
1:30 and get all of our users
1:32 and then just loop over them.
1:34 Create a Mongo User, copy the values over
1:36 the SQLAlchemy date time that it was created
1:40 is now being copied to MongoDB.
1:42 The password is being copied to MongoDB
1:44 the name is being copied to MongoDB, and so on
1:48 and then we call save().
1:49 That's it, you just copy the values over and call save().
1:52 There is a way to do bulk inserts
1:53 if you edit large amounts of data.
1:55 You could do this without calling save()
1:57 you know, millions of times
1:58 and do more of a bulk insert thing that'd be much faster.
2:01 We don't have enough data for that to matter.
2:04 Let's look at packages.
2:05 Same thing, package.objects().count(), don't do it twice.
2:09 Go to SQLAlchemy, get all the packages
2:11 for every package copy the values over.
2:15 We didn't actually put the maintainer relationships
2:17 into SQLAlchemy so there's no real way
2:20 to copy them over here
2:21 but you would just, you know, append them to this list
2:24 and then you call save().
2:25 Same thing for releases.
2:28 Don't do it twice, get all the SQLAlchemy ones
2:31 loop over, copy the values, save 'em.
2:33 This is the entire script.
2:34 The other thing that's interesting is up here
2:36 notice we're importing the SQLAlchemy User as SqlUser
2:41 the MongoDB nosql User as MongoUser
2:45 and that let's us be super explicit in this one script.
2:50 Let's go over here, a few documents none
2:53 a few documents none, a few documents none
2:55 so that's great.
2:57 Let's run our script, it takes a moment.
3:00 Remember there's actually quite a few releases
3:03 like 5400 or something
3:05 so it's not entirely trivially to do this.
3:07 Bulk insert would make it much faster but that's okay.
3:12 Look at that, it's done.
3:13 No errors, took a little time
3:15 that's a good sign it probably did something.
3:18 Let's go look. Go find our users.
3:21 Oh, there they are, a whole bunch of 'em.
3:23 They don't have passwords because these imported users
3:26 well they didn't have passwords.
3:28 Releases, here's a bunch of releases.
3:31 Conversions, the dates, the URLs, the sizes
3:34 all that kind of stuff, looking good.
3:35 And finally, packages, amqp, our abders, our argpars
3:40 and their whole description, with long scroll bar there.
3:44 Let's just go over here not even rerun our app
3:46 let's just, oh, I guess it's not running at all.
3:48 We're going to have to rerun it
3:49 but we wouldn't have to rerun it
3:51 if it were already running for it to work.
3:54 Check this out. Here's all of our data
3:58 awscli, boto3, clickonit.
4:01 Oh, one other thing.
4:03 I thought we had covered them all.
4:04 We had set up a relationship in SQLAlchemy
4:07 to navigate that foreign key relationship
4:08 and you could actually technically do that in MongoEngine
4:11 but I'm not a huge fan
4:12 I like to be a little more explicit.
4:14 Let's go write one more function here
4:17 where we're going to get the releases for the package.
4:20 So let's say find_releases_for_package.
4:28 It's going to return a list of releases
4:30 and we'll just take the name which is, again, the id.
4:34 So this is pretty straight forward. We have release.
4:39 We want to go and say package_id is package name.
4:43 And we don't want first, we want all.
4:46 Now, in our relationship in SQLAlchemy we set the ordering
4:50 that's not happening here.
4:51 So we're going to say there's going to be a .order_by.
4:57 This is pretty easy, we just say '-major_ver'.
5:00 Let's just go remind ourselves what these are called
5:04 major_ver, minor_ver, and build.
5:07 And notice down here we have a composite keys
5:10 so that we can do this order with an index
5:12 which is really important.
5:13 This one also should have an index.
5:16 Let's check package_id, yeah, so this'll be super quick.
5:21 Alright, so now we're going to call this function
5:25 instead of trying to navigate this relationship.
5:32 I'll say this and then we got to do one more level here
5:36 to keep that going. Alright, this looks good
5:40 and I suspect there's a problem in the template
5:42 that was trying to use that as well.
5:45 No, I don't think so. Alright, let's re-run it.
5:47 See if we've got that little bit fixed.
5:50 Here we go, look there's boto with its release history.
5:53 This one is the latest, it's on this date.
5:56 Everything's looking good.
5:58 Try one more, if we do two that proves it, right?
6:01 gevent, here we go, gevent's working.
6:04 Really pretty awesome.
6:05 Okay, so we've now juggled a few little variations
6:09 that you'll have between MongoDB and SQLAlchemy
6:12 or really any two databases
6:14 any two sort of implementations of this idea.
6:17 Hopefully that felt pretty easy.
6:20 It did to me anyway, maybe I've done enough MongoDB.
6:22 But there weren't very many changes.
6:24 Remember, we added our new entities
6:28 we had to change those
6:30 but those changes were really minor
6:32 because the similarities between Mongo engine
6:34 and SQLAlchemy, they're quite similar.
6:36 And then just a couple minor variations there.
6:38 We had to add mongoengine as a dependency
6:44 and we had to change that one query
6:46 because we didn't put the relationship together the same.
6:49 Alright, well that's it running on MongoDB.
6:51 If you want to use it, go for it.
6:53 I do encourage you to check out the other two courses
6:56 at Talk Python Training
6:57 because we really skipped over a lot of details
7:00 especially the security and deployment and things like that.
7:03 Take this as inspiration and proof of the power
7:06 of some of these design patterns, view models
7:10 this little service thing that we did
7:12 modeling with classes, things like that.