Building Data-Driven Web Apps with Pyramid and SQLAlchemy Transcripts
Chapter: MongoDB edition
Lecture: Importing the data from SQL to Mongo

Login or purchase this course to watch this video and the rest of the course contents.
0:00 It's really cool that we have this working on MongoDB. It's less cool we have no data. So the final thing we need to do
0:06 is actually put some data into our new database. And this is a really good technique to see. It'll show you how to take basically any other data source
0:16 and import it into MongoDB. And it would of course would've work in exactly the opposite direction. If you want to go from, say, MongoDB
0:22 to SQLAlchemy in some relational database you'll see there's some serious parallels here. Now there's just a lot of little steps
0:29 so I'm going to pace this talk through it, and then we'll just run it. So I have this program here called migrate_to_mongo.
0:35 We're going to putmigrate_to_mongo in the bin folder. Now it's going to say init_dbs is not singular 'cause we're going to work with both databases.
0:43 So down here at the bottom we're going to do the same thing we did before for SQLAlchemy and the same thing we did before
0:50 except for this one I'm calling nosql same thing we did to set up MongoDB. We need to do them both this time
0:57 'cause we're going to read from SQLAlchemy, write to MongoDB. We could do reverse, like I said.
1:02 Then we're just going to import the various chunks of data. We're going to do it more or less by collection in MongoDB. So get all these users there
1:10 get the packages and the maintainers and then put the releases. So let's just look at an example here: migrate_users.
1:17 Well, first of all we're going to say if we have some users, this must already have been done so let's bail, and for that to work
1:23 let's just drop all of them out of there. All right, no records, great. We're going to go use SQLAlchemy to do a query and get all of our users
1:33 and then just loop over them. Create a Mongo User, copy the values over the SQLAlchemy date time that it was created is now being copied to MongoDB.
1:43 The password is being copied to MongoDB the name is being copied to MongoDB, and so on and then we call save().
1:50 That's it, you just copy the values over and call save(). There is a way to do bulk inserts if you edit large amounts of data.
1:56 You could do this without calling save() you know, millions of times and do more of a bulk insert thing that'd be much faster.
2:02 We don't have enough data for that to matter. Let's look at packages. Same thing, package.objects().count(), don't do it twice.
2:10 Go to SQLAlchemy, get all the packages for every package copy the values over. We didn't actually put the maintainer relationships
2:18 into SQLAlchemy so there's no real way to copy them over here but you would just, you know, append them to this list and then you call save().
2:26 Same thing for releases. Don't do it twice, get all the SQLAlchemy ones loop over, copy the values, save 'em. This is the entire script.
2:35 The other thing that's interesting is up here notice we're importing the SQLAlchemy User as SqlUser the MongoDB nosql User as MongoUser
2:46 and that let's us be super explicit in this one script. Let's go over here, a few documents none a few documents none, a few documents none
2:56 so that's great. Let's run our script, it takes a moment. Remember there's actually quite a few releases like 5400 or something
3:06 so it's not entirely trivially to do this. Bulk insert would make it much faster but that's okay. Look at that, it's done.
3:14 No errors, took a little time that's a good sign it probably did something. Let's go look. Go find our users. Oh, there they are, a whole bunch of 'em.
3:24 They don't have passwords because these imported users well they didn't have passwords. Releases, here's a bunch of releases.
3:32 Conversions, the dates, the URLs, the sizes all that kind of stuff, looking good. And finally, packages, amqp, our abders, our argpars
3:41 and their whole description, with long scroll bar there. Let's just go over here not even rerun our app
3:47 let's just, oh, I guess it's not running at all. We're going to have to rerun it but we wouldn't have to rerun it
3:52 if it were already running for it to work. Check this out. Here's all of our data awscli, boto3, clickonit. Oh, one other thing.
4:04 I thought we had covered them all. We had set up a relationship in SQLAlchemy to navigate that foreign key relationship
4:09 and you could actually technically do that in MongoEngine but I'm not a huge fan I like to be a little more explicit.
4:15 Let's go write one more function here where we're going to get the releases for the package. So let's say find_releases_for_package.
4:29 It's going to return a list of releases and we'll just take the name which is, again, the id. So this is pretty straight forward. We have release.
4:40 We want to go and say package_id is package name. And we don't want first, we want all. Now, in our relationship in SQLAlchemy we set the ordering
4:51 that's not happening here. So we're going to say there's going to be a .order_by. This is pretty easy, we just say '-major_ver'.
5:01 Let's just go remind ourselves what these are called major_ver, minor_ver, and build. And notice down here we have a composite keys
5:11 so that we can do this order with an index which is really important. This one also should have an index.
5:17 Let's check package_id, yeah, so this'll be super quick. Alright, so now we're going to call this function
5:26 instead of trying to navigate this relationship. I'll say this and then we got to do one more level here to keep that going. Alright, this looks good
5:41 and I suspect there's a problem in the template that was trying to use that as well. No, I don't think so. Alright, let's re-run it.
5:48 See if we've got that little bit fixed. Here we go, look there's boto with its release history. This one is the latest, it's on this date.
5:57 Everything's looking good. Try one more, if we do two that proves it, right? gevent, here we go, gevent's working. Really pretty awesome.
6:06 Okay, so we've now juggled a few little variations that you'll have between MongoDB and SQLAlchemy or really any two databases
6:15 any two sort of implementations of this idea. Hopefully that felt pretty easy. It did to me anyway, maybe I've done enough MongoDB.
6:23 But there weren't very many changes. Remember, we added our new entities we had to change those but those changes were really minor
6:33 because the similarities between Mongo engine and SQLAlchemy, they're quite similar. And then just a couple minor variations there.
6:39 We had to add mongoengine as a dependency and we had to change that one query because we didn't put the relationship together the same.
6:50 Alright, well that's it running on MongoDB. If you want to use it, go for it. I do encourage you to check out the other two courses
6:57 at Talk Python Training because we really skipped over a lot of details especially the security and deployment and things like that.
7:04 Take this as inspiration and proof of the power of some of these design patterns, view models this little service thing that we did
7:13 modeling with classes, things like that.


Talk Python's Mastodon Michael Kennedy's Mastodon