Building Data-Driven Web Apps with Pyramid and SQLAlchemy Transcripts
Chapter: MongoDB edition
Lecture: Importing the data from SQL to Mongo
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
It's really cool that we have this working on MongoDB. It's less cool we have no data. So the final thing we need to do
0:06
is actually put some data into our new database. And this is a really good technique to see. It'll show you how to take basically any other data source
0:16
and import it into MongoDB. And it would of course would've work in exactly the opposite direction. If you want to go from, say, MongoDB
0:22
to SQLAlchemy in some relational database you'll see there's some serious parallels here. Now there's just a lot of little steps
0:29
so I'm going to pace this talk through it, and then we'll just run it. So I have this program here called migrate_to_mongo.
0:35
We're going to putmigrate_to_mongo in the bin folder. Now it's going to say init_dbs is not singular 'cause we're going to work with both databases.
0:43
So down here at the bottom we're going to do the same thing we did before for SQLAlchemy and the same thing we did before
0:50
except for this one I'm calling nosql same thing we did to set up MongoDB. We need to do them both this time
0:57
'cause we're going to read from SQLAlchemy, write to MongoDB. We could do reverse, like I said.
1:02
Then we're just going to import the various chunks of data. We're going to do it more or less by collection in MongoDB. So get all these users there
1:10
get the packages and the maintainers and then put the releases. So let's just look at an example here: migrate_users.
1:17
Well, first of all we're going to say if we have some users, this must already have been done so let's bail, and for that to work
1:23
let's just drop all of them out of there. All right, no records, great. We're going to go use SQLAlchemy to do a query and get all of our users
1:33
and then just loop over them. Create a Mongo User, copy the values over the SQLAlchemy date time that it was created is now being copied to MongoDB.
1:43
The password is being copied to MongoDB the name is being copied to MongoDB, and so on and then we call save().
1:50
That's it, you just copy the values over and call save(). There is a way to do bulk inserts if you edit large amounts of data.
1:56
You could do this without calling save() you know, millions of times and do more of a bulk insert thing that'd be much faster.
2:02
We don't have enough data for that to matter. Let's look at packages. Same thing, package.objects().count(), don't do it twice.
2:10
Go to SQLAlchemy, get all the packages for every package copy the values over. We didn't actually put the maintainer relationships
2:18
into SQLAlchemy so there's no real way to copy them over here but you would just, you know, append them to this list and then you call save().
2:26
Same thing for releases. Don't do it twice, get all the SQLAlchemy ones loop over, copy the values, save 'em. This is the entire script.
2:35
The other thing that's interesting is up here notice we're importing the SQLAlchemy User as SqlUser the MongoDB nosql User as MongoUser
2:46
and that let's us be super explicit in this one script. Let's go over here, a few documents none a few documents none, a few documents none
2:56
so that's great. Let's run our script, it takes a moment. Remember there's actually quite a few releases like 5400 or something
3:06
so it's not entirely trivially to do this. Bulk insert would make it much faster but that's okay. Look at that, it's done.
3:14
No errors, took a little time that's a good sign it probably did something. Let's go look. Go find our users. Oh, there they are, a whole bunch of 'em.
3:24
They don't have passwords because these imported users well they didn't have passwords. Releases, here's a bunch of releases.
3:32
Conversions, the dates, the URLs, the sizes all that kind of stuff, looking good. And finally, packages, amqp, our abders, our argpars
3:41
and their whole description, with long scroll bar there. Let's just go over here not even rerun our app
3:47
let's just, oh, I guess it's not running at all. We're going to have to rerun it but we wouldn't have to rerun it
3:52
if it were already running for it to work. Check this out. Here's all of our data awscli, boto3, clickonit. Oh, one other thing.
4:04
I thought we had covered them all. We had set up a relationship in SQLAlchemy to navigate that foreign key relationship
4:09
and you could actually technically do that in MongoEngine but I'm not a huge fan I like to be a little more explicit.
4:15
Let's go write one more function here where we're going to get the releases for the package. So let's say find_releases_for_package.
4:29
It's going to return a list of releases and we'll just take the name which is, again, the id. So this is pretty straight forward. We have release.
4:40
We want to go and say package_id is package name. And we don't want first, we want all. Now, in our relationship in SQLAlchemy we set the ordering
4:51
that's not happening here. So we're going to say there's going to be a .order_by. This is pretty easy, we just say '-major_ver'.
5:01
Let's just go remind ourselves what these are called major_ver, minor_ver, and build. And notice down here we have a composite keys
5:11
so that we can do this order with an index which is really important. This one also should have an index.
5:17
Let's check package_id, yeah, so this'll be super quick. Alright, so now we're going to call this function
5:26
instead of trying to navigate this relationship. I'll say this and then we got to do one more level here to keep that going. Alright, this looks good
5:41
and I suspect there's a problem in the template that was trying to use that as well. No, I don't think so. Alright, let's re-run it.
5:48
See if we've got that little bit fixed. Here we go, look there's boto with its release history. This one is the latest, it's on this date.
5:57
Everything's looking good. Try one more, if we do two that proves it, right? gevent, here we go, gevent's working. Really pretty awesome.
6:06
Okay, so we've now juggled a few little variations that you'll have between MongoDB and SQLAlchemy or really any two databases
6:15
any two sort of implementations of this idea. Hopefully that felt pretty easy. It did to me anyway, maybe I've done enough MongoDB.
6:23
But there weren't very many changes. Remember, we added our new entities we had to change those but those changes were really minor
6:33
because the similarities between Mongo engine and SQLAlchemy, they're quite similar. And then just a couple minor variations there.
6:39
We had to add mongoengine as a dependency and we had to change that one query because we didn't put the relationship together the same.
6:50
Alright, well that's it running on MongoDB. If you want to use it, go for it. I do encourage you to check out the other two courses
6:57
at Talk Python Training because we really skipped over a lot of details especially the security and deployment and things like that.
7:04
Take this as inspiration and proof of the power of some of these design patterns, view models this little service thing that we did
7:13
modeling with classes, things like that.