Building Data-Driven Web Apps with Pyramid and SQLAlchemy Transcripts
Chapter: MongoDB edition
Lecture: Importing the data from SQL to Mongo
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
It's really cool that we have this working on MongoDB.
0:02
It's less cool we have no data.
0:04
So the final thing we need to do
0:05
is actually put some data into our new database.
0:09
And this is a really good technique to see.
0:11
It'll show you how to take basically any other data source
0:15
and import it into MongoDB.
0:17
And it would of course would've work
0:18
in exactly the opposite direction.
0:19
If you want to go from, say, MongoDB
0:21
to SQLAlchemy in some relational database
0:24
you'll see there's some serious parallels here.
0:26
Now there's just a lot of little steps
0:28
so I'm going to pace this
0:29
talk through it, and then we'll just run it.
0:31
So I have this program here called migrate_to_mongo.
0:34
We're going to putmigrate_to_mongo in the bin folder.
0:38
Now it's going to say init_dbs is not singular
0:40
'cause we're going to work with both databases.
0:42
So down here at the bottom
0:44
we're going to do the same thing we did
0:46
before for SQLAlchemy
0:47
and the same thing we did before
0:49
except for this one I'm calling nosql
0:52
same thing we did to set up MongoDB.
0:54
We need to do them both this time
0:56
'cause we're going to read from SQLAlchemy, write to MongoDB.
0:59
We could do reverse, like I said.
1:01
Then we're just going to import the various chunks of data.
1:04
We're going to do it more or less by collection in MongoDB.
1:08
So get all these users there
1:09
get the packages and the maintainers
1:11
and then put the releases.
1:12
So let's just look at an example here: migrate_users.
1:16
Well, first of all we're going to say
1:17
if we have some users, this must already have been done
1:20
so let's bail, and for that to work
1:22
let's just drop all of them out of there.
1:25
All right, no records, great.
1:27
We're going to go use SQLAlchemy to do a query
1:30
and get all of our users
1:32
and then just loop over them.
1:34
Create a Mongo User, copy the values over
1:36
the SQLAlchemy date time that it was created
1:40
is now being copied to MongoDB.
1:42
The password is being copied to MongoDB
1:44
the name is being copied to MongoDB, and so on
1:48
and then we call save().
1:49
That's it, you just copy the values over and call save().
1:52
There is a way to do bulk inserts
1:53
if you edit large amounts of data.
1:55
You could do this without calling save()
1:57
you know, millions of times
1:58
and do more of a bulk insert thing that'd be much faster.
2:01
We don't have enough data for that to matter.
2:04
Let's look at packages.
2:05
Same thing, package.objects().count(), don't do it twice.
2:09
Go to SQLAlchemy, get all the packages
2:11
for every package copy the values over.
2:15
We didn't actually put the maintainer relationships
2:17
into SQLAlchemy so there's no real way
2:20
to copy them over here
2:21
but you would just, you know, append them to this list
2:24
and then you call save().
2:25
Same thing for releases.
2:28
Don't do it twice, get all the SQLAlchemy ones
2:31
loop over, copy the values, save 'em.
2:33
This is the entire script.
2:34
The other thing that's interesting is up here
2:36
notice we're importing the SQLAlchemy User as SqlUser
2:41
the MongoDB nosql User as MongoUser
2:45
and that let's us be super explicit in this one script.
2:50
Let's go over here, a few documents none
2:53
a few documents none, a few documents none
2:55
so that's great.
2:57
Let's run our script, it takes a moment.
3:00
Remember there's actually quite a few releases
3:03
like 5400 or something
3:05
so it's not entirely trivially to do this.
3:07
Bulk insert would make it much faster but that's okay.
3:12
Look at that, it's done.
3:13
No errors, took a little time
3:15
that's a good sign it probably did something.
3:18
Let's go look. Go find our users.
3:21
Oh, there they are, a whole bunch of 'em.
3:23
They don't have passwords because these imported users
3:26
well they didn't have passwords.
3:28
Releases, here's a bunch of releases.
3:31
Conversions, the dates, the URLs, the sizes
3:34
all that kind of stuff, looking good.
3:35
And finally, packages, amqp, our abders, our argpars
3:40
and their whole description, with long scroll bar there.
3:44
Let's just go over here not even rerun our app
3:46
let's just, oh, I guess it's not running at all.
3:48
We're going to have to rerun it
3:49
but we wouldn't have to rerun it
3:51
if it were already running for it to work.
3:54
Check this out. Here's all of our data
3:58
awscli, boto3, clickonit.
4:01
Oh, one other thing.
4:03
I thought we had covered them all.
4:04
We had set up a relationship in SQLAlchemy
4:07
to navigate that foreign key relationship
4:08
and you could actually technically do that in MongoEngine
4:11
but I'm not a huge fan
4:12
I like to be a little more explicit.
4:14
Let's go write one more function here
4:17
where we're going to get the releases for the package.
4:20
So let's say find_releases_for_package.
4:28
It's going to return a list of releases
4:30
and we'll just take the name which is, again, the id.
4:34
So this is pretty straight forward. We have release.
4:39
We want to go and say package_id is package name.
4:43
And we don't want first, we want all.
4:46
Now, in our relationship in SQLAlchemy we set the ordering
4:50
that's not happening here.
4:51
So we're going to say there's going to be a .order_by.
4:57
This is pretty easy, we just say '-major_ver'.
5:00
Let's just go remind ourselves what these are called
5:04
major_ver, minor_ver, and build.
5:07
And notice down here we have a composite keys
5:10
so that we can do this order with an index
5:12
which is really important.
5:13
This one also should have an index.
5:16
Let's check package_id, yeah, so this'll be super quick.
5:21
Alright, so now we're going to call this function
5:25
instead of trying to navigate this relationship.
5:32
I'll say this and then we got to do one more level here
5:36
to keep that going. Alright, this looks good
5:40
and I suspect there's a problem in the template
5:42
that was trying to use that as well.
5:45
No, I don't think so. Alright, let's re-run it.
5:47
See if we've got that little bit fixed.
5:50
Here we go, look there's boto with its release history.
5:53
This one is the latest, it's on this date.
5:56
Everything's looking good.
5:58
Try one more, if we do two that proves it, right?
6:01
gevent, here we go, gevent's working.
6:04
Really pretty awesome.
6:05
Okay, so we've now juggled a few little variations
6:09
that you'll have between MongoDB and SQLAlchemy
6:12
or really any two databases
6:14
any two sort of implementations of this idea.
6:17
Hopefully that felt pretty easy.
6:20
It did to me anyway, maybe I've done enough MongoDB.
6:22
But there weren't very many changes.
6:24
Remember, we added our new entities
6:28
we had to change those
6:30
but those changes were really minor
6:32
because the similarities between Mongo engine
6:34
and SQLAlchemy, they're quite similar.
6:36
And then just a couple minor variations there.
6:38
We had to add mongoengine as a dependency
6:44
and we had to change that one query
6:46
because we didn't put the relationship together the same.
6:49
Alright, well that's it running on MongoDB.
6:51
If you want to use it, go for it.
6:53
I do encourage you to check out the other two courses
6:56
at Talk Python Training
6:57
because we really skipped over a lot of details
7:00
especially the security and deployment and things like that.
7:03
Take this as inspiration and proof of the power
7:06
of some of these design patterns, view models
7:10
this little service thing that we did
7:12
modeling with classes, things like that.