MongoDB for Developers with Python Transcripts
Chapter: High-performance MongoDB
Lecture: Surveying the new code

Login or purchase this course to watch this video and the rest of the course contents.
0:01 Let's explore this slightly updated version of our code. Here we are in the github repository,
0:08 and I am in the source folder and I've added an 08_perf section, and we have the starter_big_dealership and we have the big_dealership
0:16 it even has instructions here to tell you basically how to restore that database we did just in the previous video.
0:21 This one is going to be a snapshot of how this chapter starts, it's what we're starting from now and will remain that way;
0:28 here we're going to take basically a copy of that one and evolve it into the fast high performance version,
0:34 so let's go over here and see what we've got. Now, we have a few things that are slightly different, the car is basically unchanged from before
0:42 although I added a little comment about how do we get to the owners. The one thing that is new here, in terms of the model is this owner idea,
0:51 so cars can now have an owner and how do we know which cars are owned by this owner
0:58 is we have a list of object ids, those object ids are the object ids of the cars so we're going to push the ids of the cars that are owned here
1:07 I guess we could run it as a many to many or one to many relationship, just depending on how we treat the owner, but theoretically,
1:14 we can have owners where there is a single car that is multiple owners and there are owners that own multiple cars, and we can manage it this way,
1:22 you almost never see like a car to owner intermediate table, so you're almost always going to have something like
1:28 those ids are either embedded in the owner or in the car, or under rare circumstances both. So here's how we refer back to the cars,
1:39 then we have a few basic things like the name, when was this owner created, how many times have they visited and things like that.
1:46 We want to call it owners in the database and it's just this core collection, so other than that, there's not a whole lot going on here,
1:52 let's look over here, we now have these services, I've taken all the car queries and moved them down here
1:58 do you want to create a car, you call this function, do you want to record a customer visit, here we can go to the owner
2:04 and we can use this increment operator to increment the number of visits in place. Find cars by make, find owner by name and so on.
2:17 Number of cars with bad service, a lot of this stuff is what we wrote previously;
2:21 there was the program thing that we ran over here that was interactive and I've replaced that with a few things,
2:26 one is this db stats and you can run this and it will tell you like how many cars are there, how many owners are there,
2:32 what's the average number of histories, this is basically those stats that I presented to you before,
2:37 this takes a while to run on this database, I don't recommend you run it but if you want to just run it and see what you get you can.
2:43 The database was originally created using this script, I am using something interesting you may not have heard about,
2:50 I am using this thing called Faker, so down here Faker lets you create this thing and I'm seeding it so it always generates exactly the same things,
3:02 I'm seeding random and fake and you can see down here it's creating the owners and you can ask it for things like
3:07 give me a fake name, give me a fake date between these two dates, things like that.
3:13 Similarly with cars, we're using random to get a hold of a lot of the numbers then we can use fake for anything else we might.
3:19 We ran this, with the right amount of data, it'll build it all up for us, so for some reason if you need to recreate it
3:28 run this low data thing, you can have it create a small one, if you comment, uncomment that or a large one if you only run it with those settings.
3:35 Those are all good, this is like the foundation and this is where we are. Next, we're going to ask interesting questions of this database
3:44 and we want to know how long those questions take to answer, so I've written this super simple function called time
3:49 you pass it a message and a function, it will time how long the function takes to run
3:54 and then print out the message along with the time in terms of milliseconds. And then we're going to go through
4:00 and we're going to ask interesting questions here like how many owners, how many cars, who is the 10 thousandth owner,
4:06 notice the slicing here to give us a slice of item of length one and then we'll just access it, and then we can start asking interesting questions like
4:15 how many cars are owned by the 10 thousandth owner, or if we go down here, how many owners own the 10 thousandth car,
4:22 so ask it in the reverse direction. Here we want to find the 50 thousand owner by name, so yes, technically have them but the idea is
4:31 we want to do a query based on the name field and we originally won't have any performance around these types of queries so it should be slow.
4:39 This one, how many cars are there with expensive service this was the one with the snail and in one of the first videos in this chapter,
4:47 I showed you look this takes 700 milliseconds to run to ask this question how many cars have a service history with a price greater than 16800.
4:56 So we're going to be to be able to ask all of these questions and this program will let us explore that and we'll see how to add indexes
5:05 and I'll show you how to add indexes in the shell and how to add them in MongoEngine, and MongoEngine is really nice
5:10 because as you evolve your indexes, as you add new ones simply deploying your Python web app will adapt the database that it goes and finds
5:19 to automatically upgrade to those indexes, so it's really really nice. So here you can see we're going to run this code and ask a bunch of questions
5:26 we could load the data from here, we could generate the data, but you're much better off importing the data from that zip file
5:33 because this takes like half an hour to run, you saw that zip takes like five seconds.


Talk Python's Mastodon Michael Kennedy's Mastodon