MongoDB with Async Python Transcripts
Chapter: Performance Tuning
Lecture: Projections for Packages

Login or purchase this course to watch this video and the rest of the course contents.
0:00 A quick note, I just switched this to the match statement that I was using in this example
0:06 here, rather than the if else just so you have it exactly the same, nothing, nothing too much going on there, but just a minor update.
0:13 Okay, so we saw that some things are fast. For example, when we search the database, that was really, really fast.
0:23 They're getting timed packages 300 milliseconds. And what we did is we got a FastAPI, all of its details, its description, which is
0:34 the whole readme, as well of it, it's 154 releases. And one thing you might say is, well, you designed your documents poorly.
0:43 Here's a scenario if we go see where it's getting used. Where we get the package back, and we just show the ID and the last updated and we don't
0:53 necessarily have to show the releases. we're just going to do like this, like that. It's still not going to change the time, about
1:02 300 milliseconds still, because regardless of whether we're using the description, regardless
1:07 of whether we're using the 154 releases, we're still pulling them back over and over and
1:14 over again. Not ideal. So what can we do? We can do a projection. We talked about projections
1:22 when we talked about the MongoDB query syntax in the native shell. But what about Beanie? What do we do here?
1:30 We go back to Pydantic, and we express a smaller class that we would like to project into, which is a pretty neat way to do it.
1:39 So what we're going to do is we're going to go here and we have our regular package, which derives from Beanie.document.
1:46 But down here or in a separate file, we can say a class will have package top level only is what I'm going to call this.
1:53 You call it whatever makes you happy. It's going to be a Pydantic.baseModel. And then you just go up here and you cherry pick.
2:02 You're like, ""All right, well, the ID is important. The updated date, not last updated, but summary. I'll just copy those and we can throw them away.
2:11 We don't need the defaults because we're not creating them. They're going to come out of the database, but we also don't need this.
2:18 Let's say those are the three things that we need. It's not quite enough though, we got to pass a little bit of extra information to say how
2:26 that projection is actually done from Mongo into these things because this could be called and created date if you are a monster.
2:37 So in here we're going to have a settings class as well, an inner class.
2:42 And instead of having things like what collection does it go to, we're going to talk about the projection.
2:47 So we're going to say we're going to want the ID and that's dollar underscore ID. This is the Mongo query syntax there.
2:55 We want the summary, which is summary. And we want let's just say we want the last updated, I guess. Like that.
3:06 Because that's the one we're using recall up here, we're not talking about when it was created, but when it was last updated.
3:13 Okay, so we want ID last updated and summary. And this has way, way less data. Recall over in a package.
3:27 It's the main amount of data is all these releases, right? For FastAPI, there's 154. That's a lot.
3:33 We're not getting any of that, as well as the description itself, which is that read me the other huge piece of data. So we're missing all that.
3:41 What happens if we now go and change this get timed package, which means package by name, and let's add a keyword argument, summary only.
3:52 And in this case, we're going to set it to be true and we're going to have PyCharm add the summary only on there.
4:02 And if we go to the definition, now you can see summary only is true, but we really want this to be false by default.
4:09 We're just going to use in that one case. There's a couple things we can do here.
4:13 We could write the query and expand on it or we could just do two different things like
4:18 your most natural instinct might be if not summary summary only return this else what what I'll get the key right.
4:37 What goes here? something. Let's write it that way real quick. And I'll show you a cool alternative. So onto
4:43 this, we can say dot project. And all we have to give it is that projection model. So what
4:49 was it was packaged top level, but pycharm do all this magic to import it. So let's run
4:57 this again. So let's see if we remember, here's the time to get FastAPI 309 before. Look
5:10 at that much faster, three times faster, or if you call it much faster, but it's definitely
5:16 an improvement. Let's look again. 83. Oh, so that's almost four times faster 3.7 times
5:26 faster. So that's way less stress that we're putting on to MongoDB itself. There's a lot
5:33 of less data on the network, there's less disk access, potentially, if you have a ton
5:38 of data, all of these things. And all we had to do is say, we're going to project into
5:44 this set here. And it works because we weren't making any changes. Now if I go back and reset
5:51 this real quick. We run it again to have the packages back. No releases. We don't want
6:01 to pull those back and we can't leave the code the same. We had to make a little bit
6:05 of a trade-off there, right? I think it's fair. We're like, ""All right, we don't really
6:10 need to see how many things were. What we're actually interested in is that."" Just keep Keep in mind you only have the data comes back here.
6:18 Maybe one final thing in this. We said we're getting an optional package. Should probably say or a package top level only. Right.
6:31 That should be a one or the other. You might be able to convince me to use none right here instead of optional.
6:37 But you're going to get this or possibly this. So you want to be careful now that we're we're talking about what comes back accurately in
6:44 in terms of the typing, not a big deal, but just keep that in mind. Finally, this is the naive way,
6:50 and it's fine if like this is the code you're writing, it's super simple. If you had a complicated query, something like this,
6:58 you probably don't want that. You probably wanna be able to reuse as much of that as possible. So watch this, if we go over here
7:05 and we create a variable called query like that, we're doing the whole query and either we're executing it directly here or we can then
7:20 apply on further things like to list, list and etc. It doesn't actually apply right there, but whatever additional things you would chain
7:31 on including potentially other filter queries, other filter aspects, you can just keep piling those on before you await it.
7:41 If you want to make sure you have a single copy of the query and sometimes you're going
7:46 to not project it and other times you will, this allows you to have one and only one definition to maintain. That may or may not be worth it.
7:55 Like I said, here it's questionable. Down here it's probably a good idea. Okay. Excellent. Let's just make sure it still works.
8:03 Sure enough, we found FastAPI with the same last updated date, still the same performance.
8:10 Let's go switch that back one more time just to see what the meaning is the effect is. Here we go. Still back to 300.
8:21 So roughly three to four times faster by doing that projection.
8:25 It's also worth noting that what we're doing is we're exchanging data with a local loopback MongoDB server.
8:33 If we were talking to a production version, probably MongoDB would be somewhere across
8:38 the network. So having extra data or less data go across the network will matter more.
8:45 And if you're doing some kind of distributed thing, or you're talking far away to some
8:49 cloud service, that that's where MongoDB lives, it's only going to be more true. So this is
8:55 this dev scenario, this has the least effect that it probably would in production, or some
9:01 other production like scenario, this would probably have a bigger effect still because the network would get involved in that.


Talk Python's Mastodon Michael Kennedy's Mastodon