MongoDB with Async Python Transcripts
Chapter: Course Conclusion
Lecture: Final Review

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Now finally, let's just look back on what we've talked about and what we've learned throughout this course and do a quick review.
0:07 First thing we started talking about was document databases. These weird things that store hierarchical data, how do they work?
0:15 Why would you use them and could they possibly ever be fast? So for example, out of our courses, we've got this green bit that looks like standard data
0:23 in a database, but we also have this blue section of embedded lectures. It's like a pre-computed join we decided
0:30 in pre-computing instead of real-time computing is faster. But the question is, can you still ask questions about stuff in that section?
0:40 Because if it's kind of opaque and hidden in there, that's useless. We've seen multiple times that we can ask, we were doing queries for,
0:50 give us all the releases that match something. When we had a quarter million releases in their spurs like this through 5,000 documents,
0:58 it was still millisecond type of responses, response time. So really, really, yes, we can absolutely ask questions
1:07 about the embedded data, not just the top level data. We saw Pydantic as a core part of this course, Beanie is of course based on it,
1:15 and even the stuff we did with FastAPI. So we create a class based on base model, it parses JSON data and even validates
1:24 or automatically converts from the underlying data types. For example, pages visited that third element, the three, and as a string, it just says,
1:34 eh, I know it's a string, but it could be a three if we just parsed it. So let's just do that for you. Another super important idea was async and await
1:43 and async programming and the ability to scale our requests using asyncio. So we saw that if we can communicate back to Python,
1:55 right now I'm waiting on something, wake me up when it's done, but you can go do other things. We can get awesome scalability that looks like this
2:02 instead of that four third request waiting for one and two to finish. Front and center to this whole course was the Beanie ODM object document mapper
2:11 based on Pydantic, programmed with async and await. Awesome work Roman, Roman Wright for creating this. Really, really nice framework.
2:19 Absolutely love it. When we talked about document design, the primary question was to embed or not to embed.
2:27 and I had a heuristic that you could follow that I think works out pretty well. Is the embedded data wanted most of the time?
2:36 If it is, it probably is a good idea to embed it 'cause it comes along for free other than serialization and so on.
2:42 If it's not wanted very often, it's just dead weight. And then reverse that question, how often do you want the thing that is embedded
2:50 without the containing element and all the other stuff around it? like if it's a list, all the other items in that list,
2:58 that the more you want that stuff separated, again, the less likely you wanna embed it. Is the embedded stuff a bounded set?
3:04 As remember, there's a 16 megabyte limit and a much lower practical limit on how much data goes in a document. So is that bound also small?
3:13 And then do you have an integration database or an application database, which sort of controls how many different types of questions
3:20 or how varied your queries are around that data? The more focused it is, the more likely you'll be able
3:26 to design the perfect documents to match those queries. The more diverse, the more likely you're gonna have
3:33 something closer to the traditional tabular type of relational data. Document design is only part of it. If you want your MongoDBs to go fast,
3:43 we have a couple of knobs and controls that you can turn to make things really awesome. Indexes, they're like magic database dust.
3:50 You sprinkle them on there, things go a thousand times faster. Indexes, indexes, indexes. Think about indexes a lot. They're incredibly important.
3:59 We just talked about document design. That applies query style as well. You can ask questions that are either fast or slow
4:07 depending on how you're running them. For example, you could pull all of the data back into memory and then loop through it,
4:15 or you could apply a limit to the database query and then say, I actually only want the first five. Or you could do that as a cursor
4:22 if you're gonna break out. that type of stuff is what I mean by query style. And a special subset of that would be projections, right?
4:31 Where I don't want an entire package with all of its releases, I just want the title, the release date,
4:37 the last updated date, and maybe the email, right? We saw we could do that with Pydantic models
4:42 and create a projection view basically into the queries. And then stuff we didn't cover was MongoDB server topology, replication and sharding.
4:52 Those are all awesome, just outside the scope of a Python course. That's more like a Mongo admin sort of thing.
4:58 But they are knobs you can turn as well. We deployed our database up to the cloud onto some admittedly pretty wimpy little servers,
5:08 but nonetheless, we saw that when we're running and doing development, you're probably running MongoDB just on your machine, just local talking back,
5:16 but there's a lot of considerations when you're running in the cloud, we've got our web server, we've got our database server,
5:22 we've got security and encryption, performance, all that different type of stuff. So that's what we just did recently,
5:30 right here at the end of the course. And for the very last thing, we said, well, how much traffic can this web app handle and how do you know?
5:39 So here's our max request per second version running out of Locust. And we said, let's just start increasingly
5:47 hitting these endpoints faster and faster until it reaches some limit. Well, looks like
5:51 the limit here is 765 requests per second for this particular app that we were testing.
5:58 And once it goes over that, you can usually see it just falls apart hard. So you know,
6:04 maybe don't push it all the way to the edge before you get that extra server or scale
6:07 up the server, but gives you a really good sense of what's possible.


Talk Python's Mastodon Michael Kennedy's Mastodon