MongoDB with Async Python Transcripts
Chapter: Performance and Load Testing
Lecture: Running Locust for Max Users

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Okay, so 500 requests per second, roughly, is what we can do. But that's not really how you wanna think about it most of the time.
0:10 Most of the time, a good mental model is just, how many users can we handle? I would say, think about the max request per second is,
0:19 if I make this change to an index, or if I change a projection, did it get faster or slower? Can it handle more load or less? That's really useful.
0:28 But when you wanna think of capacity planning, you wanna think about how many users can concurrently be on the site
0:36 or using the app that consumes the API or whatever it is consuming the API. How many of those things can you handle at a time?
0:45 'Cause that's what you can kind of think about. All right, we have, the other day we had a spike and there were 500 users on the site at the same time.
0:53 Should it be able to handle that? I don't know. So in this section, we're gonna add another thing up here, which is gonna be wait time.
1:01 And wait time tells you how frequently a user moves from one API endpoint to another, interacts with this part of your site and then the other.
1:10 So this will be a locust.between, and what you give it is let's say, if they're really cruising around fast, it's five seconds,
1:18 but a lot of times it might be up to 30 seconds that they're waiting to interact, right? They're reading a page or they're studying something
1:25 like, oh yeah, I gotta go click the next button to get to that thing. So this tells you how active each user is, right?
1:34 Gives you a realistic view of that instead of click, click, click, click, click, click, right? That's not realistic.
1:39 The other thing is, do they do these things the same? Probably not. Let's say going to the homepage is not common.
1:48 Okay, that's not really testing very much anyway because that's just FastAPI returning static HTML, right?
1:55 Let's say that they get the stats five times as often as the first page, and they get the most recent packages, they do that maybe 12 times,
2:09 maybe more even 15 times. So it's a rarely, somewhat often, really often. And it just happens to be going, increasing down, right? It doesn't matter.
2:22 Like you just pick these scenarios seem to happen in ratio more likely. Now this one happens three times as often as that one happens.
2:29 So we've done two things. You could have done this in the max request per second as well. You could have used these weights,
2:36 but it's like, it makes a little more sense in this, but it also does there. So we're gonna do the same kind of test,
2:43 but we're gonna try to answer the question of how many concurrent users in a realistic scenario, both from how frequently they interact with the site
2:51 and what kinds of things they typically do more often or less often. All right, let's run it again. So, well, also it's a good plan to shut down,
3:01 completely exit all the code, everything, shut down your FastAPI, API, and start it back up because it might've built up some cruft in memory,
3:13 it might have cached something. There's like a lot of weirdness going on. And if you're testing it, like we're pushing it to the breaking point.
3:20 So it could have those overwhelmed memory cache, whatever stuff just going on. So just start it over. So run the FastAPI, good. Run a locus.
3:33 Again, you could do it from the command line if you prefer. And then let's go. So in this scenario,
3:40 we're expecting to have more users than 20 or 50 or 100. Let's say we think we could probably handle 500 users.
3:47 And just for the sake of, I would add this slowly, but just because we don't want to watch it too slow.
3:53 Let's just go and we'll say we're gonna add five at a time. Let's go over to the charts. Notice you see up here, the number of users.
4:02 So as we're adding them, we're not getting that many requests per second. That's not because stuff is not working.
4:08 It's because it's five to 30 seconds per user. So they're just chilling, right? Let's zoom out a little bit here. You can see the response time.
4:19 Excellent, 30 millisecond response time. They should be happy with that. Users are used to terrible websites. It takes three or four seconds.
4:29 Got to check, is the spinner spinning? Yeah, okay, we're still waiting. So 27 milliseconds should make them happy. So we're still adding users.
4:38 the, as things get kind of smoothed in in Python, right? It's really, really efficient here. Up at the top, request per second.
4:47 Again, you can see them up here. It's going up. We're gonna need to add more users. We're definitely gonna need to add more users and do it faster.
4:54 We can actually do that while it's going here. We can say, let's go to 1,000 and we'll add them 10 at a time.
5:01 And notice the rate at which they're going up is faster. So is the endpoint. So this is great, we're still doing requests per second.
5:10 It might've looked like before when we did our tests, it was like, whoa, 500 is all we can handle. Well, 500, as hard as they can refresh,
5:20 but notice with this scenario that we planned out here, and you can say how real or unrealistic it is, but we're getting zero impact
5:33 from having right now 700 users. We're gonna go up to a thousand. So let's see if we can do this up to 5,000 and 50 at a time, just keep going.
5:43 All right, now we're starting to get more of them in here, but I think this is just a little blip.
5:57 I don't think it'll affect, this doesn't look too bad. The average is still in the 50s. The top, even though it looks high, is still 190.
6:06 kind of coming to terms with more connections. Up here, more and more requests per second,
6:11 130. Had any more users we had up to here, then we doubled it and then we made it quite a bit higher.
6:17 So look at this, we're still going great. I'll edit again, put it 10,000 and put it 100 at a
6:26 time just because I don't want to normally you would just let it go but right we're recording you don't want to watch this go super slow.
6:31 But let's keep adding. Now again, this is not the full, not the full infrastructure of what we would have in place.
6:41 As we're getting more and more users connecting, they're all connecting to the uvicorn server, which that would not be how it actually works.
6:48 They would connect to Nginx, and Nginx would buffer all those connections up and then proxy them over to uvicorn. So not super realistic in here,
6:58 but you can see, all right, this is where it's starting, starting to fall apart. And the requests per second
7:03 are also kind of leveling off or going down. So let's say, we're starting to get some failures. I hit stop.
7:11 Probably limit of number of connections to the server, the aspect that's not really supposed to handle. Let's look at these numbers.
7:20 Those are not great. Right there. So right there, again, if we had Nginx and then MicroWSGI, we would do better, not sure how much better,
7:30 but certainly better. What we got here is it looks like at 3,795, 3,800 users, it's kind of where it starts to fall apart.
7:39 So it's pretty understandable to see like, okay, everything's just growing fine, and then it just hits a wall and it starts to get errors.
7:47 And these slow down, like this is what it looks like when a website or a web application just is overloaded. It's just once it starts to go down,
7:57 it, it gets worse, because as it slows down the request, even more requests queue up behind
8:01 those right. So this is, this is our number, we can handle around 3800 users at the peak,
8:11 the way it's currently set up with OBS recording on my machine, the database, and the web server
8:17 and locus and OBS are all running concurrently. So this might not be the real number, but this is the number that we get under these constraints.
8:26 Okay, pretty excellent, right? This gives us a ton of insight. And we did this by sketching out what a standard use case looks like.
8:35 A standard use case is this franticness of a behavior, as well as this one, five, 15 split of uses across these.
8:45 And if you really need to, you could have, here's a scenario where they click this page, then they log in,
8:50 and then they go over to this other page, right? You can have these be not just one liners, right? But these can be complicated bits of code.
8:58 I just made them simple for what we're doing. Okay, it's cool, right? Really cool stuff you can learn from Locust.


Talk Python's Mastodon Michael Kennedy's Mastodon