Async Techniques and Examples in Python Transcripts
Chapter: asyncio-based web frameworks
Lecture: Performance results

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Are you ready for the grand finale? Let's compare how our app performs when it was based on Flask and then how it performs now
0:08 that we've converted it to Quart. So here's the command I'm going to use: wrk -t20 -c20 -d15s. And then I'm going to go hit the sun API.
0:18 That one's the most complicated it does both location first and then the sun based on the latitude and longitude. So it seems like a good candidate.
0:26 Now the t means threads I believe so 20 threads, 20 connections for 15 seconds pound away on this thing as hard as it can.
0:35 Now let's see what kind of results we get. I still got 20 threads, 20 connections it says right below and it gives us some stats.
0:41 Average latency 1.34 seconds, not great max is a little bit higher. Since we were able to do 71 requests in about 15 seconds
0:51 it's okay, but notice that red line. That kind of sucks. 65 requests timed out almost 50% of the requests we made timed out.
1:02 I don't know how you feel about reliability but for me, if my app can only handle half the requests it's receiving
1:09 and the rest are timing out or failing there's something badly wrong with that web application. What's one way to deal with this?
1:16 Well besides rearchitecting a cache and things like that we can get more machines, we can get a bigger machine
1:22 we can scale out, we can get a load balancer all sorts of crazy infrastructure solutions. That's one possibility we'll see another one in a minute.
1:29 And notice the overall one here is 4.72 requests per second but remember half of them are failing with timeout
1:37 that is not super amazing. This was our original app. What do we get with Quart? Ooh, a couple of things are better.
1:44 First notice the red line is gone 0 errors, 0 timeouts. Now the latency for each request is about 944 milliseconds.
1:55 Remember that green bar, we can't make it faster. This async stuff does not make it faster it just means you can do more stuff while it's happening.
2:02 So we have 20, effectively 20 independent users hammering as hard as they can. Now that doesn't mean there's 20 real users
2:10 alright, that's 20 aggressive, mad users requesting as hard as it can. Like a regular user may be, when they're on your site
2:18 does request every 15 to 30 seconds, something like that. So this is a much, much higher load then say 20 actual users.
2:25 But nonetheless, that's sort of what we're hitting it with these 20 threads. So notice all the timeouts are gone we can do 311 and look at that
2:33 20 requests per second, each request takes about one second. That's as fast as it can go with this many threads and connections, right
2:42 just each thread is getting served as good as it could. So that's a really, really good thing. So you can see we've done much, much better
2:50 on exactly the same hardware. We've ended up scaling much better. Our requests are not that much faster, right
2:56 it's not ultra fast or anything like that. We can do so much more of it because most of what we're doing is waiting and while we're waiting
3:03 we'll just go and start another request and wait on it to get back from the service. Things like that, right? So a final note here
3:10 if you do this yourself and you just type Python or run your app you're going to get something disappointing with Quart.
3:19 It's going to look more like this. It's not going to look like what you want. Why? Because if you just run it regularly
3:25 you're going to get a WSGI server a serial, not a synchronized server. You need to run it on Hypercorn so we'll talk about that next.

Talk Python's Mastodon Michael Kennedy's Mastodon