Async Techniques and Examples in Python Transcripts
Chapter: asyncio-based web frameworks
Lecture: Performance results
0:00 Are you ready for the grand finale?
0:01 Let's compare how our app performs
0:04 when it was based on Flask
0:05 and then how it performs now
0:07 that we've converted it to Quart.
0:09 So here's the command I'm going to use:
0:10 wrk -t20 -c20 -d15s.
0:15 And then I'm going to go hit the sun API.
0:17 That one's the most complicated
0:18 it does both location first and then the sun
0:21 based on the latitude and longitude.
0:23 So it seems like a good candidate.
0:25 Now the t means threads I believe
0:27 so 20 threads, 20 connections for 15 seconds
0:31 pound away on this thing as hard as it can.
0:34 Now let's see what kind of results we get.
0:37 I still got 20 threads, 20 connections it says right below
0:39 and it gives us some stats.
0:40 Average latency 1.34 seconds, not great
0:45 max is a little bit higher.
0:46 Since we were able to do 71 requests in about 15 seconds
0:50 it's okay, but notice that red line.
0:52 That kind of sucks. 65 requests timed out
0:56 almost 50% of the requests we made timed out.
1:01 I don't know how you feel about reliability
1:03 but for me, if my app can only handle
1:06 half the requests it's receiving
1:08 and the rest are timing out or failing
1:11 there's something badly wrong with that web application.
1:14 What's one way to deal with this?
1:15 Well besides rearchitecting a cache and things like that
1:18 we can get more machines, we can get a bigger machine
1:21 we can scale out, we can get a load balancer
1:22 all sorts of crazy infrastructure solutions.
1:26 That's one possibility
1:27 we'll see another one in a minute.
1:28 And notice the overall one here is 4.72 requests per second
1:32 but remember half of them are failing with timeout
1:36 that is not super amazing. This was our original app.
1:39 What do we get with Quart? Ooh, a couple of things are better.
1:43 First notice the red line is gone 0 errors, 0 timeouts.
1:48 Now the latency for each request is about 944 milliseconds.
1:54 Remember that green bar, we can't make it faster.
1:57 This async stuff does not make it faster
1:58 it just means you can do more stuff while it's happening.
2:01 So we have 20, effectively 20 independent users
2:05 hammering as hard as they can.
2:06 Now that doesn't mean there's 20 real users
2:09 alright, that's 20 aggressive, mad users
2:12 requesting as hard as it can.
2:14 Like a regular user may be, when they're on your site
2:17 does request every 15 to 30 seconds, something like that.
2:20 So this is a much, much higher load then say 20 actual users.
2:24 But nonetheless, that's sort of what we're hitting it with
2:27 these 20 threads. So notice all the timeouts are gone
2:29 we can do 311 and look at that
2:32 20 requests per second, each request takes about one second.
2:37 That's as fast as it can go
2:38 with this many threads and connections, right
2:41 just each thread is getting served as good as it could.
2:44 So that's a really, really good thing.
2:46 So you can see we've done much, much better
2:49 on exactly the same hardware.
2:51 We've ended up scaling much better.
2:53 Our requests are not that much faster, right
2:55 it's not ultra fast or anything like that.
2:58 We can do so much more of it
2:59 because most of what we're doing is waiting
3:01 and while we're waiting
3:02 we'll just go and start another request
3:04 and wait on it to get back from the service.
3:06 Things like that, right? So a final note here
3:09 if you do this yourself
3:11 and you just type python app.py or run your app
3:16 you're going to get something disappointing with Quart.
3:18 It's going to look more like this.
3:20 It's not going to look like what you want.
3:21 Why? Because if you just run it regularly
3:24 you're going to get a WSGI server
3:26 a serial, not a synchronized server.
3:28 You need to run it on Hypercorn
3:30 so we'll talk about that next.