Full Web Apps with FastAPI Transcripts
Chapter: async databases with SQLAlchemy
Lecture: Async for scalability - sync version

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Before we dive into writing code around async and await and moving that up
0:04 into the FastAPI layer,
0:06 I wanna just broadly cover what is the value and where does async and
0:10 await play a role in scalability. Now scalability can mean a lot of different things to
0:16 different people. But the idea of scalability is not to make one thing faster,
0:20 it is to allow you to do more of the same thing without slowing down,
0:24 right? So the goal here is if we have,
0:27 say, five requests coming to our site,
0:29 and then there's some kind of burst and we get 500 requests coming to the same
0:33 website. We want it to be able to handle those 500 requests at about the same
0:37 speed as it would handle the five.
0:40 That's the goal. That's the scalability that we're looking for.
0:43 We're gonna examine two views of a theoretical web server doing some processing,
0:48 one that is a traditional python WSGI, Web Service
0:51 Gateway Interface application that does not support async view methods.
0:56 Traditionally, this has been Django, Flask, Pyramid,
1:00 all those different frameworks. Django is starting to add some async features,
1:04 Flask is thinking about it. Maybe by the time you watch this recording,
1:08 they've actually made some progress there.
1:10 But many of the popular Web frameworks do not support this
1:14 async scalability. So this first view that we're gonna look at, this is, this
1:17 is, you know, traditional Flask.
1:19 Let's say at least at the current time of the recording,
1:21 and we'll see that we're going to, as we get more work sent to the server
1:25 it's just going to take longer and longer because,
1:28 well, it's not as scalable.
1:29 So let's look at a synchronous execution here,
1:31 and we're gonna get three requests.
1:33 Come in really quickly. Now,
1:35 these green bars are meant to represent how long it actually takes us to process from
1:40 the request coming in to the response coming out for the page
1:43 that Request 1 is pointing at,
1:45 and the page that Request 2 is pointing out and see Request 3 down there at
1:49 the bottom, even though it came in third,
1:51 it's actually incredibly short, so it should come out really,
1:54 really quickly if there was no other traffic.
1:57 But what happens in this synchronous world?
1:59 Well, Request 1 comes in and boom,
2:01 we're gonna start processing it right away.
2:03 How long does it take from a user's perspective from actually the request hitting the server
2:08 until the response goes out? Exactly The same amount of time is as it would normally
2:12 take, right? It's just that's how long it takes.
2:14 But when the Request 2 comes in, it has to wait to begin processing.
2:18 It cannot begin processing its response until response one is done, right?
2:24 Because the server is synchronously working,
2:26 it can't do more than one thing at a time,
2:28 so it's going to wait to get started.
2:30 And the big yellow bar is how long.
2:32 It seems like the page takes to load to the user who made the request because
2:37 they've gotta wait for this Request 1 they don't know about,
2:39 plus the time it takes for theirs. Poor old Request 3
2:42 comes in just after Request 2
2:45 and it takes as long as that big yellow bar for it.
2:47 It's gonna take a really long time for it to get the response.
2:51 If it had been processed alone,
2:52 it would be really rapid, just the size of the green Request 3 bar,
2:56 but because it has to wait for 1 to finish
2:58 and for 2, which is waiting a while for 1 and then itself.
3:01 It doesn't actually get processed for a long time.
3:03 So for the user, from the outside,
3:06 it appears that this Request 3 takes the big long yellow bar
3:10 time to get done, even though it would actually be really quick.
3:13 What's going on here? Well,
3:14 if you actually dig into one of these requests,
3:16 let's say Request 2, for example or Request 1, doesn't really matter.
3:20 One of the longer ones. It's not just processing,
3:23 there's a bunch of systems working together in web applications,
3:26 right? Maybe we're calling an external API through microservices,
3:29 maybe we're talking to the database and so on.
3:32 So let's expand this out. What does this processing actually mean?
3:35 So here's Request 1 coming in,
3:37 you could think of it as just we're processing the request,
3:39 but if we break it down into its component pieces,
3:42 it's a little more interesting. We have the framework that we're doing some database work
3:45 and then a little bit of our code runs,
3:46 which issues another query, which takes a long time over to the database.
3:49 Get a response back, we do a quick test on that,
3:52 and then we send that off to a response in the framework.
3:54 So how much work are we actually doing?
3:56 Well, maybe the gray plus the blue,
3:59 the gray plus the blue is all that really has to happen.
4:02 This database work, this is a whole another computer,
4:05 another program that we're just waiting on.
4:07 So there's no way in a synchronous web request to allow us, our website, our web framework,
4:13 our web application to do other things while we're waiting,
4:17 we just call the query and it stops.
4:19 That's not great. This is why there's such a blockade when Request 1, 2 and 3
4:24 are coming into the system because we're just waiting on something else when in fact we're
4:28 only doing a small amount of work.
4:30 Maybe a 10 to 20% of this request is actually us doing the work where the
4:35 server would be busy. Most of it is waiting and async and await
4:39 in Python are almost entirely about finding places where you wait on something else and allowing
4:45 other things to happen while you're waiting.
4:47 So there's a huge opportunity to do something better right here.