Full Web Apps with FastAPI Transcripts
Chapter: async databases with SQLAlchemy
Lecture: Async for scalability - sync version
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Before we dive into writing code around async and await and moving that up into the FastAPI layer,
0:07
I wanna just broadly cover what is the value and where does async and
0:11
await play a role in scalability. Now scalability can mean a lot of different things to
0:17
different people. But the idea of scalability is not to make one thing faster, it is to allow you to do more of the same thing without slowing down,
0:25
right? So the goal here is if we have, say, five requests coming to our site,
0:30
and then there's some kind of burst and we get 500 requests coming to the same
0:34
website. We want it to be able to handle those 500 requests at about the same speed as it would handle the five.
0:41
That's the goal. That's the scalability that we're looking for. We're gonna examine two views of a theoretical web server doing some processing,
0:49
one that is a traditional Python WSGI, Web Service Gateway Interface application that does not support async view methods.
0:57
Traditionally, this has been Django, Flask, Pyramid, all those different frameworks. Django is starting to add some async features,
1:05
Flask is thinking about it. Maybe by the time you watch this recording, they've actually made some progress there.
1:11
But many of the popular Web frameworks do not support this async scalability. So this first view that we're gonna look at, this is, this
1:18
is, you know, traditional Flask. Let's say at least at the current time of the recording,
1:22
and we'll see that we're going to, as we get more work sent to the server it's just going to take longer and longer because,
1:29
well, it's not as scalable. So let's look at a synchronous execution here, and we're gonna get three requests. Come in really quickly. Now,
1:36
these green bars are meant to represent how long it actually takes us to process from the request coming in to the response coming out for the page
1:44
that Request 1 is pointing at, and the page that Request 2 is pointing out and see Request 3 down there at the bottom, even though it came in third,
1:52
it's actually incredibly short, so it should come out really, really quickly if there was no other traffic. But what happens in this synchronous world?
2:00
Well, Request 1 comes in and boom, we're gonna start processing it right away.
2:04
How long does it take from a user's perspective from actually the request hitting the server
2:09
until the response goes out? Exactly The same amount of time is as it would normally take, right? It's just that's how long it takes.
2:15
But when the Request 2 comes in, it has to wait to begin processing. It cannot begin processing its response until response one is done, right?
2:25
Because the server is synchronously working, it can't do more than one thing at a time, so it's going to wait to get started.
2:31
And the big yellow bar is how long. It seems like the page takes to load to the user who made the request because
2:38
they've gotta wait for this Request 1 they don't know about, plus the time it takes for theirs. Poor old Request 3 comes in just after Request 2
2:46
and it takes as long as that big yellow bar for it. It's gonna take a really long time for it to get the response. If it had been processed alone,
2:53
it would be really rapid, just the size of the green Request 3 bar, but because it has to wait for 1 to finish
2:59
and for 2, which is waiting a while for 1 and then itself. It doesn't actually get processed for a long time. So for the user, from the outside,
3:07
it appears that this Request 3 takes the big long yellow bar time to get done, even though it would actually be really quick.
3:14
What's going on here? Well, if you actually dig into one of these requests, let's say Request 2, for example or Request 1, doesn't really matter.
3:21
One of the longer ones. It's not just processing, there's a bunch of systems working together in web applications,
3:27
right? Maybe we're calling an external API through microservices, maybe we're talking to the database and so on.
3:33
So let's expand this out. What does this processing actually mean? So here's Request 1 coming in,
3:38
you could think of it as just we're processing the request, but if we break it down into its component pieces,
3:43
it's a little more interesting. We have the framework that we're doing some database work and then a little bit of our code runs,
3:47
which issues another query, which takes a long time over to the database. Get a response back, we do a quick test on that,
3:53
and then we send that off to a response in the framework. So how much work are we actually doing? Well, maybe the gray plus the blue,
4:00
the gray plus the blue is all that really has to happen. This database work, this is a whole another computer,
4:06
another program that we're just waiting on. So there's no way in a synchronous web request to allow us, our website, our web framework,
4:14
our web application to do other things while we're waiting, we just call the query and it stops.
4:20
That's not great. This is why there's such a blockade when Request 1, 2 and 3
4:25
are coming into the system because we're just waiting on something else when in fact we're only doing a small amount of work.
4:31
Maybe a 10 to 20% of this request is actually us doing the work where the server would be busy. Most of it is waiting and async and await
4:40
in Python are almost entirely about finding places where you wait on something else and allowing other things to happen while you're waiting.
4:48
So there's a huge opportunity to do something better right here.