Modern APIs with FastAPI and Python Transcripts
Chapter: Deploying FastAPI on Linux with gunicorn and nginx
Lecture: Server topology with Gunicorn

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Now that we've got our virtual machine running on the Internet somewhere, in this case up on
0:05 Digital Ocean, what we're gonna do is talk quickly about what applications and server services on
0:12 that server are going to be involved and how they fit together. So this little gray box represents Ubuntu, our server.
0:21 What we're gonna first install, or first interact with when we make a request to the server at least, really we'll probably start from the inside out.
0:27 But the first thing that someone coming to the server is gonna interact with is this
0:31 Web server called NGINX. Now NGINX serves HTML and CSS and does the SSL and all those cool things.
0:39 But it's not actually where our Python code runs. In fact, we don't do anything to do with Python there. We just say you talk to all the Web browsers,
0:48 all to the applications, everything that's trying to get to the Web infrastructure. This thing is where they believe they're talking to,
0:55 and it is what they're talking to. But it's not where what is happening. That's not where the action is,
0:59 right. Where the action is, is going to be in this thing called Gunicorn. You saw that we use you uvicorn,
1:05 which is the asynchronous loop version of Gunicorn to run our FastAPI. But Gunicorn is a more proper server that is going to do things like manage the
1:17 life cycle of the apps running. So, for example, if of one of the apps gets stuck in that process, freezes
1:22 up, Gunicorn has a way to run in supervisor mode so it could say "actually that thing is stuck or it ran out of memory.
1:30 Let's restart it" so the server doesn't permanently go down, it's just gonna have a little
1:34 glitch for one user, and then it'll carry on. In order to do that, Gunicorn is gonna spin up not one but many copies of our FastAPI
1:42 application over in uvicorn, which we've already worked with. And this is where our Python code that we write,
1:48 our FastAPI, lives. So when you think of where does my code run? what is my Web app doing? It's gonna be this uvicorn process. And in fact,
1:57 not one, but many. For example, over at Talk Python training, I believe we have eight of these in parallel on
2:04 one of our servers. So when a request comes in, it's gonna hit NGINX, it's going to do its SSL exchange and all those
2:10 things that the Web browsers do with Web servers. NGINX is going to realize, Oh, this request is actually coming to our FastAPI application,
2:19 depending on how we've configured it. It's going to send a request either over http or Linux sockets directly. Gunicorn says okay, well,
2:27 we've got this request for our application and there's probably a bunch going in parallel,
2:32 which one of these worker processes is not busy and can handle requests? Well, this one. Next time a request comes in,
2:39 maybe it's this one. Another request comes in, maybe those two are busy and it decides to pick this one.
2:44 So it's going to fan out the requests between these worker processes based on whether or not they're busy and all sorts of stuff.
2:51 So it's gonna kind of even out the load across them, especially, know if they're busy and not overwhelm any one of them.
2:57 So this is what's going to be happening on our server and we're gonna go in reverse. We're gonna install uvicorn and our Python Web app,
3:03 then going to set up Gunicorn to run it. And once we get that tested and working inside the app, inside the server, on
3:10 Ubuntu, then we're gonna set up NGINX and open it out to the Internet and make this whole process that you see here flow through.


Talk Python's Mastodon Michael Kennedy's Mastodon