Modern APIs with FastAPI and Python Transcripts
Chapter: Deploying FastAPI on Linux with gunicorn and nginx
Lecture: Server topology with Gunicorn

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Now that we've got our virtual machine running on the Internet somewhere, in this case up on
0:04 Digital Ocean, what we're gonna do is talk quickly about what applications and server services on
0:11 that server are going to be involved and how they fit together.
0:14 So this little gray box represents Ubuntu, our server.
0:20 What we're gonna first install, or first interact with when we make a request to the server
0:24 at least, really we'll probably start from the inside out.
0:26 But the first thing that someone coming to the server is gonna interact with is this
0:30 Web server called NGINX. Now NGINX serves HTML and CSS and does the
0:36 SSL and all those cool things.
0:38 But it's not actually where our Python code runs. In fact,
0:41 we don't do anything to do with Python there.
0:43 We just say you talk to all the Web browsers,
0:47 all to the applications, everything that's trying to get to the Web infrastructure.
0:51 This thing is where they believe they're talking to,
0:54 and it is what they're talking to.
0:55 But it's not where what is happening.
0:56 That's not where the action is,
0:58 right. Where the action is, is going to be in this thing called Gunicorn.
1:02 You saw that we use you uvicorn,
1:04 which is the asynchronous loop version of Gunicorn to run our FastAPI.
1:11 But Gunicorn is a more proper server that is going to do things like manage the
1:16 life cycle of the apps running.
1:18 So, for example, if of one of the apps gets stuck in that process, freezes
1:21 up, Gunicorn has a way to run in supervisor mode
1:26 so it could say "actually that thing is stuck or it ran out of memory.
1:29 Let's restart it" so the server doesn't permanently go down, it's just gonna have a little
1:33 glitch for one user, and then it'll carry on. In order to do that,
1:37 Gunicorn is gonna spin up not one but many copies of our FastAPI
1:41 application over in uvicorn,
1:44 which we've already worked with. And this is where our Python code that we write,
1:47 our FastAPI, lives. So when you think of where does my code run?
1:51 what is my Web app doing?
1:53 It's gonna be this uvicorn process. And in fact,
1:56 not one, but many. For example,
1:58 over at Talk Python training, I believe we have eight of these in parallel on
2:03 one of our servers. So when a request comes in,
2:05 it's gonna hit NGINX, it's going to do its SSL exchange and all those
2:09 things that the Web browsers do with Web servers. NGINX is going to realize,
2:14 Oh, this request is actually coming to our FastAPI application,
2:18 depending on how we've configured it.
2:20 It's going to send a request either over http or Linux sockets directly.
2:25 Gunicorn says okay, well,
2:26 we've got this request for our application and there's probably a bunch going in parallel,
2:31 which one of these worker processes is not busy and can handle requests?
2:36 Well, this one. Next time a request comes in,
2:38 maybe it's this one. Another request comes in,
2:40 maybe those two are busy and it decides to pick this one.
2:43 So it's going to fan out the requests between these worker processes based on whether
2:47 or not they're busy and all sorts of stuff.
2:50 So it's gonna kind of even out the load across them,
2:52 especially, know if they're busy and not overwhelm any one of them.
2:56 So this is what's going to be happening on
2:57 our server and we're gonna go in reverse.
2:59 We're gonna install uvicorn and our Python Web app,
3:02 then going to set up Gunicorn to run it.
3:04 And once we get that tested and working inside the app, inside the server, on
3:09 Ubuntu, then we're gonna set up NGINX and open it out to the Internet
3:13 and make this whole process that you see here flow through.