Build An Audio AI App Transcripts
Chapter: Feature 1: Transcripts
Lecture: Running Background Jobs like Transcribe

Login or purchase this course to watch this video and the rest of the course contents.
0:00 In this video, let's work on making this background job system work and I'll just introduce it real quick.
0:07 So what we're doing is we're creating one of these things I'm calling a background job and storing it in the database.
0:13 So it has a time it was created, started and finished. You know, neither of these are set by default. Definitely not the finished.
0:21 They start out awaiting to be processed and they are not finished, but they do require an action and episode number and a podcast ID.
0:29 Guess what, that's what we're passing along, right? This is being used over by this background service. You saw it's called this function.
0:36 And what it's going to do is just literally create one of these and then save it and then give it back to you.
0:41 This save generates the sets the database generated ID and that's the ID that we were just playing with.
0:49 And then somewhere down here, we can ask, is the job finished? So it just goes and gets the job by ID, which is literally find one ID equals the ID you
0:59 asked for. And either it's false if it doesn't exist, or it checks to see if it's been set to be finished.
1:07 Now that all gets kind of put in the database and chills. And then this function, the worker function is the thing that goes around forever and ever.
1:16 Now, first of all, notice there's something very uncommon here says the background services
1:22 starting up and it's based on async and await and just the default, not the default, the fast API's core async IO event loop.
1:32 And what it does is it says while true do a bunch of work, you'd think that would block up the processing. But no, no, it's really cool.
1:40 So what happens is, when you go to await something that frees it up to just keep on doing, you go pro doing other processing.
1:48 So it'll say, get me one of these jobs, if there is nothing, just go to sleep for a second,
1:54 and then go through the loop again, which means get asked good for a job again in one second.
1:59 But if there is one, get us the job and start processing it. There's an error. Well, that's too bad. Get the next job. Sorry, job failed.
2:10 And finally, it's going to come down here and says, all right, well, what was I supposed to do with this job here?
2:14 Was I supposed to summarize or transcribe? And then it goes to the AI service and says, transcribe episode.
2:21 This is where we're going to write our assembly AI code using their API. But this is how it happens kind of automatically.
2:28 And then once it gets through whichever step it takes there, it marks it as complete and successful in the database.
2:36 So that when we're over here, we say, give me this thing. Is it finished? When we did that successfully, it'll say, yeah, it's finished.
2:44 Here's your information. You know, whatever has to happen when that transcript gets generated and so on. So this is how it works.
2:50 Now, where does this run? If you go look in the main, you're going to see nothing to do at all with background services.
2:59 We got secrets, routing, templating. That's the chameleon connections there. We'll get to that very, very soon.
3:08 And the caching, this is just if there's a change to a static file, we don't have to actually make any changes or hard reloads.
3:16 There's basically no cache, no stale stuff in our cache as we develop it or even as we run in production. But there's nothing here.
3:24 So where is this happening? Well, fast API has a really interesting restriction, but also way where requires anything that
3:33 uses async IO and async and await. It needs to share that core async IO thread, but fast API manages it or the server running fast API manages it.
3:44 And it doesn't exist until you get down to here or you pass it back or that something
3:50 grabs it and runs it like UVA corn or granny and those kinds of things, hyper corn. But it's not available to us yet. And this is where it gets funny.
4:01 So over here in infrastructure, we have app setup. So there's this thing called an app lifespan that we can plug into fast API and fast API
4:12 will say, look, I know you got to do some async stuff, like for example, initialize
4:17 MongoDB and it's async loop needs to run on this core fast API async IO loop. This is going to be called when everything starts up. Right?
4:29 So fast API is going to call this app lifespan for us when everything starts up. How does it know one thing back over here?
4:37 You go over here where we're creating the fast API stuff, I'll wrap around so you can read a little better is right here.
4:46 It says when we create the app, we say don't have any documentation in your website. Are you in debug mode? Right? Or development mode?
4:54 Right now we say true. We could switch that to false later. And then here's the code that you run at startup. So it runs all of this stuff.
5:07 And then it yields to run the entire application. When the app shuts down, there's like a shutdown cleanup if you want. There's nothing for us to do.
5:14 But the core thing is register the background jobs here. So what we can do is we can say async IO create a task.
5:22 And what we're going to do is we're gonna give it the background service, that worker function and call it that'll kick it off there.
5:31 Now there's a warning, it looks like I did something wrong. Perhaps I have but this is saying look, this is not awaited.
5:38 I really, really think async and await in Python are awesome. But there are some big time mistakes in the design here.
5:48 And one of the big times mistakes is there's no way to just say start this work. I don't care about the answer. I don't care about it.
5:57 That's why you can't just call this function directly. And then it runs. It's going to do this create task. Then also it expects you to wait on it.
6:04 I don't want to wait, I just want to put a bunch of stuff out into the ether and let it run as long as the app is alive.
6:10 So I'm have to use this like this is not an error. I mean to not wait on this, right? We'll come back for search when we get there as well.
6:17 And search is going to be another thing that just runs in the background periodically, it'll wake up index some things and go back to sleep.
6:23 Here we go. This will have the background job actually kicked off and running. It's going to run this worker function.
6:32 So this is how we get this async IO function running continuously in the background doing a while true on the right loop in fast API.
6:46 The next thing we got to do is to actually transcribe the episode at assembly AI. Fun. Right now if we run it, it's not going to take very long.
6:54 I guess we could go ahead and just complete this loop here. Go over here and hit go. And we get this new job and we can check if it's done.
7:12 I imagine it probably is done. Yeah, look at that. So we started a new job and it's finished because well, when it ran this, it didn't
7:20 have to wait anything. So it marked this finish in the database. Excellent.
7:25 It looks like everything's running and maybe one more thing we can check up here is that right there.
7:32 The background async IO service worker is up and running, just cruising along. You see it's getting these new jobs and then the finished.
7:41 Yes, it is finished.

Talk Python's Mastodon Michael Kennedy's Mastodon