Build An Audio AI App Transcripts
Chapter: Feature 1: Transcripts
Lecture: Storing the Transcript in the DB

Login or purchase this course to watch this video and the rest of the course contents.
0:00 We look over in the database side here, we have this transcript, which is not the same transcript as you just saw right here, not the assembly AI transcript, but the one we're putting in the database, because this is both expensive in time.
0:17 And on a grand scale, if you did tons of transcripts, it would be expensive in money. Any individual one is quite cheap. But you know, the idea of this app is maybe you give it 1000 podcasts, and every episode that comes in, you transcribe that. You don't want to do that every time somebody wants to view it. Plus, you want it to be milliseconds, not seconds.
0:39 So what we're going to do is save that to the database. And this thing will have when it was created, when it was last updated, if there's a summary available around it, or if you just have the straight words, and this transcription word kind of captures those little blocks that we were talking about in there as well. And then you have just the full text and so on.
1:00 Okay, so we want to save it to there. But first, let's see if it already exists. Let's go over here and we'll say DB transcript equals now in here, there's a full transcript for episode, and we can give it the podcast ID and the episode number. And I'll wait. And we'll say, if it already exists, let's not recomputed. Okay.
1:29 We'll say the transcript for podcasts, whatever, episode number already exists, skipping, and we'll just return DB transcript. This thing returns one of these episode transcripts, or none if it doesn't exist. But now in this case, we've checked it. So this is going to be an episode transcript, like we expect.
1:49 Now it's not doing it. There's not going to be one yet because we have yet to save one. Next down here, instead of dumping this out, that was fun, but let's not do that. Up here, let's go ahead and create one of these DB transcripts, because it was none from above. So we'll go ahead and create one of these episode transcripts. And let's pass in some values. See what we've got to pass.
2:12 ID we don't pass, the revision ID we don't pass, created date and updated date have defaults that come out of the, when they get inserted to the database, and now is good. But we can start going with episode number equals, well, episode number, podcast ID, podcast ID, kind of just pass a lot of this information along here. Words, we will set in a minute. Summary, not now. Error message, none.
2:40 So we want to set whether it's successful and how do we know if it was successful? So in this thing, returns a status, and we can check what that status is equal to. It's a string, but just like before, I like having these enumerations when the set of strings is very limited.
2:59 This comes out of assembly AI. They've got queued, processing, completed, or error. And since we know it's not processing, not queued, it's going to be one of them too. So we're just going to check to see that it is completed here. Okay. So it's successful. If it was completed, the status is going to be the same value, but just stored directly.
3:24 So if for some reason we get a weird value, we can just look and see, okay, that's what that is. We have an assembly AI ID. When this comes back, it's going to give us an ID. And later in other parts of the assembly AI API, you can say instead of passing all the data or retranscribing and say, you've already transcribed this thing. Here's its ID. Let's do something with it. Let's ask questions about it and so on.
3:51 So we're going to store that in case we need it later. And I'm not sure this is necessary, but again, this takes a lot of time, relatively speaking, and it does cost a little bit each time. So we want to maybe store, overstore the information available here.
4:08 So I'm going to say the JSON result is going to be the transcript.json response. That way, if for some reason the way we process the words and then generate additional information, like if we change how that works, we don't need to recompute it. We'll have that there and we can kind of start over. Okay.
4:29 And now one thing I did not set, the most important thing maybe is the words, because what I actually want to put in here is a little bit different. I don't want to store. Let's go over here. I want to store things a little bit different. I want to store the text, the start and time confidence, and not the end time, for example.
4:49 So first, let's check and make sure this was successful. If not, dbtranscript.success, we'll do an exception here. We'll say error processing this transcript, the status was that, and here's the error message. Or if there's no error message, just instead of saying none, we'll just leave that blank. Okay.
5:12 So we'll raise an exception. And finally, let's store the words into our database transcript as well. So we can just loop over the words that came back through the API. So for word in transcript, the assembly AI thing, .words, we're going to go and compute time in seconds.
5:32 So when we're working, say, with audio in the audio players, we want to set the seconds, not milliseconds. So we'll say the start second is just divide the word start by a thousand so we don't have to do math all the time.
5:46 Then we can say tx word equals transcript word. Text equals word.text. Start in second is start second. Confidence equals word.confidence. And finally, we want to put that into our dbtranscripts words.append, which is a list.
6:14 And MongoDB, if you're not familiar with it, you can store lists of data and nested items and all those kind of things. You're not restricted to just tabular data. So we're going to store these as sub objects in the database.
6:28 So our database object is all initialized. We just want to wait db object transcript dot save and then return dbtranscript. So we get something there. Excellent. So in review, how do we add it to the database?
6:46 First, we check if it's there. Do the transcription just as before. No longer print it out. We're going to create one of these objects with the data we got back. If it's not successful, well, let them know. And then convert these words over into our own objects.
7:06 We'd have to do something like this anyway, because we need a hidden model that's going to go into the database anyway. So we're going to do this and then save it and give it back to the rest of the app to work with. Let's do one more transcription here.
7:24 We'll do the same one. But the last time we're ever going to do this episode's transcription because it's going in the database. Speeding up ahead just so you don't have to wait for the transcription.
7:39 All right. It's all done. Let's see if we can see it in the database. I'm going to open up Studio 3T. You can use whatever thing you want to talk to Mongo. But this is there's a free version of Studio 3T. That's pretty nice.
7:51 I'm just going to connect to the local database and here you can see X-Ray podcasts. What have we got here? Transcripts. And look at that. Here it is. The assembly AI ID is apparently that the time is created.
8:07 Here you can see there's the JSON result. And let's go way down. Here we go. These are our words. Start and second. The text and the confidence again that we got there. Excellent. So if we try to transcribe that again, let's see what happens if I go over here and say transcribe.
8:28 We look and see what the text says. Starting a new job for this episode. Boom. This one with that number already exists. Skipping. Like we had at the top. Perfect. Right there. So we don't need to transcribe that again. We stored it in the database.
8:50 Now we just got to do something interesting with it, right?


Talk Python's Mastodon Michael Kennedy's Mastodon