MongoDB with Async Python Transcripts
Chapter: Foundations: async
Lecture: Full Concurrency Weather Client
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
We made a call to one URL, let's see about doing it for many. And we gotta be just a tiny bit careful here, because if you look at the response,
0:13
you can see that you only get 46 lookups, actually 46 left, you get 100 lookups an hour, I think with this API, 'cause it's really for the course,
0:24
it's not for you to build apps around, all right? So it's heavily, heavily rate limited. I think if you ask for the same one twice, you might be fine.
0:34
So it could be no big deal. But anyway, well, we do have a limit on how many you can get. All right. That said, let's go and write some code.
0:43
It's going to be awesome. So instead of getting one of these, let's go and rewrite this function to get all of them. So let's go like this.
0:53
We'll have a report equals a list here. And let's make this function here. we'll call this print show report. We go, it says async.
1:11
Don't know why PyCharm, you think that needs to be async 'cause there's nothing async happening in there. So we're gonna go for the report
1:19
and then we'll just say for URL in locations. Well, can we just do this? And we put a URL here. Technically don't need this yet.
1:36
Not do we, well, you'll see that we will. And let's print out, I guess we're kind of doing this already, we're printing that we're calling one
1:45
and then we're printing out our report. So you should be able to see the speed at which this is happening. Let's run it and see what happens.
1:50
See it coming down, click, click, click, click, one at a time. Hmm, let's try it again. One at a time, they're coming back. Okay.
2:03
Why aren't they running concurrently? I told it, it's async. I'm awaiting it right here. Well, it is awaiting, it is allowing us to scale extra work,
2:16
but let's just talk through how this works. It says go through every location, all 10 of them, and start getting the report and then wait for it.
2:25
And you get the report, show it. After you get the report, go to the next one, start it, wait for it, show it.
2:33
That's exactly the same as if you had done it in a non-async way. Other things in the program could leverage
2:39
that waiting time, but this function, not so much. So what we need to do is we need to actually start all the requests and then wait for all of them
2:49
to finish so that they're actually all started in parallel. So there's a lot of interesting nuances here,
2:56
but let's go and just have it start a bunch of tasks and we'll just wait for them to finish, okay?
3:04
So in order to do that, we need to kind of do a two-step. We need to start all of them, hang on to the running work
3:11
and then wait for that work to finish. So we'll say tasks.append, and you would like to say get report,
3:19
but remember that doesn't actually start it, right? That just creates a coroutine that could be run.
3:25
So we'll say asyncio.createTask that will start it and return a task that could be awaited on.
3:35
So now we have a bunch of them started and we'll say for task in tasks spelled correctly,
3:44
we're going to wait for the first one and then the next one and the next. So let's talk through this before we run it real quick.
3:51
We're going to start all of the work with this create task of each URL. So then we'll have 10 of them running all at the same time.
4:00
And then we're gonna go and say, when is the first one done? Then when is the second? When is the third? That's probably not the order
4:08
in which we get the responses back, but that doesn't matter. If we're waiting on the first one and the second one's already done,
4:15
the second time when we get to the second one, it just returns and says, I'm already done. You don't need to await me, I'm done, let's go.
4:22
Here's the report. So it's for this purpose, totally perfect. There's maybe more nuanced cases where you wanna find
4:30
which one completed first, process it first, but we just wanna start them all and then wait for them all. This is all we need.
4:38
Let's try again and see if this is any different. Start, done. Oh, all right, that was fun. Let's do it again.
4:46
I never get tired, but let's do it again. Start, done. In fact, let's put some timing around that.
4:54
And I'm going to make a main naive and then a main main, let's say. So this one we just did. Remember what we had. We just awaited calling it directly.
5:12
We don't need this. So we'll just grab the timing. Let's print out how many milliseconds to a tenth of a millisecond here.
5:32
And I'll add the same code to this other version here, like that. And I'll run them both first.
5:49
So we'll run the naive one and then the regular one. Click, click, click, one at a time, they're coming in. What are the results here?
5:58
300 milliseconds to do them one at a time. pretty good considering this is going all the way to New York where the web
6:04
server is and I'm in Oregon but what about this one 38 milliseconds oh that is awesome what's the speed up it's about 10 times speed up how much
6:18
concurrency are we adding about 10 times the concurrency of doing 10 at a time instead of one surprise we get an awesome almost embarrassingly
6:30
parallel scaling by just going, ""Hey, while we're waiting on one, let's just start the other and then we'll process the results as we come back.
6:39
One more time just to get some stability here. 39 milliseconds, 339. Yeah, that feels pretty stable to me. So hopefully that is impressive to you.
6:54
Hopefully you appreciate this. So the idea is that we can do a whole bunch more work while we're waiting. And what are we waiting on here?
7:02
The internet. We're waiting on whatever service API is we're calling all the traffic through the internet
7:09
from Oregon to New York to processing that and back. And that's almost all our program does is wait on the internet.
7:16
So we get almost linear scalability by adding all that concurrency. Awesome. The one weird little hitch was we have to start all the work
7:27
and then begin waiting on the results. 'Cause if we just start one, wait for it to finish, start one, wait for it to finish,
7:35
it's exactly the same speed as if we didn't have parallelism. So you gotta think a little bit through it, but really cool example.
7:42
This is async and await, and you can see what a massive difference in terms of performance and responsiveness it makes.