Async Techniques and Examples in Python Transcripts
Chapter: Why async?
Lecture: An upper bound for async speed improvement
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Now you may be thinking, "Oh my gosh, Michael
0:02
you have 12 CPUs, you can make your code
0:05
go 12 times faster."
0:08
Sorry to tell you that is almost never true.
0:11
Every now and then you'll run into algorithms
0:13
that are sometimes referred to
0:15
as embarrassingly parallelizable.
0:17
If you do, say, ray tracing, and every single pixel
0:21
is going to have it's own track of computation.
0:24
Yes, we could probably make that go nearly 12 times faster.
0:27
But most algorithms, and most code, doesn't work that way.
0:32
So if we look at maybe the execution
0:34
of a particular algorithm, we have these two sections
0:38
here, these two greens sections
0:39
that are potentially concurrent.
0:41
Right now they're not, but imagine when we said
0:43
"Oh that section and this other section.
0:45
We could do that concurrently."
0:47
And let's say those represent 15% and 5%
0:50
of the overall time.
0:52
If we were able to take this code
0:54
and run it on an algorithm that entirely broke
0:57
that up into parallelism, the green parts.
1:00
Let's say the orange part cannot be broken apart.
1:02
We'll talk about why that is in just a second.
1:04
If we can break up this green part
1:06
and let's imagine we had as many cores
1:08
as we want, a distributed system
1:10
on some cloud system.
1:11
We could add millions of cores if we want.
1:13
Then we could make those go to zero.
1:15
And if we could make the green parts go to zero
1:18
like an extreme, non-realistic experience
1:21
but think of it as a upper bound
1:23
then how much would be left?
1:25
80%, the overall performance boost
1:28
we could get would only be 20%.
1:30
So when you're thinking about concurrency
1:32
you need to think about, well how much
1:33
can be made concurrent
1:36
and is that worth the added complexity?
1:38
And the added challenges, as we'll see.
1:41
Maybe it is. It very well may be. But it might not be.
1:45
In this case, maybe a 20% gain but really added complexity.
1:49
Maybe it's not worth it. Remember that 20% is a gain
1:52
of if we could add infinite parallelism
1:55
basically, to make that go to zero
1:56
which won't really happen, right?
1:58
So you want to think about what is the upper bound
2:00
and why might there might be orange sections?
2:03
Why might there be sections
2:04
that we just can't execute in parallel?
2:06
Let's think about how you got in this course.
2:10
You probably went to the website, and you found the course and you clicked a button
2:11
and said, "I'd like to buy this course,"
2:13
put in you credit card, and the system
2:15
went through a series of steps.
2:16
It said, "Well, OK, this person wants to buy the course.
2:18
"Here's their credit card.
2:19
We're going charge their card
2:21
then we're going to record an entry in the database
2:24
that says they're in the course
2:25
and then we're going to send an email
2:26
that says, hey, thanks for buying the course
2:29
here's your receipt, go check it out."
2:31
That can't really be parallelized.
2:34
Maybe the last two.Maybe if you're willing to accept
2:37
that email going out potentially
2:38
if the database failed, it's unlikely
2:40
but, you know, possible.
2:42
But you certainly cannot take charging the credit card
2:45
and sending the welcome email
2:47
and make those run concurrently.
2:49
There's a decent chance that for some reason
2:51
a credit card got typed in wrong
2:53
it's flagged for fraud, possibly not appropriately
2:57
but, right, you got to see what the credit card system
3:00
and the company says.
3:02
There might not be funds for this for whatever reason.
3:04
So we just have to first charge the credit card
3:08
and then send the email.
3:09
There's no way to do that in parallel.
3:11
Maybe we can do it in a way that's more scalable
3:13
that lets other operations unrelated to this run.
3:17
That's a different consideration
3:19
but in terms of making this one request
3:21
this one series of operations faster
3:24
we can't make those parallel.
3:26
And that's the orange sections here.
3:27
Just, a lot of code has to happen in order
3:29
and that's just how it is.
3:32
Every now and then, though, we have these green sections
3:33
that we can parallelize, and we'll be focused
3:36
on that in this course.
3:38
So keep in mind there's always an upper bound
3:40
for improvement, even if you had infinite cores
3:43
and infinite parallelism, you can't always
3:45
just make it that many times faster, right?
3:48
There's these limitations, these orange sections
3:50
that have to happen in serial.