Async Techniques and Examples in Python Transcripts
Chapter: Why async?
Lecture: An upper bound for async speed improvement
0:00 Now you may be thinking, "Oh my gosh, Michael
0:02 you have 12 CPUs, you can make your code
0:05 go 12 times faster."
0:08 Sorry to tell you that is almost never true.
0:11 Every now and then you'll run into algorithms
0:13 that are sometimes referred to
0:15 as embarrassingly parallelizable.
0:17 If you do, say, ray tracing, and every single pixel
0:21 is going to have it's own track of computation.
0:24 Yes, we could probably make that go nearly 12 times faster.
0:27 But most algorithms, and most code, doesn't work that way.
0:32 So if we look at maybe the execution
0:34 of a particular algorithm, we have these two sections
0:38 here, these two greens sections
0:39 that are potentially concurrent.
0:41 Right now they're not, but imagine when we said
0:43 "Oh that section and this other section.
0:45 We could do that concurrently."
0:47 And let's say those represent 15% and 5%
0:50 of the overall time.
0:52 If we were able to take this code
0:54 and run it on an algorithm that entirely broke
0:57 that up into parallelism, the green parts.
1:00 Let's say the orange part cannot be broken apart.
1:02 We'll talk about why that is in just a second.
1:04 If we can break up this green part
1:06 and let's imagine we had as many cores
1:08 as we want, a distributed system
1:10 on some cloud system.
1:11 We could add millions of cores if we want.
1:13 Then we could make those go to zero.
1:15 And if we could make the green parts go to zero
1:18 like an extreme, non-realistic experience
1:21 but think of it as a upper bound
1:23 then how much would be left?
1:25 80%, the overall performance boost
1:28 we could get would only be 20%.
1:30 So when you're thinking about concurrency
1:32 you need to think about, well how much
1:33 can be made concurrent
1:36 and is that worth the added complexity?
1:38 And the added challenges, as we'll see.
1:41 Maybe it is. It very well may be. But it might not be.
1:45 In this case, maybe a 20% gain but really added complexity.
1:49 Maybe it's not worth it. Remember that 20% is a gain
1:52 of if we could add infinite parallelism
1:55 basically, to make that go to zero
1:56 which won't really happen, right?
1:58 So you want to think about what is the upper bound
2:00 and why might there might be orange sections?
2:03 Why might there be sections
2:04 that we just can't execute in parallel?
2:06 Let's think about how you got in this course.
2:10 You probably went to the website, and you found the course and you clicked a button
2:11 and said, "I'd like to buy this course,"
2:13 put in you credit card, and the system
2:15 went through a series of steps.
2:16 It said, "Well, OK, this person wants to buy the course.
2:18 "Here's their credit card.
2:19 We're going charge their card
2:21 then we're going to record an entry in the database
2:24 that says they're in the course
2:25 and then we're going to send an email
2:26 that says, hey, thanks for buying the course
2:29 here's your receipt, go check it out."
2:31 That can't really be parallelized.
2:34 Maybe the last two.Maybe if you're willing to accept
2:37 that email going out potentially
2:38 if the database failed, it's unlikely
2:40 but, you know, possible.
2:42 But you certainly cannot take charging the credit card
2:45 and sending the welcome email
2:47 and make those run concurrently.
2:49 There's a decent chance that for some reason
2:51 a credit card got typed in wrong
2:53 it's flagged for fraud, possibly not appropriately
2:57 but, right, you got to see what the credit card system
3:00 and the company says.
3:02 There might not be funds for this for whatever reason.
3:04 So we just have to first charge the credit card
3:08 and then send the email.
3:09 There's no way to do that in parallel.
3:11 Maybe we can do it in a way that's more scalable
3:13 that lets other operations unrelated to this run.
3:17 That's a different consideration
3:19 but in terms of making this one request
3:21 this one series of operations faster
3:24 we can't make those parallel.
3:26 And that's the orange sections here.
3:27 Just, a lot of code has to happen in order
3:29 and that's just how it is.
3:32 Every now and then, though, we have these green sections
3:33 that we can parallelize, and we'll be focused
3:36 on that in this course.
3:38 So keep in mind there's always an upper bound
3:40 for improvement, even if you had infinite cores
3:43 and infinite parallelism, you can't always
3:45 just make it that many times faster, right?
3:48 There's these limitations, these orange sections
3:50 that have to happen in serial.