Async Techniques and Examples in Python Transcripts
Chapter: Leveraging CPU cores with multiprocessing
Lecture: Demo: Scaling CPU-bound operations with multiprocessing
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Here we are in our multiprocessing demo chapter. It might look familiar, actually. Remember this? Computed and computed_threaded?
0:08
Where we did this math, especially the computed_threaded. We figured out how many processors we had and we said we're going to take this math function
0:16
and we were going to run it across all of these operations, but we're in fact going to break it into a bunch of segments and have it work
0:23
on each independent section. We saw that with the threads, that was super underwhelming. Let's just really quickly review that.
0:30
I'm going to call this function called do_math() and it's just going to do a bunch of wasted computational stuff for a certain period of time.
0:39
So the details aren't super important but our goal in this section is to stop using threads and start using multiprocessing.
0:48
Okay, so let's just take this threaded example carry forward with it, so we're going to say compute multiprocessing, set that one to run.
0:59
And it'll do threaded stuff for a moment doesn't matter, we'll come back to it. So our goal is to replace this thread stuff
1:07
so right now what we're doing to our threads is we're creating the threads, and then we're starting them and then we're waiting on them.
1:12
So what we're going to do is stop using threads and we're going to just focus on multiprocessing. Now we're not actually going to need this
1:21
instead, what we're going to do to make this a little bit simpler is we're going to create this thing called a pool, so I'll call.
1:26
Pool is going to be a multiprocessing.pool we can set a couple things in here. For example, we could set how many processes
1:34
we'd like to use, now if we don't see any if we say nothing is going to use processor count actually. So we can just leave it like this
1:42
that's probably what we want. You may want to constrain it to have fewer processes than it has processors, and similarly you might have even more.
1:50
So what we're going to do is we're going to go to the pool I'm going to called a function called apply_async. Here we go, and what does it take?
1:56
We have to pass the function, so it's a little bit annoying that the signatures different. Do not know why, but we don't have a target
2:04
we just have the function that's not named. So we say do_math, and we'll call it and then we have the arguments. Which is, again, going to be that.
2:13
And that's it, so this is going to actually start to work, we don't need to call stark. And how do we wait, well, we're going to say
2:20
pool.close to tell it that no more work is coming, so when its processed all its work it can quit. And then we're going to, just like we did on all of
2:28
the individual threads, we're going to join on this. Clean up on the formatting there okay, I think that that actually might do it.
2:38
Not sure we could do anything different. What did we do? We didn't put our threads in the list and then managed the list, we're just calling
2:45
pool.apply_async, we're using the number of processors on this machine by not specifying any particular number here.
2:52
Let's go, let me run the threaded one one more time so just to, let's run the single threaded one.
2:59
Just see how it goes, remember it was around 7.5 seconds. 7, 7, 7, how is our multiprocessing going to do compared to this? Let's give it a shot.
3:12
It's running, it's done. Yes, that is what we wanted. Is it as fast as if we had multiplied it or divided it by 12, a factor of 12 x increase.
3:22
I don't think so, let's find out. No it's about five times faster. We are starting separate processes and all of that.
3:31
Still we've seen a dramatic, dramatic increase in performance, we went from 7.8 seconds to 1.4 seconds, and you saw how much code that actually took.
3:43
It's a lot of talking and explaining and so on we're actually turns of rewriting this loop it's ridiculous right?
3:49
We created a pool, did a calling thread and then start, we just called pool.apply_async. Pretty much the same arguments and off to the races we go.
3:59
Let's just look at this really quick. In my little glances program. Okay, here you can see not a whole lot's going on.
4:05
CPU is around 6%, this is not going to run long enough for interesting things to happen. So I'm going to make it a little bit longer
4:13
ten times longer, so we can run it for 10 seconds and see what happens. Now if we watch, what's the CPU usage? Woo, it's 96% of the entire machine, 99.
4:25
Remember I'm doing screen recording that probably knocks out one of the CPU's right there. And look at all of these Python processes.
4:31
Heres the multiprocessing happening. So they're all running and they're cranking along and then boom, they're done.
4:38
They're all finished and now we're just down to my one little Python app that's sitting here or something to that effect.
4:44
So there we go, it took 14 seconds because we made it 10 times as much so it's pretty much a linear scale right there. Awesome, awesome, awesome.
4:53
So a multiprocessing really quite easy to use.