Async Techniques and Examples in Python Transcripts
Chapter: Parallelism in C with Cython
Lecture: Demo: Fast threading with cython (app review)

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Now that we know a little bit about Cython
0:02 and how to use it, it's time to take the Cython power
0:05 and apply it to concurrent parallel programming
0:08 apply it to threads, and you're going to see
0:10 how Cython first just speeds up the code, period
0:13 because it's C and not Python and then breaks us free
0:16 from the GIL so we can go even faster.
0:19 There's just a handful of concepts that we got to learn
0:21 to convert the code we've already written
0:23 to extremely high performance C code.
0:26 Now let's go back to this computed useless math thing
0:29 that we're doing.
0:30 Right here is our do_math, and remember, we have
0:34 let me again put the little digit grouping things
0:36 in here so you can tell more easily.
0:38 We're asking this little math function
0:41 to increment an integer
0:42 and to do some square roots and multiplication
0:45 and subtraction 30 million times.
0:48 Now, turns out for 30 million times
0:51 this is actually pretty fast.
0:52 Let's run it. 7.6 seconds.
1:02 Yeah, that's good.
1:03 I mean, we did 30 million things
1:05 actually way more than that, so impressive
1:08 but we can do it way faster.
1:10 So this you've already seen, and we talked about this
1:12 in the threading section
1:13 in the multiprocessing section, and so on.
1:16 So we said, well, if we'd run this in a threaded mode
1:19 on 12 processors, remember how awesome this computer is?
1:21 It has 12 cores.
1:22 Maybe cores is a better word than processors.
1:24 Doesn't really matter. It should get faster.
1:26 Oh, about the same.
1:29 I threw in a little factor for these so you could tell based
1:32 on some standard, running this a few times
1:34 how much faster or slower.
1:35 So the threaded version's about the same.
1:38 This one, do the multiprocessing way better.
1:41 Yes, it's 1.79 seconds, 4.7 times faster.
1:46 That is a big improvement. Can it get better still?
1:49 So that's our goal. Our goal is to take this code.
1:54 Actually, we're going to take the threaded version here
1:57 the threaded version, and that's using 12 threads
2:01 'cause there's 12 cores, and it's trying to run it
2:04 and we saw that this one actually was almost exactly
2:06 the same speed as just the serial one.
2:09 We saw with multiprocessing
2:10 that we can get 4.7 times faster
2:13 but why again is this threading one not helping us?
2:16 It's not helping because all of the execution
2:20 all the math work that we're doing is a series
2:22 of interpreter instructions in CPython.
2:24 The GIL tells us that one instruction at a time can run
2:28 period, regardless of how many threads
2:30 or things like that are running.
2:32 So this is clogged up.
2:33 It can't go any faster with threading
2:35 but that's because of the GIL.
2:37 What if we could use Cython to make this happen in C
2:40 and break free from the GIL?
2:42 Well, it'd definitely be on, wouldn't it?
2:43 And it turns out we are going to do that next.