Python Memory Management and Tips Transcripts
Chapter: Recovering memory in Python
Lecture: Ref-counting and the GIL
0:00 If you thought about multi-threaded programming or parallel or asynchronous programming in Python,
0:05 you've surely heard about this thing,
0:06 the "GIL", or the "Global Interpreter Lock",
0:09 and you often hear this presented as a negative.
0:12 What this thing does is it
0:13 Only allows a single Python instruction to be executed at any given moment,
0:19 no matter how many threads you have or how many cores you have,
0:21 none of that, Python instruction one at a time.
0:25 So that could be very limiting
0:27 for how parallelism works in Python.
0:30 Now there's a lot of situations where it works
0:32 awesome, actually I go through it in my asynchronous programming course.
0:35 But things like "I'm waiting on a network connection from a response from a Web service
0:41 or some kind of API I'm calling", that doesn't count is executing a python
0:46 statement, that's down somewhere in the OS.
0:47 So if you're waiting on other things,
0:50 actually the parallelism is great. But for computational stuff where the computational bits are in
0:55 Python, this GIL is a big problem.
0:57 Actually, Eric Snow and some of the other core developers are working on PEP
1:01 5.5.4 which will create what are called sub interpreters or multiple interpreters that take sort of a
1:08 copy of everything we've talked about and replicate separate,
1:11 isolated versions and each version can run on a thread
1:14 and it'll potentially solve or alleviate some of these problems by not sharing objects. That's
1:19 in the future. I have a lot of hope for it.
1:21 It sounds really cool, but I want to just bring up this GIL cause it's
1:25 actually super interesting around what we've just talked about about, not the garbage collection,
1:30 more the reference counting side of things.
1:32 So let's think about a world without the GIL.
1:35 We have stuff running. We've got these different references
1:39 to the PyObjects, different pointers that are all pointing back from potentially different threads,
1:45 and they're coming and going, as they do, multi-threaded, right? In parallel.
1:50 So in that case, you're gonna have to do some kind of thread lock,
1:53 some kind of critical section or muText or something around that piece of data that
1:59 is the reference count on the PyObject,
2:01 every single one, even things like numbers and strings. You can imagine that's gonna drag
2:08 down and be super, super, slow
2:11 right? That's a lot of overhead for every single normal operation.
2:14 So the tradeoff was made that said,
2:16 Well, "let's only let one bit of Python run at a time",
2:19 in that case, there's no possibility of
2:21 a race condition around the reference count,
2:24 so we don't have to have thread-locking on it and it'll be much faster.
2:28 So that's really what the GIL is all about.
2:30 People think of it as like a threading protection thing and it kind of, sort of
2:34 is. But really, what it is, is a reference counting protection mechanism that allows
2:38 Pythons reference counting to be done without thought or care of parallelism or thread safety or
2:45 all those things that are somewhat hard but certainly expensive relative to not doing them,
2:50 and they can just freely work with these objects and because of the way it runs, you're
2:54 never gonna get a race condition on that pointer. That's a global interpreter lock.
2:58 When I always thought about it,
2:59 when I first heard about, was like "this is a threading thing", and technically
3:03 yes, but really the most relevant part of it is this has to do
3:07 with reference counting. So it's a python memory management feature,
3:10 if you will. Actually, you can read more about it over at Real Python
3:13 Check out the article. They always have a bunch of good articles on this
3:17 type of stuff, so they've done a really good one
3:19 exploring the Python GIL. You can see the URL at the bottom.
3:22 If you want to learn even more about it,
3:23 go check it out there, but keep in mind when you hear about the GIL,
3:26 it's actually there to make reference
3:29 counting much faster and easier. Yes,
3:32 it's a trade off, but it's an interesting one to consider,
3:35 and a lot of times, if you're not doing parallelism,
3:37 you're better off because of it.