Python Memory Management and Tips Transcripts
Chapter: Recovering memory in Python
Lecture: Ref-counting and the GIL
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
If you thought about multi-threaded programming or parallel or asynchronous programming in Python,
0:05
you've surely heard about this thing,
0:06
the "GIL", or the "Global Interpreter Lock",
0:09
and you often hear this presented as a negative.
0:12
What this thing does is it
0:13
Only allows a single Python instruction to be executed at any given moment,
0:19
no matter how many threads you have or how many cores you have,
0:21
none of that, Python instruction one at a time.
0:25
So that could be very limiting
0:27
for how parallelism works in Python.
0:30
Now there's a lot of situations where it works
0:32
awesome, actually I go through it in my asynchronous programming course.
0:35
But things like "I'm waiting on a network connection from a response from a Web service
0:41
or some kind of API I'm calling", that doesn't count is executing a python
0:46
statement, that's down somewhere in the OS.
0:47
So if you're waiting on other things,
0:50
actually the parallelism is great. But for computational stuff where the computational bits are in
0:55
Python, this GIL is a big problem.
0:57
Actually, Eric Snow and some of the other core developers are working on PEP
1:01
5.5.4 which will create what are called sub interpreters or multiple interpreters that take sort of a
1:08
copy of everything we've talked about and replicate separate,
1:11
isolated versions and each version can run on a thread
1:14
and it'll potentially solve or alleviate some of these problems by not sharing objects. That's
1:19
in the future. I have a lot of hope for it.
1:21
It sounds really cool, but I want to just bring up this GIL cause it's
1:25
actually super interesting around what we've just talked about about, not the garbage collection,
1:30
more the reference counting side of things.
1:32
So let's think about a world without the GIL.
1:35
We have stuff running. We've got these different references
1:39
to the PyObjects, different pointers that are all pointing back from potentially different threads,
1:45
and they're coming and going, as they do, multi-threaded, right? In parallel.
1:50
So in that case, you're gonna have to do some kind of thread lock,
1:53
some kind of critical section or muText or something around that piece of data that
1:59
is the reference count on the PyObject,
2:01
every single one, even things like numbers and strings. You can imagine that's gonna drag
2:08
down and be super, super, slow
2:11
right? That's a lot of overhead for every single normal operation.
2:14
So the tradeoff was made that said,
2:16
Well, "let's only let one bit of Python run at a time",
2:19
in that case, there's no possibility of
2:21
a race condition around the reference count,
2:24
so we don't have to have thread-locking on it and it'll be much faster.
2:28
So that's really what the GIL is all about.
2:30
People think of it as like a threading protection thing and it kind of, sort of
2:34
is. But really, what it is, is a reference counting protection mechanism that allows
2:38
Pythons reference counting to be done without thought or care of parallelism or thread safety or
2:45
all those things that are somewhat hard but certainly expensive relative to not doing them,
2:50
and they can just freely work with these objects and because of the way it runs, you're
2:54
never gonna get a race condition on that pointer. That's a global interpreter lock.
2:58
When I always thought about it,
2:59
when I first heard about, was like "this is a threading thing", and technically
3:03
yes, but really the most relevant part of it is this has to do
3:07
with reference counting. So it's a python memory management feature,
3:10
if you will. Actually, you can read more about it over at Real Python
3:13
Check out the article. They always have a bunch of good articles on this
3:17
type of stuff, so they've done a really good one
3:19
exploring the Python GIL. You can see the URL at the bottom.
3:22
If you want to learn even more about it,
3:23
go check it out there, but keep in mind when you hear about the GIL,
3:26
it's actually there to make reference
3:29
counting much faster and easier. Yes,
3:32
it's a trade off, but it's an interesting one to consider,
3:35
and a lot of times, if you're not doing parallelism,
3:37
you're better off because of it.