Python for .NET Developers Transcripts
Chapter: Memory management in Python
Lecture: Memory management in Python
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Let's compare this story we just told about .NET's memory management with Python. Again, just like .NET there's no explicit memory management.
0:11
Now, we can do the C API extension thing for Python and down there you can allocate stuff and obviously that's just plain C
0:19
so, you know, there's a lot of memory management down on the C side but most folks don't actually write C extensions of Python. So, outside of that
0:28
there's not really any memory management that you do as a Python developer. It's all down in the runtime. In .NET, we saw that there were value types
0:35
and reference types. That's not a thing in Python. Everything is a reference type. Yes, we have maybe a string which is a pointer
0:43
that points off to some string in memory. We might have a customer object which points off to it. But also, the number 7
0:50
that is a pointer off to some object in memory. Okay, so everything is a reference type. Technically, some things like the numbers
0:56
the first few numbers do this, like flywheel pattern to preallocate it because they're so common. But in general, the right conception is
1:04
that everything is a reference type and that reference types are always allocated on the heap. There's nothing allocated on the stack
1:11
other than potentially, like pointers to your local stuff, right? But the actual memory that you think you're working with
1:17
that is allocated somewhere else. Memory management is simpler in Python. In .NET, we have generations and we have finding the live objects
1:27
and all of the roots of all the live objects and asynchronous collection of that when memory pressure is high, and all of that. Forget that for Python.
1:35
We have reference counting. Super, super simple. It's kind of like smart pointers in C++. I've got an object. There's three variables pointing at it.
1:44
One of the variables is gone because the the stack is now gone. Now I've got two. One of those is assigned a new value.
1:49
So now I've got one, and then the other one maybe it's set to null, or that function returns. When that count gets to zero
1:55
immediately that object is erased. It's cleaned up. So memory management is just as simple as every time a thing is added a reference to
2:03
that count goes up. If that count number hits zero, boom. It's erased. That's faster and more deterministic than garbage collection.
2:12
But there's a big problem. And what is the problem with reference counting? Cycles. If I have one object and it originally has a pointer
2:19
then it creates another but then they refer back to each other well, even if I stop referring to all those from-
2:26
either of those objects from all of the variables that I have they're still always going to have at least one reference.
2:31
Item 1, referring to 2, and 2 referring to 1. And even if those get lost, in terms of references they're never going to be deleted
2:38
from this reference counting gc. Most of the time reference counting works like a charm cleans up everything fast.
2:44
But there's a fall back generational garbage collector to work on these cycles and catch that extra memory that goes by. So, you can think of this
2:53
as much more like .NET's memory management for the cycles but the first line of defense to clean up memory is reference counting.
3:01
Again, that makes this deterministic. However, memory is not compacted after reference counting happens or this generational garbage collector
3:11
that catches the cycles runs. No, memory is never compacted in Python. That's interesting. It probably makes it a little bit faster most of the time.
3:20
You also get fragmented memory which can make caches less efficient. Though there are some techniques
3:25
around this reference counting memory management system that lessen the fragmentation called blocks, pools and arenas.
3:33
And the quick summary is each object of a given size- so I've got an object and it takes up I don't know, 24 bytes
3:40
there's a bunch of memory segments that are meant you know, there's one, let's say this one holds the things of size 24
3:47
and it'll allocate a certain amount and it'll keep allocating the things of that size into that chunk until it's full
3:53
and it'll create another one into these pools and then pools are grouped into arenas and so on.
3:58
You probably don't care too much, but the takeaway here is that there are mechanisms in place to help break down the fragmentation
4:04
that you might run into. If you want dive deeper into some of these ideas I recommend you read two articles both by the same person
4:12
"Memory Management in Python" by over here at this link here. And then "Garbage collection in Python: things you need to know"
4:18
So they talk about a lot of this stuff with pictures and graphs and actually the C data structures that hold this stuff
4:24
and so on. We're going to play around with this in some demos but if you really want to see, like what is the structure that defines these things?
4:30
Or what are the exact rules in which one gets allocated or cleaned up? Check out those articles. It's too low level of a detail for us
4:36
to really to dive into here.