Python for the .NET Developer Transcripts
Chapter: Memory management in Python
Lecture: Memory management in Python

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Let's compare this story we just told about .NET's memory management with Python. Again, just like .NET there's no explicit memory management.
0:11 Now, we can do the C API extension thing for Python and down there you can allocate stuff and obviously that's just plain C
0:19 so, you know, there's a lot of memory management down on the C side but most folks don't actually write C extensions of Python. So, outside of that
0:28 there's not really any memory management that you do as a Python developer. It's all down in the runtime. In .NET, we saw that there were value types
0:35 and reference types. That's not a thing in Python. Everything is a reference type. Yes, we have maybe a string which is a pointer
0:43 that points off to some string in memory. We might have a customer object which points off to it. But also, the number 7
0:50 that is a pointer off to some object in memory. Okay, so everything is a reference type. Technically, some things like the numbers
0:56 the first few numbers do this, like flywheel pattern to preallocate it because they're so common. But in general, the right conception is
1:04 that everything is a reference type and that reference types are always allocated on the heap. There's nothing allocated on the stack
1:11 other than potentially, like pointers to your local stuff, right? But the actual memory that you think you're working with
1:17 that is allocated somewhere else. Memory management is simpler in Python. In .NET, we have generations and we have finding the live objects
1:27 and all of the roots of all the live objects and asynchronous collection of that when memory pressure is high, and all of that. Forget that for Python.
1:35 We have reference counting. Super, super simple. It's kind of like smart pointers in C++. I've got an object. There's three variables pointing at it.
1:44 One of the variables is gone because the the stack is now gone. Now I've got two. One of those is assigned a new value.
1:49 So now I've got one, and then the other one maybe it's set to null, or that function returns. When that count gets to zero
1:55 immediately that object is erased. It's cleaned up. So memory management is just as simple as every time a thing is added a reference to
2:03 that count goes up. If that count number hits zero, boom. It's erased. That's faster and more deterministic than garbage collection.
2:12 But there's a big problem. And what is the problem with reference counting? Cycles. If I have one object and it originally has a pointer
2:19 then it creates another but then they refer back to each other well, even if I stop referring to all those from-
2:26 either of those objects from all of the variables that I have they're still always going to have at least one reference.
2:31 Item 1, referring to 2, and 2 referring to 1. And even if those get lost, in terms of references they're never going to be deleted
2:38 from this reference counting gc. Most of the time reference counting works like a charm cleans up everything fast.
2:44 But there's a fall back generational garbage collector to work on these cycles and catch that extra memory that goes by. So, you can think of this
2:53 as much more like .NET's memory management for the cycles but the first line of defense to clean up memory is reference counting.
3:01 Again, that makes this deterministic. However, memory is not compacted after reference counting happens or this generational garbage collector
3:11 that catches the cycles runs. No, memory is never compacted in Python. That's interesting. It probably makes it a little bit faster most of the time.
3:20 You also get fragmented memory which can make caches less efficient. Though there are some techniques
3:25 around this reference counting memory management system that lessen the fragmentation called blocks, pools and arenas.
3:33 And the quick summary is each object of a given size- so I've got an object and it takes up I don't know, 24 bytes
3:40 there's a bunch of memory segments that are meant you know, there's one, let's say this one holds the things of size 24
3:47 and it'll allocate a certain amount and it'll keep allocating the things of that size into that chunk until it's full
3:53 and it'll create another one into these pools and then pools are grouped into arenas and so on.
3:58 You probably don't care too much, but the takeaway here is that there are mechanisms in place to help break down the fragmentation
4:04 that you might run into. If you want dive deeper into some of these ideas I recommend you read two articles both by the same person
4:12 "Memory Management in Python" by over here at this link here. And then "Garbage collection in Python: things you need to know"
4:18 So they talk about a lot of this stuff with pictures and graphs and actually the C data structures that hold this stuff
4:24 and so on. We're going to play around with this in some demos but if you really want to see, like what is the structure that defines these things?
4:30 Or what are the exact rules in which one gets allocated or cleaned up? Check out those articles. It's too low level of a detail for us
4:36 to really to dive into here.


Talk Python's Mastodon Michael Kennedy's Mastodon