Python for the .NET developer Transcripts
Chapter: Memory management in Python
Lecture: Memory management in Python
0:00 Let's compare this story we just told
0:02 about .NET's memory management with Python.
0:06 Again, just like .NET
0:08 there's no explicit memory management.
0:10 Now, we can do the C API extension thing
0:14 for Python and down there you can allocate stuff
0:16 and obviously that's just plain C
0:18 so, you know, there's a lot of memory management
0:20 down on the C side
0:21 but most folks don't actually write C extensions of Python.
0:25 So, outside of that
0:27 there's not really any memory management
0:29 that you do as a Python developer.
0:30 It's all down in the runtime.
0:32 In .NET, we saw that there were value types
0:34 and reference types.
0:35 That's not a thing in Python.
0:37 Everything is a reference type.
0:39 Yes, we have maybe a string which is a pointer
0:42 that points off to some string in memory.
0:44 We might have a customer object which points off to it.
0:47 But also, the number 7
0:49 that is a pointer off to some object in memory.
0:51 Okay, so everything is a reference type.
0:53 Technically, some things like the numbers
0:55 the first few numbers do this, like
0:57 flywheel pattern to preallocate it
0:59 because they're so common.
1:00 But in general, the right conception is
1:03 that everything is a reference type
1:05 and that reference types are always allocated on the heap.
1:08 There's nothing allocated on the stack
1:10 other than potentially, like
1:11 pointers to your local stuff, right?
1:13 But the actual memory that you think you're working with
1:16 that is allocated somewhere else.
1:19 Memory management is simpler in Python.
1:22 In .NET, we have generations
1:24 and we have finding the live objects
1:26 and all of the roots of all the live objects
1:29 and asynchronous collection of that
1:31 when memory pressure is high, and all of that.
1:33 Forget that for Python.
1:34 We have reference counting.
1:36 Super, super simple.
1:37 It's kind of like smart pointers in C++.
1:40 I've got an object.
1:41 There's three variables pointing at it.
1:43 One of the variables is gone
1:44 because the the stack is now gone.
1:46 Now I've got two. One of those is assigned a new value.
1:48 So now I've got one, and then the other one
1:50 maybe it's set to null, or that function returns.
1:52 When that count gets to zero
1:54 immediately that object is erased.
1:56 It's cleaned up.
1:57 So memory management is just as simple as
2:00 every time a thing is added a reference to
2:02 that count goes up.
2:03 If that count number hits zero, boom. It's erased.
2:06 That's faster and more deterministic
2:08 than garbage collection.
2:11 But there's a big problem.
2:12 And what is the problem with reference counting?
2:15 If I have one object and it originally has a pointer
2:18 then it creates another
2:19 but then they refer back to each other
2:22 well, even if I stop referring to all those from-
2:25 either of those objects from all of the variables
2:26 that I have
2:27 they're still always going to have at least one reference.
2:30 Item 1, referring to 2, and 2 referring to 1.
2:32 And even if those get lost, in terms of references
2:35 they're never going to be deleted
2:37 from this reference counting gc.
2:39 Most of the time
2:40 reference counting works like a charm
2:42 cleans up everything fast.
2:43 But there's a fall back generational garbage collector
2:47 to work on these cycles and catch that extra memory
2:49 that goes by. So, you can think of this
2:52 as much more like .NET's memory management for the cycles
2:56 but the first line of defense
2:57 to clean up memory is reference counting.
3:00 Again, that makes this deterministic.
3:03 However, memory is not compacted
3:06 after reference counting happens
3:08 or this generational garbage collector
3:10 that catches the cycles runs.
3:12 No, memory is never compacted in Python.
3:15 That's interesting.
3:16 It probably makes it a little bit faster
3:18 most of the time.
3:19 You also get fragmented memory
3:20 which can make caches less efficient.
3:22 Though there are some techniques
3:24 around this reference counting memory management system
3:27 that lessen the fragmentation called
3:29 blocks, pools and arenas.
3:32 And the quick summary is each object of a given size-
3:35 so I've got an object and it takes up
3:37 I don't know, 24 bytes
3:39 there's a bunch of memory segments that are meant
3:42 you know, there's one, let's say
3:43 this one holds the things of size 24
3:46 and it'll allocate a certain amount
3:48 and it'll keep allocating the things of that size
3:50 into that chunk until it's full
3:52 and it'll create another one into these pools
3:54 and then pools are grouped into arenas
3:55 and so on.
3:57 You probably don't care too much, but the takeaway here is that there are mechanisms
4:00 in place to help break down the fragmentation
4:03 that you might run into.
4:05 If you want dive deeper into some of these ideas
4:07 I recommend you read two articles
4:09 both by the same person
4:11 "Memory Management in Python" by
4:13 over here at this link here. And then
4:15 "Garbage collection in Python: things you need to know"
4:17 So they talk about a lot of this stuff
4:18 with pictures and graphs and actually
4:20 the C data structures that hold this stuff
4:23 and so on. We're going to play around with this in some demos
4:26 but if you really want to see, like
4:27 what is the structure that defines these things?
4:29 Or what are the exact rules
4:30 in which one gets allocated or cleaned up?
4:33 Check out those articles.
4:34 It's too low level of a detail for us
4:35 to really to dive into here.