Python Memory Management and Tips Transcripts
Chapter: Recovering memory in Python
Lecture: When reference counting breaks
0:00 Well, you heard me go on about the benefits of reference counting.
0:04 It's fast, It's lightweight, it's deterministic
0:07 and yet we saw there's this other thing called a GC that must be doing something,
0:10 a garbage collector, and it wasn't involved in reference counting.
0:14 So what's the deal? Where does reference counting fall down?
0:18 Where does it break? Reference counting is excellent,
0:22 but one thing it cannot deal with
0:24 is cycles. You have one object you create
0:27 and then that object refers to another and then some other part to another,
0:31 which links back to itself, which might create some kind of cycle.
0:35 You're gonna end up in a situation where the reference count can never go smaller than
0:38 1, so the object will never be freed, and it'll be
0:42 leaked. Memory will be leaked and it won't be great.
0:46 There's some interesting things you can do around that for performance,
0:49 but first, let's just look at the problem.
0:52 So let's start with some simple code.
0:53 We have our person that we created,
0:56 and we can add friends to them.
0:58 So we're gonna create one person whose name is Michael.
1:00 There they are. They're out here in memory.
1:02 So person name is Michael. They have some friends, there are
1:05 no friends in there yet. Create another person.
1:08 Her name is Sarah. She's out here in memory,
1:10 and she has no friends either.
1:12 But notice each one has a reference count of 1, and 1 because p1 points
1:16 at Michael, so that's 1, p2 points at Sarah,
1:20 so she gets 1. However,
1:22 Michael and Sara are friends, so we're gonna go over to Michael,
1:26 p1, and say "add to the friends, or appended to the friends list, p2, that's Sarah".
1:30 So that means that Sarah is one of Michael's friends.
1:35 Sarah, being a lovely person,
1:37 wants to reciprocate that and says,
1:40 "hey, I'm also a friend. Michael is my friend because I'm his friend",
1:44 right? So in the same way,
1:46 we're gonna put Michael into
1:47 Sarah's list of friends. And now look,
1:50 each of them have 2 reference counts.
1:53 And now we decide "hey,
1:54 we're done with p1", and that's going to take away this link,
1:58 and the reference count for Michael is 1.
2:00 The 1 comes from Sarah and her friends list,
2:03 pointing back, and guess what, we're done with Sarah as
2:05 well, so we're gonna take away that link,
2:07 and her reference count goes down to 1 as well because Michael is one of her
2:11 friends, and she's in his friend list.
2:15 So look at this. We're in this situation where there's no more variables pointing at
2:18 either Michael or Sarah, and yet their reference count is 1.
2:23 What action could possibly happen in this program that will make that go to zero,
2:28 either for Michael, so that he'll get cleaned up,
2:31 which will take away the reference count of Sarah and get her to clean up or
2:34 in reverse? Well, there's nothing left pointed at them.
2:36 No one can manipulate the friends list because no one even knows about these variables anymore.
2:41 This is just a fundamental flaw in reference counting garbage collection.
2:45 If you end up in a situation like this,
2:48 you're done. Those things will never,
2:50 ever be cleaned up. I guess you could like C++,
2:53 remember to always break the cycles.
2:55 But that's not gonna work, right?
2:57 It might not be this simple.
2:59 There could be, Michael has some other friend who has some other object which holds on
3:04 to other people who then hold on to Sarah,
3:07 who happens to be one of Michael's friends or,
3:08 you know, something like some big,
3:10 long, complicated chain. It's not a 2 person linked cycle.
3:14 It could have many, many links in that cycle,
3:16 and it could be really hard to understand. So these cycles,
3:19 that's what fundamentally breaks reference counting.
3:22 You can see, we've even set both the variables to none,
3:25 and there's no mechanism for cleaning up Michael or Sarah because they're in this,
3:29 like, locked bit where you've gotta wait for one to go away to get
3:33 to the other, but that's never gonna happen.