Python Memory Management and Tips Transcripts
Chapter: Allocating memory in Python
Lecture: Allocation in action
0:00 Let's do a little thought experiment. Imagine we have this one line of Python code,
0:03 which we know whole tons of stuff is happening down in the runtime.
0:07 But on the Python side, it's simple.
0:08 We have a person object. We want to create them, past some initial data over
0:12 to them, their name is Michael, and so on.
0:15 Now let's imagine that to accomplish this operation we need 78 bytes of RAM from the
0:23 operating system. What happens? How does that get into Python? Like,
0:28 what part of memory do we get?
0:30 How is it organized and so on?
0:32 So a very simplistic and naive way to think about this would be all right,
0:37 what we're gonna do is going to go down to the C-layer, and C-Layer
0:40 is going to use the underlying operating system mechanism for getting itself some memory.
0:46 It'll malloc it, right? So malloc is the C allocation side and then free, we would call
0:52 free on the pointer to make that memory go away,
0:54 Okay? So you might just think that the C runtime just goes to the
0:58 operating system and says "give me 78 bytes", and the operating system says "super,
1:03 we're gonna provide 78 bytes of virtual memory that we've allocated to that process",
1:09 which then, boom into that we put the thing that we need, some object
1:14 that has an id and the id wasn't explicitly set.
1:17 But, let's say it's generated.
1:18 The id is that in the name is Michael. Well,
1:23 that seems straightforward enough. I mean,
1:24 you have these layers, right?
1:26 Python is running and then Python is running actually implemented in C and C is running on top
1:31 of the operating system and the operating system is running on real RAM on top of hardware.
1:36 So this seems like a reasonable thought process.
1:39 But no, no, no. This is not what happens.
1:43 There's a whole lot more going on.
1:44 In fact, that's what this whole chapter is about,
1:46 is talking about what happens along these steps.
1:50 And it's not what we've described here.
1:53 Let's try again. So at
1:55 the base we, of course,
1:56 still have RAM. We have hardware.
1:58 That's where memory lives. We still have an operating system.
2:01 Operating systems provide virtual memory to the various processes that are running so that one process
2:07 can't reach in and grab the memory of another.
2:09 For example, we saw that there's ways in the operating system to allocate stuff.
2:15 So at C, there's an API called malloc that's gonna talk to the underlying operating
2:20 system. This is what we had sort of envisioned the world to be before.
2:24 But there's additional layers of really advanced stuff happening on top here.
2:29 Above this, we have what's called Pythons allocator or the PyMem
2:34 API and PyMalloc.
2:36 So the C runtime doesn't just call malloc,
2:39 it calls PyMalloc, which runs through a whole bunch of different strategies to do
2:44 things more efficiently. We saw that in python and Cpython in particular,
2:49 that every tiny little piece of data that you work with, everything, numbers,
2:53 characters, strings, all the way up to lists and dictionaries and whatnot,
2:58 these are all objects, and every one of them requires a special separate allocation,
3:02 often very small bits of data,
3:04 and that's why Python has this
3:06 Pymalloc. But wait, there's more.
3:10 If you're allocating something small, and by small,
3:14 I mean sys.getsizeof,
3:16 not my fancy reversal thing. So if you're allocating something that is in its own
3:21 essence small, then Python is going to use something called the "Small Object Allocator",
3:27 which goes through a whole bunch of patterns and techniques to optimize this further,
3:32 and we're going to dig into that a bunch.
3:34 So if you want to see all this happening,
3:35 you can go to "bit.ly/pyobjectallocators", the link at
3:40 the bottom, and actually the source code is ridiculously well documented.
3:45 There's like paragraphs of stuff talking about all these things in here,
3:49 but there's actually, in there, There's a picture, ASCII art-like picture that looks
3:53 very much like this diagram that I drew for you with some details that I left
3:57 out, but they're in the source code.