Python Memory Management and Tips Transcripts
Chapter: Python variables and memory
Lecture: The size of objects
0:00 Alright, let's answer some questions and you can see I've already threw in the questions
0:04 up here just you don't have to watch me type like print.
0:07 How big is this? How big is that?
0:09 And the thing I want to explore is how much memory is used by certain things.
0:14 So we get a sense for a
0:16 number versus a string that has one character versus a string with 20 characters or a
0:21 string with one character versus 20.
0:23 Like how much is it? The character space versus all the underlying runtime infrastructure.
0:28 That's gonna contribute to the memory use.
0:30 So we're in luck. Python has a good way to answer this.
0:33 We're gonna import sys. And over here we can print out things like
0:39 sys.getsizeof a thing like the number four.
0:43 So let's do that real quick and then run.
0:47 How big is the number four?
0:49 Well, in a language like C++ or C# or something like that
0:53 where these are just allocated locally,
0:56 you always have to talk about the size and the number is like a short
0:59 or a long or something like that.
1:02 But typically this would be 2, 4, or 8 bytes
1:06 long. in Python, a number,
1:08 a small number, is 28. If we had a little bit bigger number,
1:13 it's still 28. Let me make a little bigger so it stays on the screen
1:16 But if it's a lot bigger,
1:22 we use a tiny bit more memory.
1:24 So the size matters, but not so much.
1:28 But there is some overhead. Remember,
1:29 this is the PyObject pointer and all the things to know how many people are
1:34 keeping track of it, where it was allocated,
1:37 what type it is. All of these things are happening behind the scenes,
1:40 and we just see the simple number 4,
1:42 but Python is doing a bunch of work through that infrastructure that we talked about.
1:46 Remember the red pill stuff? That's what's happening,
1:49 that's why this a little bit bigger.
1:51 Alright, what about this one?
1:52 Let's print sys.getsizeof the letter
1:56 "a". Well, these you feel like these might be similar,
1:59 right? I mean, in most programming languages,
2:03 that's 1, 2 or 4 bytes and this is 2, 4, 8 So maybe it's even smaller.
2:07 Let's see. Nope. 50.
2:10 It's bigger. So it turns out,
2:11 strings have a lot going on,
2:13 so there's a little bit going on there.
2:15 And let's see if we have a big string like this.
2:21 How much larger is it? 25.
2:24 Well, 25 larger right? 75 now,
2:28 and that's because we have 26 of these rather than 1.
2:32 So you just multiply that by 26.
2:36 We get our 26 with multiplied by 100.
2:39 Look, it's about 100 bigger.
2:40 Okay, that's what's going on here,
2:42 right? So basically, there's this infrastructure to keep track of the string and do
2:46 all the string things and then the extra data.
2:49 And if we had Unicode characters,
2:51 they might take up more than one byte per character.
2:55 How about a list? Simple little empty list.
3:01 How big is that? 56.
3:04 Okay, that's not too super large.
3:08 Now let's do something with like,
3:10 10 items in it. There we go.
3:14 I put 10 in there. Let's see how much bigger
3:16 it is. 136. Well, that's quite a bit bigger.
3:20 Let's think about that for a second.
3:22 Well, what does the list actually contain?
3:24 The list doesn't contain the values the list contains basically what every variable in Python is
3:31 It contains a pointer to a number out in memory.
3:34 So there's somewhere out in memory a 1.
3:36 And in here in the list,
3:37 there's the go find the 1 the number over there and then
3:40 here's another pointer out to the 2 wherever it is in memory.
3:44 These smaller numbers are interesting where that actually is.
3:47 But we've got a list, and it's got 10 of those in there.
3:50 The lists generally don't allocate one slot at a time.
3:53 They kind of grow in a doubling type of way,
3:55 like, you've got 10, and then you add a few more so we're gonna allocate
3:59 20 and then 40 and and so on that kind of pattern so that you're not
4:03 constantly allocating every time you're adding something and copying cause that's super slow.
4:08 All right, well, that's how big that is.
4:10 Sort of. We're going to see that this going to get interesting and let's actually
4:14 do something else. So how much memory did this take like,
4:18 How much does that line contribute?
4:21 Well, it contributes, we saw that each number is 28 and then it's gonna
4:28 allocate whatever the list needs to be. The list
4:30 by itself is 56. So each one of them would have 280 bites,
4:37 probably for all of those numbers,
4:39 those 10 numbers because they're 28 each and then we're gonna have the 56 for the
4:44 list. That's like 320 or something 330.
4:49 And then there's also the pointers that are gonna be in the list as well that
4:52 have to point out there, maybe the over allocation.
4:55 So something's going on like this is not big enough,
4:57 right? Just in the numbers alone,
4:59 they should take 280 bytes. We're going to see what's going on in a minute
5:02 but this is how much room this thing is taking.
5:05 But let's try to force the issue by saying
5:08 "What if there's a large piece of data right there and a large piece of data
5:14 right here?" So I'm gonna do a little bit of work here.
5:17 I'm going to come up with some data that we're gonna ask about,
5:20 and I'm going to go from 1 to 11.
5:25 I'm going to add in some stuff 10 times. The number of elements in the list should
5:31 be the same. I'm going to come up with an item and the item is going
5:34 to be a list that starts out with the number of whatever it is the loop.
5:38 So, first time through this will be 1 second time
5:40 It'll be 2, and so on,
5:42 and Python has this funky little trick that we can do here.
5:45 So if n is 7 and we can come over here,
5:50 there's a list of Let's put,
5:51 like 3 in there and we times n what we get is a like a multiple
5:57 a list with that copied that many times.
5:59 So here we get a single list with seven 3's instead of just one 3. We're going to
6:06 do that here. i times i so first it will be 1, then 4,
6:12 so on, times 10. And by the time that gets to be 10 that's
6:16 gonna be 1000. So it'll be a list with 10 in it 1000 times.
6:20 the last one that's out here is gonna be bigger than just, you know,
6:23 the number 10. Absolutely. You're gonna put that item in there and let's just
6:30 really quickly print out data just so you see what we got here.
6:32 Notice there's a whole bunch of tens and a bunch of nines fewer and so on,
6:36 right? It kind of grows geometrically. All right,
6:39 so that looks like that did kind of like what I said it did.
6:41 And let's just print
6:43 sys.getsizeof, Data. 184.
6:48 What do you think? I'm gonna think
6:50 no. no, that's not right.
6:53 But, this does give us a sense
6:55 of what the base size is.
6:57 So what are we answering or what
6:59 information are we getting when we say getsizeof? What we're getting is it
7:04 goes through and it says, "I'm gonna look at the actual data structure, the list".
7:08 So, this one right here and let's see how much it's internally allocated,
7:13 what are its fields? And if it's got a big buffer to store items it
7:18 puts in it. How long is that buffer?
7:19 But what it doesn't do is it doesn't traverse the object graph.
7:23 It doesn't go "Okay, well,
7:25 there's 10 things in here. Let me follow the reference from each one of those
7:28 10. See how big it is.
7:30 And if it has references, follow their references"
7:32 it doesn't do this traversal
7:34 which is actually what you need to know about how much memory is used.
7:38 But this getsizeof, its A start,
7:40 you'll see that there's a better way that we can get going to actually answer this
7:44 question more accurately.