Python Memory Management and Tips Transcripts
Chapter: Memory and functions
Lecture: Tracking memory usage
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Our code works, actually it works really well. I see no problem with it.
0:04
It's gonna go through and process this data in this pipeline that we talked about.
0:08
But let's see if we can understand it's memory usage and then we can explore to see if we can make it better.
0:13
Now I want to change one thing really quick here. Just because this is going to uncover some of the challenges that we might see. Let's say,
0:22
"head" just like this and this will be the "tail" right? Kind of beginning and the end of this data.
0:30
And let's just put five of them in there and let's go from -5 to the end. Why is this gonna be interesting?
0:38
Because some of the advancements that we'll be able to do later will have a harder
0:42
time looking at the end. So if you reasonably want to go through the data and, like, ask questions like "what
0:47
are the last five?" you'll see both the benefits and the drawbacks. So this will give us kind of a realistic constraint as we go.
0:54
Alright, so what do we want to do? Well, first of what we want to do is we want to go and bring in our little size utility again.
1:02
So the size utility, this is what we've used so far, but it also has this thing called "report process memory",
1:07
which will return the number of megabytes used, as well do a little print statement. So we're gonna use that and in order to use it without PyCharm,
1:16
right? Again, we can Just "import size_ util" which is fine here, but without PyCharm,
1:22
we've got to make sure that it's getting found in the path here. Sort of annoying, but so it is. So this is going to make sure you can always run it,
1:31
even if you just go and run it from the terminal or command prompt right in
1:36
that one folder. Let's go and do, to find a little function here, that's called "report", and it's going to take a "step_name",
1:43
which is a string. Alright, so it's just gonna go to the size_util and say "report process
1:50
memory" but we also want to just show a little bit of information about that so that you can see at this step it's doing such and such rather than just
1:58
having the numbers come out three times. So we'll do a little print with an f-string here. We'll say "step is this".
2:11
And then we also want to set the end to just be a little space. Actually got two spaces there. So we're gonna say step name
2:19
like "loading data" or "filtering" or whatever, and then we'll see the print statement from before, okay? So let's go ahead and use it here.
2:27
Now I'm gonna write some funky code. Ah.. I'll write it Like this. Let's say "report original loaded". I'll just say "loaded".
2:40
Over here, "report starting" right? So we have the baseline of what we got and "filtered" and "scaled". Then we do "done", I guess.
3:00
Alright, let's run it and see how it works. We started with 9 megabytes used then we loaded that one million numbers that went from 9
3:08
to 48 million. And then when we filtered it, it went up some more, then we scaled it went up some more. So at the end,
3:16
we were at 92.48 So how much did our little bit of code take to run? It took 83 megabytes. As we go and improve it over this
3:29
what I'm gonna call "naive mode" or just the most straightforward mode, you'll see that maybe we could do better in this.
3:34
That's the number that we gotta beat if it's going to be better. Now, I did say I was gonna write a little bit of funky code. And what
3:39
I wanted to write, Maybe I'll actually put it here, is this, because, you don't normally see semi-colons in Python,
3:46
but check this out. So what I didn't want to do is I kind of wanted to just keep the same little pattern, but make it report at the end of each step.
3:53
So I'm gonna go and leave it like this, right? The report is just here for us to kind of know what's going on.
4:01
I don't really want to change this pattern. I just want to look really simple and clear
4:03
and just these three lines, these three lines are actually the problem and where the solution lies.