#100DaysOfCode in Python Transcripts
Chapter: Days 49-51: Measuring performance
Lecture: Demo: Even more focused collection
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Now we're turning off the profile until we get to our code. Run this little bit and then we disable it again. Right up here we have profile enable
0:08
and we have profile disable. But there's still a lot of reporting stuff and do you really care how fast the thing prints out?
0:13
Like, it's print to the console, you really can't control that. That's not the essence. So let's just reorganize this code,
0:20
refactor it so that we can group the analytics and the data bits of it and then we'll move on. Okay, so let's come over here and we'll say
0:30
we'll get those days and get those days and those days. Now clearly, there's a problem here. We can't just keep calling the days like we were.
0:39
So, we've have to call this hot days, cold days, and wet days. And we got to replace in that here, hot days, cold days, and wet days.
0:50
Okay, so now we can take this profile bit and disable it way sooner, like this. So here we can do a little bit of work
0:59
in this block of code here and only profile that. Let's run this one more time. All right now, how things are looking.
1:07
Okay, that looks even a little bit cleaner. They didn't change the numbers for this obviously because that's outside of what we were doing,
1:13
and it wouldn't change this either, right. But it does clean things up just a little bit. Let's look at what is the worst case scenario here.
1:22
Well there's init right here, this is obviously the worst function. It's at the top. But let's go look at it. It's doing this line right here.
1:34
Chances are we can't really do any better than that. It turns out that we can call it less often, that's one thing we could try to do
1:42
is check and see if it's already been initialized, then don't do it, that's actually a massive, massive performance gain, but let's make
1:47
what we already have faster before we add that. Over here, we're basically parsing the row and we're pinning the data.
1:55
Remember parsing row down here actually does all sorts of conversions and then assigns it to this record and so on.
2:03
So, there's this init, but really the thing that is the problem here, this parse row that we're calling 36,135 times.
2:11
That is a ton of times that we're calling this. Can we make it faster? Answer is, probably, yes, yes we can. How can we do that?
2:22
One thing we could realize is, it's really this all this conversion these are taking strings and converting them to integers,
2:31
the dictionary read and write is like crazy fast. So, you could look at that and figure this out,
2:37
but it, that's not really the problem, the problem is the string conversion to numbers ints and floats.
2:44
And then also this, we're allocating this record and we're signing well over however elements there are in this named tuple and we're giving it back.
2:53
What can we do here to make this faster? Well it turns out, if you look at the way our program works, first up here that we're working
3:03
with actual max temperature, actual precipitaion, and over in the programming we're working with actual min temp and we should have been sorting by min
3:14
temp here as well. Okay, so minor little bug, but really highs on a cold day are pretty close to the lows as well.
3:22
Alright so we're working with these three values, max temp, min temp, and precipitation. If you look at the little reports we're running
3:29
we're also working with date, nothing else. However, just for completeness sake, we said we're going to convert everything,
3:37
we're going to convert the mean temperature, the record temperature, the average temperature, you name it we're converting that.
3:43
Well if we know our program isn't actually going to touch those pieces of data let's not do that.
3:49
So let's see what we can do about improving performance by reducing some of the data we're working with here.