#100DaysOfCode in Python Transcripts
Chapter: Days 49-51: Measuring performance
Lecture: Demo: Even more focused collection
Login or purchase this course to watch this video and the rest of the course contents.
0:00 Now we're turning off the profile until we get to our code. Run this little bit and then we disable it again. Right up here we have profile enable
0:08 and we have profile disable. But there's still a lot of reporting stuff and do you really care how fast the thing prints out?
0:13 Like, it's print to the console, you really can't control that. That's not the essence. So let's just reorganize this code,
0:20 refactor it so that we can group the analytics and the data bits of it and then we'll move on. Okay, so let's come over here and we'll say
0:30 we'll get those days and get those days and those days. Now clearly, there's a problem here. We can't just keep calling the days like we were.
0:39 So, we've have to call this hot days, cold days, and wet days. And we got to replace in that here, hot days, cold days, and wet days.
0:50 Okay, so now we can take this profile bit and disable it way sooner, like this. So here we can do a little bit of work
0:59 in this block of code here and only profile that. Let's run this one more time. All right now, how things are looking.
1:07 Okay, that looks even a little bit cleaner. They didn't change the numbers for this obviously because that's outside of what we were doing,
1:13 and it wouldn't change this either, right. But it does clean things up just a little bit. Let's look at what is the worst case scenario here.
1:22 Well there's init right here, this is obviously the worst function. It's at the top. But let's go look at it. It's doing this line right here.
1:34 Chances are we can't really do any better than that. It turns out that we can call it less often, that's one thing we could try to do
1:42 is check and see if it's already been initialized, then don't do it, that's actually a massive, massive performance gain, but let's make
1:47 what we already have faster before we add that. Over here, we're basically parsing the row and we're pinning the data.
1:55 Remember parsing row down here actually does all sorts of conversions and then assigns it to this record and so on.
2:03 So, there's this init, but really the thing that is the problem here, this parse row that we're calling 36,135 times.
2:11 That is a ton of times that we're calling this. Can we make it faster? Answer is, probably, yes, yes we can. How can we do that?
2:22 One thing we could realize is, it's really this all this conversion these are taking strings and converting them to integers,
2:31 the dictionary read and write is like crazy fast. So, you could look at that and figure this out,
2:37 but it, that's not really the problem, the problem is the string conversion to numbers ints and floats.
2:44 And then also this, we're allocating this record and we're signing well over however elements there are in this named tuple and we're giving it back.
2:53 What can we do here to make this faster? Well it turns out, if you look at the way our program works, first up here that we're working
3:03 with actual max temperature, actual precipitaion, and over in the programming we're working with actual min temp and we should have been sorting by min
3:14 temp here as well. Okay, so minor little bug, but really highs on a cold day are pretty close to the lows as well.
3:22 Alright so we're working with these three values, max temp, min temp, and precipitation. If you look at the little reports we're running
3:29 we're also working with date, nothing else. However, just for completeness sake, we said we're going to convert everything,
3:37 we're going to convert the mean temperature, the record temperature, the average temperature, you name it we're converting that.
3:43 Well if we know our program isn't actually going to touch those pieces of data let's not do that.
3:49 So let's see what we can do about improving performance by reducing some of the data we're working with here.