#100DaysOfCode in Python Transcripts
Chapter: Days 49-51: Measuring performance
Lecture: Demo: Even more focused collection
0:00 Now we're turning off the profile
0:01 until we get to our code.
0:03 Run this little bit and then we disable it again.
0:05 Right up here we have profile enable
0:07 and we have profile disable.
0:08 But there's still a lot of reporting stuff
0:10 and do you really care how fast the thing prints out?
0:12 Like, it's print to the console,
0:14 you really can't control that.
0:16 That's not the essence.
0:17 So let's just reorganize this code,
0:19 refactor it so that we can group the analytics
0:23 and the data bits of it and then we'll move on.
0:27 Okay, so let's come over here and we'll say
0:29 we'll get those days and get those days and those days.
0:34 Now clearly, there's a problem here.
0:36 We can't just keep calling the days like we were.
0:38 So, we've have to call this hot days,
0:40 cold days, and wet days.
0:42 And we got to replace in that here,
0:44 hot days, cold days, and wet days.
0:49 Okay, so now we can take this profile bit
0:52 and disable it way sooner, like this.
0:55 So here we can do a little bit of work
0:58 in this block of code here and only profile that.
1:01 Let's run this one more time.
1:04 All right now, how things are looking.
1:06 Okay, that looks even a little bit cleaner.
1:08 They didn't change the numbers for this obviously
1:10 because that's outside of what we were doing,
1:12 and it wouldn't change this either, right.
1:15 But it does clean things up just a little bit.
1:17 Let's look at what is the worst case scenario here.
1:21 Well there's init right here,
1:24 this is obviously the worst function.
1:25 It's at the top. But let's go look at it.
1:29 It's doing this line right here.
1:33 Chances are we can't really do any better than that.
1:35 It turns out that we can call it less often,
1:39 that's one thing we could try to do
1:41 is check and see if it's already been initialized,
1:43 then don't do it, that's actually a massive,
1:45 massive performance gain, but let's make
1:46 what we already have faster before we add that.
1:50 Over here, we're basically parsing the row
1:53 and we're pinning the data.
1:54 Remember parsing row down here actually
1:56 does all sorts of conversions and then
1:58 assigns it to this record and so on.
2:02 So, there's this init, but really the thing that is
2:04 the problem here, this parse row
2:07 that we're calling 36,135 times.
2:10 That is a ton of times that we're calling this.
2:13 Can we make it faster? Answer is, probably, yes, yes we can.
2:19 How can we do that?
2:21 One thing we could realize is, it's really
2:24 this all this conversion these are taking strings
2:28 and converting them to integers,
2:30 the dictionary read and write is like crazy fast.
2:33 So, you could look at that and figure this out,
2:36 but it, that's not really the problem, the problem is the
2:40 string conversion to numbers ints and floats.
2:43 And then also this, we're allocating this record
2:46 and we're signing well over however elements
2:49 there are in this named tuple and we're giving it back.
2:52 What can we do here to make this faster?
2:55 Well it turns out, if you look at the way our program works,
2:59 first up here that we're working
3:02 with actual max temperature, actual precipitaion,
3:05 and over in the programming we're working
3:07 with actual min temp and we should have been sorting by min
3:13 temp here as well.
3:16 Okay, so minor little bug, but really
3:18 highs on a cold day are pretty close to the lows as well.
3:21 Alright so we're working with these three values,
3:23 max temp, min temp, and precipitation.
3:26 If you look at the little reports we're running
3:28 we're also working with date, nothing else.
3:32 However, just for completeness sake,
3:34 we said we're going to convert everything,
3:36 we're going to convert the mean temperature,
3:38 the record temperature, the average temperature,
3:40 you name it we're converting that.
3:42 Well if we know our program isn't actually
3:45 going to touch those pieces of data let's not do that.
3:48 So let's see what we can do about improving performance
3:50 by reducing some of the data we're working with here.