Effective PyCharm Transcripts
Chapter: Performance and profiling
Lecture: Concepts: Profiling

0:01 You've seen the profiler in action and you've seen our technique for making our fake little set of functions faster

0:09 so let's go back and talk about some of the concepts that we saw. So we can go to the run menu and choose profile our program here

0:19 or we could go up and just press this to profile it, so there's also a way in the project to do it,

0:27 so there's a lot of ways to start profiling our run configuration and like I said, this can be a unit test,

0:32 this can be a proper program like the one we did, it could be a web app, whatever, any run configuration you should be able to just profile it.

0:40 Now, you run it and it runs down here, like this notice it's starting the C profiler and then after it runs, it pops open with the stats,

0:51 by default it opens in own time and I think that's really not the right place to be you want cumulative time and sort of work your way down

1:01 so go and sort by the time ms not own time, right here. Notice that compute analytics was probably the worst thing

1:11 that we are in control of, it's called 9 times, it took 7.6 seconds, that's really a problem. So we should probably go look at that and analyze it.

1:23 We also have learn, we also have read data, those are the different parts that we've written that look especially bad,

1:30 time.sleep we didn't write that, I can't do anything about it maybe we could call it fewer times or with a smaller value, okay we also have get records

1:39 and so these are the places we should probably be looking and that's what our analysis here is telling us, probably starting with compute analytics.

1:49 We're also creating the connection and you might think well there's nothing you can do to make talking to a database faster

1:57 you have to open the connection to talk to it but you could implement connection pooling or at least make sure what you're doing

2:03 is leveraging the built in connection pooling of your database provider. While the statistics are cool, I think the graphical version is much better

2:12 so here we can dig into the individual functions, we have program, we have main, we have we go and after go it gets interesting,

2:20 those are the three heavyweight things that go does and that's really all the program is doing. So search is the least bad of the three options

2:28 compute analytics is the worst. So the way to read this is we start here in program.py it calls main, from main that calls go

2:39 and from go we call this one, and then this one, and then this one so we're calling these functions sort of in this order

2:46 so you can follow the flow until you get to a point like okay, this looks bad and like something we can optimize.

2:51 And remember, color matters, so we've got green for search it's pretty fast, relative to the other things we've got orange for compute analytics,

3:03 and we've got red for main, so this is a percent of time and you can actually see the percent there,

3:08 like search is 3.4%, compute analytics is 70% and main is 96% so it's kind of a gradient from green to red with a little yellow in the middle,

3:20 relative to everything else it is probably fast enough. Compute analytics, this could be faster, right,

3:28 but the color is kind of telling you it's not the worst you've seen but it could be better, this is low right,

3:33 this is pretty much as bad as it gets from this particular program. We could also navigate so we could right click in the tabular version

3:40 and say navigate the source or actually jump over to the call graph so if you click on the show call graph, it will take you over here

3:49 but if you go to that one right click you could only navigate to the source,

3:54 so there's not this bi-directional take me to the graph, take me to the table.

3:58 So here we can navigate down to the source and see what's actually going on. So those are the techniques and tooling that we use,

4:07 I want to leave you with one quick warning though be aware of the effects of profilers so profilers and their friends, the debugger,

4:16 these can have non obvious effects so you might have two functions, one which is called one time and one is called a 100 thousand times

4:27 and without the profiler, maybe they're the same amount of time, exactly but because the profiler is in the way and collecting data about every call

4:35 the one that's called a 100 thousand times looks way worse in the profiler than the other which just goes down to the system

4:42 and the profiler is not doing much so you can think of these as having a little bit of quantum mechanics effects

4:47 kind of Heisenberg uncertainty principle the more precisely you measure it, you might actually be changing how it's behaving.

4:56 While C profiler is pretty good and the debugger with the Cython speed ups are pretty good,

5:03 just keep in mind that this is not exactly the real runtime behavior, this is the runtime behavior while it's being deeply observed.

5:11 Okay it's still super, super helpful to help track down these issues and it's more important to look at the differences across time I'd say

5:20 than it is to look at the exact dummers and say well now it is a tiny bit faster, it could be just the profiler is affecting it.

Effective PyCharm Transcripts Chapter: Performance and profiling Lecture: Concepts: Profiling

Effective PyCharm Transcripts
Chapter: Performance and profiling
Lecture: Concepts: Profiling