Effective PyCharm Transcripts
Chapter: Performance and profiling
Lecture: Profiling the slow app

Login or purchase this course to watch this video and the rest of the course contents.
0:02 We saw this program is not especially fast, how long it takes maybe five seconds or something like that
0:09 and we would like it to be faster, if that is at all possible. So what we're going to do is we're going to use the profiling tools
0:18 to ask where time is spent and if we want to just profile the entire application that's probably the easiest thing to do,
0:25 we can go over here and push this button, to me this looks a little bit like a cd, but it's like a speedometer type thing
0:34 so once you have a run configuration this can be for unit tests or for regular apps, or web apps even,
0:41 what you can do is click here to run that with profiling, so let's do it.
0:49 So it started the c profiler which is probably the better of the two profilers, it did what it does and then it saved a snapshot here
0:57 based on the project name and then it exited and immediately what came up is this statistics page here, so if we look at the statistics,
1:07 what we have is the number of times the function was called so for example, this raising something to a power was called 600 thousand times
1:15 that's pretty intense, right, this is probably our machine learning thing in terms of time though, that didn't take that long,
1:22 120 milliseconds, that is some time. There's some other things going on here, learn was called one time but it took half a second,
1:31 let see our socket, connect took a little while, but it's only 4% of the time so you can go on here and sort of, I would say sort these
1:40 and then go and find where that's interesting, so own time, this means only this function spent this time
1:46 not the things it calls, I find this to be super helpful sometimes but usually what I'd like to know is how long does it take for this function to run
1:55 not how long just inside that level in the call stack, how long did it run that is this one. So program.py took 1.6 seconds,
2:05 main took that long, okay, we called go and that's really all that main was doing, it was calling go.
2:12 So here we have compute analytics, we had three things we were doing a search, we were doing get records and we were doing a compute analytics
2:22 so it looks like this analytics thing is the slowest and then, we have learn over here
2:28 and down here we have get records I think that's what we were calling it so you can sort of see the relative breakdown
2:35 and this is really helpful, we can do things like let's say I would like if you just double click it I guess, I meant to click once
2:42 but if you click once you can say navigate the source and it'll take you right to that function,
2:47 okay this one is probably the worst one that we control actually well, this one is.
2:53 It's spending almost all of its time in learn down here, which is where we were. So we can think about how we might be able to make this better
3:01 we could think about the algorithm here now that's one way to look at it, and this is okay but I don't really like it that much,
3:10 I guess it depends on how complex things are. The other thing you can look at is the call graph
3:14 and the call graph is awesome, this does not look awesome, does it? So let's try to zoom in a little bit
3:20 so we can get something more meaningful, so if we come down here there's a bunch of junk that's going to be in here that has nothing to do with us
3:27 so like here you can see this is all the load module start up time from Python we can't make that any faster that's CPython,
3:33 that is what it is, if you want that to be faster you need to use PyPy or Cython or some other runtime,
3:40 maybe you could somehow pre-compile, get some pyc files but that's generally out of our control,
3:48 this though, program calling main, calling go, calling these three functions, this is where it's interesting.
3:55 Notice the colors, this is red, this is like as bad as it gets and this is slightly less bad, just slightly
4:02 because something else is happening and the start up and then this is among these three, not too bad so it's lighter green,
4:09 this is yellow because it's maybe 3 times as bad as the other stuff on that level, and so on so we can go through and actually see what's happening,
4:17 like this one is actually going out and calling get on request which is going over the socket, this one is calling learn and read data
4:26 and those are both pretty bad it turns out; and this one's calling get records and the real slow part of get records is creating the connection here,
4:35 actually I think I remember, I wanted this to be a little bit slower so let me go over here, we can navigate the source
4:41 and let's make actually this part which we'll talk about in a second a little slower so one more profile thing here with the call graph
4:52 it always looks easy like that so we'll just zoom it it's time it rearrange itself a little bit, here we go
5:05 so now we're spending some time on read query and get row as well as connecting, so these all kind of have their own issues
5:12 so how are we going to fix it? Well, we can go through and optimize, let's say I would start with the worst thing,
5:20 is there a way to optimize this for example, is there a way to optimize that,
5:25 so I think we'll just start here, this is the slowest thing that we have control of,
5:28 this is slow but it's only slow because it's just literally calling these, right
5:32 and you can see right here its own time is zero milliseconds, but total it's that. Similarly, here these are calling things that are slow.
5:40 Alright, so the goal is, armed with this information and having this loaded up here, let's say this is going to be the starting point
5:49 we'll have this one and we could make some changes and we'll have another one of these show up
5:54 and we can do a quick comparison and see how things are working.


Talk Python's Mastodon Michael Kennedy's Mastodon