Effective PyCharm Transcripts
Chapter: Performance and profiling
Lecture: Profiling the slow app
Login or
purchase this course
to watch this video and the rest of the course contents.
0:02
We saw this program is not especially fast, how long it takes maybe five seconds or something like that
0:09
and we would like it to be faster, if that is at all possible. So what we're going to do is we're going to use the profiling tools
0:18
to ask where time is spent and if we want to just profile the entire application that's probably the easiest thing to do,
0:25
we can go over here and push this button, to me this looks a little bit like a cd, but it's like a speedometer type thing
0:34
so once you have a run configuration this can be for unit tests or for regular apps, or web apps even,
0:41
what you can do is click here to run that with profiling, so let's do it.
0:49
So it started the c profiler which is probably the better of the two profilers, it did what it does and then it saved a snapshot here
0:57
based on the project name and then it exited and immediately what came up is this statistics page here, so if we look at the statistics,
1:07
what we have is the number of times the function was called so for example, this raising something to a power was called 600 thousand times
1:15
that's pretty intense, right, this is probably our machine learning thing in terms of time though, that didn't take that long,
1:22
120 milliseconds, that is some time. There's some other things going on here, learn was called one time but it took half a second,
1:31
let see our socket, connect took a little while, but it's only 4% of the time so you can go on here and sort of, I would say sort these
1:40
and then go and find where that's interesting, so own time, this means only this function spent this time
1:46
not the things it calls, I find this to be super helpful sometimes but usually what I'd like to know is how long does it take for this function to run
1:55
not how long just inside that level in the call stack, how long did it run that is this one. So program.py took 1.6 seconds,
2:05
main took that long, okay, we called go and that's really all that main was doing, it was calling go.
2:12
So here we have compute analytics, we had three things we were doing a search, we were doing get records and we were doing a compute analytics
2:22
so it looks like this analytics thing is the slowest and then, we have learn over here
2:28
and down here we have get records I think that's what we were calling it so you can sort of see the relative breakdown
2:35
and this is really helpful, we can do things like let's say I would like if you just double click it I guess, I meant to click once
2:42
but if you click once you can say navigate the source and it'll take you right to that function,
2:47
okay this one is probably the worst one that we control actually well, this one is.
2:53
It's spending almost all of its time in learn down here, which is where we were. So we can think about how we might be able to make this better
3:01
we could think about the algorithm here now that's one way to look at it, and this is okay but I don't really like it that much,
3:10
I guess it depends on how complex things are. The other thing you can look at is the call graph
3:14
and the call graph is awesome, this does not look awesome, does it? So let's try to zoom in a little bit
3:20
so we can get something more meaningful, so if we come down here there's a bunch of junk that's going to be in here that has nothing to do with us
3:27
so like here you can see this is all the load module start up time from Python we can't make that any faster that's CPython,
3:33
that is what it is, if you want that to be faster you need to use PyPy or Cython or some other runtime,
3:40
maybe you could somehow pre-compile, get some pyc files but that's generally out of our control,
3:48
this though, program calling main, calling go, calling these three functions, this is where it's interesting.
3:55
Notice the colors, this is red, this is like as bad as it gets and this is slightly less bad, just slightly
4:02
because something else is happening and the start up and then this is among these three, not too bad so it's lighter green,
4:09
this is yellow because it's maybe 3 times as bad as the other stuff on that level, and so on so we can go through and actually see what's happening,
4:17
like this one is actually going out and calling get on request which is going over the socket, this one is calling learn and read data
4:26
and those are both pretty bad it turns out; and this one's calling get records and the real slow part of get records is creating the connection here,
4:35
actually I think I remember, I wanted this to be a little bit slower so let me go over here, we can navigate the source
4:41
and let's make actually this part which we'll talk about in a second a little slower so one more profile thing here with the call graph
4:52
it always looks easy like that so we'll just zoom it it's time it rearrange itself a little bit, here we go
5:05
so now we're spending some time on read query and get row as well as connecting, so these all kind of have their own issues
5:12
so how are we going to fix it? Well, we can go through and optimize, let's say I would start with the worst thing,
5:20
is there a way to optimize this for example, is there a way to optimize that,
5:25
so I think we'll just start here, this is the slowest thing that we have control of,
5:28
this is slow but it's only slow because it's just literally calling these, right
5:32
and you can see right here its own time is zero milliseconds, but total it's that. Similarly, here these are calling things that are slow.
5:40
Alright, so the goal is, armed with this information and having this loaded up here, let's say this is going to be the starting point
5:49
we'll have this one and we could make some changes and we'll have another one of these show up
5:54
and we can do a quick comparison and see how things are working.