Python Data Visualization Transcripts Chapter: Matplotlib Lecture: Histograms

0:00 Now that we've talked a little bit about how do you see the object oriented interface
0:03 I wanna take a step back and talk about how we can also customize the plots. So the histogram that we've been working with,
0:12 you might have noticed that the data is skewed quite a bit and maybe we want to dive in a little bit deeper on a specific range and we can do that
0:22 So let me show you an example of customizing the range. We can pass, By passing the range of 10-50, we can tell the Histogram to only start at
0:40 10 And to go all the way up to 50. And this gives us a little bit more ability to focus in on the data.
0:47 And it is pretty common operation you're gonna want to do with histograms The other things you can do,
0:54 continue to copy and paste is try something called accumulative histogram. You can see there's a very different view here,
1:05 we're still in the 10-50 range. But what it's telling us is When we get up until up to this 25-30 range that's where the vast majority of the cars are.
1:16 So it's a kind of a different way to interpret the histogram data that we have been looking at. And another option we can do is just by continuing
1:26 to change the parameters. We have a whole lot of different ability to analyze the data very quickly. So now we have,
1:39 instead of having that filled in histogram we have the step function and have made it a horizontal histogram.
1:48 And what I think is really interesting about this. And the reason I wanted to go through this is to explain to you that there
1:54 are many parameters for changing the way that you look at the data in mat plot lib. And so it's important to look at the documentation,
2:02 understand what those options are and figure out what works best for your own visualization outside
2:08 of controlling the range. Probably one of the most common things that I do with histogram is you want to change the number of bins.
2:19 So here we told it that there should be five bins between 10 and 50. Instead of letting mat plot lib, figure it out automatically for you,
2:28 you can specify it like I did there and to see the difference, it's really bump it up to maybe 100 bins can see a much more fidelity in
2:38 your data. I don't think I want to talk about is why we're using the semicolon and what is actually returned from a histogram.
2:47 Let's just leave it to the default number of bins. And let's say we actually want to know what the bins are.
2:56 So the way we would do that run that command. So we get our same histogram. But if we look at the variable in it's an array of the number of values
3:09 in each of the buckets or bins. You want to see the bins, you can look at the bins variable and you get that array and then the final
3:18 one that I'm not gonna talk about much is patches, which are the actual bars and in more advanced uses of mat plot lib.
3:26 This is where you could do some additional customization if you wanted to, but I'm not going to go into that.

Talk Python's Mastodon Michael Kennedy's Mastodon