Python Data Visualization Transcripts
0:00 Since our last notebook was getting kind of long. I thought I'd start another notebook to go through an example of how to do additional
0:07 customization of your plots and also add a linear regression line to your plot. So for the new notebook I've set it up just like we have our other ones
0:18 I have all of my imports. I established my file paths to the EPA fuel economy file. I read it in you can see the top five rows as well as enable
0:30 matplot lib so it will plot in line. The one other thing that I wanted to call out,
0:35 I added a new import here for stats models and for those of you not familiar
0:40 with stats models, it's a really useful python module that does a lot of statistical analysis of your data in a very straightforward,
0:49 easy to understand model and you can look at the documentation to learn more about it
0:55 I'll go through one quick example but I encourage you to explore it more on your own. Similar to what we did in the past.
1:05 I created a very simple average by year what the fuel cost is. So I have this nice simple data frame that we will plot in a second.
1:14 So let's say we want to build a model to predict or show what a trend line would look like for the fuel economy as it changes over the years.
1:23 So we'll call this the MPG Model. Now I've developed this model that says predict the fuel
1:29 costs based on the year and develop and create a fitted line to that. If you want to see the values and see for each year this is what it
1:40 predicts the values would be. And if you want to see how good your model is, this prints out a nice table that describes the model as well as some
1:50 other measures of the effectiveness of the fit of that model. And I'll leave that to you as you decide you want to dive into this in
1:58 a little more detail. So now that we have this model, let's plot it. So what I've done is create a scatter plot showing the fuel
2:09 costs by year and then plotted as a line the fitted values so you can see
2:15 that this line represents what that that trend looks like if we want to clean this up. Since this isn't really a very good fit.
2:24 I'm doing this just for illustration purposes. Let's trim the number of years were showing and it looks a little bit cleaner.
2:42 So in this example I just changed the range instead of going from 2000 to 2020
2:47 I'm just doing 2010-2020 And then I also compacted the wide range to go from 1800 to 2200, Just to make it a little easier to visualize.
2:59 You can see that it's not too bad a fit for this range. Once again, I'm not gonna go into statistically how you'd want to evaluate this.
3:08 But this does show you how to use matplot lib to plot a linear regression.