Python Data Visualization Transcripts
Lecture: Line plots
0:00 We're going to continue working with the same data. But I've started a new jupyter notebook just to keep the notebooks a little bit smaller
0:06 So let's go through just a quick refresher of what the top of our notebook looks like. We have the imports for matplot-lib,
0:15 numpy and pandas. We've established the directories and the files for our EPA, fuel economy file that we've been looking at.
0:24 We're reading in the data and then just taking a look at what the top five
0:29 rows look like. Now we're going to go into plotting something outside of histograms, and box plots. And as I mentioned,
0:36 I like histograms and box plots because it's one variable. But in real life you're gonna want to two variables against each other.
0:44 And the most common way to do this or one of the most common ways is a line chart. So let's give an example of one.
0:49 Let's say we wanted to plot the combined highway mileage per year and with matplot lib, we actually need to create that data.
0:58 So I'm going to create a new data frame. So let me walk through this real quick.
1:12 So I've taken our data frame and averaged the comb 08 column buy year and I used as index equals false to give me a nice clean data frame here.
1:22 I also rounded the data just for convenience sake and it makes some of the plots look a little bit better. So now that I have for each year,
1:30 what that averages. Let's plot it using a line plot. So you'll notice that I created the plot like we have in the past where I
1:47 create my figure in my axes and I don't tell it to plot a line plot I just say plot, I give an X and the Y.
1:56 So it puts the year across the bottom and the average by year across the y axis. But I don't specify that's a line plot and that's because matplot lib
2:06 assumes just by using a plot using the plot command that it is a line plot But as you look at this,
2:13 you'll see that there's some opportunities to clean this up and make it a little bit
2:17 nicer. So let's talk about what we need to do to make this a little more presentable. One of the first things I noticed about this is,
2:24 I really don't like the decimals here that it's a year, it's .0.5, you know, this, this doesn't really make sense for years.
2:33 So the way we want to do this, we're gonna recreate why is we need to set what we call the X ticks. So these are called ticks on the X axis,
2:46 Hence X ticks. So let's manually set those to 2 year increments. So now we have 2000 2002 through 2020 And two year increments.
3:04 And what we did to do this to accomplish this is use the numpy function, 'arange' which says generate a list or an array between 2000 and 2022.
3:17 With incremental steps of two. So this is one way to specifically do it. There is another way we can do this using a major formatter.
3:26 So let me walk through how we would do that. I'm gonna copy the same plot. So we did the same plot set up our figure in our axes but then we
3:41 access the X axis and use the function set major formatter. And we use the string method for matter to use the python string formatting option to
3:52 tell it not to show to show zero decimal points for this floating point.
3:58 So this is just another example where you can there are multiple ways within matplot lib to format and and work with your plots.
4:08 This formatter option is very useful when you have dates, when you have currency other options where you want to clean this up a little bit more.