Python Data Visualization Transcripts
Lecture: Pair and jointplot
0:00 The next plot I want to talk about is the pair plot. The easiest way to see what it does is to run it on the full data
0:10 frame which will actually take some time. So you can see it generates this really large graph which compares every column,
0:26 every numeric column in the data frame to every other column. So you can quickly look at this and see relationships.
0:35 For example, if you look at the highway eight fuel economy compared to the combined 08, fuel economy, you see that they're very closely related,
0:44 which makes sense because they're they're basically similar measures of fuel economy.
0:51 But this is a really useful tool to quickly look at all of your data and see if where you want to dive in and do some more investigation.
1:00 Now, this specific plot actually is maybe a little busy with all the columns, but you can specify which columns you want to plot.
1:11 So let's go through an example of that. So now in this example I specify what the X variables are in the Y variables
1:19 So I just want to look at cylinders, displacement and barrels. And I can also pass in hue to show colors for the
1:28 different date ranges. Now we have a much easier to digest visualization because it's just three by three and I've added the color.
1:39 So you can start to see some some trends for the different date ranges. So let me do another example just to hammer home what that looks like and some
1:50 of the other configurations that you can do with this type of plot. So here I can tell it to do a kde For the kind type.
1:59 And it will give us a different view and now here is the finished plot and I will warn you that this plot is very computational expensive to calculate.
2:15 It took a couple of minutes on my machine to actually create it. So I have obviously sped up the video.
2:21 It is a really interesting visualization to show you how you can quickly vary the types
2:27 of analysis and you can use it to zoom in on additional aspects of your data
2:33 that you want to investigate further final plot I want to show is the joint plot
2:38 And this plots two variables against each other and in this example I'm gonna plot the
2:44 barrels 08 versus displacement and tell it to add a regression line. So this is a center scatter plot with a regression line but then you also have
2:55 on the X. And the Y axis. This histogram that show the distribution of the data.
3:02 So this shows how displacement is distributed and this shows how barrels 08 is distributed.
3:10 I'll give another example where we can further customize this, let's say we want to add the date range into this.
3:20 Now we have a similar grid but we also now have color coding for the date range. So you can see the blue for the earlier date period and the orange
3:30 for the later date range and similar to what we've done with some of the other plots. I find it interesting to do
3:39 'kde' plots because these are not the types of plots that you see as much in other tools like Excel.
3:47 So I want to highlight this so you can get some exposure to it.
3:53 So here's the output where we see the relationship between the barrels and displacement and then
4:00 how it varies between the two wheel drive and four wheel drive vehicles.
4:05 So once again, this one is a little bit difficult to interpret in this specific
4:10 situation. But I wanted to call this out so that you were aware there was an option and play with it on your own data,
4:17 just like all of the Seaborn plots that we've talked through. This will really make more sense on your own data and give you the tools to
4:25 quickly move back and forth and evaluate what works best for you.