Python Data Visualization Transcripts
Chapter: Altair
Lecture: Amazon data set
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
For this final exercise of altair, I'm gonna bring in a new data set that we haven't looked at yet and use
0:06
altair to explore it and create some really interesting visualizations. So for Chapter six exercise three,
0:14
I have a new notebook and I'm importing a data set that is an Excel file So you'll need to make sure that you have open py Excel installed.
0:22
You can use pip to install it and then I'm going to read in the file and I'll show you the file here in the data frame.
0:29
It's a fairly simple data set of books on amazon their user rating the number of
0:34
reviews and what the average prices for that year as well as whether it's a fiction or nonfiction book. So the first thing we may want to do is look at
0:42
how many reviews there are by year by genre. So now we have a nice chart that shows the number of reviews by published year
0:52
The orange is nonfiction and the blue is fiction. So you can clearly see from this that in 2020 huge increase in the number of
1:01
reviews, I'll walk through exactly what we're doing here. So we're creating our data frame, we're telling it that we're going to mark it as a bar.
1:10
And then we tell it that the Y axis this year and what I wanted to do here is given a new title instead of saying year here,
1:18
I wanted to say, published year, to clean that up a little bit and I want the title on the X axis
1:23
to be number of reviews and then we tell it that the color is genre and altair takes care of everything for us.
1:31
I've used this opportunity to introduce a few new concepts here but this shows why the altair API. Of using alt.X or alt.Y,
1:42
gives you a lot more flexibility in the types of visualization even though you have to type a little bit more.
1:49
So let's go through another example showing how we can do some pretty interesting things with the altair API. In this example I want to look at by year
1:59
and fiction and nonfiction what the average prices and I want to use a tool tip
2:05
so that I can see how many records there are each year and what that average
2:09
prices and some cool things about this is that I've created this simple mark rectangle so
2:15
almost like a heat map where I have the year in the genre and then the color is based on the average price so it creates this gradient of color.
2:25
And then for the tool tip I use this alternative altair API. To say the tool tip should be the mean of the price.
2:33
And I can also tell the format so that we have a nicely formatted string for currency, U.S currency and also make sure that the account of the records
2:43
doesn't have any decimals. So this just shows you how much flexibility you have with that altair API.