Python Data Visualization Transcripts Chapter: Altair Lecture: Amazon data set

0:00 For this final exercise of altair, I'm gonna bring in a new data set that we haven't looked at yet and use
0:06 altair to explore it and create some really interesting visualizations. So for Chapter six exercise three,
0:14 I have a new notebook and I'm importing a data set that is an Excel file So you'll need to make sure that you have open py Excel installed.
0:22 You can use pip to install it and then I'm going to read in the file and I'll show you the file here in the data frame.
0:29 It's a fairly simple data set of books on amazon their user rating the number of
0:34 reviews and what the average prices for that year as well as whether it's a fiction or nonfiction book. So the first thing we may want to do is look at
0:42 how many reviews there are by year by genre. So now we have a nice chart that shows the number of reviews by published year
0:52 The orange is nonfiction and the blue is fiction. So you can clearly see from this that in 2020 huge increase in the number of
1:01 reviews, I'll walk through exactly what we're doing here. So we're creating our data frame, we're telling it that we're going to mark it as a bar.
1:10 And then we tell it that the Y axis this year and what I wanted to do here is given a new title instead of saying year here,
1:18 I wanted to say, published year, to clean that up a little bit and I want the title on the X axis
1:23 to be number of reviews and then we tell it that the color is genre and altair takes care of everything for us.
1:31 I've used this opportunity to introduce a few new concepts here but this shows why the altair API. Of using alt.X or alt.Y,
1:42 gives you a lot more flexibility in the types of visualization even though you have to type a little bit more.
1:49 So let's go through another example showing how we can do some pretty interesting things with the altair API. In this example I want to look at by year
1:59 and fiction and nonfiction what the average prices and I want to use a tool tip
2:05 so that I can see how many records there are each year and what that average
2:09 prices and some cool things about this is that I've created this simple mark rectangle so
2:15 almost like a heat map where I have the year in the genre and then the color is based on the average price so it creates this gradient of color.
2:25 And then for the tool tip I use this alternative altair API. To say the tool tip should be the mean of the price.
2:33 And I can also tell the format so that we have a nicely formatted string for currency, U.S currency and also make sure that the account of the records
2:43 doesn't have any decimals. So this just shows you how much flexibility you have with that altair API.

Talk Python's Mastodon Michael Kennedy's Mastodon