Data Science Jumpstart with 10 Projects Transcripts
Chapter: Project 10: Making a Snow Report Dashboard with Dash and Plotly
Lecture: Clean Pandas data with a function for plotly
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
In this video, we're going to clean up the data. So here's the data that we just loaded.
0:05
This is a data frame that has meteorological information about a ski resort. It's got data from 1990 to 2018.
0:16
And we're mostly concerned with some of the attributes that you can't see here. And this is not showing all of the columns that I want to see.
0:26
Just bring those out over here. We've got date and then we've got a lot of attributes that are hidden in here.
0:34
The ones that we'll be concerned with is the snow one. That's how much snow fell on a given day. Snow depth, that's how much snow is on the ground.
0:42
We've got these T, Tmax, Tmin, Tobs. That's the temperature of observation.
0:49
That's the temperature maximum during the day, temperature minimum during the day.
0:51
And then there's a Tobs, which is the temperature of observation when they went out and measured the snow depth.
0:56
And then another one is precipitation, PRCP. That's how much water fell. So that's different than snow. Snow is how much snow inches fell.
1:04
PRCP is how much water inches fell. Okay, so here's my code to clean it up. I've got a chain here. Let me just talk about what this chain is doing.
1:14
We are converting the date to a date. I'm pulling out certain columns and then I'm making a month column, a year column, and a season column.
1:23
The season column is a little bit more complex, but basically we can see that a season runs from the end of a year to the next year.
1:31
And that's the logic for doing that. Okay, at this point, I'll store that in a variable called ALTA. Let's just make sure that ALTA exists.
1:43
And there we go.