Data Science Jumpstart with 10 Projects Transcripts
Chapter: Project 5: Cleaning Heart Disease Data in Pandas
Lecture: Fixing the Num Column
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Let's look at the num column. This should have a value of 0 or 1. It looks like it has a value of 0, 1, 2, 3, 4.
0:09
Let's go back up to our documentation and look at this. So this is column 58, the predicted attribute.
0:27
The description here says value 0 or 1. But if we go up here above, it says it can be 0, 1, 2, 3, 4.
0:35
Which is this goal column, value 0, no presence, 2, 4. So it looks like we are seeing those values 0 through 4.
0:43
The next thing I'm going to do is look at the data type of that. That actually looks like it's an int64, so that's pretty good.
0:51
I'm just going to go ahead and run this. I'm just going to convert it to an int8. In this case, I don't need to use my special function there.
1:01
I can just stick that into the as type. So here's our chain that looks like it does a relatively good job of cleaning up our code.