Data Science Jumpstart with 10 Projects Transcripts
Chapter: Project 5: Cleaning Heart Disease Data in Pandas
Lecture: Using the Cleanup Function for the Fbs Column
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Okay, we're going to keep going through. I'm going to look at fasting blood sugar. I've got my chain here
0:06
I'm just going to pull off that column and do a describe on it. It looks again like we have that same problem
0:12
Describe isn't giving us super useful values. I'm going to look at the value counts here
0:17
I see 0 and 0.01 and 1.0 and question mark. So we need to convert these types here So this looks like a boolean value
0:26
I'm going to replace this with a boolean instead of the types that I was using previously for the other columns
0:33
Because I have the function there it makes it really easy. I just specify D type
0:37
Let's do that and look at the value counts. That looks like it did work We also could come in here and say drop NA as false
0:46
To see the missing values and you can see that 90 of those are missing again
0:52
This would be something where I would want to go back to a subject matter expert and ask them why these values are missing
0:57
If there is not an entry in here, is that the same as a false? That would be a question that I want to make sure I have an answer to