Data Science Jumpstart with 10 Projects Transcripts
Chapter: Project 5: Cleaning Heart Disease Data in Pandas
Lecture: Using the Cleanup Function for the Fbs Column

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Okay, we're going to keep going through. I'm going to look at fasting blood sugar. I've got my chain here
0:06 I'm just going to pull off that column and do a describe on it. It looks again like we have that same problem
0:12 Describe isn't giving us super useful values. I'm going to look at the value counts here
0:17 I see 0 and 0.01 and 1.0 and question mark. So we need to convert these types here So this looks like a boolean value
0:26 I'm going to replace this with a boolean instead of the types that I was using previously for the other columns
0:33 Because I have the function there it makes it really easy. I just specify D type
0:37 Let's do that and look at the value counts. That looks like it did work We also could come in here and say drop NA as false
0:46 To see the missing values and you can see that 90 of those are missing again
0:52 This would be something where I would want to go back to a subject matter expert and ask them why these values are missing
0:57 If there is not an entry in here, is that the same as a false? That would be a question that I want to make sure I have an answer to


Talk Python's Mastodon Michael Kennedy's Mastodon