Data Science Jumpstart with 10 Projects Transcripts
Chapter: Project 7: Working with Movie Review Text Data in Pandas
Lecture: Load movie review text data from a directory
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
We're going to load our libraries, and I've got some data that I've downloaded here. So this is reviews of movies. Here's the README from that.
0:21
So this is the large movie review data set. This is organized as a file system.
0:28
It's got training data with positive and negative samples, and the reviews look something like this. So here's our code to load the data here.
0:39
This is a little bit more involved, but I've got a directory here, and inside of that there's a positive directory and there's a negative directory.
0:48
I have this function up here that will traverse those directories and get us our data frames here and then concatenate those.
0:59
Once I've got those, I'm going to drop the index and I'm going to change some types on those. Let's run that and look at a sample of that.
1:10
So we can see that we have this review text here. Here's what our data frame looks like. We have 600, two rows, and four columns.
1:29
So we've got an ID, a rating, a sentiment, and the text.