Data Science Jumpstart with 10 Projects Transcripts
Chapter: Project 7: Working with Movie Review Text Data in Pandas
Lecture: Predicting Values with XGBoost and Pandas
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
In this section I'm going to show you if you had a new review, how would you make a prediction for it?
0:05
Okay, so let's make a review here. I'm going to say xnew. This is a data frame. It has some review text in it.
0:12
What am I going to do? I'm going to pull out the text. I'm going to call my removeStop. I'm going to stick that into my vectorizer.
0:19
And I'm just going to call transform. And then I'm going to stick that into a data frame. And that should give me my xnew.
0:27
Let's run that. There is my xnew. So I've got three reviews. I hated this movie. This was the best movie.
0:35
I think I know how I felt about this movie. Both good, but weird parts.
0:40
Okay, so let's make a prediction. And to make a prediction, all we have to do is take our data frame and call predict on it on that XGB model.
0:47
And so this says that the first one was zero. I hated this movie, so that's a negative review. Next one was best movie. It got a one positive review.
0:56
This one says I think I know how I felt. Both good, but weird parts. And this said that that is a positive review.
1:04
One of the cool things about this is we can say predict probability. This is the probability that something is positive or negative.
1:10
So the first one, this is the negative column right here. And this is the positive column. So this is 90% negative.
1:17
This next one is 71% positive. And this one here is 54% positive. How do I know that the left side is negative and the right side is positive?
1:27
Because remember we're predicting whether something is positive in our Y.
1:31
And so in this case here are our classes. False here corresponds to this column here, and true corresponds to the right column over there.