#100DaysOfCode in Python Transcripts
Chapter: Days 58-60: Twitter data analysis with Python
Lecture: Build a Twitter wordcloud
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Alright, our last part, building a Wordcloud. I left this in the notebook, although we prepped and we have all the requirements installed.
0:11
One way you could also do it is to run pip install inside the notebook using an exclamation mark,
0:18
but make sure that your virtual environment is enabled to not install it in your global namespace. But we have already Wordcloud,
0:26
so let's move on and get all the Tweets, but this time we filter out all the retweets I mentioned. And as the code is pretty similar as the last video,
0:34
I'm just going to copy-paste it here. It looks over the tweets, and it excludes the retweets that start with "RT" and with the at sign.
0:43
So, retweet and mention. And that should give us a clean list for the Wordcloud. Now, here's the wordcloud module.
0:53
It's a little Wordcloud generator in Python and you can just feed it a bunch of text and it comes up with this nice output.
1:01
You can put a mask on it to get the words in the shape you want. And I'm going to use that to put the words in the shape of our PyBites logo.
1:11
So let's make the Wordcloud. I'm going to type it out because it's a bit of code and I come back and explain it line by line.
1:32
And this takes a while. It's doing a lot of processing in the background. So let's wait for it to come back. Cool. We got a Wordcloud object.
1:41
Let me quickly highlight what happened. First, we made a PyBites mask by doing an image.open on a PyBites logo I have in my directory.
1:50
An image is from the Pillow library. Then we make a set of stop words, and stop words we imported in the beginning
1:59
which is part of the Wordcloud module. I add, and that was basically by doing some trial and error, I had to add co and https
2:08
because those were common tags. They're false positives because those are related to Twitter links, and, yeah. We don't want to have these misrepresent
2:17
our Twitter word populations, so we add them to the stop words. Then we make the Wordcloud object. We give it a white background, max words 2000,
2:26
you would have to try it on your own data set what the best value is here. We pass it in the mask and the stop words.
2:33
Then we generate the Wordcloud, passing in the string of all the Tweets we defined earlier. Next up, I want to show the Wordcloud in the browser.
2:43
And we're going to use a little bit of matplotlib to do that. This might take a bit as well. Alright. That looks better.
3:01
And look at that! We got the Wordcloud in the form of our PyBites logo. By the way, this is our logo and mask, so you see the similar shape.
3:13
And look at that. I mean, what's cool about this is that you really see what we're all about: 100 Days Of Code, Python, Code Challenge,
3:21
API, Django, PacktPub, Twitter. So they're really things that stand out, flask, of course, so very cool. And nice that you can just import the module.
3:33
Three lines of code to create the object and four lines of code to make the image, basically, and you're set. I mean, it's pretty impressive.
3:42
That's a wrap of this lesson. I hope you like it and you got a taste of how to get data from the Twitter API and do a bit of analysis on that data.