#100DaysOfWeb in Python Transcripts
Chapter: Days 77-80: Twitter and Slack bots
Lecture: Get a tweet with requests and BeautifulSoup
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
In this video we're going to scrape the code challenge grid to scrape URL and extract the data to it. I'm going to use this URL for my pybot user
0:11
and this grid ID. And today it's first of October so we should end up extracting this tweet. Run back to the script.
0:23
I have my stub function which takes the URL. And later we going to use argparse to create that URL based on the user name and project
0:33
that are passed into the script. So for now I'm going to just hard code the URL to get this piece of the script working.
0:46
First we use requests to get the content of that URL. And I already imported all the dependencies here.
1:03
And if that fails for some reason I'm going to raise_for_status. Let's check if this works. Awesome.
1:28
No need for the extra call because it was called already here. Next step is to use beautiful soup to get the table rows of the page.
1:39
As you look at the HTML the 100 days are wrapped in table rows. So let's extract those next.
1:58
The request response object typically has content and I have to define HTML parser.
2:18
First I go to the T body. I get the first element and do a find all of TR. It should give me a list of all the rows.
2:41
Let's also print it just to see what it gives me. And I have to type this correctly. Awesome. And it didn't fail with an assert error
2:59
so this also was true. Now we need to match the tweet that is for today. So let's look through the table rows. And let's extract the date string.
3:24
Let me just quickly look at the HTML again. In every table row there's a small tag which includes the date in parentheses.
3:37
So I'm going to find that small tag get the text, and strip off the parentheses. And the nice thing about strip is that it can
3:50
takes multiple characters. And going to convert this date string into a daytime object. And remember this was the exercise we kicked
4:07
off the last 100 days with. It's a useful utility, STRP time. Extract the daytime from a string. And for that I need to specify the format
4:20
which is day, month name, year. And month name is percentage P. And a four-digit year is an uppercase Y.
4:40
Then we can see if date is today or if it's not is today we can just continue to look. If it is today, and again today we will find at the top.
4:59
If it is today we can find the 100 days activity field, which is this guy. And if I get the text of this table cell
5:14
or TD I should get the activity, this piece. And of course we're going to get this one because today's the first of October.
5:46
And I want the text extracted from that. And I returned the tweet. Or this can match today and finds the text or it never gets to here because none
5:59
of the daytimes is today. Then I just return none. Let's see what that gives me. It takes a little bit because I'm doing
6:16
a request of course. Awesome, day three. And this is exactly what we wanted to achieve in this video. In the next video we're going to authenticate
6:31
with the Twitter API and tweet this out to my timeline. Exciting stuff.