#100DaysOfCode in Python Transcripts
Chapter: Days 46-48: Web Scraping with BeautifulSoup4
Lecture: A quick BS4 overview
0:00 I just thought I'd give a quick overview
0:02 for anyone who hasn't dealt with Beautiful Soup 4 before.
0:06 So if you haven't, feel free to keep watching,
0:09 but if you have,
0:10 skip on over because I'm just going to be repeating myself.
0:13 Now, Beautiful Soup 4 allows you to parse web pages.
0:18 Okay, we've all dealt with requests by now
0:21 and we know that we're using requests,
0:23 we can pull down the code behind their web page, right?
0:27 And we can then use Beautiful Soup 4
0:30 to parse that data.
0:32 All right, I'll tell you what I mean.
0:34 So, let's view the page source
0:36 for our PyBites Code Challenges page.
0:40 And here you'll see all of your HTML.
0:44 Now, if I wanted specifically
0:46 to get all of our code challenge names
0:50 and just put them in a list to make
0:52 in some sort of an email
0:54 or whatever application I can think of, right.
0:57 Well, how am I going to do that?
0:59 If you go into the page source,
1:01 you need to find,
1:02 the first thing you need to do
1:03 is you need to find that data in the code
1:07 and here it is.
1:08 It's an unordered list
1:09 with ID of article list, and a class of article list.
1:13 Okay, and then all of our different challenge headers
1:18 are stored in list elements, okay?
1:22 Now, with that information in hand,
1:25 we can then use Beautiful Soup 4 to search this page,
1:29 remembering that requests will pretty much pull down
1:32 this page looking like this,
1:34 and Beautiful Soup 4 will parse that,
1:37 and we can then tell it what to look for.
1:40 And now you can start thinking,
1:42 imagining the cool things you can do with this.
1:44 So, we can skip all of this junk up here,
1:48 all of this code that we don't care about,
1:50 and drill straight down to the list that we do care about.
1:54 And you can use this on any site that you can think of.
1:57 You can search by all sorts of different criteria.
2:00 And we're going to show that in the next video.
2:04 So, get excited, because this is really, really fun stuff.