Consuming HTTP Services in Python Transcripts
Chapter: Screen scraping: Adding APIs where there are none
Lecture: Controlling your user agent in requests

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Here is a website we can use, called whatsmyuseragent.com to actually check what it thinks and pretty much any other web server in the world
0:09 is going to think we are when we talk to it. So if we pull that up in Firefox, let me do a request to it,
0:16 you can see it will say what is your user agent, your user agent is something like this, Mozilla 5,
0:22 running on Macintosh, Intel running OS 10 Sierra and this version, 51 on Firefox,
0:27 and here, isn't this cool, I even have a public ipv6 address these days I'm feeling so much like I'm in the future, okay so if we do a request,
0:35 against that url, it's going to redirect it here, and it turns out that there is a special tag or id on that little section of where it says
0:45 what our user agent is, so they can style it, hm, I wonder if we could get that back.
0:49 So over here, what we're going to do is we're going to do a request, and we're going to do a get, and we're going to use Beautiful Soup,
0:55 and this time instead of using lxml just to show you you could do the other one, we'll use the built in parser, and then we want to go
1:03 grab that user agent and get its text, and print it out, so we could do that, and our reported user agent is no surprise,
1:11 Python-request 213. In fact, I think we could do this as a select one and drop this zero, get the same effect.
1:21 Perfect, that's more like it. Okay, so how do we control this, well, it's this step where we control
1:27 what gets sent to the server, so we can do this with headers, because the user agent is just a header, so this is going to be a dictionary,
1:34 and the value is going to be user- agent and then what we put in here is
1:39 whatever we want, you want it to think we're exactly what we were with Firefox, fine, oh, there might be a small problem, if you don't pass it along.
1:47 Want to be like Firefox, boom we are, we could even have fun with them, so you know, we're Mozilla 7, we're OS 10 32 and we're even Firefox 54,
1:58 just to show you like we can put whatever we want here, alright, maybe they think some super secret version of macOS is like being prototype
2:07 down their site, who knows, but see, we're running Firefox 54, never mind the newest one is 51,
2:12 we can tell it this and it gets sent right along, so there is a couple of uses for this,
2:16 like I said, it could be that you might want to specifically get the desktop site or the mobile site and you can control whether you look mobile
2:24 or you look desktopy by setting your user agent, you might also be getting blocked if you look like some kind of robot
2:31 so you can look non robot like by doing this, yeah, there is a couple of reasons, you also might want to set it to be your own custom thing,
2:38 so we don't want to do this, I'll save this for you, but maybe we want something else with user agent,
2:43 maybe we want to say I am super user agent 007 version 0.1, right, maybe you want to pass information say this is actually this application
2:54 and it's this version that we're working with so there might be a reason you want to pass that as well,
3:00 so we could be sure user agent 007 version .1, whatever you want, right,
3:03 there is a couple of reasons and the value you choose might depend on your thinking. But, it can be important to control your user agent
3:11 because it determines the HTML that you get.


Talk Python's Mastodon Michael Kennedy's Mastodon