#100DaysOfCode in Python Transcripts
Chapter: Days 73-75: Automate tasks with Selenium
Lecture: Demo 1: access my Packt ebook collection
0:00 All right, let the fun start. Let's look at a more practical example. You're probably familiar with packtpub.com.
0:08 They have a daily free eBook I've been collecting the last months, and I got a bunch of eBooks on my account,
0:16 but my account obviously is behind a login. So let's write a script to log in to Packt. Reach out to my account details.
0:24 Go to my eBooks, this link here and make a list of all the books it finds here. Then, we retrieve the book titles and URLs, so, let's get coding.
0:34 First of all, I don't want to store any password and login into my script, so we need to load them from the environment.
0:43 One way you can do that in Python is with import os, os.environ get and let's say we call it packt_user and Packt_password.
0:57 We store them in user and password. And you see, I already set them in the environment. I will show you how to do that next,
1:08 so let's go back to the terminal and make sure you have your virtual environment deactivated. And go into venn/bin/activate.
1:17 And go to the end and do an export of packt_user and export packt_password. And if you want to follow along, make those the values
1:35 of your login, save that. Activate the virtual environment again, and I'm using this alias, and now, you should have them
1:44 in your environment variables. And it means they will be accessible to your script. All right with the user and password set, let's log in to the site.
2:00 So this is the login site and let's initialize a driver. And let's get the page, then on the page, let's find
2:18 the actual login form which we can do with find_element_by_id. And first I looked at the page source to see how the user and password fields are named.
2:28 And they have them named as edit name, and you want to send the keys, basically sending data into that form input fields, user.
2:41 Here we do the same for password, and the password field is named pass, and here we want to send it our password,
2:52 and importantly we want to make sure we hit enter after that last value, so by running Selenium it opens the browser and goes to the login page,
3:03 and there's my email and my password, and click enter. Look at that it logged into my account. How cool is that?
3:16 Now we're logged into the page and move on to find my eBooks. As we saw there is a link on the page, My eBooks,
3:25 so we just need to find that link and click it. Before running that cell let me show you where we are now and what that page looks after clicking.
3:38 Now, we are in account details. Click the cell. Now we're in my eBooks. How cool is that? I'm navigating this side through Selenium.
3:53 Let's move on and extract the books. I'm going to use find_elements again, but now by class names because I saw
4:07 that the books are in a class product-line and that's in elements. Right, couple of Selenium web elements, cool.
4:25 I can write a dictionary comprehension to actually I extract the nid, N-I-D, kind of the identifier, practice using and the title.
4:36 I'm going to store that in books. I'm using the get_attribute, nid as key, and
4:54 title as value. for e in elements. Look at that all the books of my account. Good I think we're done now, so let's close the driver,
5:11 and that actually closed the browser. Alright, so and boom. You cannot see it, but that closed my Chrome Browser I had open.
5:19 Now that we have the data in a structure, I can just write a little bit of code to get the book. And to keep the focus on Selenium,
5:27 I'm just going to copy that code in. We have to download URL which I extracted from HTML. We have that id and the format of the book we want.
5:37 Possible formats are PDF, EPUB, and MOBI. We write a function called get_books, grabbing my books for a string and checks
5:46 if the book format is correct and then it just looks through the titles. Does a regular expression match on the title
5:53 and it gives me the title and URL. The next step would then be to actually download the book to my desktop, but that's out of the scope of this lesson.
6:02 Let's try it out. As just a regular expression I can get a regular expression like searches. I want all the MOBI files for Python Data Books. Nice.
6:19 I want the books for machine learning and I want the format of PDF. It should also work in uppercase. There you go. A little useful script.
6:33 I don't spend too much time on them here because I want to really focus on Selenium, but the point is that once Selenium loaded your data
6:40 into a structure or you can dump it to your database table or whatever then it's just easy to write a function to work with that data.