Python for Decision Makers and Business Leaders Transcripts
Chapter: Data science in Python
Lecture: Counting the domains

Login or purchase this course to watch this video and the rest of the course contents.
0:00 We're so close to having the answer. We've gone and downloaded the data we parsed it apart. As XML we started working through it
0:07 and then we said well each one of has embedded HTML which is all sorts of yucky but we can use this Beautiful Soup to pull the pieces out.
0:14 And now we've found there's 799 unique domains and 2,824 total. What do we do now? Well, the last thing to do is figure out
0:22 how many times each one appears. That may sound complicated and in some languages it is, but watch this. We talked about Python's batteries included
0:34 well one of those batteries one of those things in the standard library is something called the collections module.
0:41 Like this, and it has this thing called a Counter and to the counter we can give the things we want it, well to count. What are we getting?
0:51 Well it has, oh, looks like it's already got some stuff here like GitHub this many and so on but what I'd like is to sort it.
0:56 So we can say the most common is going to be counter not you guessed it most common. And if we just print that out, there you go.
1:07 GitHub 447 references, Twitter 202 Python Bytes which maybe exclude ourselves maybe not YouTube, and so on. Maybe we only want the top 25
1:19 'cause you don't want to graph all of them you just want to see the most important ones. So we can come down here and do one
1:23 literally one more line say give me all the items from 0 to 25, and we just show the top 25 and there they are. Those are the top 25.
1:33 This is the kind of stuff that makes Python so useful it's just like a couple of steps a couple of lines. You don't...
1:40 There's no algorithmic thinking here I don't have to come up with a algorithm where I could make mistakes or I have to spend time working on it.
1:47 No, I just grab the right thing ask the right question, and boom out comes the answer.

Talk Python's Mastodon Michael Kennedy's Mastodon