Python for decision makers and business leaders Transcripts
Chapter: Data science in Python
Lecture: Graphing the top 25 domains
0:00 Finally now that we have these top 25
0:02 let's just do a quick graph.
0:04 Now I'm just going to copy some stuff over
0:06 because the graph is kind nitpicky
0:07 and the details are not important.
0:09 So what we're going to do is we're going to put
0:10 in an import to use matplotlib and numpy
0:14 and then we'll just do the graph like this
0:16 and I just set the figure size.
0:18 Come up to Values, and I'm going to create
0:19 a histogram bar chart here.
0:22 We hit it, and look at that. We are done.
0:26 We have GitHub and Twitter and Python Bytes
0:28 and now you can really tell what is important
0:30 and what isn't.
0:31 And with that long tail, it tails off quickly, doesn't it?
0:34 So we've done it. We've gone through winter Python Bytes
0:38 downloaded the RSS feed, 2.5 megs of it.
0:41 We've pulled it apart to just get the description
0:44 and then we kind of made a note to ourselves
0:47 and then for each one of those, we said let's parse that HTML and just get the links
0:50 From the links, we're going to get the domains
0:53 and once you have the domains
0:54 you just do a quick count and boom here you go.
0:56 Throw that at the graph in matplotlib.
0:59 Done. Besides the graphepart
1:02 I don't think there's anything that's super complicated.
1:04 Maybe not everything you're familiar with
1:06 but I hope you were able to follow along
1:07 and I hope this experience really showed you the power
1:10 of Jupyter, Python for understanding data
1:14 and just going out to, not just some local data source
1:17 but going to the internet and grabbing that data
1:19 and then turning it into insight
1:21 within just a couple of minutes, beautiful.