#100DaysOfCode in Python Transcripts
Chapter: Days 82-84: Data visualization with Plotly
Lecture: Prep 2: useful data structures for plotting
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
So that was some prework, but I have not spoke about yet is what are we going to plot. And there are three graphs I want to make.
0:09
First, I want to make a bar chart of our posting activity, what months did we put more content out, and what months less.
0:16
Secondly, I want a pie chart of breakdown of our categories, so each blog post has one category associated,
0:23
and that will show us what we blog about most. Similarly, that also is true for tags, tags give an indication what we blog about.
0:32
But we use more tags than categories. One blog post can be a ten tags, so it's a bit more granular. So it still will be another angle,
0:42
or another inside into our data. Next up, there are three exercises to get the data into a format that I can easily make those three graphs.
0:53
So first of all, I want to have the published entries. So I'm going to use Counter, to count all the entries by year, month.
1:02
And here's where the helper comes in, because we're going to use a list comprehension, we've dealt with in, I believe Day 16, and we can say pub dates,
1:15
and make a list comprehension. And, I can just say for entry in entries, and entries is our complete RSS feed broken down
1:27
into nice entries by feedparser. And we saw an entry here, laid out. So I'm going to look over these entries and for every entry, here's the helper,
1:37
I'm going to convert to datetime, entry, and I'll take the published fields and what's funny, I'm actually going to
1:44
prepare to those two, using the dictionary way. But I should actually be able to do a dot notation which is much nicer.
1:53
I put that into convert to datetime, and convert to datetime, it's actually not 100 percent accurate.
2:00
It's more like, I mean that was the initial intent, but let's actually call it date, year, month. Because that's actually what it's returning, right?
2:13
So we should make our functions descriptive. And, yeah let's give the first five to see if I'm going in the right direction. And I am.
2:24
And the nice thing about Counter as we've seen in day four in the collections module lesson, is that I can give it a list of items,
2:34
and it just does a count. So if I want to have posts by month, so counter can just get this pub dates list, and look what happens. Wow. Boom.
2:46
I mean I didn't have to keep track of, well we saw that in the previous lesson right, they can hide it in a manual loophole
2:52
for all the items, keep in account and etc. But this is all done, understand the library. Secondly, we need to break down the categories.
3:01
So, similar as list comprehension, we're going to look over the entries. But instead of getting your month,
3:09
I'm going to use the other helper we defined and just get category. And I'm going to do that on the link.
3:14
And those are not pub dates, those are categories. Again, counter is your best friend. Tags is almost the same, so I'm going to just copy it over.
3:31
Tags, that is actually a bit more complex. Let me go from start, so for entry and entries, and here I have an exceptional case
3:39
for a nested for list comprehension. For each entry, loop through the tag. And each tie has a term, let's lower case
3:50
that to not have to deal with upper and lower case. So for each entry, because one entry has a list of tags, I'm looping through this list of tags,
3:59
and I'm taking out the term. That's what I'm basically doing. I lower case that tag, so we have all the tags, for all the entries.
4:06
And again, I can use a counter to get that all counted up. Let's give most common a limitation of 20, and let's print the first five.
4:18
And obviously five then is at the top. Right, that was a lot of preparation but the good news, is that the data is now in a structure that
4:26
we can easily make plots.