Python for the .NET Developer Transcripts
Chapter: Computational notebooks
Lecture: Graphing the popular domains
0:00 Well we're almost ready. I guess I'll leave this bit here. I'm not super fond of it, but it's okay. We want to add a new cell.
0:08 I'll convert that to Markdown summary of popular domains then we hit B to add another. Now what we're going to do here
0:16 is we're going to do some graphing with a library called Matplotlib. There's a bunch of different graphing libraries in Python.
0:22 We're going to use Matplotlib for this one. Now we'll probably have to install it but let's go ahead and drop this import statement here.
0:28 We're going to import the plot and we're going to also use this library called NumPy. NumPy is a really cool library for us to work with.
0:35 It does like high performance multi-dimensional arrays so Matplotlib uses that. Normally if we work with these plots it'll pop up in a separate window
0:44 but if we say percent, this is a special way to issue commands to Jupyter itself not to Python, so that when you run Matplotlib
0:53 grab the graphs and display them inline. So this'll probably crash. Let's see if it does, yes. Do we have Matplotlib and NumPy?
1:00 No, so let's go over here. So we want to go over to our GetHub repo and we're going to activate our virtual environment
1:07 And then we're going to pip install Matplotlib and NumPy. Alright, super! That seemed to work. Let's see if we can run this now.
1:21 Success, number 71, execution 71. Maybe we could add a cell above it that's marked down. Like that, so we could say import and set up Matplotlib.
1:35 Then we could define our actual code maybe this will be Markdown. All right and now what we're going to do
1:42 is I'm going to put some Matplotlib code here. I'm not going to type it out cause it's finicky and it's not particularly revealing.
1:48 So I'm just going to drop that there. So what we're going to do it's going to go down here we're going to define the values
1:53 which is, you remember lookin' up here the value we want the second element. These are the numbers, the 23, the 21
2:01 the 20, and so on. And then I want the bins. These are the categories that these go into and this is going to be domain names
2:09 Twitter and Google, and so on. And then I need the indices that we're going to put them in. These are like popular number 1, 2, 3, and so on.
2:20 And then we're going to say we want to do a bar graph with those values and then this is the number of referrals
2:25 the title is site referral referred to by Python Bytes. We're going to put the bins as the labels but we want to put them vertically
2:32 'cause if you put 'em horizontally 25 domain names on a little tiny screen they just override each other.
2:37 So we got to rotate them, so they go up and down and then we're going to just pick the right y-axis based on arranging these values
2:46 and by groups of 50, NumPy is sweet! Run that, so cool! There it is, well what is the most popular site that as far as we're concerned on our podcast?
2:57 The one that I do with Brian Okken. It is GitHub, and then Twitter and then YouTube, and up to us, barely in fourth place we have Python.org
3:06 and then medium Reddit RealPython, and so on. Pretty awesome, huh? Hopefully as we went through this I took my time and I tried not to rush it
3:15 'cause I wanted you to get the feel of experiencing this type of environment. Working through code like this and so on
3:23 and we could even put more pictures and more results, and more summaries, right? It's all marked down in images plus code.
3:29 I think this is pretty sweet right here. Let's do one thing. Let's just verify everything works because something funky about these notebooks
3:36 is they almost have like a go to style. Look at this, so this was the 10th thing run. This is the 15th. This down here is 26th
3:45 and then back here was 27 and then down here who knows what the heck is down here. Here's 67, 68, 69.
3:53 These are in order, but I could like run this again. There's not a real order is there? So let's go over here. Let's just say run all cells
4:02 so start at the top, works it's way over either still I got these numbers. You could even go close this say do we want to save it?
4:10 Of course we really want to save that. We could create a new notebook from here which would be pretty awesome
4:16 but we're going to go and launch this one again. We could say clear all outputs so it resets it, and then we say run all cells.
4:27 A little bit better, right? At least we're running them in order here. We've got to restart our kernel, I think
4:32 to get it to completely rerun from scratch. Maybe there's a better way to do it but for me I probably would just restart the kernel
4:38 and rerun it, but this is awesome, right? Look how cool this is! This is so different than a regular application.
4:44 And remember if we want to go play with this we don't have to go re-download the stuff. We don't have to reparse 2.5 megs of XML and HTML.
4:54 We don't have to do all that distinct stuff. If I just want to make these a little skinnier I can make 'em skinnier. If I want to make 'em wider.
5:02 I can make them wider and I can just focus on this little bit of exploring my data. I'm going to put it back the way I liked it.
5:09 Here we go, that's pretty slick. We can go through here and explore the data in this really unique and different way in these computational notebooks.
5:17 So this is JupyterLab and Jupyter Notebooks, and Python.