Python for the .NET developer Transcripts
Chapter: Computational notebooks
Lecture: Graphing the popular domains
0:00 Well we're almost ready.
0:01 I guess I'll leave this bit here.
0:03 I'm not super fond of it, but it's okay.
0:05 We want to add a new cell.
0:07 I'll convert that to Markdown
0:09 summary of popular domains
0:13 then we hit B to add another.
0:14 Now what we're going to do here
0:15 is we're going to do some graphing
0:17 with a library called Matplotlib.
0:18 There's a bunch of different
0:20 graphing libraries in Python.
0:21 We're going to use Matplotlib for this one.
0:24 Now we'll probably have to install it
0:25 but let's go ahead and drop this import statement here.
0:27 We're going to import the plot
0:28 and we're going to also use this library called NumPy.
0:32 NumPy is a really cool library for us to work with.
0:34 It does like high performance
0:36 multi-dimensional arrays
0:37 so Matplotlib uses that.
0:40 Normally if we work with these plots
0:42 it'll pop up in a separate window
0:43 but if we say percent, this is a special
0:46 way to issue commands to Jupyter itself
0:48 not to Python, so that when you run Matplotlib
0:52 grab the graphs and display them inline.
0:54 So this'll probably crash.
0:55 Let's see if it does, yes.
0:57 Do we have Matplotlib and NumPy?
0:59 No, so let's go over here.
1:02 So we want to go over to our GetHub repo
1:04 and we're going to activate our virtual environment
1:06 And then we're going to pip install Matplotlib and NumPy.
1:17 Alright, super! That seemed to work.
1:19 Let's see if we can run this now.
1:20 Success, number 71, execution 71.
1:24 Maybe we could add a cell above it that's marked down.
1:31 Like that, so we could say
1:32 import and set up Matplotlib.
1:34 Then we could define our actual code
1:36 maybe this will be Markdown.
1:40 All right and now what we're going to do
1:41 is I'm going to put some Matplotlib code here.
1:43 I'm not going to type it out
1:44 cause it's finicky and it's not particularly revealing.
1:47 So I'm just going to drop that there.
1:49 So what we're going to do
1:50 it's going to go down here
1:51 we're going to define the values
1:52 which is, you remember lookin' up here
1:56 the value we want the second element.
1:58 These are the numbers, the 23, the 21
2:00 the 20, and so on. And then I want the bins.
2:04 These are the categories that these go into
2:06 and this is going to be domain names
2:08 Twitter and Google, and so on.
2:10 And then I need the indices
2:13 that we're going to put them in. These are like popular number 1, 2, 3, and so on.
2:19 And then we're going to say
2:20 we want to do a bar graph with those values
2:22 and then this is the number of referrals
2:24 the title is site referral referred to by Python Bytes.
2:28 We're going to put the bins as the labels
2:30 but we want to put them vertically
2:31 'cause if you put 'em horizontally
2:32 25 domain names on a little tiny screen
2:35 they just override each other.
2:36 So we got to rotate them, so they go up and down
2:39 and then we're going to just pick the right y-axis
2:43 based on arranging these values
2:45 and by groups of 50, NumPy is sweet!
2:47 Run that, so cool!
2:51 There it is, well what is the most popular site
2:54 that as far as we're concerned on our podcast?
2:56 The one that I do with Brian Okken.
2:58 It is GitHub, and then Twitter
3:00 and then YouTube, and up to us, barely
3:03 in fourth place we have Python.org
3:05 and then medium Reddit RealPython, and so on.
3:08 Pretty awesome, huh?
3:10 Hopefully as we went through this
3:11 I took my time and I tried not to rush it
3:14 'cause I wanted you to get the feel of experiencing
3:17 this type of environment.
3:19 Working through code like this and so on
3:22 and we could even put more pictures
3:23 and more results, and more summaries, right?
3:25 It's all marked down in images plus code.
3:28 I think this is pretty sweet right here.
3:30 Let's do one thing.
3:31 Let's just verify everything works
3:33 because something funky about these notebooks
3:35 is they almost have like a go to style.
3:38 Look at this, so this was the 10th thing run.
3:40 This is the 15th. This down here is 26th
3:44 and then back here was 27
3:46 and then down here who knows what the heck is down here.
3:49 Here's 67, 68, 69.
3:52 These are in order, but I could like run this again.
3:55 There's not a real order is there?
3:57 So let's go over here.
3:59 Let's just say run all cells
4:01 so start at the top, works it's way over
4:04 either still I got these numbers.
4:07 You could even go close this
4:08 say do we want to save it?
4:09 Of course we really want to save that.
4:11 We could create a new notebook from here
4:14 which would be pretty awesome
4:15 but we're going to go and launch this one again.
4:19 We could say clear all outputs
4:21 so it resets it, and then we say run all cells.
4:26 A little bit better, right?
4:27 At least we're running them in order here.
4:29 We've got to restart our kernel, I think
4:31 to get it to completely rerun from scratch.
4:33 Maybe there's a better way to do it
4:34 but for me I probably would just restart the kernel
4:37 and rerun it, but this is awesome, right?
4:39 Look how cool this is!
4:40 This is so different than a regular application.
4:43 And remember if we want to go play with this
4:45 we don't have to go re-download the stuff.
4:48 We don't have to reparse 2.5 megs of XML and HTML.
4:53 We don't have to do all that distinct stuff.
4:55 If I just want to make these a little skinnier
4:58 I can make 'em skinnier. If I want to make 'em wider.
5:01 I can make them wider
5:02 and I can just focus on this little bit
5:04 of exploring my data.
5:06 I'm going to put it back the way I liked it.
5:08 Here we go, that's pretty slick.
5:10 We can go through here and explore the data
5:12 in this really unique and different way
5:15 in these computational notebooks.
5:16 So this is JupyterLab
5:18 and Jupyter Notebooks, and Python.