Move from Excel to Python with Pandas Transcripts
Chapter: Intro to Pandas
Lecture: Diving into the data
0:00 So the next thing we could do if we select a column will use this
0:03 df["extended amount"] column again, and what we can do is if we want to take the
0:09 sum of it, it's gonna operate on all the values in that column and added
0:13 up. We can also do things like,
0:16 let's say, I don't want to even do a new cell.
0:20 I can go back edit it.
0:21 Press shift enter again and gives me the average,
0:24 which is pretty helpful. It can also do some other things that you may not
0:27 think about as much that can be really useful in your data analysis.
0:32 So if we have a question of wanting to know how many invoices we have,
0:36 we can use df["invoice"].unique(). So now we can say,
0:41 Oh, well, we have a unique invoice in each row that's helpful.
0:45 The other thing you can do is all of the options that we've talked about our
0:49 full data frame. You can run those on a single calomn if we just want
0:52 to see the product column and just want to look at the top by rows,
0:59 you can do head to see that you can.
1:01 This one might be one where being able to see the number of unique products.
1:05 So if you were just looking at this data for the first time and want to
1:09 know while I see I've got shirts and books and posters,
1:12 how many unique values do I have?
1:14 You can do that as well.
1:17 So there's four different types of products,
1:19 so that could be a pretty helpful function for you.
1:23 There's couple others that I wanna talk about.
1:25 Another one that is useful is the count function that you'll use in other settings.
1:34 So you get the idea that you have selected column and then you can performing operation
1:40 on on that column. There's also some some handy ones if you want to do
1:45 a short cut. So let's say I want to know...
1:48 Well, I have four different product types.
1:50 How many shirts? How many books?
1:52 How many posters do I have?
1:54 You can use value counts for that which is really useful is a really common thing
1:59 that I do. On almost every data set is you take a look at your
2:02 value counts, see how many you have,
2:05 and then one of the other things you can do is you can chain operations.
2:09 So if we want to the value counts and want to maybe see what percentage?
2:15 Or maybe let's say divide those by 100.
2:19 You can use the DIV function to do that and the other.
2:25 The other thing to keep in mind is when you have these functions,
2:29 you can pass arguments to it as well.
2:32 So this one bypassed normalize equals true,
2:36 it gives me the percentage that each value shows.