Move from Excel to Python with Pandas Transcripts
Chapter: Intro to Pandas
Lecture: Diving into the data

Login or purchase this course to watch this video and the rest of the course contents.
0:00 So the next thing we could do if we select a column will use this df["extended amount"] column again, and what we can do is if we want to take the
0:10 sum of it, it's gonna operate on all the values in that column and added up. We can also do things like, let's say, I don't want to even do a new cell.
0:21 I can go back edit it. Press shift enter again and gives me the average, which is pretty helpful. It can also do some other things that you may not
0:28 think about as much that can be really useful in your data analysis. So if we have a question of wanting to know how many invoices we have,
0:37 we can use df["invoice"].unique(). So now we can say, Oh, well, we have a unique invoice in each row that's helpful.
0:46 The other thing you can do is all of the options that we've talked about our full data frame. You can run those on a single calomn if we just want
0:53 to see the product column and just want to look at the top by rows, you can do head to see that you can.
1:02 This one might be one where being able to see the number of unique products. So if you were just looking at this data for the first time and want to
1:10 know while I see I've got shirts and books and posters, how many unique values do I have? You can do that as well.
1:18 So there's four different types of products, so that could be a pretty helpful function for you. There's couple others that I wanna talk about.
1:26 Another one that is useful is the count function that you'll use in other settings.
1:35 So you get the idea that you have selected column and then you can performing operation
1:41 on on that column. There's also some some handy ones if you want to do a short cut. So let's say I want to know...
1:49 Well, I have four different product types. How many shirts? How many books? How many posters do I have?
1:55 You can use value counts for that which is really useful is a really common thing that I do. On almost every data set is you take a look at your
2:03 value counts, see how many you have, and then one of the other things you can do is you can chain operations.
2:10 So if we want to the value counts and want to maybe see what percentage? Or maybe let's say divide those by 100.
2:20 You can use the DIV function to do that and the other. The other thing to keep in mind is when you have these functions,
2:30 you can pass arguments to it as well. So this one bypassed normalize equals true, it gives me the percentage that each value shows.


Talk Python's Mastodon Michael Kennedy's Mastodon