Move from Excel to Python with Pandas Transcripts
Chapter: Intro to Pandas
Lecture: Working with multiple columns
0:00 So now the next thing might be Well, how do you work with multiple columns? Let's get some summary information on the price and quantity columns.
0:12 So one way to do this if we want to work with multiple columns, is we define a variable and we need to use a list.
0:19 And then if we want most about columns and then if we want to get let's
0:25 say, the average value for the price and quantity we can do df["summary_columns"] and this is going to calculate the mean we could ... if we wanted to
0:36 You could do -sum. Sum may not really be useful in this context, but let's go and stick with mean.
0:42 So that's one way that you can combine multiple columns together and then let me show you that you can define a variable like that,
0:50 and I think that's a good practice. But if you choose not to define a variable,
0:55 and I'm cutting and pasting on purpose so you can kind of see how this works. I just defined that list of columns that I want to work on,
1:02 and it can do that as well, so you'll see both methods when you are looking at your pandas data and looking at
1:10 examples online. I want to go over one Other example. If you want to use the describe function, you can do that as well.
1:23 And remember, we ran that on full data frame once again have chosen just a small set of data frames and then the final thing.
1:31 Let's go through a real quick example of how we would add a new column. So let's say we want to put a country column on here,
1:41 and we know that everyone is in the US. So think about how you do that. If you had an Excel bio,
1:47 you would probably put us a in a columm and drag it down. Here. You just assign the string USA to that country column. And now if you say df.head(),
1:57 it's gonna show country for all the values. And if you want to check, you can look at df.tail() and see that country is everywhere.
2:04 So we can also do something similar if we want to do math. So let's say we have a 15% or a, 1.5% fee. We want to add we can,
2:13 Say Okay, let's add df.['fee'] = df.['extended amount'] * .015 So what this does is adds a new column called Fee.
2:21 It is taking the "extended amount" times 0.15 and adding it as the entry in the fee column. So we press enter. And now if you look at the column,
2:33 you can see the fee. So this is "compare this" to how you would create a formula in Excel, where you would have to create that formula and drag it
2:42 down for each row. Here you just enter it once, and pandas takes care of making sure that everybody gets that value.
2:49 So it's a really compact and simple way to analyze things.
2:53 It also makes it easy to troubleshoot because you're only putting that formula in one location