Move from Excel to Python with Pandas Transcripts
Chapter: Intro to Pandas
Lecture: Concept: Working with columns
Login or
purchase this course
to watch this video and the rest of the course contents.
So now the next thing might be Well, how do you work with multiple columns? Let's get some summary information on the price and quantity columns.
So one way to do this if we want to work with multiple columns, is we define a variable and we need to use a list.
And then if we want most about columns and then if we want to get let's
say, the average value for the price and quantity we can do df["summary_columns"] and this is going to calculate the mean we could ... if we wanted to
You could do -sum. Sum may not really be useful in this context, but let's go and stick with mean.
So that's one way that you can combine multiple columns together and then let me show you that you can define a variable like that,
and I think that's a good practice. But if you choose not to define a variable,
and I'm cutting and pasting on purpose so you can kind of see how this works. I just defined that list of columns that I want to work on,
and it can do that as well, so you'll see both methods when you are looking at your pandas data and looking at
examples online. I want to go over one Other example. If you want to use the describe function, you can do that as well.
And remember, we ran that on full data frame once again have chosen just a small set of data frames and then the final thing.
Let's go through a real quick example of how we would add a new column. So let's say we want to put a country column on here,
and we know that everyone is in the US. So think about how you do that. If you had an Excel bio,
you would probably put us a in a columm and drag it down. Here. You just assign the string USA to that country column. And now if you say df.head(),
it's gonna show country for all the values. And if you want to check, you can look at df.tail() and see that country is everywhere.
So we can also do something similar if we want to do math. So let's say we have a 15% or a, 1.5% fee. We want to add we can,
Say Okay, let's add df.['fee'] = df.['extended amount'] * .015 So what this does is adds a new column called Fee.
It is taking the "extended amount" times 0.15 and adding it as the entry in the fee column. So we press enter. And now if you look at the column,
you can see the fee. So this is "compare this" to how you would create a formula in Excel, where you would have to create that formula and drag it
down for each row. Here you just enter it once, and pandas takes care of making sure that everybody gets that value.
So it's a really compact and simple way to analyze things.
It also makes it easy to troubleshoot because you're only putting that formula in one location