Move from Excel to Python with Pandas Transcripts
Chapter: Intro to Pandas
Lecture: Working with column names
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Okay, let's go through some more examples of how to work with Jupyter notebooks and
0:05
pandas, and I've opened up my notebook and I want to walk through something that could be a little confusing to new users.
0:11
So if you look at this notebook, I've just opened it up and you can see that in this cell I'm showing the
0:18
data frame by typing DF. And so there may be a temptation to go in here and let's just take a look at the head and remember,
0:27
shift + enter. I press that and I get a name error df is not defined. And the reason is I haven't actually run everything in the notebook,
0:35
so it's really useful to hit this menu option. Kernel, restart and run all, and you'll get this option to restart.
0:45
Run all cells. You do that and what this does. It runs through all of the code from top to bottom and makes everything live in
0:52
the current Kernel. So now if I make a change, everything works. You can also see that the number has incriminated.
0:59
So went from 1, 2, 4 5, 6, 7, 8, 9 and then back up to 10 and 3 is gone because I reran in that cell.
1:09
So this points to some of the power of Jupyter notebooks, but also how it can be confusing sometimes if you get out of order.
1:16
So the thing I would recommend is that you frequently use Kernel Restart and run all And if you don't want to use the menu,
1:25
this command here, restart the Kernel rerun, everything will do the same thing. So once we've done that, we've taken a look at our data frame.
1:34
And now we want to actually look at some columns. So the simplest way to do this, remember, we have. If you ever forget what columns do I have,
1:44
type df.head() and we have these columns called Invoice / Company / purchased_date. So let's just say df.invoice and I see all of the invoice
1:56
column all of the values in the invoice. You can see each one it truncates if you are in the middle because it doesn't want to
2:02
show 1000 rows, which makes sense. It's pretty good. That should be pretty intuitive to someone that has worked with Python before.
2:11
But what happens if we want to look at this extended amount where there's a space in the column name, you get a syntax error,
2:20
and that's because Python doesn't understand what this space means. So the syntax you need to use is put a bracket around it and quotes,
2:32
and then you can reference the column and here you go, so you can see that. 323, 420, 161, 203, 684. if I scroll appear 323, 420, 161, 203, 684.
2:43
So the the reason I point this out is you have two options to access the
2:49
columns, and sometimes you'll see code that has that period versus the bracket notation. I encourage you to always use the bracket notation.
2:59
It will make your life easier when you have these types of situations and it's consistent with the other operations you're gonna want to do and pandas.
3:06
So the main reason I bring it up is so that you're aware of it,
3:08
and you can keep that in mind when you are doing your analysis and doing your problem solving.