# Move from Excel to Python with Pandas Transcripts Chapter: Case study walk through: Sales commissions Lecture: Commission rate calculations

0:00 Before I continue with the rest of the analysis, I'm going to delete some of the boilerplate cells that were put in here just to
0:09 keep this clean. All right, you can see that our data has been read in and let's do the quick check to make sure everything looks good.
0:17 So the one thing that were seeing here from a data type perspective is that zip code is coming in as an n 64. Total sales is and then 64 as well,
0:27 which is OK, but zip code is not so We need to go back and fix it when we read in our file. So let's go ahead and specify the data type,
0:37 and now we look at it and zip code is an object. So I specified string here. But behind the scenes, pandas turns that into an object,
0:46 and then we can take a look at her data frame. Just double check again, and zip codes look good. And all of our data seems to be okay.
0:56 Let's do some additional quick checks of the data toe, understand what we have here, so we can see how many entries we have in
1:05 total sales, we can see the average, the standard deviation and what the max and men are.
1:12 One of the things I like to use with describe and highlight how you need to understand what the various options are.
1:20 We can just run this on objects as well, and this is a little bit easier to see that we have 50 different companies.
1:27 We have three unique channels. We have 50 unique zip codes, states and account numbers. It was just ah,
1:37 quick way to start to get a feel for what the data looks like. And here we've done the math to figure out OK, how much did we sell last year?
1:50 And this is a little bit hard to read, So maybe we want to use our f-strings to clean this up using f-strings
1:59 I told it, format it as a float with commas and no decimal points so this makes it easier to see that we have \$126 million in sales.
2:08 Now what if we want to figure out what the effective commission rate would be? So we know our commission target \$1 million.
2:17 It knows how used the underscores here in more recent versions of Python, This is inappropriate syntax. And instead of using commas,
2:25 it's a lot easier to view your data and understand that that's a million dollars. And I did a little formula here to see what is our effective rate.
2:38 So if we know that we have a million dollars and commissions and are yet last year sales were \$126 million. Therefore,
2:46 the effective rate would be 1260.79% times last year sales to get us to a million
2:52 dollars in commissions. Let's build upper model to see what the actual commission would be on each of these customers. So at a commission column now,
3:06 what this tells us is this is the individual commission that would be calculated for the
3:11 sales for this company. So if we wanted to maybe clean this up a little bit because we've got the commission's out the meaning decimal points,
3:20 we can round it. It's a little cleaner. Let's just check our math here So the total commissions is a little bit over a
3:31 million dollars, and that's just a function of floating point math and rounding than the
3:37 average commission her customers \$20,000 if we want to do maybe some more detailed.
3:46 This tells us what the meaning is with standard deviation is the minimum the max and
3:52 the different percentiles. Really nice summary of the commission data. So let's see if we take this back to our CFO and see what they think

Talk Python's Mastodon Michael Kennedy's Mastodon