Move from Excel to Python with Pandas Transcripts
Chapter: Case study walk through: Sales commissions
Lecture: Cleaned, joined file
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Now that we have the company associated with the sales rap, we could start to model. The commission's in a little bit more detail,
0:09
so we're now adding a column that will show how much commission is calculated for each
0:15
customer. So let's take a look Scroll over here now and we can see a commission, and what I want to do here is clean this up.
0:24
I'm going around it a little bit. This is going to introduce some rounding errors,
0:28
but I think it's Ah, it makes sense in this case that the two decimal points now that makes it a little more straightforward.
0:37
And we can always check to make sure that we're still in a million dollars in commissions. They were right at a million dollars.
0:46
We contained to have some around the heirs. But that's not an issue at this point. Now that we have all this data,
0:52
we can start to aggregate it and get some insights into how the commissions are distributed
0:57
For instance, what if we want to know how much we're paying by region starts to give us a little more insight into what our plan might be telling us
1:09
so We've got a pretty broad distribution of commissions just based on region, and we can do some other quick analysis on this if we want to.
1:19
Maybe not just looking at the some, but we also want to see the average distribution.
1:27
So once again there's there's a little bit of spread here between the sales and the different regions that we may want to investigate some more.
1:35
One of the things that may be worth investigating is how we're commissioning by channel.
1:42
Let's add that into our group by I'm also going around it just to make it a little easier toe interpret. So looking at this data,
1:51
there continues to be, you know, some interesting differences with the reseller versus retail that we probably want to dive into a
1:59
little bit more. So let's try another one this example, and continue to copy and paste. Just take a look at the channel.
2:09
It's interesting that there's a lot more commissions for reseller in retail,
2:15
and you know what would probably be interesting is to see how Maney companies actually fall in these various buckets. So let's have that as well,
2:29
and I used the count function to count how many company names this ties back into what we observed earlier, that we have a lot of retailers,
2:38
number of resellers and only two partners, and it just looks like the commission rates may not be equally distributed across those various
2:46
groups. And I think the way toe best see this data is to use a pivot table. So let's build one.
3:17
Now, as we look at the pivot table broken out by region and the different
3:21
customer channels, we start to see that there is some unequal distribution in the way
3:26
partners air distributed across geography ease as well as the total amount in the different geography
3:34
ease. For instance, the South and the Midwest have 794,000 versus a little over 100,000 in the Northeast in the West.
3:44
So this distribution might not be what we would want to see with our commission's plan We're probably going to need to do some more analysis on it,
3:54
but this customer rep data frame that we put together is really gonna be useful for the next step in analysis. So I'm going to save this.
4:03
It looks like I forgot to define output file. So let's go ahead and do that up at the top so that we keep everything
4:09
in the same place. So I'm going to store this file in the process directory because I'm going to need to use this for some additional analysis.
4:24
So I copied that. You have it down here and now we save that file and let me open it up and show you what it looks like.