Move from Excel to Python with Pandas Transcripts
Chapter: Case study walk through: Sales commissions
Lecture: Cleaned, joined file
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
Now that we have the company associated with the sales rap,
0:03
we could start to model. The commission's in a little bit more detail,
0:08
so we're now adding a column that will show how much commission is calculated for each
0:14
customer. So let's take a look Scroll over here now and we can see a
0:18
commission, and what I want to do here is clean this up.
0:23
I'm going around it a little bit.
0:24
This is going to introduce some rounding errors,
0:27
but I think it's Ah, it makes sense in this case that the two decimal
0:31
points now that makes it a little more straightforward.
0:36
And we can always check to make sure that we're still in a million dollars in
0:40
commissions. They were right at a million dollars.
0:45
We contained to have some around the heirs.
0:47
But that's not an issue at this point.
0:49
Now that we have all this data,
0:51
we can start to aggregate it and get some insights into how the commissions are distributed
0:56
For instance, what if we want to know how much we're paying by region
1:04
starts to give us a little more insight into what our plan might be telling us
1:08
so We've got a pretty broad distribution of commissions just based on region,
1:14
and we can do some other quick analysis on this if we want to.
1:18
Maybe not just looking at the some,
1:23
but we also want to see the average distribution.
1:26
So once again there's there's a little bit of spread here between the sales and the
1:32
different regions that we may want to investigate some more.
1:34
One of the things that may be worth investigating is how we're commissioning by channel.
1:41
Let's add that into our group by I'm also going around it just to make it
1:46
a little easier toe interpret. So looking at this data,
1:50
there continues to be, you know,
1:51
some interesting differences with the reseller versus retail that we probably want to dive into a
1:58
little bit more. So let's try another one this example,
2:03
and continue to copy and paste.
2:06
Just take a look at the channel.
2:08
It's interesting that there's a lot more commissions for reseller in retail,
2:14
and you know what would probably be interesting is to see how Maney companies actually fall
2:20
in these various buckets. So let's have that as well,
2:28
and I used the count function to count how many company names this ties back into
2:33
what we observed earlier, that we have a lot of retailers,
2:37
number of resellers and only two partners,
2:39
and it just looks like the commission rates may not be equally distributed across those various
2:45
groups. And I think the way toe best see this data is to use a
2:49
pivot table. So let's build one.
3:16
Now, as we look at the pivot table broken out by region and the different
3:20
customer channels, we start to see that there is some unequal distribution in the way
3:25
partners air distributed across geography ease as well as the total amount in the different geography
3:33
ease. For instance, the South and the Midwest have 794,000 versus a little over
3:41
100,000 in the Northeast in the West.
3:43
So this distribution might not be what we would want to see with our commission's plan
3:48
We're probably going to need to do some more analysis on it,
3:53
but this customer rep data frame that we put together is really gonna be useful for
3:57
the next step in analysis. So I'm going to save this.
4:02
It looks like I forgot to define output file.
4:05
So let's go ahead and do that up at the top so that we keep everything
4:08
in the same place. So I'm going to store this file in the process directory
4:19
because I'm going to need to use this for some additional analysis.
4:23
So I copied that. You have it down here and now we save that file
4:28
and let me open it up and show you what it looks like.