Move from Excel to Python with Pandas Transcripts
Chapter: Case study walk through: Sales commissions
Lecture: Cleaned, joined file
0:00 Now that we have the company associated with the sales rap, we could start to model. The commission's in a little bit more detail,
0:09 so we're now adding a column that will show how much commission is calculated for each
0:15 customer. So let's take a look Scroll over here now and we can see a commission, and what I want to do here is clean this up.
0:24 I'm going around it a little bit. This is going to introduce some rounding errors,
0:28 but I think it's Ah, it makes sense in this case that the two decimal points now that makes it a little more straightforward.
0:37 And we can always check to make sure that we're still in a million dollars in commissions. They were right at a million dollars.
0:46 We contained to have some around the heirs. But that's not an issue at this point. Now that we have all this data,
0:52 we can start to aggregate it and get some insights into how the commissions are distributed
0:57 For instance, what if we want to know how much we're paying by region starts to give us a little more insight into what our plan might be telling us
1:09 so We've got a pretty broad distribution of commissions just based on region, and we can do some other quick analysis on this if we want to.
1:19 Maybe not just looking at the some, but we also want to see the average distribution.
1:27 So once again there's there's a little bit of spread here between the sales and the different regions that we may want to investigate some more.
1:35 One of the things that may be worth investigating is how we're commissioning by channel.
1:42 Let's add that into our group by I'm also going around it just to make it a little easier toe interpret. So looking at this data,
1:51 there continues to be, you know, some interesting differences with the reseller versus retail that we probably want to dive into a
1:59 little bit more. So let's try another one this example, and continue to copy and paste. Just take a look at the channel.
2:09 It's interesting that there's a lot more commissions for reseller in retail,
2:15 and you know what would probably be interesting is to see how Maney companies actually fall in these various buckets. So let's have that as well,
2:29 and I used the count function to count how many company names this ties back into what we observed earlier, that we have a lot of retailers,
2:38 number of resellers and only two partners, and it just looks like the commission rates may not be equally distributed across those various
2:46 groups. And I think the way toe best see this data is to use a pivot table. So let's build one.
3:17 Now, as we look at the pivot table broken out by region and the different
3:21 customer channels, we start to see that there is some unequal distribution in the way
3:26 partners air distributed across geography ease as well as the total amount in the different geography
3:34 ease. For instance, the South and the Midwest have 794,000 versus a little over 100,000 in the Northeast in the West.
3:44 So this distribution might not be what we would want to see with our commission's plan We're probably going to need to do some more analysis on it,
3:54 but this customer rep data frame that we put together is really gonna be useful for the next step in analysis. So I'm going to save this.
4:03 It looks like I forgot to define output file. So let's go ahead and do that up at the top so that we keep everything
4:09 in the same place. So I'm going to store this file in the process directory because I'm going to need to use this for some additional analysis.
4:24 So I copied that. You have it down here and now we save that file and let me open it up and show you what it looks like.