Move from Excel to Python with Pandas Transcripts
Chapter: Case study walk through: Sales commissions
Lecture: Cleaned, joined file

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Now that we have the company associated with the sales rap, we could start to model. The commission's in a little bit more detail,
0:09 so we're now adding a column that will show how much commission is calculated for each
0:15 customer. So let's take a look Scroll over here now and we can see a commission, and what I want to do here is clean this up.
0:24 I'm going around it a little bit. This is going to introduce some rounding errors,
0:28 but I think it's Ah, it makes sense in this case that the two decimal points now that makes it a little more straightforward.
0:37 And we can always check to make sure that we're still in a million dollars in commissions. They were right at a million dollars.
0:46 We contained to have some around the heirs. But that's not an issue at this point. Now that we have all this data,
0:52 we can start to aggregate it and get some insights into how the commissions are distributed
0:57 For instance, what if we want to know how much we're paying by region starts to give us a little more insight into what our plan might be telling us
1:09 so We've got a pretty broad distribution of commissions just based on region, and we can do some other quick analysis on this if we want to.
1:19 Maybe not just looking at the some, but we also want to see the average distribution.
1:27 So once again there's there's a little bit of spread here between the sales and the different regions that we may want to investigate some more.
1:35 One of the things that may be worth investigating is how we're commissioning by channel.
1:42 Let's add that into our group by I'm also going around it just to make it a little easier toe interpret. So looking at this data,
1:51 there continues to be, you know, some interesting differences with the reseller versus retail that we probably want to dive into a
1:59 little bit more. So let's try another one this example, and continue to copy and paste. Just take a look at the channel.
2:09 It's interesting that there's a lot more commissions for reseller in retail,
2:15 and you know what would probably be interesting is to see how Maney companies actually fall in these various buckets. So let's have that as well,
2:29 and I used the count function to count how many company names this ties back into what we observed earlier, that we have a lot of retailers,
2:38 number of resellers and only two partners, and it just looks like the commission rates may not be equally distributed across those various
2:46 groups. And I think the way toe best see this data is to use a pivot table. So let's build one.
3:17 Now, as we look at the pivot table broken out by region and the different
3:21 customer channels, we start to see that there is some unequal distribution in the way
3:26 partners air distributed across geography ease as well as the total amount in the different geography
3:34 ease. For instance, the South and the Midwest have 794,000 versus a little over 100,000 in the Northeast in the West.
3:44 So this distribution might not be what we would want to see with our commission's plan We're probably going to need to do some more analysis on it,
3:54 but this customer rep data frame that we put together is really gonna be useful for the next step in analysis. So I'm going to save this.
4:03 It looks like I forgot to define output file. So let's go ahead and do that up at the top so that we keep everything
4:09 in the same place. So I'm going to store this file in the process directory because I'm going to need to use this for some additional analysis.
4:24 So I copied that. You have it down here and now we save that file and let me open it up and show you what it looks like.


Talk Python's Mastodon Michael Kennedy's Mastodon