One of our clients at SCS recently needed to compare two sets of financial results – specifically, to see if they were the same. The client calculates its sales commissions on a monthly basis, and they suspected the final week of a previous month actually had consistently higher sales. This line of thinking makes sense, understanding sales motivations closing in on a deadline. Despite inclinations toward a different set of conclusions; however, the sales units all swore there was no difference between the final week of a sales month and the other weeks. The only way to decide this issue in an objective way was to apply a statistical test to the data. That’s where SCS came in.
Taking a closer look at the data.
Our first step was to create a set of test data for an imaginary company. The data consisted of sales totals recorded on a weekly basis. We created six, four-week periods of data with the final week always having the highest totals. We set the average for the final weeks at $60 million and the average for non-final weeks at $50 million. In other words, we deliberately created an underlying difference, with a plan to create a statistical test to find it. The following data illustrates how we set up and ran tests to perform this function:
We called the statistical test that we used to test the difference between final and non-final weeks the “t” test. It calculates the probability that a pattern in the data is due to a coincidence rather than an underlying relationship. A complete explanation of this test is beyond the scope of this article, but there are plenty of online resources to learn about it. Excel provides a simple way to perform the t test with the T.TEST function.
We split our array of test data into two tables with final weeks in one table and non-final weeks in another table. The tables below duplicate this test to facilitate the use of this function.
The first two arguments to the function are simply the arrays of sales totals from the two tables.
The next argument to the function requires that we specify if we want a one- or two-tailed distribution. The tail is the part of the t distribution that asymptotically goes to zero. It represents the probability that we can reject the null hypothesis. If we use a one-tailed test then we are only testing for final weeks being greater. If we want to test that they are different – whether greater or less than – then we’ll use the two-tailed test. So in our example, we’ll be using the value “2” for this parameter.
The final parameter specifies the type of t test to perform. We can’t use a paired t test since we have unequal number of records in the two samples. We’ll next check if the variance for the two samples are similar. The standard deviation for non-final weeks is 3.72 and the standard deviation for final weeks is 2.94. It’s reasonable to assume that the variances for the two samples are similar and we can use the formula for two samples of equal variance.
The results of the t test are clear. There is much less than 1% chance that the two samples are from the same population.
Therefore we can assume that sales figures for this imaginary company are different for the final week of a sales period. The same test at the client showed a similar result, providing a valuable business insight into the organization.