Difference between revisions of "Teppei Syokudo - Improving Store Performance: ESK Data Analysis Methodology"

From Analytics Practicum
Jump to navigation Jump to search
Line 51: Line 51:
 
<div style="margin:20px; padding: 10px; background: #ffffff; font-family: Trebuchet MS, sans-serif; font-size: 95%;-webkit-border-radius: 15px;-webkit-box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96); -moz-box-shadow:    7px 4px 14px rgba(176, 155, 121, 0.96);box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96);">
 
<div style="margin:20px; padding: 10px; background: #ffffff; font-family: Trebuchet MS, sans-serif; font-size: 95%;-webkit-border-radius: 15px;-webkit-box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96); -moz-box-shadow:    7px 4px 14px rgba(176, 155, 121, 0.96);box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96);">
 
<font size =3 face=Georgia >
 
<font size =3 face=Georgia >
<i><b>Set Role ID.</b></i>
+
<i><b>Time of Day Effect.</b></i> We were provided with hourly and daily data for both sales and labour within a three month period. In exploring the data given, we found that there was a pattern in the sales transactions.
<p>In exploring the sales-labour data, we first found the mean sales per hour from both MW and RP. It can be seen that the mean sales for MW peak from 11:00 to 13:00 during lunchtime and 18:00 to 20:00 during dinnertime. Likewise, RP’s mean sales peak from 11:00 to 13:00, and from 17:00 to 19:00. </p>
+
[[File:Reg-figure2.png|500px]]
<p>We hypothesized that staff who are more productive will perform better on average as compared to the shop sales on an hourly basis. We explored staff performance by looking at a common measure of labour performance, which is staff productivity. </p>
+
<p>Mean sales per hour peaked from 11:00 to 13:00 and from 18:00 to 20:00. We attribute the peak periods to the lunchtime and dinnertime crowds respectively. Other time periods are idle time. We realise that this time of day effect will have to be taken into account when evaluating labour productivity in our hypotheses so that sales toward a particular factor would not be over or under attributed. We label 11:00 to 13:00 as Lunch Peak, 14:00 to 17:00 as Idle, and 18:00 to 20:00 as Dinner Peak.</p>
<p>We attributed hourly sales to each of the staff present in the shop at that hour. We then took an average of the attributed sales for each of the staff by dividing his total sales with the total number of hours that he worked. </p>
+
<br>
<p>However, we realized that there was an hourly effect on retail sales, which affects the labour productivity of the staff. This means that on an absolute basis, if Staff A and B both work the same number of hours, but A works during peak hours, and B works during non-peak hours, A’s labour productivity (Store Sales / Number of hours worked) will be higher than B. This might lead to a possible misrepresentation because Staff A might be poorer at customer service or upselling as compared to B. This leads to a need for data standardization on an hourly basis. For more information on the methodology used, please refer to the Data preparation section.</p>
+
<i><b>Day of Week Effect.</b></i> We also found the mean sales for every day of the week. Accounting for the time of day effect, we find the mean sales each day of the week. The day of the week also accounts for public holidays as crowds may be higher during public holidays. Similar to the time of day effect, the chart below shows that there is also a day of week effect.  
<p>After standardizing the data, we proceeded to rank them based on their standardized labour productivity and took the top 5 performers, as well as the bottom 5 performers for each store, and plotted their hourly sales, compared to the shops’ average sales.</p>
+
[[File:Reg-figure3.jpg|500px]]
<p>Our hypothesis is partially true because the top 5 performers almost perform higher than the mean shop sales but mostly during peak hours. The bottom 5 performers also perform lower than the mean shop sales but mostly during peak hours.</p>
+
<p>During Lunch Peak and Dinner Peak, the store tends to achieve higher sales average on Fridays, with Saturdays experiencing lower mean sales. However during Idle periods, Saturdays and Public Holidays tend to experience relatively better mean sales compared to Idle periods on other days in the week.</p>
<p>This implies that there is value in identifying high performers that perform on a consistent basis. Firstly, we can benchmark staff performance using the top performers. Secondly, we can qualitatively assess the behavior of top performers that affect sales and develop means to train the rest of the staff to be like them.</p>
+
<br>
 +
<i><b>Removing Autocorrelation.</b></i> We test for autocorrelation using the Durbin-Watson test. Taking the example for testing the effect of manager presence on sales per customer, we ran a Fit Model using Y as Sales/Customer Number and Model Effects using Total number of Manager Labour Hours.
 +
<p>[[File:Reg-figure4.jpg|400px]]</p>
 +
<p>The results of the Durbin-Watson test is shown below:</p>
 +
[[File:Reg-figure5.jpg|400px]]
 +
<p>With p-value less than 0.05, we know that autocorrelation is present in our data. The Durbin-Watson value of 1.51 tells us that there is some positive correlation between Sales/Customer Number and Total number of Manager Labour Hours.</p>
 +
<p>To account for the time of day and day of week effects, in the Fit Model, we add Day and Lunch Peak/Idle/Dinner Peak into By.</p>
 +
[[File:Reg-figure6.jpg|400px]]
 +
<p>The results of the Durbin-Watson test this time is:</p>
 +
[[File:Reg-figure7.jpg|400px]]
 +
<p>The p-value is greater than 0.05, showing that autocorrelation is no longer present in the data. We now know that for the analyses we run next, in order to account for the time of day and day of week effects, we should include Day and Lunch Peak/Idle/Dinner Peak in By.</p>
  
 
</font>
 
</font>
 
</div>
 
</div>

Revision as of 13:23, 17 April 2016


Home   Product Portfolio Analysis   Evaluating Store KPIs   Project Management   Documentation   The Team
  Introduction Data Analysis Methodology Hypotheses & Findings References  

Data Exploration

Time of Day Effect. We were provided with hourly and daily data for both sales and labour within a three month period. In exploring the data given, we found that there was a pattern in the sales transactions. Reg-figure2.png

Mean sales per hour peaked from 11:00 to 13:00 and from 18:00 to 20:00. We attribute the peak periods to the lunchtime and dinnertime crowds respectively. Other time periods are idle time. We realise that this time of day effect will have to be taken into account when evaluating labour productivity in our hypotheses so that sales toward a particular factor would not be over or under attributed. We label 11:00 to 13:00 as Lunch Peak, 14:00 to 17:00 as Idle, and 18:00 to 20:00 as Dinner Peak.


Day of Week Effect. We also found the mean sales for every day of the week. Accounting for the time of day effect, we find the mean sales each day of the week. The day of the week also accounts for public holidays as crowds may be higher during public holidays. Similar to the time of day effect, the chart below shows that there is also a day of week effect. Reg-figure3.jpg

During Lunch Peak and Dinner Peak, the store tends to achieve higher sales average on Fridays, with Saturdays experiencing lower mean sales. However during Idle periods, Saturdays and Public Holidays tend to experience relatively better mean sales compared to Idle periods on other days in the week.


Removing Autocorrelation. We test for autocorrelation using the Durbin-Watson test. Taking the example for testing the effect of manager presence on sales per customer, we ran a Fit Model using Y as Sales/Customer Number and Model Effects using Total number of Manager Labour Hours.

Reg-figure4.jpg

The results of the Durbin-Watson test is shown below:

Reg-figure5.jpg

With p-value less than 0.05, we know that autocorrelation is present in our data. The Durbin-Watson value of 1.51 tells us that there is some positive correlation between Sales/Customer Number and Total number of Manager Labour Hours.

To account for the time of day and day of week effects, in the Fit Model, we add Day and Lunch Peak/Idle/Dinner Peak into By.

Reg-figure6.jpg

The results of the Durbin-Watson test this time is:

Reg-figure7.jpg

The p-value is greater than 0.05, showing that autocorrelation is no longer present in the data. We now know that for the analyses we run next, in order to account for the time of day and day of week effects, we should include Day and Lunch Peak/Idle/Dinner Peak in By.