Difference between revisions of "ANLY482 AY2017-18 T1 Group03/Final Progress"

From Analytics Practicum
Jump to navigation Jump to search
Line 141: Line 141:
  
 
We will then have information about sales and interaction to each customer by a particular product the company made sales to. The file was saved as temporarily as Invoice Interaction.  
 
We will then have information about sales and interaction to each customer by a particular product the company made sales to. The file was saved as temporarily as Invoice Interaction.  
<br>
+
 
 +
 
  
 
''1.3.5 Cleaning up Invoice Interaction''
 
''1.3.5 Cleaning up Invoice Interaction''
 +
 
With the new file on hand, we have renamed both N rows for face-to-face interactions and phone web interactions to F2F Interaction Count and Call Count respectively. We then recoded missing value in F2F Interaction Count and Call Count columns with “0”.  
 
With the new file on hand, we have renamed both N rows for face-to-face interactions and phone web interactions to F2F Interaction Count and Call Count respectively. We then recoded missing value in F2F Interaction Count and Call Count columns with “0”.  
 
There were sales transactions done that did not contain both face-to-face and phone web interactions. As our objective was to analyze sales based on interactions, we have decided to remove those transactions that do not contain any interactions.
 
There were sales transactions done that did not contain both face-to-face and phone web interactions. As our objective was to analyze sales based on interactions, we have decided to remove those transactions that do not contain any interactions.
<br>
+
 
 +
 
 +
 
 +
''1.3.6 Standardizing Invoice Interaction file data''
 +
 
 +
Before we came up with statistical model to analyze sales and interaction, we first checked if the data in the total sales amount (i.e Sum(Amount$)) contained a normal distribution. Initial analysis by running a distribution check indicated that the total sales amount was right-skewed as shown below.
 +
 
 +
 
 +
{| class="wikitable" width="100%"
 +
|| Mean || 2172.2607 || Std Dev || 8785.3315 || Std Err Mean || 78.534427 || Upper 95% Mean || 2326.2002
 +
|-
 +
|| Lower 95% Mean || 2018.3211 || N || 12514 || Median || 515 || Mode || 173.55
 +
|}

Revision as of 16:05, 3 December 2017

Logo3.PNG

Home About Us Project Overview Midterm Progress Final Progress Project Management Documentation


IP.png
1.0 Methodology


1.1 Data Collection
There were 6 main files sent to us by client out of which 5 of them have been extensively used. For our research, we obtained a full year data of 2016 and 2017 data till August.

File Name (Edited for confidentiality) Description Number of Rows
Customer Interaction Information on interaction details ~213,000(FTF) & ~421,000(PW)
Employee File Information on employee teams ~2000
Customer (Organisations) Information on healthcare organisations ~117,000
Customer (non-Organisation) Information on healthcare professionals ~205,000
Invoice Information on transactions ~9,000,000



1.2 Tools Used
Our research has extensively used SAS JMP Pro 13 for data cleaning and transformation and for confirmatory analysis because of its intuitive system and simplistic approach. During the initial period, there has been considerable usage of Tableau Software for exploratory data analysis because of its easy visualisations for understanding the data.




1.3 Data Cleaning & Transformation

1.3.1 Files Required for Cleaning & Transformation

File Name (Edited for confidentiality) Description
Customer Interaction (Type A) This file contains information about face-to-face interactions that the client has done with the customers. We have included information such as customer code and allocated duration (i.e to remove double counting of duration issue in the original file) in the file.
Customer Interaction (Type B) This file contains information about phone web interactions that the client has done with the customers. We have included customer code in the file.
Clean Invoice (2015-2017) This file contains information about sales invoices that occurred between 2015 to the latest month in 2017. We have previously cleaned and mapped in Therapy Area to the file.


1.3.2 Customer Interaction

We have performed a summary function for both Customer 360 (F2F) and Customer 360 (Phone Web) based on each product code sold and total interaction duration to each customer on a monthly basis.

Essentially, the summarized files will show us that per month, how many interactions were made and how much time were spent to sell a product to each customer. The N counts that appear in each file represents interaction counts for face-to-face (F2F Interaction Count) and phone web (Call Count) interactions. We have saved both files as Summarized Customer 360 (F2F) and Summarized Customer 360 (PW).


1.3.3 Invoice File

We performed a summary function for the invoice file to extract out relevant information for us to analyse sales and to eventually link the summarized file to interaction files. Essentially, we summarized the invoice based on products sold to individual customer and how much revenue was earned per month. We also classified the products sold by their therapy areas, which distribution channel was used to sell the products, and the sales representative who was involved in making the sales. We have saved the summarized file as Summarized Invoice.


1.3.4 Linking both files

We used the update function to pull in interaction details from both summarized Customer 360 files into the summarized invoice file, using date of invoice and interaction as a linking point.

The purpose of doing this was that we could not find a direct linkage between customer interactions and sales invoice. To close a sales invoice, there could be multiple interactions done before the customer decides to buy a product. Therefore, we have chosen this method to account for the number of interactions done per month and tie that to sales invoices per month based on customer code and product code.

We will then have information about sales and interaction to each customer by a particular product the company made sales to. The file was saved as temporarily as Invoice Interaction.


1.3.5 Cleaning up Invoice Interaction

With the new file on hand, we have renamed both N rows for face-to-face interactions and phone web interactions to F2F Interaction Count and Call Count respectively. We then recoded missing value in F2F Interaction Count and Call Count columns with “0”. There were sales transactions done that did not contain both face-to-face and phone web interactions. As our objective was to analyze sales based on interactions, we have decided to remove those transactions that do not contain any interactions.


1.3.6 Standardizing Invoice Interaction file data

Before we came up with statistical model to analyze sales and interaction, we first checked if the data in the total sales amount (i.e Sum(Amount$)) contained a normal distribution. Initial analysis by running a distribution check indicated that the total sales amount was right-skewed as shown below.


Mean 2172.2607 Std Dev 8785.3315 Std Err Mean 78.534427 Upper 95% Mean 2326.2002
Lower 95% Mean 2018.3211 N 12514 Median 515 Mode 173.55