Difference between revisions of "ANLY482 AY2017-18 T1 Group03/Final Progress"

From Analytics Practicum
Jump to navigation Jump to search
Line 69: Line 69:
 
<br>
 
<br>
 
-->
 
-->
<div id="Methodology" style="border-style: solid; border-width:0; background: #ff6600; padding: 7px; font-weight: bold; text-align: justify; line-height: wrap_content; text-indent: 20px; font-size:20px; font-family:Century Gothic;border-bottom:5px solid white; border-top:5px solid black"><font color= #ffffff>3.0 Methodology</font></div>
+
<div id="Methodology" style="border-style: solid; border-width:0; background: #ff6600; padding: 7px; font-weight: bold; text-align: justify; line-height: wrap_content; text-indent: 20px; font-size:20px; font-family:Century Gothic;border-bottom:5px solid white; border-top:5px solid black"><font color= #ffffff>1.0 Methodology</font></div>
  
 
<br>
 
<br>
'''3.1 Data Collection'''
+
'''1.1 Data Collection'''
 
<br>
 
<br>
 
There were 6 main files sent to us by client out of which 5 of them have been extensively used. For our research, we obtained a full year data of 2016 and 2017 data till August.
 
There were 6 main files sent to us by client out of which 5 of them have been extensively used. For our research, we obtained a full year data of 2016 and 2017 data till August.
Line 96: Line 96:
 
<br><hr><br>
 
<br><hr><br>
  
'''3.2 Tools Used'''
+
'''1.2 Tools Used'''
 
<br>
 
<br>
 
Our research has extensively used SAS JMP Pro 13 for data cleaning and transformation and for confirmatory analysis because of its intuitive system and simplistic approach. During the initial period, there has been considerable usage of Tableau Software for exploratory data analysis because of its  easy visualisations for understanding the data.
 
Our research has extensively used SAS JMP Pro 13 for data cleaning and transformation and for confirmatory analysis because of its intuitive system and simplistic approach. During the initial period, there has been considerable usage of Tableau Software for exploratory data analysis because of its  easy visualisations for understanding the data.
Line 102: Line 102:
 
<br><hr><br>
 
<br><hr><br>
  
'''3.3 Data Cleaning & Transformation'''
+
'''1.3 Data Cleaning & Transformation'''
<br>
+
 
''3.3.1 Files Required for Cleaning & Transformation''
+
''1.3.1 Files Required for Cleaning & Transformation''
 +
 
 
{| class="wikitable" width="100%"
 
{| class="wikitable" width="100%"
 
|-
 
|-
Line 117: Line 118:
 
|}
 
|}
  
 +
 +
 +
''1.3.2 Customer Interaction''
 +
 +
We have performed a summary function for both Customer 360 (F2F) and Customer 360 (Phone Web) based on each product code sold and total interaction duration to each customer on a monthly basis.
 +
 +
Essentially, the summarized files will show us that per month, how many interactions were made and how much time were spent to sell a product to each customer. The N counts that appear in each file represents interaction counts for face-to-face (F2F Interaction Count) and phone web (Call Count) interactions. We have saved both files as Summarized Customer 360 (F2F) and Summarized Customer 360 (PW).
 +
 +
 +
 +
''1.3.3 Invoice File''
 +
 +
We performed a summary function for the invoice file to extract out relevant information for us to analyse sales and to eventually link the summarized file to interaction files. Essentially, we summarized the invoice based on products sold to individual customer and how much revenue was earned per month. We also classified the products sold by their therapy areas, which distribution channel was used to sell the products, and the sales representative who was involved in making the sales. We have saved the summarized file as Summarized Invoice.
 +
 +
 +
 +
''1.3.4 Linking both files''
 +
 +
We used the update function to pull in interaction details from both summarized Customer 360 files into the summarized invoice file, using date of invoice and interaction as a linking point.
 +
 +
The purpose of doing this was that we could not find a direct linkage between customer interactions and sales invoice. To close a sales invoice, there could be multiple interactions done before the customer decides to buy a product. Therefore, we have chosen this method to account for the number of interactions done per month and tie that to sales invoices per month based on customer code and product code.
 +
 +
We will then have information about sales and interaction to each customer by a particular product the company made sales to. The file was saved as temporarily as Invoice Interaction.
 +
<br>
 +
 +
''1.3.5 Cleaning up Invoice Interaction''
 +
With the new file on hand, we have renamed both N rows for face-to-face interactions and phone web interactions to F2F Interaction Count and Call Count respectively. We then recoded missing value in F2F Interaction Count and Call Count columns with “0”.
 +
There were sales transactions done that did not contain both face-to-face and phone web interactions. As our objective was to analyze sales based on interactions, we have decided to remove those transactions that do not contain any interactions.
 
<br>
 
<br>

Revision as of 15:52, 3 December 2017

Logo3.PNG

Home About Us Project Overview Midterm Progress Final Progress Project Management Documentation


IP.png
1.0 Methodology


1.1 Data Collection
There were 6 main files sent to us by client out of which 5 of them have been extensively used. For our research, we obtained a full year data of 2016 and 2017 data till August.

File Name (Edited for confidentiality) Description Number of Rows
Customer Interaction Information on interaction details ~213,000(FTF) & ~421,000(PW)
Employee File Information on employee teams ~2000
Customer (Organisations) Information on healthcare organisations ~117,000
Customer (non-Organisation) Information on healthcare professionals ~205,000
Invoice Information on transactions ~9,000,000



1.2 Tools Used
Our research has extensively used SAS JMP Pro 13 for data cleaning and transformation and for confirmatory analysis because of its intuitive system and simplistic approach. During the initial period, there has been considerable usage of Tableau Software for exploratory data analysis because of its easy visualisations for understanding the data.




1.3 Data Cleaning & Transformation

1.3.1 Files Required for Cleaning & Transformation

File Name (Edited for confidentiality) Description
Customer Interaction (Type A) This file contains information about face-to-face interactions that the client has done with the customers. We have included information such as customer code and allocated duration (i.e to remove double counting of duration issue in the original file) in the file.
Customer Interaction (Type B) This file contains information about phone web interactions that the client has done with the customers. We have included customer code in the file.
Clean Invoice (2015-2017) This file contains information about sales invoices that occurred between 2015 to the latest month in 2017. We have previously cleaned and mapped in Therapy Area to the file.


1.3.2 Customer Interaction

We have performed a summary function for both Customer 360 (F2F) and Customer 360 (Phone Web) based on each product code sold and total interaction duration to each customer on a monthly basis.

Essentially, the summarized files will show us that per month, how many interactions were made and how much time were spent to sell a product to each customer. The N counts that appear in each file represents interaction counts for face-to-face (F2F Interaction Count) and phone web (Call Count) interactions. We have saved both files as Summarized Customer 360 (F2F) and Summarized Customer 360 (PW).


1.3.3 Invoice File

We performed a summary function for the invoice file to extract out relevant information for us to analyse sales and to eventually link the summarized file to interaction files. Essentially, we summarized the invoice based on products sold to individual customer and how much revenue was earned per month. We also classified the products sold by their therapy areas, which distribution channel was used to sell the products, and the sales representative who was involved in making the sales. We have saved the summarized file as Summarized Invoice.


1.3.4 Linking both files

We used the update function to pull in interaction details from both summarized Customer 360 files into the summarized invoice file, using date of invoice and interaction as a linking point.

The purpose of doing this was that we could not find a direct linkage between customer interactions and sales invoice. To close a sales invoice, there could be multiple interactions done before the customer decides to buy a product. Therefore, we have chosen this method to account for the number of interactions done per month and tie that to sales invoices per month based on customer code and product code.

We will then have information about sales and interaction to each customer by a particular product the company made sales to. The file was saved as temporarily as Invoice Interaction.

1.3.5 Cleaning up Invoice Interaction With the new file on hand, we have renamed both N rows for face-to-face interactions and phone web interactions to F2F Interaction Count and Call Count respectively. We then recoded missing value in F2F Interaction Count and Call Count columns with “0”. There were sales transactions done that did not contain both face-to-face and phone web interactions. As our objective was to analyze sales based on interactions, we have decided to remove those transactions that do not contain any interactions.