Difference between revisions of "ANLY482 AY2016-17 T2 Group3: PROJECT OVERVIEW/ Methodology"

From Analytics Practicum
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 85: Line 85:
 
<div><font face="Open Sans">
 
<div><font face="Open Sans">
 
Next, cluster analysis will be carried out to determine the existence of clusters amongst Vanitee’s customers and beauty professionals. We will attempt to identify the profiles of each cluster according to their booking history and examine the reasons affecting the performance of each cluster. Thereafter, we hope to translate the identified clusters into a form of customer segmentation to help Vanitee better understand its customer base.  
 
Next, cluster analysis will be carried out to determine the existence of clusters amongst Vanitee’s customers and beauty professionals. We will attempt to identify the profiles of each cluster according to their booking history and examine the reasons affecting the performance of each cluster. Thereafter, we hope to translate the identified clusters into a form of customer segmentation to help Vanitee better understand its customer base.  
</font></div>
 
 
<div style="height: 2em"></div>
 
 
<div style="background: #EAEAEA; line-height: 0.3em; border-left: #000000 solid 8px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;"><font face ="Open Sans" color= "black" size="2"><b>SURVIVAL ANALYSIS</b></font></div></div>
 
<div style="height: 1em"></div>
 
<div><font face="Open Sans">
 
We will also be attempting to conduct survival analysis to predict the Customer Lifetime Value (CLV) by campaign. Survival analysis is a statistical technique that analyzes the duration to a certain event (e.g. a booking on Vanitee). Hence, such analysis will aim to which campaign drives the highest value customer and in the event of a new campaign, which customer profile will be respond early. The effectiveness of campaign codes in ensuring repeat bookings can also be investigated through such an analysis.
 
</font></div>
 
 
<div style="height: 2em"></div>
 
 
<div style="background: #EAEAEA; line-height: 0.3em; border-left: #000000 solid 8px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;"><font face ="Open Sans" color= "black" size="2"><b>EXTRAPOLATION</b></font></div></div>
 
<div style="height: 1em"></div>
 
<div><font face="Open Sans">
 
Lastly, we will attempt to extrapolate the data to forecast and calculate projected number of Unicorn, Explorer, Rockstar tier customers under the current VIP loyalty  program. The extrapolation will be based off the customer booking patterns observed in the current dataset.
 
 
</font></div>
 
</font></div>
  

Latest revision as of 03:55, 22 April 2017

V Logo.png


HOME   ABOUT US   PROJECT OVERVIEW   PROJECT FINDINGS   PROJECT MANAGEMENT   DOCUMENTATION   ALL PROJECTS



DATA COLLECTION

We will use the data provided to us by Vanitee which through our access to their MongoDB database on the cloud. In particular, we will target data tables that pertain to customers, beauty professionals, bookings and loyalty programmes.

DATA PREPARATION

As mentioned above, data rows within each data table may differ slightly in the number of columns (attributes) they contain. As such, we will attempt to consolidate the data into suitable and consistent formats to be used for analysis.

Additionally, data tables that have relationships with other data tables can be combined into one dataset. Hence, we will attempt to prepare different datasets according to the project objectives.

EXPLORATORY DATA ANALYSIS

We will look into the bookings customers make and also the use of credits and campaign codes when they are making their bookings. From here, we will be able to understand the buying behaviour of customers and analyze the trends in their bookings. Additionally, we will also identify any trends in their usage of gems. As for beauty professionals, we will go into observing the frequency of their bookings, services they put up on the platform as well as their chat responsiveness.

DATA CLEANING

Missing values and outliers observed during the exploration of data may invite unnecessary inaccuracy and skewness in our analysis. To handle missing values, we will look at the amount of missing values identified and determine if the value should be estimated or simply removing the entire data row. For outliers, we will attempt to analyze why they exist and decide if they are relevant enough to be included in our analysis.

DATA NORMALISATION & TRANSFORMATION

As the distribution of values differ amongst different attributes, we will attempt to normalize such attributes before commencing our analysis to prevent these attributes from dominating other attributes. Also, data transformation techniques such as discretization and binarization will be performed to convert the necessary data to categorical and binary form respectively.

CLUSTER ANALYSIS

Next, cluster analysis will be carried out to determine the existence of clusters amongst Vanitee’s customers and beauty professionals. We will attempt to identify the profiles of each cluster according to their booking history and examine the reasons affecting the performance of each cluster. Thereafter, we hope to translate the identified clusters into a form of customer segmentation to help Vanitee better understand its customer base.