Difference between revisions of "ANLY482 AY2016-17 T2 Group3: HOME/Interim"
Line 380: | Line 380: | ||
=== Project Timeline === | === Project Timeline === | ||
− | [[File:V_Project_Timeline_4.JPG| | + | [[File:V_Project_Timeline_4.JPG|1000px|center]] |
=== Revised Work Plan === | === Revised Work Plan === | ||
− | [[File:V_Project_Schedule_2.JPG| | + | [[File:V_Project_Schedule_2.JPG|1000px|center]] |
Revision as of 21:49, 22 February 2017
HOME | ABOUT US | PROJECT OVERVIEW | PROJECT FINDINGS | PROJECT MANAGEMENT | DOCUMENTATION | ALL PROJECTS |
Contents
Overview
Project Background & Motivation
Vanitee was officially launched in May 2015, in an attempt to bridge the gap between customers and independent beauty professionals. Typically, beauty professionals that are listed on the platform are emerging and independent beauty artists. To put it simply, they are professionals who want to grow their brand and customer base. By providing such a platform, Vanitee is able to help them showcase what they do best.
However, this does not mean that there are no competitors. Competitors include brick and mortar shops in local neighbourhoods and even bigger beauty brands with chain stores such as Jean Yip Group. Even though these are physical stores, they still pose as a threat as customers can still choose to go to these stores instead of using Vanitee to engage a beauty professional. Hence, Vanitee does not want to stop at just providing a platform for these beauty professionals and for customers to engage them. Furthermore, with an increasing number of professionals and customers coming on board, evaluating their performance so far becomes much more imperative.
Firstly, to further the success of their application, Vanitee has to place emphasis on attracting new customers as well as retaining their existing customers. Many customers might have become dormant after just one booking. Hence, analysis can be done to find out why they have turned dormant and identify possible solutions to attract them to make the next booking.
Secondly, in response to these dormant customers, Vanitee currently has an extensive loyalty program (as shown in Figure 1) in place that offers customers credits, gems as well as campaign codes with every booking made. However, one issue they face is the lack of understanding of how consumers utilize these in-app resources. Also, they wish to understand the effectiveness of such a loyalty program in encouraging customers to make repeated bookings in the future.
Project Objectives
Hence, by utilizing the data from their current application’s database, we would wish to discover meaningful and informative insights which will allow Vanitee to better retain their customers and beauty professionals and understand the effectiveness of their current loyalty program. To achieve the above mentioned, we have set the following objectives:
Customers
- To determine the customer segmentation (different groups of customers) from the current booking patterns. Which customers are stagnant? Which customers are actively using the app?
- To understand customers’ behaviour. When was the last time a customer used the app? How frequent does a customer use the app? How much does a customer spend on average?
- To evaluate the effectiveness of using campaign codes to ensure customers repeat their bookings
- To understand how customers are using credits and gems (refer to Figure 2), whether they are accumulating before use or using them in their next booking
- To determine the Customer Lifetime Value (CLV) by campaign (which promotional campaign drives the highest value customer?) To which campaign do customers react to more? Do customers respond more to campaigns giving discounts in dollar amounts (e.g. $20 off) or to percentage amounts (e.g. 20% off)? Which customers react and respond more to campaigns, credits and gems?
- Which services generate the most profits?
Beauty Professionals
- To determine if there is any correlation on what makes beauty professionals more attractive to customers (based on the following hypothesis).
- Are beauty professionals more attractive if they have a higher chat response rate?
- Are beauty professionals more attractive if they have a greater variety of services?
- Are beauty professionals more attractive if they offer less expensive services compared to other professionals?
- Are beauty professionals more attractive if they offer services on non-working days or hours?
Data Integration and Filtering
Data Collection
To facilitate our data analysis, Vanitee has provided us with access to their current MongoDB database on the cloud. The database contains numerous tables such as customers, beauty professionals, bookings, campaigns etc. Our team has decided to use two full years worth of data which ranges from Jan 2015 to Dec 2016.
Extracted Tables
Different types of data are currently represented by different tables in the database. In this case, there were a total of 59 tables for us to utilize. After exploring each table and its suitability for analysis, we eventually narrowed down to the following 7 tables:
Bookings
A row in this table represents a specific booking of a Customer with a Beauty Professional. The detailed description of the main columns in this table is as follows:
Campaigns
A row in this table represents a specific campaign (marketing initiative). The detailed description of the main columns in this table is as follows:
Categories
A row in this table represents a specific category that can be used to classify services. The detailed description of the main columns in this table is as follows:
Users
A row in this table represents a specific user that has an account on the application (either as a customer or professional). The detailed description of the main columns in this table is as follows:
Customers
A row in this table represents a specific customer in relation to a specific beauty professional. The detailed description of the main columns in this table is as follows:
Professionals
A row in this table represents a specific beauty professional. The detailed description of the main columns in this table is as follows:
Services
A row in this table represents a specific service offered by a beauty professional. The detailed description of the main columns in this table is as follows:
Challenges
As a relatively new startup that is still growing, Vanitee has been doing quite a bit of testing to minimize the number of bugs in their application. However, it has been observed that their testing is not done in a consistent manner which makes it difficult for us to filter out test data accurately. Moreover, some of the tables do not have indicative columns to indicate whether a record is a test data or not. Hence, to overcome this, we have implemented the following measures:
- Narrowed the initial data range from Aug 2014 - Dec 2016 to Jan 2015 - Dec 2016 as Vanitee had mentioned that larger amounts of testing happened in 2014 when they first officially launched
- Remove records that have columns that indicate that the record is a test data (e.g. columns such as is_test, test_at, deleted_at)
- Manually searched for the keyword “test” that appeared in several columns that involve name and description
Another challenge we faced was due to the way Vanitee had structured a few of their tables in the database. For example, the Bookings & Categories table had way more records than expected. The main reason for this was that most of the records were actually related to each other where certain records had a master_id column that referenced another record within the same table. Vanitee’s rationale behind the master_id column was that they wanted to keep track of the changes made to each record but as new records instead of updating the current record. One specific example is that a single booking can have a parent booking (also known as the master booking). The multiple child bookings represent the different states that the booking has undergone. Hence, to overcome this, we have implemented the following measures:
- For those tables with the column master_id, we created a new column called is_master to indicate whether that record is a master record or not
- Non-master records are then excluded from analysis and the formulation of graphs
Data Cleaning and Exploration
Issues
Before exploring the data, we faced several issues such as duplication of data, missing values as well as changes in Vanitee’s business model. As these issues may potentially affect the accuracy of our analysis, we have implemented the following measures to overcome these issues prior to performing our analysis.
Duplicate Values
Out of the 7 tables identified above, only the Campaigns table had records with duplicate values. As seen in Figure 3 below, we observed that there were several records that had similar campaign names and almost identical creation date times. However, we realized that only 1 campaign record had campaign codes, while the others did not have. Hence, we made the assumption that only campaigns that had campaign codes were true campaigns that were carried out. Also, to help reduce such duplicate values, we made use of the column is_published to sieve out those campaigns that were actually published over the past 2 years.
Missing Values
After examining the tables, we realized that different tables had a varying amount of columns with missing values. Firstly, when we looked at the Bookings table, columns that contained monetary values (e.g. final_price, total_price, discount_amount etc.) mostly had missing values as well. Upon further inspection, we deduced that these missing values actually represented the monetary value of 0. Hence we replaced those missing values with the value of 0. Another column that had a huge amount of missing values was price_vanitee_transaction_fee which is the fee that Vanitee profits when an online booking has been successfully checked out. Subsequently, we found out that these missing values were attributed to changes in Vanitee’s business model which we will elaborate further in the next issue.
Another major table that had columns with missing values was the Campaigns table. Columns like start_at and end_at had several missing values that confused us initially. After analyzing Vanitee’s online dashboard, we learnt that campaigns need not have start and end dates to be created. Campaigns that had missing start dates would use the creation date as replacement as clarified with Vanitee. On the other hand, campaigns that had missing end dates would mean that the campaign codes could be redeemed as long as the start date had passed.
Changes in Business Model
After meetings with Vanitee and email correspondence with their developers, we learnt that Vanitee’s current business model was only recently implemented (around Nov to Dec 2016). As seen in Figure 4, there are several changes to their business model. To make matters worse, the Bookings table did not have a column that indicates whether a booking was based on the current or previous business model. In addition, this also partially explains why the column price_vanitee_transaction_fee had missing values (as mentioned above).
Since some of our analysis we were about to do involved calculating the profit or Vanitee fee made per booking, we came up with the formula (as shown in Figure 5) below as an alternative way of calculating that value. To put it simply, the profit per booking is the final price that the customer pays (after any discount) minus away the payout that the professional receives, minus away the transaction fee incurred from payment by credit card and finally minus away the cashback that the customer receives as credits. We felt that the benefit of using this formula was that it ignores the Vanitee fee in its calculation and uses the other columns that do not have much missing values instead. Also, it works for any booking regardless of past or current business model.
Findings
Users, Customers & Professionals
Figure 6 - Users breakdown by type |
Description |
Figure 7 - Customers breakdown by age |
Description |
Figure 8 - Customers breakdown by gender |
Description |
Figure 9 - Professionals breakdown by age |
Description |
Figure 10 - Professionals breakdown by gender |
Description |
Bookings
Figure 11 - Bookings breakdown by type |
Description |
Figure 12 - Bookings breakdown by status |
Description |
Figure 13 - Bookings breakdown by frequency |
Description |
Figure 14 - Bookings breakdown by year |
Description |
Figure 15 - Bookings breakdown by month |
Description |
Figure 16 - Bookings breakdown by month & year |
Description |
Figure 17 - Bookings breakdown by day |
Description |
Figure 18 - Bookings breakdown by recency (initial) |
Description |
Figure 19 - Bookings breakdown by recency (final) |
Description |
Figure 20 - Bookings breakdown by monetary value (final) |
Description |
Figure 21 - Bookings breakdown by duration from signup to 1st booking (initial) |
Description |
Figure 22 - Bookings breakdown by duration from signup to 1st booking (final) |
Description |
Figure 23 - Bookings breakdown by service count |
Description |
Figure 24 - Bookings breakdown by category |
Description |
Figure 25 - Bookings breakdown by campaign usage |
Description |
Figure 26 - Bookings breakdown by credit usage |
Description |
Figure 27 - Profit per booking formula comparison |
Description |
Services
Figure 28 - Services breakdown by price |
Description |
Figure 29 - Services breakdown by professional |
Description |
Figure 30 - Services breakdown by category |
Description |
Campaigns
Figure 31 - Campaigns breakdown by duration |
Description |
Figure 32 - Campaigns breakdown by type |
Description |
Figure 33 - Campaigns breakdown by discount type |
Description |
Figure 34 - Campaigns breakdown by discount amount |
Description |
Figure 35 - Campaigns breakdown by usage |
Description |
Figure 36 - Campaigns breakdown by duration from start to 1st booking (initial) |
Description |
Figure 37 - Campaigns breakdown by duration from start to 1st booking (final) |
Description |
Revised Methodology
Cluster Analysis
Next, cluster analysis will be carried out to determine the existence of clusters amongst Vanitee’s customers and beauty professionals. We will attempt to identify the profiles of each cluster according to their booking history and examine the reasons affecting the performance of each cluster. Thereafter, we hope to translate the identified clusters into a form of customer segmentation to help Vanitee better understand its customer base.
Survival Analysis
We will also be attempting to conduct survival analysis to predict the Customer Lifetime Value (CLV) by campaign. Survival analysis is a statistical technique that analyzes the duration to a certain event (e.g. a booking on Vanitee). Hence, such analysis will aim to which campaign drives the highest value customer and in the event of a new campaign, which customer profile will be respond early during a campaign (over a fixed period of time). The effectiveness of campaign codes in ensuring repeat bookings can also be investigated through such an analysis.
Revised Scope of Work
Our scope of work mainly remains the same as before as stated in our project proposal. However, based on the comments given by our sponsor and project supervisor, our team has decided to revise some of the exploratory data analysis. For certain charts, we will break down further to derive more specific information that will help Vanitee understand the situation on hand. The charts we will supplement are as follows:
- A chart to display the breakdown of bookings based on the day it is being checked out, meaning the day the service is being booked for and carried out. This chart will complement the already existing chart that displays the breakdown of bookings based on the day the booking was created, meaning the day the customer made the booking.
- A chart that will display the breakdown of individual services by price. For now, the current bar chart shows the breakdown of of the services by price.
- A chart that will display the breakdown of campaigns that have the most redemptions. This chart will facilitate our analysis to find out the ideal duration and discount a campaign should have.
From the feedback we got from Prof. Kam, we have decided to focus our analysis more on customers and bookings (which includes the usage of campaigns and credits) as compared to the beauty professionals. Also, our sponsor feels that it is more beneficial to them to understand more about their customers. For this reason, we will not be looking at the activeness of a beauty professional but we will continue the analysis on the attractiveness of a beauty professional as the attractiveness of a beauty professional affects the way customers make their booking whereas the activeness of the beauty professional is less likely to do so.
Moving forward, our analysis will allow us discover insights based on Vanitee’s revenue and bookings. For instance, for the service(s) that have the most bookings made, Vanitee would like to know what the ideal price of the service should be. Also, we will look into successful bookings that used campaign codes to find out the ideal type of discount and for what duration.
The table below will illustrate better our updated work scope:
Project Timeline
Revised Work Plan