ANLY482 AY2016-17 T2 Group10 Project Overview: Methodology
Correlations
Some questions we hope to answer include what should the business invest in in order to achieve higher efficiency and growth and which sales method is the most efficient? For this, we could look at correlations between sales revenue and inputs. While correlation is not indicative of causation, it can be highly suggestive.
Cluster Analysis + Machine Learning
Depending on quality of data and conversations in future, we also hope to create a machine learning model that will be able to do some predictive analytics. For example by predicting how would performance vary if we change an input resource.
We could do clustering on the client data, and then for each client cluster, we can train an artificial neural network (ANN) on the sales inputs, client characteristics and resulting revenue and thereby predict results based on sales input. This is to create a predictive model for each type of client. A neural network has been applied to sales forecasting in a fashion retail setting before (Sun et al.), but to the best of our knowledge has not been applied to pharmaceutical sales.
After the clustering, we could also compare the revenue to the sales input to identify the more efficient teams or methods and recommend GSK to analyze them in future to uncover the reasons behind the efficiency and to spread them as best practices through the organization.
Survival Analysis
Survival Analysis is a statistical technique used to analyze the expected duration of time until an event occurs and also one of the cornerstones of customer analytics. An event in our project context can be customer attrition (where existing customers turnover to other companies) or inventory depletion (where certain pharmaceutical products run dry). An understanding of when customer turnover or when inventory will need to be replenished enables GSK to plan timing for churn prevention efforts through more frequent sales channel.