Difference between revisions of "Group04 Proposal"
Yllee.2017 (talk | contribs) |
Yllee.2017 (talk | contribs) |
||
Line 30: | Line 30: | ||
* interactive selection of variables in formulation of scenario/business objectives; and <br/> | * interactive selection of variables in formulation of scenario/business objectives; and <br/> | ||
* user-friendly for non-statistician/layman.<br/> | * user-friendly for non-statistician/layman.<br/> | ||
− | + | <br> | |
'''Customers Analytics'''<br/> | '''Customers Analytics'''<br/> | ||
'''(A) Use case: Segmentation by clustering and/or classification'''<br/> | '''(A) Use case: Segmentation by clustering and/or classification'''<br/> |
Revision as of 12:06, 21 November 2018
|
|
|
|
Contents
Background
testing
Project Motivation
For our project, we intend to build a business application in the context of understanding customers, and to perform business analytics (namely, exploratory, explanatory and predictive analysis) on their customers’ demographic and transaction data. The application will be built to achieve the following objectives:
- provide good visualisation of raw data, variable and results by faceting and/or 3D view;
- interactive selection of variables in formulation of scenario/business objectives; and
- user-friendly for non-statistician/layman.
Customers Analytics
(A) Use case: Segmentation by clustering and/or classification
With the fast advancement of technologies, such as the digital transformation, Internet of Things (IoT), cloud computing, these make available huge amount of data about consumer behaviour, and about transactions, event activities and influencing factors that provide visibility into performance and behavioural decision across a variety of industries and consumer channels.
As customers have different needs and wants, they have different reasons or drivers for buying products of the company, therefore, customer segmentation is a very useful data mining technique to find groups of customers that differ in important ways associated with product interest, market participation, or response to marketing efforts. By understanding the differences among groups, a marketer can make better strategic choices about opportunities, product definition, and positioning, and can engage in more effective promotion.
(B) Use case: Regression to predict response or potential high value customers
Regression analysis is a broad term for a set of statistical methodologies used to predict a response variable (also called a dependent, criterion, or outcome variable) from one or more predictor variables (also called independent or explanatory variables). In general, regression analysis can be used to identify the explanatory variables that are related to a response variable, to describe the form of the relationships involved, and to provide an equation for predicting the response variable from the explanatory variables.
Regression, as a statistical business analytic tool, can be a powerful data mining technique to acquire better understanding of patterns and hidden relationships in the data for businesses in customer strategies.
Overview of Dataset
(1) Customer Campaigning dataset – This dataset includes customers' demographics, coverage and product related information. Marketing managers used dataset with such information to understand their customers base, what they want and what drives them, so as to be able to market effectively to their customers.
(2) Dunnhumby dataset – The Complete Journey (https://www.dunnhumby.com/sourcefiles) Dunnhumby is a data science company that specializes in Customer Data Analytics. The “Dunnhumby – A Complete Journey” dataset is a collection of transaction data at household level over two years from a group of 2,500 households who are frequent shoppers at a retail chain. The amount of details captured goes down to individual purchases, specific items, item category, demographics and includes direct campaign details including coupons and redemptions made based on the purchases made.
Application Libraries & Packages
Package Name | Descriptions |
---|---|
shiny & shiny dashboard | Interactive web applications for data visualization |
ggplot2 | High-quality graphs |
Tidyverse: tidyr, dplyr, ggplot2 | Tidying and manipulating data for visualizing in ggplot2 |
shinythemes | Apply themes to Shiny applications |
ggthemr | Apply themes to ggplot2 plots |
lubridate | Easily transform dates |
Plotly | Provide graphics |
Threejs | Provide 3-dimensional visualization |
ggraph | Provide graphics for clustering, regression |
ggiraph | Provide interactive ggplot graphics |
k Means Algorithms in R | Provide various k means algorithms in R |
ISLR | Provide glm() for logistic regression |
References
Image credit to: Christopher Dombres (under a Creative Commons license)