HeaderSIS.jpg

Difference between revisions of "IS480 Team wiki: 2018T1 analyteaka projectscope"

From IS480
Jump to navigation Jump to search
 
(15 intermediate revisions by 2 users not shown)
Line 29: Line 29:
 
<!--SUB MENU-->
 
<!--SUB MENU-->
 
{| class="wikitable;style="background-color:white; color:white padding: 5px 0 0 0;" width="100%" height=50px cellspacing="0" cellpadding="0" valign="top" border="0"  
 
{| class="wikitable;style="background-color:white; color:white padding: 5px 0 0 0;" width="100%" height=50px cellspacing="0" cellpadding="0" valign="top" border="0"  
| style="vertical-align:top;width:20%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px">[[IS480 Team wiki: 2017T1 analyteaka ProjectOverview Description | <font color="#35332E"><b>PROJECT DESCRIPTION</b></font>]]  
+
| style="vertical-align:top;width:16%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px">[[IS480 Team wiki: 2018T1 analyteaka ProjectOverview Description | <font color="#35332E"><b>PROJECT DESCRIPTION</b></font>]]  
| style="vertical-align:top;width:20%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px">[[IS480_Team_wiki:_2018T1_ analyteaka_stakeholders | <font color="#35332E"><b>STAKEHOLDERS</b></font>]]  
+
| style="vertical-align:top;width:16%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px">[[IS480_Team_wiki:_2018T1_ analyteaka_stakeholders | <font color="#35332E"><b>STAKEHOLDERS</b></font>]]  
| style="vertical-align:top;width:20%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:3px solid #7D5B53; ">[[IS480_Team_wiki:_2018T1_ analyteaka_projectscope | <font color="#35332E"><b>PROJECT SCOPE</b></font>]]  
+
| style="vertical-align:top;width:16%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:3px solid #7D5B53; ">[[IS480_Team_wiki:_2018T1_ analyteaka_projectscope | <font color="#35332E"><b>PROJECT SCOPE</b></font>]]  
| style="vertical-align:top;width:20%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px"> [[IS480_Team_wiki:_2018T1_ analyteaka_xfactor | <font color="#35332E"><b>X-FACTOR</b></font>]]  
+
| style="vertical-align:top;width:16%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px"> [[IS480_Team_wiki:_2018T1_ analyteaka_xfactor | <font color="#35332E"><b>X-FACTOR</b></font>]]  
| style="vertical-align:top;width:20%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px"> [[IS480_Team_wiki:_2018T1_ analyteaka_technologies | <font color="#35332E"><b>TECHNOLOGIES</b></font>]]
+
| style="vertical-align:top;width:16%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px"> [[IS480_Team_wiki:_2018T1_ analyteaka_technologies | <font color="#35332E"><b>TECHNOLOGIES</b></font>]]
 +
| style="vertical-align:top;width:16%;" | <div style="padding: 0px; font-weight: bold font-family:Century Gothic; text-align:center; line-height: wrap_content; font-size:85%; border-bottom:1px"> [[IS480_Team_wiki:_2018T1_ analyteaka_research | <font color="#35332E"><b>RESEARCH</b></font>]]
 
|}
 
|}
 
<!--SUB MENU-->
 
<!--SUB MENU-->
Line 79: Line 80:
 
*How many products do customers usually buy in a single purchase?
 
*How many products do customers usually buy in a single purchase?
 
*What other products do customers usually buy when they purchase a coffee table?
 
*What other products do customers usually buy when they purchase a coffee table?
 +
*What is the best performing store/location?
 +
*What is the best performing month/day/period?
 +
*Which is the race/age/gender demographic for each store?
 +
*What kind of item sells better as a combination?
 
|-
 
|-
  
 
|style="text-align: center; background-color:#f0f0f0;" width="200pt" | Part 2
 
|style="text-align: center; background-color:#f0f0f0;" width="200pt" | Part 2
|style="background-color:#ffffff;"| This module is responsible for providing descriptive analytics for different stores and their respective locations. It will provide the foundation for predictive analytics (e.g. products to be recommended to customers of a specific store).
+
|style="background-color:#ffffff;"| This module is responsible for providing predictive analytics for different stores and their respective locations based on part 1.  
 
   
 
   
 
<i><u>Examples of descriptive analytics would include:</u></i>
 
<i><u>Examples of descriptive analytics would include:</u></i>
Line 90: Line 95:
 
   
 
   
 
<i><u>Examples of business questions that will be answered:</u></i>
 
<i><u>Examples of business questions that will be answered:</u></i>
*What is the best performing store/location?
+
*What is the best performing item category next quarter?
*What is the best performing month/day/period?
+
*Which store sells the most item next quarter?
*Which is the race/age/gender demographic for each store?
+
*What kind of product can we push as an add-on/bundle per season?
 
|-
 
|-
 
|}
 
|}
 +
 +
[[File:Analyteaka customization module.png|1000px|center]]
 +
This module allows the user to firstly, define the categories/sub-categories of the items. As Scanteak's item is constantly evolving, there's a need to be able to modify how they define the categories of the item. E.g item code that starts with C would fall under leather, C69 would be a living room leather item.
 +
 +
Secondly, the customer clustering provided by the machine learning would provide a highly generic label with no actual meaning e.g Profile A (Age 20-40, stays in Bedok, Changi, and Punggol, house type tends to be condo). The user would be able to change the label base on their own definition. e.g. White Collar East Sider.
  
 
[[File:Analyteaka_Bootstrap.png|1000px|center]]
 
[[File:Analyteaka_Bootstrap.png|1000px|center]]
A new sales system, meant to replace Scanteak’s legacy system, is currently being developed by Scanteak’s in-house developers. As the new sales system is still in the midst of completion, the bootstrap module will allow the user to upload the customer data exported from the new sales system. Once the new sales system has been completed, the manual bootstrapping of CSV files will be phased out and the bootstrap module will be modified to interact with the new sales system directly through API calls.
+
A new sales system, meant to replace Scanteak’s legacy system, is currently being developed by Scanteak’s in-house developers. As the new sales system is still in the midst of completion, the data upload module will allow the user to upload the customer data exported from the new sales system. Once the new sales system has been completed, the manual data uploading of CSV files will be phased out and the data upload module will be modified to interact with the new sales system directly through API calls.
  
<i><u>Steps for bootstrapping</u></i>
+
Furthermore, the new system will allow for a higher level of data quality, providing better [https://hbr.org/2018/04/if-your-data-is-bad-your-machine-learning-tools-are-useless predictive analytics result].
  
1. Stream uploaded csv into data frame<br>
+
<i><u>Steps for data uploading</u></i>
2. Change all column types into its respective value. e.g  str -> datetime <br>
+
 
3. Infer all columns required<br>
+
1. Data cleaning by removing the duplicate row (double entry, invalid rows)<br>
*Gender and ethics based on first name and last name
+
2. Infer columns required<br>
*Residential district based on postal code  
+
* Gender and ethnicity based on first name and last name
*Age based on NRIC.   
+
* Residential district/housing type/ house value  based on postal code  
4. Data cleaning by removing duplicate row (double entry, invalid rows) <br>
+
* Age and citizenship based on NRIC.   
5. Convert into datastore request objects <br>
+
 
6. Uploading data to datastore<br>
+
3. Convert into datastore request objects <br>
 +
4. Uploading data to datastore<br>
  
 
[[File:Analyteaka_data_visualisation.png|1000px|center]]
 
[[File:Analyteaka_data_visualisation.png|1000px|center]]
This module will make use of Bootstrap, flask and Dash by Plothy to generate charts based on data output generated by customer and store profiling module.  
+
This module will make use of data upload, flask, and Dash by Plothy to generate charts based on data output generated by customer and store profiling module.
 +
 
 +
[[File:Analyteaka Marketing planning.png|1000px|center]]
 +
 
 +
This module aids the management in the planning for upcoming marketing campaign/flash sales.
 +
 
 +
The module will provide a recommendation of item pairing for cross-selling, particular customer cluster/stores to focus on based on the user's input of stores, item categories, date range, payment methods, target segmentation and day of the weeks. This will reduce the time needed for deciding targeted Facebook ads' demographics or planning the item or location for an upcoming sales campaign.
  
[[File:Analyteaka_analytics_reporting.png|1000px|center]]
 
This module provide weekly and monthly report based on data provided by the store and customer profiling module and provide recommendations based on the data. It will make use of data visualization module to generate reports for end users based on their roles (sales executives, senior management, etc.).
 
  
 
[[File:Analyteaka_machine_learning.png|1000px|center]]
 
[[File:Analyteaka_machine_learning.png|1000px|center]]
 
This module will contain the machine learning system. As we are using Python as our main programming language, we will be utilizing libraries such as – SciPy, NumPy, matplotlib, pandas, Scikit-learn to help us complete this module. Using the training dataset (6 months’ worth of offline retail data) we have prepared, we will train the system to provide predictive analytics for both customers and stores.
 
This module will contain the machine learning system. As we are using Python as our main programming language, we will be utilizing libraries such as – SciPy, NumPy, matplotlib, pandas, Scikit-learn to help us complete this module. Using the training dataset (6 months’ worth of offline retail data) we have prepared, we will train the system to provide predictive analytics for both customers and stores.
 
   
 
   
The entire process can be automated, whereby the system will retrieve raw sales data from the in-house sales system, process the raw sales data in the machine learning module before handing it over to the analytics & reporting module.
+
The entire process can be automated, whereby the system can retrieve raw sales data from the in-house sales system, upload it to the server, process the raw sales and payment data in the machine learning module before returning it to the database.  
+
 
 
<i><u>Examples of predictive analytics:</u></i>
 
<i><u>Examples of predictive analytics:</u></i>
 
*Recommended products for different customer profiles
 
*Recommended products for different customer profiles
 
*Recommended price range for different customer profiles
 
*Recommended price range for different customer profiles
 
*Recommended products to be displayed for different stores
 
*Recommended products to be displayed for different stores
*Best selling type of item and item category for different stores
 
 
   
 
   
 
<i><u>Examples of business questions that will be answered:</u></i>
 
<i><u>Examples of business questions that will be answered:</u></i>
Line 132: Line 146:
 
*What kind of customers (e.g. Chinese) are you expecting at a certain branch (e.g. Suntec) and what kind of furniture (e.g. Oriental-style) should you display?
 
*What kind of customers (e.g. Chinese) are you expecting at a certain branch (e.g. Suntec) and what kind of furniture (e.g. Oriental-style) should you display?
  
 +
[[File:Analyteaka_Footer_Transparent.png|1000px|center]]
 
<!--CONTENT-->
 
<!--CONTENT-->

Latest revision as of 16:24, 15 August 2018

Analyteaka Header.png


HOME

ABOUT US

PROJECT OVERVIEW

PROJECT MANAGEMENT

DOCUMENTATION



Analyteaka Scope.png
Analyteaka customer profiling.png
Customer categorization Based on the historical sales data provided by Scanteak, we are going to work out the customers’ race (from name), gender (from name), age (from NRIC), income level (based on their housing district), and if they are return customers (based on the past transaction records).
Customer profiling Moving forward from customer categorization, which isolates various identifiable traits (age, race, gender etc.), we are going to generate several profiles/personas based on a combination of identifiable traits.

Examples of descriptive analytics would include:

  • Age/race/gender/housing district composition of customers
  • Customer return rate
  • Sales and quantity sold for each identifiable trait or profile
  • Main payment method

Examples of business questions that will be answered:

  • What is the average amount spent by customers between the age 40-45?
  • Are they mostly new customers or return customers?
  • What is the main target profile (40-year-old Chinese male)?
Analyteaka store profiling.png
Part 1 This module is responsible for providing descriptive analytics for different products and their respective categories. It will provide the foundation for predictive analytics (e.g. recommended product and quantity allocation for each store).

Examples of descriptive analytics would include:

  • Sales figure and quantity sold for each product and category
  • Sales composition of each product and category
  • Grouping of products which are commonly purchased together

Examples of business questions that will be answered:

  • What is the best performing product/category for this outlet?
  • How many products do customers usually buy in a single purchase?
  • What other products do customers usually buy when they purchase a coffee table?
  • What is the best performing store/location?
  • What is the best performing month/day/period?
  • Which is the race/age/gender demographic for each store?
  • What kind of item sells better as a combination?
Part 2 This module is responsible for providing predictive analytics for different stores and their respective locations based on part 1.

Examples of descriptive analytics would include:

  • Number of customers for each store
  • Number of sales and sales figure for different months/days/periods
  • Customer (age, race, gender, etc.) composition of each store

Examples of business questions that will be answered:

  • What is the best performing item category next quarter?
  • Which store sells the most item next quarter?
  • What kind of product can we push as an add-on/bundle per season?
Analyteaka customization module.png

This module allows the user to firstly, define the categories/sub-categories of the items. As Scanteak's item is constantly evolving, there's a need to be able to modify how they define the categories of the item. E.g item code that starts with C would fall under leather, C69 would be a living room leather item.

Secondly, the customer clustering provided by the machine learning would provide a highly generic label with no actual meaning e.g Profile A (Age 20-40, stays in Bedok, Changi, and Punggol, house type tends to be condo). The user would be able to change the label base on their own definition. e.g. White Collar East Sider.

Analyteaka Bootstrap.png

A new sales system, meant to replace Scanteak’s legacy system, is currently being developed by Scanteak’s in-house developers. As the new sales system is still in the midst of completion, the data upload module will allow the user to upload the customer data exported from the new sales system. Once the new sales system has been completed, the manual data uploading of CSV files will be phased out and the data upload module will be modified to interact with the new sales system directly through API calls.

Furthermore, the new system will allow for a higher level of data quality, providing better predictive analytics result.

Steps for data uploading

1. Data cleaning by removing the duplicate row (double entry, invalid rows)
2. Infer columns required

  • Gender and ethnicity based on first name and last name
  • Residential district/housing type/ house value based on postal code
  • Age and citizenship based on NRIC.

3. Convert into datastore request objects
4. Uploading data to datastore

Analyteaka data visualisation.png

This module will make use of data upload, flask, and Dash by Plothy to generate charts based on data output generated by customer and store profiling module.

Analyteaka Marketing planning.png

This module aids the management in the planning for upcoming marketing campaign/flash sales.

The module will provide a recommendation of item pairing for cross-selling, particular customer cluster/stores to focus on based on the user's input of stores, item categories, date range, payment methods, target segmentation and day of the weeks. This will reduce the time needed for deciding targeted Facebook ads' demographics or planning the item or location for an upcoming sales campaign.


Analyteaka machine learning.png

This module will contain the machine learning system. As we are using Python as our main programming language, we will be utilizing libraries such as – SciPy, NumPy, matplotlib, pandas, Scikit-learn to help us complete this module. Using the training dataset (6 months’ worth of offline retail data) we have prepared, we will train the system to provide predictive analytics for both customers and stores.

The entire process can be automated, whereby the system can retrieve raw sales data from the in-house sales system, upload it to the server, process the raw sales and payment data in the machine learning module before returning it to the database.

Examples of predictive analytics:

  • Recommended products for different customer profiles
  • Recommended price range for different customer profiles
  • Recommended products to be displayed for different stores

Examples of business questions that will be answered:

  • What kind of furniture (based on price range) should you recommend to a certain customer profile (e.g. 30-year-old Chinese male at the Suntec Branch)
  • What kind of customers (e.g. Chinese) are you expecting at a certain branch (e.g. Suntec) and what kind of furniture (e.g. Oriental-style) should you display?
Analyteaka Footer Transparent.png