ANLY482 AY2016-17 T2 Group4: Project Overview

From Analytics Practicum
Jump to navigation Jump to search


TeamInsured Home.png   HOME

 

TeamInsured About Icon.png   PROJECT OVERVIEW

 

TeamInsured Findings.png   PROJECT FINDINGS

 

TeamInsured PM.png   PROJECT MANAGEMENT

 

TeamInsured Documentation.png   DOCUMENTATION


Motivation & Business Problem

INTENT Improve the transparency of information useful in identifying a seller’s performance to customers and sellers. PROBLEM - Customers aren’t able to identify which are the best sellers to purchase products from. - The characteristics of sellers that matter to a customer aren’t clearly defined to sellers who want to manage and improve their performance.


Project Objectives

We will identify critical features that can allow sellers to measure and manage their performance on Lazada’s platform. The aforementioned features will be exposed to the customers to help them identify the better/best sellers to purchase from.

Constraints

Production ready: Run data pipeline within 3 hours with 16gb RAM

Project Details

System Architecture

LazadaSA.png


Predictive Variables(Seller Attributes)

Shipping Time Pricing Return Rate Seller Initiated Cancellation Rate Seller Category ( e.g. home & living , fashion, multi-category sellers) Size of Seller Seller’s Years of experience on Lazada

Response Variables (Seller Performance Metrics)

Total purchases made per sales item Product Popularity Ratio (PPR) = Total Purchase / Distinct Count of products

Data Source


Sensitive Data (Not to be revealed)


Methodology

LazadaMethodology.png


Data Collection

This will be done to form the pipeline of data extraction from Lazada database and Google Analytics. The challenge is to properly pull out quality data from the relevant and updated sources.

Data Exploration and Cleaning

Manage exploratory analysis of these data. These analysis will be used to improve on business questions which also affect the exploratory analysis. This process will be done repeatedly with necessary data cleaning and munging until we find business questions which accurately express business needs given the data and exploratory analysis made.

Data Modelling

After a proper exploratory phase of the analysis, we will train and test machine learning models to to answer predictive and prescriptive business questions. This will include processes such as clustering to segment user behaviours, regression to include impacts of various seller attributes to CX Metrics, etc. Various statistical learning models such as Random Forest and Regularization might also be used to reduce risk of overfitting and increase testing accuracy of models.

Data Visualization

These data analysis will be documented visually Jupyter Notebook or interactive dashboard tools which are later demonstrated and presented to business users such as Lazada suppliers and internal teams. Insights presentation techniques such as Storyboarding and Pyramid technique (Barbara Pinto) might also be used to ensure proper presentation to match findings and business needs.