Difference between revisions of "ANLY482 AY2017-18T2 Group 25 : Project Overview / Methodology"

Revision as of 15:58, 10 April 2018

Data

Data Used (Only title used to maintain confidentiality):

Reservation Information

User Information

EDM Campaigns

User Email Activity

Tools Used

Methodology

Discovery

Data Preparation and Cleaning 4 raw datasets (>10GB), consisting of customer reservations and Electronic Direct Mailer (EDM) interactions data for the year of 2017 were used for analysis Data cleaning was done in Jupyter Notebook due to the size of raw data files received, while Exploratory Data Analysis (EDA) was done using both Jupyter Notebook and JMP

Exploratory Data Analysis (EDA) Initial data analysis was carried out separately to analyse the individual situation for both customer reservations and EDM, before joining both datasets to to track the conversion rates of EDM to reservations

Text Mining and Logistic Regression Analysis Text cleaning and analysis were done using JMP’s built-in text explorer A stepwise Logistic Regression model was used to explain the relationship of the words used with the conversion rate of each campaign

Difference between revisions of "ANLY482 AY2017-18T2 Group 25 : Project Overview / Methodology"

Revision as of 15:58, 10 April 2018

Data

Tools Used

Methodology

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

@@ Line 38: / Line 38: @@
 ==<div style="background: #404040; padding: 15px; font-weight: bold; line-height: 0.3em; text-indent: 15px; font-size: 16px"><font color=#ffffff >Tools Used</font></div>==
-In_progress
+[[File:Group25ToolsUsed.png|240px]]
+<br>
 ==<div style="background: #404040; padding: 15px; font-weight: bold; line-height: 0.3em; text-indent: 15px; font-size: 16px"><font color=#ffffff >Methodology</font></div>==
 <b>Discovery</b><br/>
-In_progress
+Data Preparation and Cleaning
+raw datasets (>10GB), consisting of customer reservations and Electronic Direct Mailer (EDM) interactions data for the year of 2017 were used for analysis
+Data cleaning was done in Jupyter Notebook due to the size of raw data files received, while Exploratory Data Analysis (EDA) was done using both Jupyter Notebook and JMP
+Exploratory Data Analysis (EDA)
+Initial data analysis was carried out separately to analyse the individual situation for both customer reservations and EDM, before joining both datasets to to track the conversion rates of EDM to reservations
+Text Mining and Logistic Regression Analysis
+Text cleaning and analysis were done using JMP’s built-in text explorer
+A stepwise Logistic Regression model was used to explain the relationship of the words used with the conversion rate of each campaign