ANLY482 AY2016-17 T2 Group10 Analysis & Findings: Analysis

From Analytics Practicum
Revision as of 13:27, 15 April 2017 by Jxsim.2013 (talk | contribs) (Created page with "<center> 300px </center> <!------- Main Navigation Bar----> <center> {| style="background-color:#ffffff ; margin: 3px 10px 3px 10px; width="80%"| | s...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Kesmyjxlogo.png

HOME

ABOUT US

PROJECT OVERVIEW

ANALYSIS & FINDINGS

PROJECT MANAGEMENT

DOCUMENTATION

EDA

Analysis

Implications

<< ANLY482 AY2016-17 T2 Projects

ACTUAL METHOD: Analysis of Variance (ANOVA) using Fit Y by X

Analysis of Variance is a statistical method used to analyze differences among group means and their variances among and between groups. It is also a form of statistical hypothesis testing to test whether differences between pairs of group means are significant or not.

Prior to using ANOVA, we have attempted using linear regression to generalize the relationship between number of interactions and sales revenue. However, low R-squared values that suggest weak correlation and model not fitting the data were obtained, and these prompted us to carry out similar analysis using nonparametric tests like ANOVA.

The primary step to carry out ANOVA is to discretize our explanatory variable - “interaction count” into bins and as such, converting it from a numerical to categorical variable. The objective of discretization is because we wish to understand whether each of these interaction bins have significant differences between one another when it comes to sales revenue (response). To define the range of interaction counts for “Low”, “Medium” and “High” interaction bins, we consulted our sponsor, who proposed that “Low” is for interaction count less than or equal to 1, “Medium” is for interaction count from 2 to 4 and “High” is for interaction count 5 and above.