Difference between revisions of "AY1516 T2 Team13 Natasha Studio Findings RuleMining"

From Analytics Practicum
Jump to navigation Jump to search
Line 34: Line 34:
  
 
! style="font-size:13px; text-align: center; border-top:solid #ffffff; border-bottom:solid #2e2e2e" width="100px"| [[AY1516 T2 Team13 Natasha Studio_Findings_RuleMining| <span style="color:#000000">ASSOCIATION RULE MINING</span>]]
 
! style="font-size:13px; text-align: center; border-top:solid #ffffff; border-bottom:solid #2e2e2e" width="100px"| [[AY1516 T2 Team13 Natasha Studio_Findings_RuleMining| <span style="color:#000000">ASSOCIATION RULE MINING</span>]]
 +
! style="font-size:13px; text-align: center; border-top:solid #ffffff; border-bottom:solid #ffffff" width="20px"|
 +
 +
! style="font-size:13px; text-align: center; border-top:solid #ffffff; border-bottom:solid #ffffff" width="100px"| [[AY1516 T2 Team13 Natasha Studio_Findings_LogReg| <span style="color:#000000">LOGISTIC REGRESSION</span>]]
 
! style="font-size:13px; text-align: center; border-top:solid #ffffff; border-bottom:solid #ffffff" width="20px"|
 
! style="font-size:13px; text-align: center; border-top:solid #ffffff; border-bottom:solid #ffffff" width="20px"|
 
|}
 
|}

Revision as of 13:31, 17 April 2016

HOME

TEAM

PROJECT OVERVIEW

FINDINGS & ANALYSIS

PROJECT MANAGEMENT

DOCUMENTATION

EXPLORATORY DATA ANALYSIS OTHER ANALYSIS DATABASE CREATION ASSOCIATION RULE MINING LOGISTIC REGRESSION

Process Flow

Process Flow of ARM Analysis

Using SAS® Enterprise Miner 12.1, we performed ARM using the Association node to discover associations between “Member” as the ID variable and 1) “Price Package” 2) “Package (Genre)” 3) “Course/Open & Level” as 3 different Target variables. This allowed us to identify key associations between the different packages that customers would purchase.

Due to our missing data gap in 2013, we had first split our preliminary analysis into two; 2010-2012 and 2014-2015. This would allow us to see if the association between the time period are different and whether there is a need to split our subsequent analysis. Preliminary results showed that the association discovered does differ between the two time periods. Consequently, we proceeded to analyze them separately.

We also applied sequence discovery to enhance our ARM model. By adding “Date Purchased” as a Sequence variable, time of purchase is taken into account. We find this enhancement is necessary for our model as customers typically do not buy more than 1 package at the same time. Instead, they would buy 1 package, utilize it and then buy another. Thus, taking into account time is necessary in our analysis. Hence, we believe that findings for sequence discovery should be more applicable.

Calibration of ARM Analysis

The above shows our final calibration of our model. It was designed to give us the most ideal set of rules.

Comparing between our association results as well as our sequence results, we find that focusing on our sequence results is sufficient as generally, the same rules are flagged out under both analyses. The key difference is that, as mentioned, the time factor being taken into account. As such, we proceeded to focus on our sequence analysis.


Next Steps

The next step would be to further calibrate the model to adjust maximum items and other parameters. Also, at this point, time has not been taken into account. Thus, our team is looking towards performing sequence discovery in SAS EM using “date purchased” as well. We would also be analysing using other target variables like “Package” (Genre) and “Course / Open & Level” with the MemberIDs for further analysis.