AY1516 T2 Team13 Natasha Studio Findings RuleMining
EXPLORATORY DATA ANALYSIS | OTHER ANALYSIS | DATABASE CREATION | ASSOCIATION RULE MINING | LOGISTIC REGRESSION |
---|
Process Flow
Using SAS® Enterprise Miner 12.1, we performed ARM using the Association node to discover associations between “Member” as the ID variable and 1) “Price Package” 2) “Package (Genre)” 3) “Course/Open & Level” as 3 different Target variables. This allowed us to identify key associations between the different packages that customers would purchase.
Due to our missing data gap in 2013, we had first split our preliminary analysis into two; 2010-2012 and 2014-2015. This would allow us to see if the association between the time period are different and whether there is a need to split our subsequent analysis. Preliminary results showed that the association discovered does differ between the two time periods. Consequently, we proceeded to analyze them separately.
We also applied sequence discovery to enhance our ARM model. By adding “Date Purchased” as a Sequence variable, time of purchase is taken into account. We find this enhancement is necessary for our model as customers typically do not buy more than 1 package at the same time. Instead, they would buy 1 package, utilize it and then buy another. Thus, taking into account time is necessary in our analysis. Hence, we believe that findings for sequence discovery should be more applicable.
The above shows our final calibration of our model. It was designed to give us the most ideal set of rules.
Comparing between our association results as well as our sequence results, we find that focusing on our sequence results is sufficient as generally, the same rules are flagged out under both analyses. The key difference is that, as mentioned, the time factor being taken into account. As such, we proceeded to focus on our sequence analysis.
Next Steps
The next step would be to further calibrate the model to adjust maximum items and other parameters. Also, at this point, time has not been taken into account. Thus, our team is looking towards performing sequence discovery in SAS EM using “date purchased” as well. We would also be analysing using other target variables like “Package” (Genre) and “Course / Open & Level” with the MemberIDs for further analysis.