Difference between revisions of "ISSS608 2016-17 T3 Group8 Arules Project Proposal"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 45: Line 45:
 
However, not all datasets are ready to do this kind of data mining in R, for example continuous variables are not easy to handle, which is a barrier for users to step into the world of data analytics. We intend to develop an association rule mining application for non-statistician users to understand easily, to play with and get insights, and to bring them into data analytics using the fundamental yet powerful concept - probabilities.
 
However, not all datasets are ready to do this kind of data mining in R, for example continuous variables are not easy to handle, which is a barrier for users to step into the world of data analytics. We intend to develop an association rule mining application for non-statistician users to understand easily, to play with and get insights, and to bring them into data analytics using the fundamental yet powerful concept - probabilities.
  
<b><big>Room for Improvement of Current Pakcages</big></b><br>
+
<b><big>Room for Improvement of Current Packages</big></b><br>
 +
 
 +
Current R packages available for association rule mining are helpful for data analysts, but their limitations also brings difficulty to interpret the analysis results:
 +
 
 +
1) Static visualizations
 +
Visualizations provided in ARM current packages are mostly static,
 +
2) Lack Interactivity
 +
3)

Revision as of 00:02, 7 August 2017

Group eight Logo.png A Visual Application for Better Business Decision Making

Introduction

Project Proposal

Final Report

Application

 


Current Packages

There are two core packages used in our application, both of which are under the “arules” family.

arules:

“Arules” is the very foundation on which we built this application. “Arules” enables users to apply association rule mining algorithms on transaction data or any other data that meets certain requirement. It is quite powerful at manipulating and transforming data, pruning redundant rules, as well as filtering association rules generated. Users can filter the rules by customizing thresholds for support, confidence, and lift, as well as the antecedent and consequent, and sort the rules by support, lift and confident.

arulesviz:

“Arulesviz” is a R package that provides users various visualizations of association rules. Users can choose to visualize their association rules using scatter plot, matrix-based visualization, grouped matrix-based visualization, graph-based visualization, parallel coordinates plot, double-decker plot etc. The diversity of visualizations provided makes it the most popular R package for visualizing association rules. Yet one drawback of this package is that these visualizations are all static graphs, which lacks interactivity with users.

Motivation

Association Rule Mining is Powerful
Although association rule mining is usually applied in market basket analysis to mine the relationship between different products, it is actually a very powerful algorithm that can be applied for any dataset to discover the association, correlation and causation between variables. With an interesting target variable, we actually can find out the relevant association rules within the dataset, even if it is not a transaction data.


However, not all datasets are ready to do this kind of data mining in R, for example continuous variables are not easy to handle, which is a barrier for users to step into the world of data analytics. We intend to develop an association rule mining application for non-statistician users to understand easily, to play with and get insights, and to bring them into data analytics using the fundamental yet powerful concept - probabilities.

Room for Improvement of Current Packages

Current R packages available for association rule mining are helpful for data analysts, but their limitations also brings difficulty to interpret the analysis results:

1) Static visualizations Visualizations provided in ARM current packages are mostly static, 2) Lack Interactivity 3)