Group01 proposal
Contents
Abstract
Market Basket Analysis (MBA) is one of the key techniques used by large retailers to uncover associations between products brought by consumers. In our case, we apply a sequential version of MBA, called “sequential itemset mining” or “sequential pattern mining”, to analyze whether buying one item in the past indicates a higher likelihood of buying other things in the future. For instance, whether purchasing peanut butter implies sales of bread in the near future.
Market Basket Analysis
What is Market Basket Analysis?
Market basket analysis is a type of data mining technique to find association rules between different objects in a set, find frequent patterns in a transaction database, relational databases or any other information repository.
What is Association Rules?
Association Rule Mining can tell you what items do customers frequently buy together by generating a set of rules called Association Rules in form if this then that.
How do we apply Market Basket Analysis?
The applications of Association Rule Mining are found in Marketing, Basket Data Analysis (or Market Basket Analysis) in retailing, clustering and classification. In our project, we apply it particularly by Market Basket Analysis.
Principal indicators of Association Rules:
Network Visualization Technique
Network Visualization is a technique often used to show the relationships between the different items. As the name suggested, this technique shows the relationship in network type of format, which is easier for the users to understand how different items are related to one another.
This visualization technique would complement the association rules mined from market basket analysis.
Use Case
To illustrate how the network visualization technique can complement the association rules derived from market basket analysis, we will be using dataset from Instacart.
This data is downloaded from the The Instacart Online Grocery Shopping Dataset 2017 on Feb 2020. The data dictionary can be found this link.
About Data Set
There are overall 3421083 number of orders of 206209 customers over 49688 products in 21 major catagories.
Application UI
Data loaded and preprocessing
Gentric market basket analysis by selecting threshold
Rule network on categories or items of interest
Time series analysis of selected category or item (line chart or bar chart)
Show the most ranking rules over time & Display rule between two items or categories over time
Objects
- Display the overview of relationship between category using Network diagram
- Show product bundles that expect to be consumed simultaneously and key indicators of each
- Interactively visualize how these bundles evolve over time
Packages Use
Package | Description |
---|---|
arules | Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). |
arulesViz | Extends package 'arules' with various visualization techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration. |
tidyverse | The tidyverse is an opinionated collection of R packages designed for data science. |
readr | Read Excel Files in R. |
dplyr | Tools for Splitting, Applying and Combining Data. |
ggplot2 | Create graphics and charts. |
Visnetwork | Create graphics and charts. |
Storyboard
Team Members
References
- Market Basket Analysis using R.
- A Visual Application for Better Business Decision Making.
- A Gentle Introduction on Market Basket Analysis — Association Rules
- [ https://blog.revolutionanalytics.com/2019/02/sequential-pattern-mining-in-r.html Tutorial: Sequential Pattern Mining in R for Business Recommendations]