Red Dot Payment - Mid-Term

From Analytics Practicum
Revision as of 20:14, 1 March 2018 by Ruizhi.ong.2014 (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

HOME

 

PROJECT OVERVIEW

 

PROJECT FINDINGS

 

PROJECT DOCUMENTATION

 

PROJECT MANAGEMENT

 

ANLY482 HOMEPAGE

Mid-Term Finals


Data Exploration

Due to the sensitivity of the data, please refer to the interim report for more details.


General Descriptive Statistics

  • We have identified the top merchants in terms of total number of approved transactions, and approved transaction monetary value.
  • Moving on, we also find out the top merchants contributing to the other types of transaction in each bin.



Line of Fit
As mentioned under ‘Objectives’, we want to visualise and compare the performance among merchants in the same bin. Initially, we wanted to use funnel plot to perform this visualisation. Funnel plots are a form of scatter plot in which number of approved transactions per merchant are plotted against number of transactions per merchant. Control limits, similar to confidence limits are then overlaid on the scatter plot. The control limits represent the expected variation in number of approved transactions assuming that the only source of variation is stochastic. Thus, those who lie outside of the control limits would allow us to identify the star and laggard merchants in the same bin.

However, we were unable to obtain proper model fit for the funnel plots for each bin, as seen in the example under Figure 6, where we plot the funnel plot for Bin/Group 5.

Figure7RDP.jpg
Figure 6: Funnel plot for Bin/Group 5s



Thus, we decided to use Line of Fit graphs instead. We plot the Line of Fit graphs for each bin, by comparing the number of approved transactions per merchant against total transaction per merchant in that bin. As shown in Line of Fit graphs in Figure 7a to 7e, we were able to get more proper model fit, and realised that merchants that lie outside the confidence interval can be grouped into over-performing (star) and under-performing (laggard) merchants. These merchants can be identified using the JMP’s brush tool. Merchants that fall above the confidence interval and regression line are identified as star merchants. Our client should focus on identifying factors that have led to the higher than average approved transaction rates of the star merchants. Merchants that fall below the confidence interval and regression line are identified as laggard merchants.

Merchants who lie within the confidence interval (blue shaded area) follow a stochastic trend and will not be analysed. In addition, we realised that the Line of Fit graph for Bin 1 does not provide a good model fit for analysis. Thus, the Line of Fit graph for Bin 1 will also not be analysed.

Figure8aRDP.jpg
Figure 7a: Group 5 Line of Fit Graph




Figure8bRDP.jpg
Figure 7b: Group 4 Line of Fit Graph




Figure8cRDP.jpg
Figure 7c: Group 3 Line of Fit Graph




Figure8dRDP.jpg
Figure 7d: Group 2 Line of Fit Graph




Figure8eRDP.jpg
Figure 7e: Group 1 Line of Fit Graph



While we have identified laggard merchants from Line of Fit graphs, we should also set individual benchmarks for each merchant. Based on each merchant’s volume of transactions, we will identify the respective average number of approved transactions they should have in order to be within the confidence level. Thus, our client can highlight how each individual laggard merchant compares to other merchants within the same bin and set benchmarks based on the regression line. This will motivate merchants to improve their own performance.

Time Series Analysis

Figure9aRDP.jpg
Figure 8a: Time Series Pattern for each type of transaction across 24 months



Based on the time-series graph in Figure 8a, we realised that the number of approved transactions increased drastically in the year of 2016, and generally decreased in the year of 2017. Nevertheless, we notice that the number of approved transactions are subjected to great fluctuation in 2017.
In addition, we notice that there is an increasing number of rejected transactions towards the end of 2017. This could demonstrate a potential problem for RDP, as it would mean a decrease in RDP commission/increase in opportunity cost with an increase in rejected transactions.

Figure9bRDP.jpg
Figure 8b: Time Series Pattern for each type of transaction across 7 days in a week



Based on the time-series graph in Figure 8b, we also realised that approved transaction tend to peak on the Tuesday and Saturday of a week. On the other hand, rejected transaction tend to peak on Wednesday. There is no distinct relationship between authentication failed/duplicate merchant_train ID transactions and the day of the week.

Figure9cRDP.jpg
Figure 8c: Time Series Pattern for each type of transaction across 24 hours in a day



Based on the time-series graph in Figure 8c, we realised that approved transaction tend to rise from 6am onwards. In contrast to our initial prediction that approved transaction will peak after working hours from 5-7pm, approved transaction actually drop during these hours. Insead, it rises from 7pm onwards. This could mean that merchants’ customers would usually transact at night/using home PC.

Moving Forward

After the interim presentation and this report submission, are looking to complete the following:

Merchant Categorisation Codes (MCC): To add a new column called “Merchant Categorisation Code (MCC)” into our data, that will categorise each merchant into a specific industry. This allows us to derive better analysis in terms of the common industries that star or laggard merchants belong to.

Remove Duplicated Merchants: There are certain merchants whose names are duplicated in various forms - e.g. City Harvest Church v.s. City Harvest Church (Virtual Terminal) actually belong to the same merchant. Our client hope to remove the duplicated names, so that each merchant name in our data is unique and belong to only one merchant.

Visual Dashboard: Aid in understanding some of our client’s current business issues, as well as the benchmark metrics and attributes that the company may not be currently analysing.

Finding Revenue per Merchant: (See below description)
As shown in Figure 9a, there are significant difference between the mean number of transactions per merchant for each bin - e.g. Bin 1: average 3 transactions per merchant, Bin 2: average 25 transactions per merchant, Bin 3: average 135 transactions per merchant, Bin 4: average 714 transactions per merchant and as expected, Bin 5: average 30818 transactions per merchant.

RDPAnova1.jpg
Figure 9a: Oneway Anova Analysis of Number of Transactions per merchant by bins

However, in Figure 9b, while there is a significant difference between the mean transaction monetary value among the 5 different bins, interestingly, we note that Bin 5 does not has the highest mean transaction monetary value (only $96.34). In fact, Bin 1 has the highest mean transaction monetary value of $3124.43. This means that while Bin 5 consist of merchants with the highest volume of transactions, they are not necessarily the most valuable group of merchants.

Figure6bRDP.jpg
Figure 9b: Oneway Anova Analysis of ‘converted value’ by bins

Moving forward, RDP will provide the total revenue it earns from each merchant. This will allow us to find the most valuable bin of merchants.