Red Dot Payment Methodology

From Analytics Practicum
Revision as of 20:33, 25 February 2018 by Ruizhi.ong.2014 (talk | contribs)
Jump to navigation Jump to search

HOME

 

PROJECT OVERVIEW

 

PROJECT FINDINGS

 

PROJECT DOCUMENTATION

 

PROJECT MANAGEMENT

 

ANLY482 HOMEPAGE

Background Data Source Methodology

Tools Used

Software: The sponsor has given us the flexibility to use any software tools suitable for the project. As such, all analyses and modelling processes will be done using tools that we are experienced in, such as JMP Pro and SAS Enterprise Miner. Both tools allow us to streamline the data mining process in developing models.

Model Building (In Progress)

Previously, we planned to use the following methodology:

Cluster Analysis: Identifying different segments of merchants and customers that were not previously defined by RDP; Data may include, but not limited to the following: Merchant name, customer IP address, IP country, reason code and description.

  • However, we realised that performing cluster analysis for customers is meaningless, as RDP is not able to control the profiles of customers transacting with its merchants.
  • In terms of identifying different segments of merchants, we realise that 93% of the total transactions do not have ‘IP Country’. Thus, this leaves us with only 2 variables - ‘amount_of_money’ and ‘reason_code_description’, that we can work on to segment our merchants. Through interactive binning, we hope to group these continuous values into a smaller number of "bins".


Logistic Regression: Analyzing the dataset to identify whether there are one or more independent variables that would determine an outcome measured with a dichotomous variable (e.g.what are the factors leading to rejected or approved transaction).

  • Based on our client meetings, we realised that RDP is not able to control the factors leading to rejected/approved transaction, as all transactions are processed by Maybank. RDP merely provides the payment gateway services that facilitates the transfer of information between its merchants’ payment portal (i.e. websites) and the payment processor used by merchants’ acquiring bank (i.e. MayBank).


Model Validation and Refinement: Verifying our analysis with a different fiscal year to ensure that our predicted results do not differ significantly. If possible, we may use an independent sample t-test to ensure that the differences are statistically insignificant.

In performing exploratory data analysis, we have revised our methodology to the following:

Interactive Binning: We seek to group merchants into smaller bins based on the number of transactions per merchant.

  • One Way Anova: We will find out if there is significant difference in the mean approved transaction value of different bins, to identify the existence of multiplier effect - i.e. whether the bin with highest volume of transactions also has the highest mean approved transaction value. This helps us to identify the bin of merchants most valuable to RDP.


  • Line of Fit: Using a Line of Fit graph, we will identify the over-performing (star) and under-performing (laggard) merchants in each bin, by comparing the number of approved transactions per merchant against total transaction per merchant. This sets the benchmark for each merchant’s performance.


Time Series Analysis: Used to determine if there is any relationship between the approved/rejected transactions, and (i) the month of the year, (ii) day of the week and (iii) time of the day.

Model Validation and Refinement: Similar to above - no change in description