Red Dot Payment Methodology

From Analytics Practicum
Jump to navigation Jump to search

HOME

 

PROJECT OVERVIEW

 

PROJECT FINDINGS

 

PROJECT DOCUMENTATION

 

PROJECT MANAGEMENT

 

ANLY482 HOMEPAGE

Background Data Source Methodology

Tools Used

Software: The sponsor has given us the flexibility to use any software tools suitable for the project. As such, all analyses and modelling processes will be done using tools that we are experienced in, such as JMP Pro and SAS Enterprise Miner. Both tools allow us to streamline the data mining process in developing models.

Model Building (In Progress)

Previously, we planned to use the following methodology:

Cluster Analysis: Identifying different segments of merchants and customers that were not previously defined by RDP; Data may include, but not limited to the following: Merchant name, customer IP address, IP country, reason code and description.

  • However, we realised that performing cluster analysis for customers is meaningless, as RDP is not able to control the profiles of customers transacting with its merchants.
  • Instead, we will only be focusing on clustering merchants based on certain characteristics/variables that affect approved/rejected transaction rates.


Logistic Regression: Analyzing the dataset to identify whether there are one or more independent variables that would determine an outcome measured with a dichotomous variable (e.g.what are the factors leading to rejected or approved transaction).

  • Based on our client meetings, we realised that RDP is not able to control the factors leading to rejected/approved transaction, as all transactions are processed by Maybank. RDP merely provides the payment gateway services that facilitates the transfer of information between its merchants’ payment portal (i.e. websites) and the payment processor used by merchants’ acquiring bank (i.e. MayBank).


Model Validation and Refinement: Verifying our analysis with a different fiscal year to ensure that our predicted results do not differ significantly. If possible, we may use an independent sample t-test to ensure that the differences are statistically insignificant.

In performing exploratory data analysis, we have revised our methodology to the following:

Interactive Binning: We seek to group merchants into smaller bins based on the number of transactions per merchant

Line of Fit:

  • Identify relationship between approved/rejected transaction rates and number of transactions per merchant
  • In addition, for each bin, we will use Line of Fit graphs and compare the number of approved transactions per merchant against total transaction per merchant. From then on, we will identify the over-performing (star) and under-performing (laggard) merchants, based on those merchants who lie outside the 95% confidence interval. This also sets the benchmark for each merchant’s performance


Time Series Analysis: Used to determine if there is any relationship between the approved/rejected transactions, and (i) the month of the year, (ii) day of the week and (iii) time of the day.

Model Validation and Refinement: Similar to above - no change in description