AY1516 T2 Sport Betting at Singapore Pools Project Overview Final

From Analytics Practicum
Jump to navigation Jump to search


THE SPONSOR

 

THE TEAM

 

THE OVERVIEW

 

THE MANAGEMENT

 

THE DOCUMENTS

 


Proposal Midterm Final


Abstract

In today’s interconnected world, the gambling environment has transformed into a multifaceted playing field without boundaries, exposing more people and younger people to the games, and too creates loop holes for illegal gambling operators to enter the market. The result is greater public worry about the social ills of irresponsible gambling.

Our sponsor, the Singapore Pools, takes a strong stand in responsible gaming, wanting to offer a safer outlet for the public to play. This paper will explore gambling transaction data (n=930,000) to identify and better understand betting patterns that would eventually allow us to flag out players who engage in or is susceptible to irresponsible gambling in turn suggest ways to promote responsible gambling. This paper would also consult with past literature to guide our methodological approach and cross compare hypotheses and findings.

The methodological flow of this project begins with exploratory data analysis where the dataset would cleaned and transformed for further analysis. The large set of transaction will be aggregated into a list of user data. We then proceeded to relationship analysis of the parameters and bet preferences of players of different demographics. Using a clustering analysis, we will then profiled players into four main segments: (1) Masses (2) High-rollers (3) Players-at-risk (4) Habituals.

This unique segmentation would allow our sponsor to identify players would are at risk of irresponsible gambling, and suggest strategies to reach out to these segments and alert them of their betting behaviour and educate them about responsible betting. To ensure project continuity and future analyses, our team has created a dynamic dashboard to visualise monthly transaction trends, highlight popular events, players who are at risk, and allow exploration of each individual player’s profile and betting patterns (i.e. their betting intensity, transaction history).

Introduction

Gambling is often seen as a problem in society, no doubt gambling addiction poses a grave societal problem, however banning gambling is not a viable solution, for it would simply drive these activities underground. Our sponsor, Singapore Pools was set up by the Singapore government in 1968 to place gambling on legal grounds and to deal with the social ills tied to gambling. Ever since, Singapore Pools (SG Pools) has been the sole legalized operator to run lotteries and sports betting in Singapore.

Unlike in most countries where gambling houses are privately owned organizations, Singapore Pools is a stated-owned organization, registered under Singapore’s Ministry of Finance. Singapore Pools offers four main products to the public (TOTO, Singapore Sweep, 4D, Sports Betting) all of which –operations and product configurations – are regulated by Singapore’s Ministry of Home Affairs, Ministry of Finance, Ministry of Social & Family Development.

Our sponsor takes a strong stand in responsible gaming, wanting to offer a safer outlet, where players can bet responsibly within their financial means. Attrition rates have be raising over the years, and this could meant that Singapore Pools’ customers are seeking other avenues to participate in gambling activities such as illegal online-gambling sites, which may lead to irresponsible betting. Therefore, within the next few years, our sponsor seeks to undertake a data-driven approach to promote responsible gambling by monitoring the player's’ betting behaviour and performance, in hopes of highlighting alarming patterns that could indicate signs of irresponsible gambling, and too use this data to help usher in their online betting platform that is scheduled to launch in the upcoming year.

Thankfully our sponsor has been collecting user and transaction data for the past several years, but has yet put it to good use. Singapore Pools had set up a customer insights division about a year ago to better understand their customers through the analysis of these user data. This is their first step towards a data-driven approach to promote responsible gambling and to understand the gambling behavioural patterns of their customers; and thus this is where our team comes in.

Project Objectives

The aim of this project is to provide Singapore Pools with a better understanding of the gambling behaviours of their customers through the identification of betting preferences and patterns. Clusters of players may be identified base on their betting behaviour – ways of splitting their bets, preference for a league, different decision making process, and ways of selecting their bet selections. Such behavioural patterns could possibly be linked back to certain demographics pertaining to the cluster, allowing us to further infer reasons behind their gambling habits, and hopefully could help us identify those irresponsible gamblers too.

The scope of our project is limited to the Sports Betting segment of customers who have opened betting accounts with Singapore Pools. The data provide are confined to line betting transactions made by their ‘Gold’ and ‘Platinum’ members.

The overall objectives of our project are as stated:

(1) Provide insights with regards to gambling behavioural patterns

(2) Profile their existing pool of customers into meaningful segments

(3) Build a dashboard to visualize betting patterns and trends on a macro and individual level

And based on the characteristics of each cluster, the sponsor’s end objective is to (1) flag out players who display alarming patterns that could lead to irresponsible betting, and (2) tailored business actions that targets the derived clusters of players to enhance their gambling experience while ensuring that they make bets in a responsible fashion.

Literature Review

Gambling is one topic that is widely researched across the world, from survey polls of gambling participation and perception, gambling risk and pathology, to thorough statistical analysis on gambling behaviours.

According to a survey done by Singapore’s Ministry of Community Development, Youth and Sports (MCYS), within a year’s period, 58% of Singaporeans over 18 years of age have participated in at least one gambling activity. Further study on pathological gamblers by the MCYS found that players at risk to developing a gambling addiction would gamble at least once a week, and this pool of susceptible players made up 70% of the sample population involved in the study (2005).

Behavioural or betting patterns is another popular area studied across most papers – for they provide cues to possible pathological gambling behaviours; difference between betting behaviour of regular players and players at risk (problem gamblers) is evident, a common finding in most studies. One study revealed that gamblers at risk are more likely to bet more frequently coupled with increasing bet amounts, regardless of their bet outcome (Mizerski, 2011). And that less frequent players are more likely to put more effort into decision-making when making bets to allow for future betting possibilities, as compared to regular or frequent players. Evidently they also found that certain betting games and game arrangements may actually prompt reckless betting that could like to irresponsible gambling.

Several other papers provided insights to a more analytical approach to segment gamblers and identify those at risk. A study by Faregh and Leth-Steensen (2011) discovered clusters of players with variations in terms of their bet activity level (frequency), bet variability (spread of stakes and odds), time spent on making the bets, and the games played. Relationship and predictive analysis between selected parameters may reveal variables that best predicted returns, and reflect bet strategies that are less sophisticated (Gainsbury & Russell, 2013). Suggestions on data collection procedures, selection of metrics and parameters for clustering players in these papers are just some of the secondary insights that have aided our choice of methodology and analysis – determining ways of profiling our result clusters – that will be elaborated on later in this report.

Besides researching the field of gambling and the analytical methodologies, we took examined past data visualization papers to learn about the pit falls and best practices of data visualization. “Different types of graphs are designed to communicate different types of messages” quote data visualization expert, Stephen Few, as he demonstrated in his papers regarding the effective use of points and lines to shape data trends, to the principles of colour selection for data visualization – use of contrasting or analogous colours for varying purposes (Few, 2004; 2006; 2007). Meanwhile some graphs are best to avoid, such as alluring 3-D graphs or pie-charts which can be rendered better in a two-dimensional plane, for the added depth and angle makes interpretation more difficult (Few, 2005). Returning to the dashboard, two guiding principle in designing the dashboard layout that we took from Few’s recommendations was to (1) find balance between being information rich and not oversimplifying and (2) to remove clutter or any distractions that do not add value (Few, 2005).

Leveraging on these prior knowledge, our team hopes to deliver actionable insights with regards to the betting behaviour of our sponsor’s pool of customers, and present the findings on a dashboard that is all visually appealing, intuitive, practical, and accurately data driven.

Data Cleaning & Transformation

Exploratory Data Analysis

Methodology for Clustering

Results & Findings

Recommendations

Project Dashboard

Limitations & Challenges

Testing

Further Developments

Moving forward to gather deeper consumer understanding, our team proposes a more extensive data collection procedure. Singapore Pools have collected basic demographical data on their Platinum users, however this was not extended to the Gold users. With greater demographic data – such as customer’s salary and occupation, customer’s address, customer’s family background – we may discover new betting patterns, and new relationship between existing (betting behaviour) parameters and player demographic data; and so offering a better understanding of what responsible gambling should be at an individual level rather than a general take across the entire population.

Besides collecting demographic data, there are other transaction data such as account top-ups which could too provide greater insight on one’s betting behaviour. Historical trends of one’s frequency and amount of top-ups, coupled with his or her past winnings records, could reveal patterns of irresponsible gambling.

Image201.png

There are many papers and research carried out regarding the use rule-based classifiers and pre-learning data clustering such as association rule clustering and automated genetic clustering, but still this process is yet to be viable in the near future. Such machine learning approaches are still unperfected given the vast number of rules that needs be considered, and the incapability of the machine to learn beyond the training data. Coupled with another obvious limitation regarding the interpretation of clusters’ profiles that will be left to the dashboard user, and as subjective as it is, the user would require some degree of statistics knowledge to make meaningful inferences.

Our team would therefore suggest that having a workable methodology or guide on how to perform the clustering analysis would be more feasible. The results from the current clustering analysis only provides a one-time-off understanding of the current market context, the insights cannot be replicated for future references. There is a need to revise the clustering analysis from time to time, updated with new datasets representative of the latest trends and context. A more sustainable solution would be to have a trained analyst to conduct the clustering analysis each round.

Lastly as mentioned above, the dataset that our team have been working on is merely a three month long dataset, as such we will continue to work with our sponsor to carry out load testing with a larger set of data. Client user testing of the dashboard is also currently on going, and we too will continue to provide support to update the dashboard charts and interactive tools upon further feedback.