Difference between revisions of "Gambling Behaviour Pattern Analysis Project Overview"

From Analytics Practicum
Jump to navigation Jump to search
Line 36: Line 36:
 
<!--END OF Sub-Navigation-->
 
<!--END OF Sub-Navigation-->
  
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Background of Singapore Pools</font></div></div>==
+
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Project Aim & Objectives</font></div></div>==
  
Gambling is often seen as a problem in society, no doubt gambling addiction poses a grave societal problem, however banning gambling is not a viable solution, for it would simply drive these activities underground. As such, our sponsor, Singapore Pools was set up by the Singapore government in 1968 to place gambling on legal grounds and to deal with the social ills tied to gambling. Ever since, Singapore Pools has been the sole legalized operator to run lotteries and sports betting in Singapore.
+
The aim of our project is to allow Singapore Pools to better understand the gambling behaviours of their customers through the identification of gambling patterns, which can be unique across different clusters of individuals. Each cluster might have their own specific ways of splitting their bets, different churn rates, preference for a league, different decision making process, and ways of selecting their bet selections. Such behavioural patterns could possibly be linked back to certain demographics pertaining to the cluster, allowing us to further infer reasons behind their gambling habits, and hopefully could help us identify those irresponsible gamblers too. For the purpose of the project, the scope of our project is limited to the Sports Betting segment of customers who have opened betting accounts with Singapore Pools.
  
This establishment offers a regulated environment, one where the government is able to regulate gambling behaviours to some extent, by educate citizens about playing responsibly, within their meanings, informing them of the possible adverse consequence of gambling, therefore managing the risk of gambling problems. As a not-for-profit-organization, Singapore Pools donates a share of their revenue to benefit good causes on a regular basis, many of which have being made for the betterment of charity, education, health, community development, and sports sectors.
+
The overall objectives of our project are to:
  
Singapore Pools offers 4 products to the public – TOTO, The Singapore Sweep, 4D, Sports Betting. These products and services are all regulated by Singapore’s Ministry of Home Affairs, Ministry of Finance, and the Ministry of Social and Family Development.
+
<b>(1) Profile their existing pool of customers through clustering analysis
  
<br>
+
(2) Create a data visualization of the consumer betting activity
[[File:SP50 edited.jpg | 800px]]
+
 
 +
(3) Build a dashboard to visualize the profiling and data points </b>
  
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Business Problems & Motivations</font></div></div>==
+
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Project Motivations</font></div></div>==
  
 
In today’s globalized world, the Internet has transformed the gambling environment into a multifaceted, non-physical, multi-platform, environment without boundaries. This presents loopholes for illegal gambling operators to enter the market and draw our customers away, into their unregulated arena that is susceptible to the creation of gambling addiction issues.   
 
In today’s globalized world, the Internet has transformed the gambling environment into a multifaceted, non-physical, multi-platform, environment without boundaries. This presents loopholes for illegal gambling operators to enter the market and draw our customers away, into their unregulated arena that is susceptible to the creation of gambling addiction issues.   
Line 53: Line 54:
 
Singapore Pools offers a safer outlet, one where players can bet responsibly, within their means. Attrition rates have be raising over the years, and this could meant that Singapore Pools’ customers are seeking other avenues to participate in gambling activities such as illegal online-gambling sites, which may lead to irresponsible betting. Therefore, within the next few years, our sponsor seeks to undertake a data-driven approach to promote responsible gambling by monitoring the player's’ betting behaviour and performance, in hopes of highlighting alarming patterns that could indicate signs of irresponsible gambling.     
 
Singapore Pools offers a safer outlet, one where players can bet responsibly, within their means. Attrition rates have be raising over the years, and this could meant that Singapore Pools’ customers are seeking other avenues to participate in gambling activities such as illegal online-gambling sites, which may lead to irresponsible betting. Therefore, within the next few years, our sponsor seeks to undertake a data-driven approach to promote responsible gambling by monitoring the player's’ betting behaviour and performance, in hopes of highlighting alarming patterns that could indicate signs of irresponsible gambling.     
  
Our sponsor has actually been collecting user data for the past several years, but have yet put it to good use. Just a year ago, Singapore Pools had set up a customer insights division to better understand their customers through the analysis of these user data and their first step towards a data-driven approach to promote responsible gambling was to understand the gambling behavioural patterns of their customers.
+
Our sponsor has actually been collecting user data for the past several years, but has yet put it to good use. Just a year ago, Singapore Pools had set up a customer insights division to better understand their customers through the analysis of these user data and their first step towards a data-driven approach to promote responsible gambling was to understand the gambling behavioural patterns of their customers.
  
 
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Literature Review</font></div></div>==
 
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Literature Review</font></div></div>==
  
Singapore has the highest spending on gambling per capita as reported by Global Betting and Gaming Consultants. In the 2014 NCPG (National Council on Problem Gambling) survey report, an estimated 44% of Singapore’s population have participated in gambling related activities in a one year period, compared to a 47% in their last survey in 2011. The amount of bet has also fallen, with about 90% of the surveyed spending less than $200 per month; and only a minute fraction of them (0.3%) gambled with large amounts of over $1000 each month. An alarming finding in this survey, was that probable pathological gamblers are on the rise, with greater frequency of gambling (83% gambled once a week, as compared to 68% in 2011). Furthermore, this regular gambling habit was picked up at a younger age, with 17% gambling regularly before the age of 18 in comparison to a 5% in 2011.The rise of such phenomenon was the result of early exposure to online gambling, as such the need to regulate gambling content.  
+
Singapore has the highest spending on gambling per capita as reported by Global Betting and Gaming Consultants. In the 2014 NCPG (National Council on Problem Gambling) survey report, an estimated 44% of Singapore’s population have participated in gambling related activities in a one year period, compared to a 47% in their last survey in 2011. The amount of bet has also fallen, with about 90% of the surveyed spending less than $200 per month; and only a minute fraction of them (0.3%) gambled with large amounts of over $1000 each month. An alarming finding in this survey was that probable pathological gamblers are on the rise, with greater frequency of gambling (83% gambled once a week, as compared to 68% in 2011). Furthermore, this regular gambling habit was picked up at a younger age, with 17% gambling regularly before the age of 18 in comparison to a 5% in 2011.The rise of such phenomenon was the result of early exposure to online gambling, as such the need to regulate gambling content.
 +
 
 +
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">General Rules of Soccer Betting</font></div></div>==
 +
 +
In soccer betting, the customer places a wager on any of the selections under a specific bet type (e.g. home team to win, total number of goals in the match equals to 3). If the selection corresponds to the winning selection as declared by Singapore Pools, the customer will qualify for the winnings, which are based on the prevailing odds at the time the customer’s bets are placed.
 +
 +
<br/>
 +
[[File:New-soccer-season-xl.jpg | 1000px]]
 +
 
 +
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Project Data</font></div></div>==
 +
 
 +
The dataset will be presented to us in the form of an Excel spreadsheet and each worksheet within the spreadsheet contains the entire purchase history of one player along with other recorded parameters. We will be given a dataset consisting of approximately 4,000 unique customer accounts with account activity details over a time period of three months. The time period spans from January to March, which coincides with the peak period of the different international soccer leagues, so as to ensure the high volume of betting activity for analysis.     
 +
 
 +
The parameters in the dataset can be split into three distinct categories and include the following:
 +
 
 +
<b>1. Demographics of Players</b>
  
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Project Overview</font></div></div>==
+
Customer Account No.
  
The aim of our project is to allow Singapore Pools to better understand the gambling behaviours of their customers through the identification of gambling patterns, which can be unique across different clusters of individuals. Each cluster might have their own specific ways of splitting their bets, different churn rates, preference for a league, different decision making process, and ways of selecting their winning bets per match. Such behavioural patterns could possibly be linked back to certain demographics pertaining to the cluster, allowing us to further infer reasons behind their gambling habits, and hopefully could help us identify those irresponsible gamblers too. For the purpose of the project, the scope of our project is limited to the Sports Betting segment of their Account Customers.
+
Gender
  
The objectives of our project is to:
+
Date of Birth
  
(1) Profile their existing pool of customers through clustering analysis
+
Income Range
  
(2) Create a data visualization of the consumer betting activity
+
Type of Membership
  
(3) Build a dashboard to visualize the profiling and data points
+
Nationality
  
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">General Rules of Soccer Betting</font></div></div>==
+
Account Opening Date
 
In soccer betting, the customer places a wager on any of the selections under a specific bet type (e.g. home team to win, total number of goals in the match equals to 3). If the selection corresponds to the winning selection as declared by Singapore Pools, the customer will qualify for the winnings, which are based on the prevailing odds at the time the customer’s bets are placed.
 
 
<br/>
 
[[File:New-soccer-season-xl.jpg | 1000px]]
 
  
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Data</font></div></div>==
 
  
As for the dataset presented to us, within the spreadsheet, each worksheet contains the entire purchase history of one player along with other recorded parameters, hence one worksheet itself holds the data of just one person. We would therefore need to aggregate all this individualized observations of betting patterns into one worksheet, and represent the relevant attributes of each player as a single row.     
+
<b>2. Betting Activity</b>
  
The parameters collected by our sponsor can be split into two distinct categories and include the following:
+
Customer Account No.
  
Demographics
+
Bet Date & Time
  
ID of players (NRIC)
+
Bet Selection
  
Gender of players
+
Bet Type
  
Type of membership of players
+
Event Name
  
Home address of players
+
Event Code
  
Account Activity
+
Bet Amount
  
Date when bet was made
+
Bet Odds
  
Time when bet was made
+
Bet Start Time of Bet Event
  
Type of bet made
 
  
Soccer match info
+
<b>3. Top Up & Withdrawal Activity</b>
  
Team that player betted on
+
Customer Account No.
  
Odds of the bet
+
Transaction Date & Time
  
Amount of money betted
+
Transaction Type
  
Amount of top-up
+
Transaction Mode
  
Time when top-up was made
+
Transaction Amount
  
Returns (Winnings/losses)
 
 
 
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Methodology & Work Scope</font></div></div>==
 
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Methodology & Work Scope</font></div></div>==
  
Line 177: Line 184:
 
<strong>Seventh phase</strong> - Optimization of dashboard
 
<strong>Seventh phase</strong> - Optimization of dashboard
  
Test software performance whether it meets the minimum requirements of the clients and perform any optimizations to meet these.   
+
Based on the feedback given during the mid-term reporting and by the client, we will implement modifications to the dashboard to ensure that the client’s requirements are met in this final phase of the project.   
  
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Deliverables</font></div></div>==
+
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Project Deliverables</font></div></div>==
  
 
(1) Mid-term report & presentation  
 
(1) Mid-term report & presentation  
Line 190: Line 197:
 
<br>
 
<br>
  
Data visualization [Parameters to be displayed have yet been decided] - (1) Player’s demographics (2) Winnings / losses trendline (3) Active player list (4) Player profile type (5) Prize winnings summary (6) Leaderboard (7) Winning ratio (8) Risk factor 
+
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Project Limitations & Assumptions</font></div></div>==
 
 
To identify the consumer’s betting pattern along his betting journey, the dashboard will display a sequence of profiles of certain betting behaviour relative to a phase in one’s betting journey.
 
 
 
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">Limitations & Assumptions</font></div></div>==
 
  
 
The data collected are only from players who use Singapore Pool’s phone betting service, and this only accounts for 5% of their entire pool of customers. The remaining 95% are anonymous players who make bets at Singapore Pools’ physical stall outlets. Given that this project outcome was meant as a precursor to help with the launch of Singapore Pools’ online betting system, their target audience would be more similar to those that are using the current betting lines service, hence this limited behaviour data we have are perfect to model those future online players.     
 
The data collected are only from players who use Singapore Pool’s phone betting service, and this only accounts for 5% of their entire pool of customers. The remaining 95% are anonymous players who make bets at Singapore Pools’ physical stall outlets. Given that this project outcome was meant as a precursor to help with the launch of Singapore Pools’ online betting system, their target audience would be more similar to those that are using the current betting lines service, hence this limited behaviour data we have are perfect to model those future online players.     
Line 200: Line 203:
 
There is a lack of secondary data with regards to this niche field of research on gambling, in Singapore specifically. Without existing literature and data on Singaporean’s betting behaviour, we will not be able to compare the results from our profiling and verify the underlying reasons behind their betting patterns. The only secondary data that we can benchmark our data with is one report that studied a sample of Australian gamblers in the state of Victoria, hence we had to make the assumption that cross cultural differences would have little influence on how players of the different demographics in each country made betting decisions.  
 
There is a lack of secondary data with regards to this niche field of research on gambling, in Singapore specifically. Without existing literature and data on Singaporean’s betting behaviour, we will not be able to compare the results from our profiling and verify the underlying reasons behind their betting patterns. The only secondary data that we can benchmark our data with is one report that studied a sample of Australian gamblers in the state of Victoria, hence we had to make the assumption that cross cultural differences would have little influence on how players of the different demographics in each country made betting decisions.  
  
However, our chief priority would be to highlight gaps in the data that may possibly predict patterns of irresponsible gambling, and not to infer the underlying reason behind any irregular betting behaviour.
+
However, our chief priority would be to highlight gaps in the data that may possibly predict patterns of irresponsible gambling, and not to infer the underlying reason behind any irregular betting behaviour.
 
 
 
 
==<div style="background: #007BBD; line-height: 0.3em; border-left: #007BBD solid 13px;"><div style="border-left: #45E98F solid 5px; padding:15px;"><font face ="Century Gothic" color= "white" size="5">References</font></div></div>==
 
 
 
[1] REPORT OF SURVEY ON PARTICIPATION IN GAMBLING ACTIVITIES AMONG SINGAPORE RESIDENTS, 2014. (2015).
 
 
 
[2] Singapore Pools Official Website. (n.d.). Retrieved January 14, 2016, from http://www.singaporepools.com.sg
 
</font>
 

Revision as of 03:41, 18 January 2016


THE SPONSOR

 

THE TEAM

 

THE OVERVIEW

 

THE MANAGEMENT

 

THE DOCUMENTS

 


Proposal Midterm Final


Project Aim & Objectives

The aim of our project is to allow Singapore Pools to better understand the gambling behaviours of their customers through the identification of gambling patterns, which can be unique across different clusters of individuals. Each cluster might have their own specific ways of splitting their bets, different churn rates, preference for a league, different decision making process, and ways of selecting their bet selections. Such behavioural patterns could possibly be linked back to certain demographics pertaining to the cluster, allowing us to further infer reasons behind their gambling habits, and hopefully could help us identify those irresponsible gamblers too. For the purpose of the project, the scope of our project is limited to the Sports Betting segment of customers who have opened betting accounts with Singapore Pools.

The overall objectives of our project are to:

(1) Profile their existing pool of customers through clustering analysis

(2) Create a data visualization of the consumer betting activity

(3) Build a dashboard to visualize the profiling and data points

Project Motivations

In today’s globalized world, the Internet has transformed the gambling environment into a multifaceted, non-physical, multi-platform, environment without boundaries. This presents loopholes for illegal gambling operators to enter the market and draw our customers away, into their unregulated arena that is susceptible to the creation of gambling addiction issues.

Singapore Pools offers a safer outlet, one where players can bet responsibly, within their means. Attrition rates have be raising over the years, and this could meant that Singapore Pools’ customers are seeking other avenues to participate in gambling activities such as illegal online-gambling sites, which may lead to irresponsible betting. Therefore, within the next few years, our sponsor seeks to undertake a data-driven approach to promote responsible gambling by monitoring the player's’ betting behaviour and performance, in hopes of highlighting alarming patterns that could indicate signs of irresponsible gambling.

Our sponsor has actually been collecting user data for the past several years, but has yet put it to good use. Just a year ago, Singapore Pools had set up a customer insights division to better understand their customers through the analysis of these user data and their first step towards a data-driven approach to promote responsible gambling was to understand the gambling behavioural patterns of their customers.

Literature Review

Singapore has the highest spending on gambling per capita as reported by Global Betting and Gaming Consultants. In the 2014 NCPG (National Council on Problem Gambling) survey report, an estimated 44% of Singapore’s population have participated in gambling related activities in a one year period, compared to a 47% in their last survey in 2011. The amount of bet has also fallen, with about 90% of the surveyed spending less than $200 per month; and only a minute fraction of them (0.3%) gambled with large amounts of over $1000 each month. An alarming finding in this survey was that probable pathological gamblers are on the rise, with greater frequency of gambling (83% gambled once a week, as compared to 68% in 2011). Furthermore, this regular gambling habit was picked up at a younger age, with 17% gambling regularly before the age of 18 in comparison to a 5% in 2011.The rise of such phenomenon was the result of early exposure to online gambling, as such the need to regulate gambling content.

General Rules of Soccer Betting

In soccer betting, the customer places a wager on any of the selections under a specific bet type (e.g. home team to win, total number of goals in the match equals to 3). If the selection corresponds to the winning selection as declared by Singapore Pools, the customer will qualify for the winnings, which are based on the prevailing odds at the time the customer’s bets are placed.


New-soccer-season-xl.jpg

Project Data

The dataset will be presented to us in the form of an Excel spreadsheet and each worksheet within the spreadsheet contains the entire purchase history of one player along with other recorded parameters. We will be given a dataset consisting of approximately 4,000 unique customer accounts with account activity details over a time period of three months. The time period spans from January to March, which coincides with the peak period of the different international soccer leagues, so as to ensure the high volume of betting activity for analysis.

The parameters in the dataset can be split into three distinct categories and include the following:

1. Demographics of Players

Customer Account No.

Gender

Date of Birth

Income Range

Type of Membership

Nationality

Account Opening Date


2. Betting Activity

Customer Account No.

Bet Date & Time

Bet Selection

Bet Type

Event Name

Event Code

Bet Amount

Bet Odds

Bet Start Time of Bet Event


3. Top Up & Withdrawal Activity

Customer Account No.

Transaction Date & Time

Transaction Type

Transaction Mode

Transaction Amount

Methodology & Work Scope

Our methodology and proposed work scope for this project are as follows:

(1) Data cleaning

(2) Data exploration

(3) Data transformation

(4) Cluster analysis & profiling

(5) Relationship cross analysis

(6) Creation of new player metrics

(7) Wireframe dashboard & select display parameters

(8) UX internal audit

(9) Client consultation on dashboard

(10) Application calculations & filtering

(11) Dashboard prototype I

(12) Dashboard testing & calibration

(13) Dashboard prototype II

(14) Dashboard final optimization

(15) Client user testing


First phase - Data cleaning and Data preparation

The first phase would be data cleaning; such steps would include: (1) filtering out non-Singaporean players for they are not Singapore Pools’ primary concern and would possibly skew the betting pattern; (2) filtering out one-off players who were only active for a brief period; (3) filtering out incomplete-data or empty fills if any.

Second phase - Data transformation

Data transformation then follows suit. With the given parameters, we will create new metrics (categorical data) for each player such as age and their socioeconomic standing (SES) - we can identify their age using their NRIC number, and their SES as a rough gauge of their assets on the basis of the real estate property they own. This new data formats would then be put into higher level analytical models and analysis.

Third phase - Data analytics

We will proceed to leverage on analytical tools (i.e. SAS, SPSS) to create profiles and segments of various betting behaviours.

Fourth phase - Discovery of relationship

Based on the segmentation of the players, we will draw links between the demographics of each segment and the betting behavioural patterns.

Fifth phase - Wireframing and selecting display parameters

Next comes the designing the user-interface of the dashboard and visualization of data graphs for simple and pleasant viewing. The various viewing pages (for summary or visualize of specific data) will be implemented and the dashboard navigation paths will be planned for.

Sixth phase - Building of dashboard for data visualization

Before inserting the parameters into the dashboard, we must once again carry out an overall transformation of data into readable CSV for bootstrapping to the dashboard prototype. We will be using either Tableau or D3.js to build the data visualization.

Seventh phase - Optimization of dashboard

Based on the feedback given during the mid-term reporting and by the client, we will implement modifications to the dashboard to ensure that the client’s requirements are met in this final phase of the project.

Project Deliverables

(1) Mid-term report & presentation

(2) Final report & presentation

(3) Project poster

(4) Customized dashboard

Project Limitations & Assumptions

The data collected are only from players who use Singapore Pool’s phone betting service, and this only accounts for 5% of their entire pool of customers. The remaining 95% are anonymous players who make bets at Singapore Pools’ physical stall outlets. Given that this project outcome was meant as a precursor to help with the launch of Singapore Pools’ online betting system, their target audience would be more similar to those that are using the current betting lines service, hence this limited behaviour data we have are perfect to model those future online players.

There is a lack of secondary data with regards to this niche field of research on gambling, in Singapore specifically. Without existing literature and data on Singaporean’s betting behaviour, we will not be able to compare the results from our profiling and verify the underlying reasons behind their betting patterns. The only secondary data that we can benchmark our data with is one report that studied a sample of Australian gamblers in the state of Victoria, hence we had to make the assumption that cross cultural differences would have little influence on how players of the different demographics in each country made betting decisions.

However, our chief priority would be to highlight gaps in the data that may possibly predict patterns of irresponsible gambling, and not to infer the underlying reason behind any irregular betting behaviour.