Difference between revisions of "ANLY482 AY2016-17 T1 Group2: Project Findings"

From Analytics Practicum
Jump to navigation Jump to search
(Created page with "<!-- Start Logo --> <br /> <!-- End Logo --> <!-- Start Navigation Bar --> {|style="background-color:#2196F3; font-family:sans-serif; font-size:140%;" width="100%" cellspacin...")
 
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<!-- Start Logo -->
+
<!-- Start Main Navigation Bar -->
<br />
+
{|style="background-color:#0096da; font-family:sans-serif; font-size:140%; text-align:center;" width="100%" cellspacing="0" |
<!-- End Logo -->
+
| style="border-bottom:7px solid #005192;" width="10%" |
 +
[[ANLY482_AY2016-17_T1_Group2 | <font color="#bbdefb">Home</font>]]
 +
 
 +
| style="border-bottom:7px solid #005192;" width="10%" |
 +
[[ANLY482_AY2016-17_T1_Group2: Team | <font color="#bbdefb">Team</font>]]
  
<!-- Start Navigation Bar -->
+
| style="border-bottom:7px solid #005192;" width="20%" |
{|style="background-color:#2196F3; font-family:sans-serif; font-size:140%;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
+
[[ANLY482_AY2016-17_T1_Group2: Project Overview | <font color="#bbdefb">Project Overview</font>]]
| style="border-bottom:7px solid #1976D2; text-align:center;" width="10%" |
 
[[ANLY482_AY2016-17_T1_Group2 | <font color="#FFFFFF">Home</font>]]
 
  
| style="border-bottom:7px solid #1976D2; text-align:center;" width="10%" |
+
| style="border-bottom:7px solid #febd3d;" width="20%" |
[[ANLY482_AY2016-17_T1_Group2: Team | <font color="#FFFFFF">Team</font>]]
+
[[ANLY482_AY2016-17_T1_Group2: Project Findings | <font color="#fff">Project Findings</font>]]
  
| style="border-bottom:7px solid #1976D2; text-align:center;" width="20%" |
+
| style="border-bottom:7px solid #005192;" width="20%" |
[[ANLY482_AY2016-17_T1_Group2: Project Overview | <font color="#FFFFFF">Project Overview</font>]]
+
[[ANLY482_AY2016-17_T1_Group2: Project_Management | <font color="#bbdefb">Project Management</font>]]
  
| style="border-bottom:7px solid #FFC107; text-align:center;" width="20%" |
+
| style="border-bottom:7px solid #005192;" width="20%" |
[[ANLY482_AY2016-17_T1_Group2: Project Findings | <font color="#BBDEFB">Project Findings</font>]]
+
[[ANLY482_AY2016-17_T1_Group2: Documentation | <font color="#bbdefb">Documentation</font>]]
 +
|}
 +
<!-- End Main Navigation Bar -->
  
| style="border-bottom:7px solid #1976D2; text-align:center;" width="20%" |
+
<!-- Start Sub Navigation Bar -->
[[ANLY482_AY2016-17_T1_Group2: Project_Management | <font color="#FFFFFF">Project Management</font>]]
+
{| style="background-color:#fff; font-family:sans-serif; font-size:130%; text-align:center; margin:15px auto 0 auto;" width="60%" cellspacing="0"
 +
|-
 +
! style="vertical-align:top; padding:0px;" width="50%" | [[ Analysis and Findings as of Mid-Terms| <font color="#212121">Mid-Term</font>]]
 +
<div style="border-bottom:4px solid #005192;"></div>
  
| style="border-bottom:7px solid #1976D2; text-align:center;" width="20%" |
+
! style="vertical-align:top; padding:0px;" width="50%" | [[ Analysis and Findings as of Finals | <font color="#757575">Finals</font>]]
[[ANLY482_AY2016-17_T1_Group2: Documentation | <font color="#FFFFFF">Documentation</font>]]
+
<div style="border-bottom:2px solid #005192;"></div>
 
|}
 
|}
<!-- End Navigation Bar -->
+
<!-- End Sub Navigation Bar -->
  
 
<br />
 
<br />
  
 
<!-- Start Information -->
 
<!-- Start Information -->
<div style="background:#2196F3; line-height:0.3em; font-family:sans-serif; font-size:120%; border-left:#BBDEFB solid 15px;"><div style="border-left:#FFFFFF solid 5px; padding:15px;"><font color="#FFFFFF"><strong>Datasets</strong></font></div></div>
+
<div style="background:#0096da; line-height:0.3em; font-family:sans-serif; font-size:120%; border-left:#bbdefb solid 15px;"><div style="border-left:#fff solid 5px; padding:15px;"><font color="#fff"><strong>Data Exploration</strong></font></div></div>
  
 
<div style="color:#212121;">
 
<div style="color:#212121;">
The dataset provided by Singapore Pools are
+
Throughout our Data Exploration phase, our team analyzed the variations in demands, which is defined as number of tickets sold, and this section will bring you through the interesting findings in relation to our project.
# Profit & Loss (P&L) by Book
+
 
# Timetables of the matches
+
 
# Leagues & Tournaments
+
* '''Overall Demand Analysis'''
They are presented in the form of excel sheet and contains 52 unique tournaments over the time span of 7 years (2010 to May 2016). The parameters of the dataset are as followed:
+
Initially, our team plotted the demand of all the events, and even though we realized that there are variations, however these variations are not really insightful.
 +
 
 +
(Picture HERE)
 +
 
 +
Thus, we drilled down further to look at the different classification of events. Our team has decided to classify the events as per the image below, as the differences in the event frequency will potentially affect the demands being analyzed.
 +
 
 +
[[File:Event_Classification.JPG|600px|center|Event Classification]]
 +
 
 +
The graph shows that the demands for the different types of events peak differently. For example, for the regular events, which are held on a annual basis, peak during April of every year and this is caused by the demands of Coachella at California. Similarly, we also found out that the demand for seasonal events peaked during June period for year 2010, 2012, 2014 and 2015. And this is due to the International Music Festival and Hardwell World Tour Concerts.  
  
'''<big>Profit & Loss (P&L) by Book</big>'''
+
(Picture HERE)
{| class="wikitable" style="margin-top:0px"
+
 
 +
 
 +
* '''Control Chart Analysis'''
 +
As from the previous sub-section, our team wanted to analyze the demands further for each specific event, and thus we selected a few events to analyze the demands using Control Chart.
 +
 
 +
The Control Chart is used to analyse the demands in a time-series manner, and it allows us to visualize the movements of the demands throughout a specified time-frame. The red-lines determines the upper and lower control limits, which signifies data points out of the normal-range (at 3 standard deviations away), and in this case we are more interested in looking into the upper control limits boundary. The green line signifies the average movements of the data points.
 +
 
 +
In this analysis, we will be looking at the specific events of different event types as below:
 +
 
 +
# Seasonal Event - International Music Festival 2010 & 2014
 +
# Regular Event - Coachella at California 2010 to 2012
 +
# Regular Event - Bonnaroo at Tennessee 2010 to 2012
 +
 
 +
 
 +
The main objective of the analysis is to find a pattern as to how the demand changes over time in a specific event. In addition, we are also motivated to find out at which point of time the demand is peaking at.
 +
 
 +
 
 +
* Seasonal Event - International Music Festival
 +
 
 +
{| class="wikitable"
 
|-
 
|-
! Parameters !! Description
+
! colspan="2" | Seasonal Event - International Music Festival Control Chart Analysis
 
|-
 
|-
| Book Date (settlement time)
+
| [[File:Seasonal Event - Intl Music Festival.JPG|500px|center|Seasonal Event - International Music Festival 2010]] || [[File:Seasonal Event - Intl Music Festival-2014.JPG|500px|center|Seasonal Event - International Music Festival 2014]]
|| The official time and date (after the end of the match) for the payout of the bet placed. <sub>For example, 9/1/2010 4:01:00 PM</sub>
 
 
|-
 
|-
| League Name
+
| colspan="2" | '''Analysis:''' Each data point within the Control Chart signifies a single performance within the International Music Festival, and it is arranged in a time-series manner. With reference to the Control Charts above, our team discovered that most seasonal events, such as the International Music Festival, have shown that the demands have an sudden spike as the the date gets closer to the closing stages of the International Music Festival. The extreme spike in demand towards the end of the of the International Music Festival may have caused a sales bottleneck for TixCo.  
|| The name of the league. <sub>For example, A LEAGUE and ASIAN CHAMP</sub>
 
|-
 
| Book Title
 
|| The title of the match. <sub>For example, 4501 Wellington v Brisbane Roar and 4502 Central Coast v Queensland Fury</sub>
 
|-
 
| Bet Type
 
|| The type of the bet. <sub>For example, 1/2 GOAL and HALF/FULL TIME DOUBLE</sub>
 
|-
 
| Winning Selection
 
|| The winning bet type. <sub>For example, Wellington(-1.5) and Queensland Fury(1.5)</sub>
 
|-
 
| Total Stake
 
|| The amount of payment collected.
 
|-
 
| Winning Payout
 
|| The total amount of payout for the winning selection of the match.
 
|-
 
| Winning Liab
 
|| The total winning liability (Total Stake - Winning Payout) of the match.
 
|-
 
| Total Bet Cnt
 
|| The total count of the winning bet type.
 
 
|}
 
|}
  
'''<big>Timetables of the matches</big>'''
+
 
{| class="wikitable" style="margin-top:0px"
+
* Regular Event - Coachella at California 2010 - 2012
|-
+
 
! Parameters !! Description
+
{| class="wikitable"
 
|-
 
|-
| League
+
! colspan="3" | Regular Event - Coachella at California Control Chart Analysis
|| League name. <sub>For example, A league and ASEAN 8</sub>
 
 
|-
 
|-
| Match No
+
| [[File:Regular_Event_-Coachella-2010.JPG|500px|center|Regular Event - Coachella at California 2010]] || [[File:Regular_Event_-Coachella-2011.JPG|500px|center|Regular Event - Coachella at California 2011]] || [[File:Regular_Event_-Coachella-2012.JPG|500px|center|Regular Event - Coachella at California 2012]]
|| Unique identifier for the match on that day. <sub>For example, 4506 means A league on 1st december 2010</sub>
 
 
|-
 
|-
| Match KickOff (Singapore date/time)
+
| colspan="3" | '''Analysis:''' Each data point within the Control Chart signifies a single performance within Coachella at California, and it is arranged in a time-series manner. The first observation in which the team observed is that there is no clear pattern in the increment in demands and demands seem to fluctuate as time progresses. Secondly, there seems to be alot of spikes or demands above the Upper Control Limits, and this is very different from what we have seen previously in the Seasonal Events. Brushing the data points above the Upper Control Limit (UCL), reveals that the more popular bands commanded a better demands.  
|| Match kickoff time in Singapore (Singapore time). <sub>For example, 1/12/2010  4:00:00 PM (Singapore time) and 1/12/2010  7:00:00 PM (Local time)</sub>
 
|-
 
| Match KickOff (Local date/time)
 
|| Match kickoff time at the host country (Local time). <sub>For example, 1/12/2010  4:00:00 PM (Singapore time) and 1/12/2010  7:00:00 PM (Local time)</sub>
 
|-
 
| Home Team
 
|| Home team name.
 
|-
 
| Away Team
 
|| Away team name.
 
|-
 
| Remarks
 
|| Remarks for the timetable (if applicable).
 
|-
 
| Start of sales
 
|| Start time of sales - normally 2 days before the actual match. <sub>For example, 1/12/2010  8:00:00 AM</sub>
 
|-
 
| Close of sales
 
|| End time of sales. <sub>For example, 3/12/2010  1:55:00 AM</sub>
 
 
|}
 
|}
  
'''<big>Leagues & Tournaments</big>'''
+
 
{| class="wikitable" style="margin-top:0px"
+
* Regular Event - Bonnaroo at Tennessee 2010 - 2011
|-
+
 
! Parameters !! Description
+
{| class="wikitable"
|-
 
| No.
 
|| Unique identifier for the leagues. <sub>For example, 01 means S.League and 02 means Singapore Cup</sub>
 
|-
 
| League Name
 
|| The name of the league. <sub>For example, S.League and Singapore Cup</sub>
 
|-
 
| Mnemonic
 
|| Short form of the league. <sub>For example, SL means S.League and SC means Singapore Cup</sub>
 
|-
 
| Season #
 
|| Season number.
 
 
|-
 
|-
| Tournament
+
! colspan="3" | Regular Event - Bonnaroo at Tennessee Control Chart Analysis
|| Tournament country.
 
 
|-
 
|-
| Other League Name
+
| [[File:Regular_Event_-Bonnaroo-2010.JPG|500px|center|Regular Event - Bonnaroo at Tennessee 2010]] || [[File:Regular_Event_-Bonnaroo-2011.JPG|500px|center| Regular Event - Bonnaroo at Tennessee 2011]] || [[File:Regular_Event_-Bonnaroo-2012.JPG|500px|center| Regular Event - Bonnaroo at Tennessee 2012]]
|| Other name of the league. <sub>For example, English Premier is the same as E Premier</sub>
 
 
|-
 
|-
| Remarks
+
| colspan="3" | '''Analysis:''' Each data point within the Control Chart signifies a single performance within the Bonnaroo at Tennessee, and it is arranged in a time-series manner. Similar to the Regular Event Coachella at California, there seems to be no clear pattern in the increment of demands, the demands seems to fluctuate very much as time progresses throughout the event. However, our team has noted that the overall demands is deemed to be much more stable, as there are lesser spikes (e.g., over the Upper Control Limit) in demands. Unlike Coachella in California, it seems that the spiked in demands seem to involve bands at random.  
|| Remarks for the leagues and tournaments (if applicable).
 
 
|}
 
|}
  
 
</div>
 
</div>
 +
 +
<br />
 +
 +
<div style="background:#0096da; line-height:0.3em; font-family:sans-serif; font-size:120%; border-left:#bbdefb solid 15px;"><div style="border-left:#fff solid 5px; padding:15px;"><font color="#fff"><strong>Summary of Data Exploration</strong></font></div></div>
 +
 +
<div style="color:#212121;">
 +
 +
As we per our team's analysis, we realized that there seems to be a multitude of factors, which will potentially affect the demands of each event. An example would be Coachella at California is showing high relation in terms of popularity of bands involved and tickets sold, however the Bonnaroo at Tennessee is showing another set of trends in which our team are not able to determine the correlated factors.
 +
 +
</div>
 +
 +
 
<!-- End Information -->
 
<!-- End Information -->

Latest revision as of 17:01, 1 December 2016

Home

Team

Project Overview

Project Findings

Project Management

Documentation

Mid-Term
Finals


Data Exploration

Throughout our Data Exploration phase, our team analyzed the variations in demands, which is defined as number of tickets sold, and this section will bring you through the interesting findings in relation to our project.


  • Overall Demand Analysis

Initially, our team plotted the demand of all the events, and even though we realized that there are variations, however these variations are not really insightful.

(Picture HERE)

Thus, we drilled down further to look at the different classification of events. Our team has decided to classify the events as per the image below, as the differences in the event frequency will potentially affect the demands being analyzed.

Event Classification

The graph shows that the demands for the different types of events peak differently. For example, for the regular events, which are held on a annual basis, peak during April of every year and this is caused by the demands of Coachella at California. Similarly, we also found out that the demand for seasonal events peaked during June period for year 2010, 2012, 2014 and 2015. And this is due to the International Music Festival and Hardwell World Tour Concerts.

(Picture HERE)


  • Control Chart Analysis

As from the previous sub-section, our team wanted to analyze the demands further for each specific event, and thus we selected a few events to analyze the demands using Control Chart.

The Control Chart is used to analyse the demands in a time-series manner, and it allows us to visualize the movements of the demands throughout a specified time-frame. The red-lines determines the upper and lower control limits, which signifies data points out of the normal-range (at 3 standard deviations away), and in this case we are more interested in looking into the upper control limits boundary. The green line signifies the average movements of the data points.

In this analysis, we will be looking at the specific events of different event types as below:

  1. Seasonal Event - International Music Festival 2010 & 2014
  2. Regular Event - Coachella at California 2010 to 2012
  3. Regular Event - Bonnaroo at Tennessee 2010 to 2012


The main objective of the analysis is to find a pattern as to how the demand changes over time in a specific event. In addition, we are also motivated to find out at which point of time the demand is peaking at.


  • Seasonal Event - International Music Festival
Seasonal Event - International Music Festival Control Chart Analysis
Seasonal Event - International Music Festival 2010
Seasonal Event - International Music Festival 2014
Analysis: Each data point within the Control Chart signifies a single performance within the International Music Festival, and it is arranged in a time-series manner. With reference to the Control Charts above, our team discovered that most seasonal events, such as the International Music Festival, have shown that the demands have an sudden spike as the the date gets closer to the closing stages of the International Music Festival. The extreme spike in demand towards the end of the of the International Music Festival may have caused a sales bottleneck for TixCo.


  • Regular Event - Coachella at California 2010 - 2012
Regular Event - Coachella at California Control Chart Analysis
Regular Event - Coachella at California 2010
Regular Event - Coachella at California 2011
Regular Event - Coachella at California 2012
Analysis: Each data point within the Control Chart signifies a single performance within Coachella at California, and it is arranged in a time-series manner. The first observation in which the team observed is that there is no clear pattern in the increment in demands and demands seem to fluctuate as time progresses. Secondly, there seems to be alot of spikes or demands above the Upper Control Limits, and this is very different from what we have seen previously in the Seasonal Events. Brushing the data points above the Upper Control Limit (UCL), reveals that the more popular bands commanded a better demands.


  • Regular Event - Bonnaroo at Tennessee 2010 - 2011
Regular Event - Bonnaroo at Tennessee Control Chart Analysis
Regular Event - Bonnaroo at Tennessee 2010
Regular Event - Bonnaroo at Tennessee 2011
Regular Event - Bonnaroo at Tennessee 2012
Analysis: Each data point within the Control Chart signifies a single performance within the Bonnaroo at Tennessee, and it is arranged in a time-series manner. Similar to the Regular Event Coachella at California, there seems to be no clear pattern in the increment of demands, the demands seems to fluctuate very much as time progresses throughout the event. However, our team has noted that the overall demands is deemed to be much more stable, as there are lesser spikes (e.g., over the Upper Control Limit) in demands. Unlike Coachella in California, it seems that the spiked in demands seem to involve bands at random.


Summary of Data Exploration

As we per our team's analysis, we realized that there seems to be a multitude of factors, which will potentially affect the demands of each event. An example would be Coachella at California is showing high relation in terms of popularity of bands involved and tickets sold, however the Bonnaroo at Tennessee is showing another set of trends in which our team are not able to determine the correlated factors.