Group03Proposal
|  |  |  |  | 
Contents
Background
The Crown Jewel of the Formula One Race Circuit, backdrop of the successful Hollywood Film “Crazy Rich Asian” and the honorable host of the Memorable North Korea-United States Summit, Singapore’s ability to position herself as a neutral yet vibrant destination has led to hordes of visitors setting foot onto her sunny shores. It is no surprise that the tourism sector has been developing into a growth engine for Singapore’s economy . For 2017, Singapore’s tourism sectors attained records highs in both tourists’ arrivals and spending.  According to the data released by Singapore Tourism Board, the number of arrivals increased by 6.2 per cent to $17.4 million, while tourism receipts increased by 3.9 per cent to $26.8 million. The increasing affordability of travel, with the prevalence of low-cost carriers globally, as contribute to the opportunistic trend.
Beyond tourism, Singapore is also an ideal venue for the conduit of businesses. Singapore has constantly been ranked as the top few, if not the top, amongst Asian cities for hosting Meetings, Inventive Travel, Conventions & Exhibitions (MICE) events. Its premium geographical location and stable political climates have been the two main reasons for being the prime destination for international MICE events. In 2017, a total of 935 international meetings took place in Singapore. 
 
  
  
Past Work Reviews
Motivation
During our exploratory analysis on the data comprising of the tourism arrival into Singapore, we noticed that the arrival patterns of tourists and business travellers from respective countries at heterogenous. A keen understanding to the unique travel behaviours can reveal their travel preference which is essential for local businesses to devise plans to attract more tourism receipts boosting their business revenue. The ability of the analysts to grapple the data and transform the insights into actionable business decision will see their businesses flourishes.
With the recent completion of Marina Cruise Centre and ongoing construction of Jewel Changi Airport, the tourism receipts are expected to continue to grow steady for the next decade, barring any black swans.
Objectives
We aim to build an interactive platform to illustrate the trends and seasonality within given time-series data on Singapore tourism sector. Users can have a better understanding of the Singapore tourism situation over the last ten years.
Through this project, we hope that the tourism industry business, especially the small and medium business (check with the SME contribution to Singapore economy or tourism industry) can make optimal marketing solutions and business decision. We attempt to create the platform that assist the business owners and analysts to detect some useful insights from the relationship between travelling revenue and expenditure to promote the economic growth.
- The platform can give us the overview on the visitors’ arrivals pattern by country, age and different transportation methods.
- It also provides the geographic map to illustrate the visitor density among different countries.
- Tourism demand forecasting
Data Source
The Singapore Tourism Sector data is extracted from CEIC database which is available at:
 
https://insights-ceicdata-com.libproxy.smu.edu.sg
 
From our perspective, we have selected the five datasets -- 
Arrival by country 
Arrival by age 
Arrival by transport 
Length of stay 
Tourism revenue and expenditure 
The datasets are in either monthly or yearly format, or both. For our system analysis, we plan to use filtered data from 2007 onwards.
Methodology
Exploratory Analysis
We will explore the different trends of time-series data provided by the various tourism data sets (Period cyclicity and seasonality). Different interactions of identified attributes might provide certain data insights that we can use for our analysis. Visualize the time series in the following ways:
- Geographic heat map: Visualize the time series by displaying the geographic heat map on the density of visitor arrivals based on the selection of the specific calendar month.
- Slopegraphs; This visualization technique can provide maximum information with “minimum ink”. It could help us to detect how the number of the visitor changed over the years.
- Waterfall: Rather than the values itself, a waterfall plot tries to bring out the changes in the values. It could provide the overview of the time series line chart along with on how large the difference is between two data points.
Explanatory Analysis
- Decompose time-series information into its constituent parts: Observation, Seasonal, Trend, Random (Noise). From the separate parts, users can understand the different time-series patterns and derive insights.
- We have many variables(columns) in our dataset, so it is obvious that dimensionality is too high to make effective analysis, and the curse of dimensionality can happen. For this reason, it is important to reduce dimensionality in some way. One of the best approaches is to use time series representations in order to reduce dimensionality, reduce noise and emphasize the main characteristics of time series. In this stage, we would like to do the clustering time series analysis to group the countries with the similar pattern.
Predictive Analysis
Time series forecasting is the use of a model to predict future values based on previously observed values.in this case, we would like to use forecasting techniques such as seasonal exponential smoothing and ARIMA to perform prediction. After forecasting analysis, we must compare predicted tourism to real tourism to help us understand the accuracy of our forecasts. Meanwhile, the standard error and other mathematical statistics can be estimated to further verify the forecasting models and help to choose the best one.
Application Libraries & Packages
| Package Name | Descriptions | 
|---|---|
| TSrepr | Methods for representations (i.e. dimensionality reduction, preprocessing, feature extraction) of time series to help more accurate and effective time series data mining. Non-data adaptive, data adaptive, model-based and data dictated (clipped) representation methods are implemented. Also min-max and z-score normalisations, and forecasting accuracy measures are implemented. | 
| ggplot2 | ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. | 
| Cluster | Methods for Cluster analysis. | 
| Clustercrit | Best criterion returns the best index value according to a specified criterion. | 
| ggmap | ggmap is a package to show the spatial data visualization. It can retrieve various online sources (e.g. Google Maps) for user to download and use as layers within the ggplot2 plotting system. | 
| Slopgraph | Convert a data frame (containing a panel dataset, where rows are observations and columns are time periods) into an Edward Tufte-inspired "slopegraph" using either base or ggplot2 graphics. | 
| Forecast | Methods and tools for displaying and analysing univariate time series forecasts including exponential smoothing via state space models and automatic ARIMA modelling. | 
| Tseries | Computes the Augmented Dickey-Fuller test for the null that x has a unit root. | 
References
1. Tay, F. (2018, February 12). Tourist arrivals, spending in Singapore hit record high for 2nd straight year; China top source of visitors.
https://www.straitstimes.com/singapore/tourist-spending-in-singapore-hit-record-268b-in-2017-china-top-source-of-visitors
2. (2018, June 09). Newsletters Singapore Excels as MICE Destination.
https://www.stb.gov.sg/news-and-publications/newsletters/Pages/June 2015/Singapore-Excels-as-MICE-.aspx
3. Laurinec, P. (2018, March 13). TSrepr use case - Clustering time series representations in R.
https://petolau.github.io/TSrepr-clustering-time-series-representations/
4. Turner, P. (2012, November) The Comparative Economic Impact of Travel & Tourism WTTC
 https://www.wttc.org//media/files/reports/benchmark%20reports/the_comparative_economic_impact_of_travel tourism.pdf
5. Dalinina, R. (2017, January 10). Introduction to Forecasting with ARIMA in R.
 
https://www.datascience.com/blog/introduction-to-forecasting-with-arima-in-r-learn-data-science-tutorials
6. Powell, C. (2018, June 22). PowCreating Slopegraphs with R.
https://datascienceplus.com/creating-slopegraphs-with-r/
7. Tan, A. (2017, October 24). Singapore tourism doubled in 10 years, supports 164, 000 job.
https://www.businesstimes.com.sg/government-economy/singapore-tourism-doubled-in-10-years-supports-164000-jobs-wttc

