File:Group3ProjectBanner.PNG

From Visual Analytics and Applications
Revision as of 22:14, 14 October 2018 by Anna.zuo.2017 (talk | contribs)
Jump to navigation Jump to search

Original file(1,050 × 351 pixels, file size: 841 KB, MIME type: image/png)


Proposal

Poster

Application

Report


Background

The Crown Jewel of the Formula One Race Circuit, backdrop of the successful Hollywood Film “Crazy Rich Asian” and the honorable host of the Memorable North Korea-United States Summit, Singapore’s ability to position herself as a neutral yet vibrant destination has led to hordes of visitors setting foot onto her sunny shores. It is no surprise that the tourism sector has been developing into a growth engine for Singapore’s economy. For 2017, Singapore’s tourism sectors attained records highs in both tourists’ arrivals and spending. According to the data released by Singapore Tourism Board, the number of arrivals increased by 6.2 per cent to $17.4 million, while tourism receipts increased by 3.9 per cent to $26.8 million. The increasing affordability of travel, with the prevalence of low-cost carriers globally, as contribute to the opportunistic trend.

Motivation

During our exploratory analysis on the data comprising of the tourism arrival into Singapore, we noticed that the arrival patterns of tourists from respective countries at heterogenous. A keen understanding to the unique of arrival patterns can reveal the travel preference which is essential for businesses to attract more tourism receipts to boost their business revenue. The ability of the analysts to grapple the data and transform the insights into actionable business decision will see their businesses flourishes.

With the recent completion of Marina Cruise Centre and ongoing construction of Jewel Changi Airport, the tourism receipts are expected to continue to grow steady for the next decade, barring any black swans.

Objectives

We aim to build an interactive platform to illustrate the trends and seasonality within given time-series data on Singapore tourism sector. Users can have a better understanding of the Singapore tourism situation over the last ten years.

Through this project, we hope that the tourism industry business, especially the small and medium business (check with the SME contribution to Singapore economy or tourism industry) can make optimal marketing solutions and business decision. We attempt to create the platform that assist the business owners and analysts to detect some useful insights from the relationship between travelling revenue and expenditure to promote the economic growth.

  • The platform can give us the overview on the visitors’ arrivals pattern by country, age and different transportation methods.
  • It also provides the geographic map to illustrate the visitor density among different countries.
  • Tourism demand forecasting


Data Source

The Singapore Tourism Sector data is extracted from CEIC database which is available at:
https://insights-ceicdata-com.libproxy.smu.edu.sg

From our perspective, we have selected the five datasets -- which are on the topics of tourism arrivals by country, age, transport, length of stay and the tourism revenue and expenditure. The datasets are in a monthly format. For our system analysis, we plan to use filtered data from 2007 onwards.

Methodology

Exploratory Analysis

We will explore the different trends of time-series data provided by the various tourism data sets (Period cyclicity and seasonality). Different interactions of identified attributes might provide certain data insights that we can use for our analysis. Visualize the time series in the following ways:

  • Geographic heat map: Visualize the time series by displaying the geographic heat map on the density of visitor arrivals based on the selection of the specific calendar month.
  • Slopegraphs; This visualization technique can provide maximum information with “minimum ink”. It could help us to detect how the number of the visitor changed over the years.
  • Waterfall: Rather than the values itself, a waterfall plot tries to bring out the changes in the values. It could provide the overview of the time series line chart along with on how large the difference is between two data points.

Explanatory Analysis

  • Decompose time-series information into its constituent parts: Observation, Seasonal, Trend, Random (Noise). From the separate parts, users can understand the different time-series patterns and derive insights.
  • We have many variables(columns) in our dataset, so it is obvious that dimensionality is too high to make effective analysis, and the curse of dimensionality can happen. For this reason, it is important to reduce dimensionality in some way. One of the best approaches is to use time series representations in order to reduce dimensionality, reduce noise and emphasize the main characteristics of time series. In this stage, we would like to do the clustering time series analysis to group the countries with the similar pattern.

Predictive Analysis

Time series forecasting is the use of a model to predict future values based on previously observed values.in this case, we would like to use forecasting techniques such as seasonal exponential smoothing and ARIMA to perform prediction. After forecasting analysis, we must compare predicted tourism to real tourism to help us understand the accuracy of our forecasts. Meanwhile, the standard error and other mathematical statistics can be estimated to further verify the forecasting models and help to choose the best one.

Application Libraries & Packages

Package Name Descriptions
Shiny[1] Interactive web applications for data visualization
Tidyverse: tidyr, dplyr, ggplot2[2] Tidying and manipulating data for visualizing in ggplot2
Shinythemes[3] Provide consistent UI elements for aesthetics
forecast[4], broom, sweep[5] Packages used to "tidy" data models for easy forecasting. Forecast package uses ts objects that is difficult to manipulate. sw_sweep from the sweep package uses broom-style tidiers to extract model infomation into 'tidy' data frames. sweep package also uses timekit at the back-end to maintain the original time series index throughout the whole process.
tibbletime[6] Time-based data subsetting
lubridate[7] Easy manipulation of datetime data
timetk[5] Extracting/checking of datetime index from ts objects
stringr[8] String manipulation
DT[9] Sortable data table UI element for model accuracy measures
cowplot[10] Graph arrangement of ggplots in a single renderPlot function
shinycssloaders[11] Loading animation for large data loading and model training

References

1. TSrepr use case - Clustering time series representations in R
https://petolau.github.io/TSrepr-clustering-time-series-representations/

2. The Comparative Economic Impact of Travel & Tourism - WTTC
https://www.wttc.org//media/files/reports/benchmark%20reports/the_comparative_economic_impact_of_travel__tourism.pdf

3. Creating Slopegraphs with R
https://datascienceplus.com/creating-slopegraphs-with-r/

4. Introduction to Forecasting with ARIMA in R

https://www.datascience.com/blog/introduction-to-forecasting-with-arima-in-r-learn-data-science-tutorials

  1. RStudio.org. [1] "Interact. Analyze. Communicate.", Retrieved on 30 November 2017
  2. tidyverse.org. [2] "R packages for data science", Retrieved on 30 November 2017
  3. Chang, W, RStudio, and etc. [3] "shinythemes: Themes for Shiny", Retrieved on 30 November 2017
  4. Hyndman, R, and etc. [4] "forecast: Forecasting Functions for Time Series and Linear Models", Retrieved on 30 November 2017
  5. 5.0 5.1 www.business-science.io. [5] "Open Source Software For Business & Financial Analysis", Retrieved on 30 November 2017
  6. Vaughan, D, and etc. [6] "tibbletime: Time Aware Tibbles", Retrieved on 30 November 2017
  7. Spinu, V, and etc. [7] "lubridate: Make Dealing with Dates a Little Easier", Retrieved on 30 November 2017
  8. Wickham, H and RStudio. [8] "stringr: Simple, Consistent Wrappers for Common String Operations", Retrieved on 30 November 2017
  9. Xie, Y, and etc. [9] "DT: A Wrapper of the Javascript Library 'DataTables'", Retrieved on 30 November 2017
  10. Wilke, C, and etc. [10] "cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'", Retrieved on 30 November 2017
  11. Sali, A, and etc. [11] "shinycssloaders: Add CSS Loading Animations to 'shiny' Outputs", Retrieved on 30 November 2017

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current22:01, 14 October 2018Thumbnail for version as of 22:01, 14 October 20181,050 × 351 (841 KB)Anna.zuo.2017 (talk | contribs)
  • You cannot overwrite this file.

There are no pages that use this file.

Metadata