ISSS608 2016-17T3 Group15 Report
|
|
|
|
Characterizing Pandemic Spread Using R
By Chua Gim Hong, Huang LiWei and Ngo Siew Hui
Contents
Abstract
A pandemic is an epidemic or outbreak of infectious disease that spreads rapidly not only to many people, but across countries. The unprecedented mobility of people and food over the last 30 years has seen a steady increase in the frequency and diversity of disease outbreaks. No country is immune to this growing global threat. Scientists are predicting that it is not a matter of if, but when the next pandemic will happen. Singapore, as a small city state, with the highest population density in the world and one of the highest air passenger traffic, is particularly vulnerable.
There are reasons to remain optimistic, as Singapore’s SMART Nation initiatives and modern healthcare systems’ electronic records have open up new possibilities in the fight against potential infectious disease outbreaks in the country. Data will be increasingly ubiquitous as the world, including Singapore, continues to make significant advancement in the digitalisation age. Insights from the data have the potential to offer a critical line of preparedness needed through early identification, rapid effective response, and containment of disease outbreaks. Despite the increasing availability of data, we will need appropriate and affordable data exploratory and analysis tool.
In view of this, our project aims to develop a visualisation tool using R Shiny and R data visualization packages such as calendar heatmap and trellis plot. To be deployed it as an interactive dashboard prototype, this visualisation tool can potentially be used by health officials to analyse the hospitalisation data and characterise the spread of the pandemic across countries should an actual disease outbreak happen. R programming will be used to analyse a synthetic dataset (i.e. computer- and human-generated data) relating to a major disease outbreak that spanned several cities across the world in 2009.
This presentation consists of four main sections. First, the motivation and objectives of the project will be discussed. This is followed by a detailed discussion on the principles and concepts of key visualisation methods used. After which, the R packages used to develop the application and the user-interface designed will be discussed. Using the synthetic dataset, we will demonstrate how the functions of our tools can be used to detect the patterns and attribute distributions that characterize a pandemic spread. The efficacy of each of these visual analytics techniques will be discussed in detail. The presentation will conclude with a sharing of valuable insights gained through working on the project and potential application areas of our visualisation tool. We will also suggest possibilities for future works by combining hospital records with other data sources.
[VAST Challenge 2010 - Characterisation of Pandemic Spread]
Motivation of the application
Review and critic on past works
Design framework
A detail description of the design principles used and data visualisation elements built (Refer to Section 3: Interface of this paper [1].
Demonstration
Sample test cases
Discussion
What has the audience learned from your work? What new insights or practices has your system enabled? A full blown user study is not expected, but informal observations of use that help evaluate your system are encouraged.
Future Work
A description of how your system could be extended or refined.
Installation guide
including hardware configuration and software integrationn. Sample Installation Guide
User Guide
Step-by-step guide on how to use the data visualisation functions designed.
References