Group02 proposal v2

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Rain & Shine(new).png

Team

 

Proposal

 

Poster

 

Application

 

Research Paper

Version2|Version1



PROBLEM & MOTIVATION

Problem
The current reporting of Singapore's climate has always been primitive and thus it is challenging for viewers to obtain in-depth insights. In 2019, multiple news companies reported that Singapore is heating up twice as fast as the rest of the world. When combined with the island’s constant high humidity, it could be life-threatening. Professor Matthias Roth of the department of geography at the National University of Singapore (NUS) attributed the rising temperatures to global warming and the Urban Heat Island (UHI) effect. However, there was no data given to back up their claims on Singapore's climate change.

Motivation
Our team aims to present Singapore's climate data in more user-friendly and meaningful interpretation ways. Through Rain&Shine, an interactive and user-friendly visualization dashboard that shows the distribution of the climate by Subzone, Region, and Overall, we hope to provide Singaporeans with knowledge and in-depth insights to Singapore's Climate. Additionally, we want to identify the trends inherent within the weather data available and answer questions regarding the changes in Singapore's climate from available historical data.

OBJECTIVES

Provide meaningful graphs for viewers to identify weather patterns in Singapore.

  • Changes in Singapore’s climate patterns from 1982 to 2019.
    1. Temperature
    2. Rainfall distribution across Singapore

We aim to provide an interactive visualization dashboard to assist General Public, people living in Singapore with understanding the weather of our country with visualization information such as:

  1. Insights on the Rainfall Precipitation of the whole of Singapore and each subzone with rainfall station from 1982 to 2019.
  2. Insights on the Temperature of the whole of Singapore and each subzone with temperature station from 1982 to 2019.
  3. Insights on the relationship of the Rainfall Precipitation and Temperature in the different months yearly.

Target Group:

  • General Public, people living in Singapore, weather enthusiast

DATASET

The Data Sets we will be using for our analysis and for our application is listed below:

Data/Source Variables/Description Rationale Of Usage

Temperature and Rainfall Data
(Jan 1982 - Dec 2019)

(http://www.weather.gov.sg/climate-historical-daily/)

  • Stations
  • Year
  • Month
  • Daily Rainfall
  • Highest 30-min/60-min/120-min Rainfall (mm)
  • Mean/Minimum/Maximum Temperature (°C)
  • Mean/Max Wind Speed (km/h)

This dataset covers a good time series of Singapore's weather from 1982 to 2019 across different weather categories. Our team wish to spot the trend or pattern of Singapore's climate in every town that we can obtain its historical data.

Weather Station Location Data

(https://api.data.gov.sg/v1/environment/rainfall) (https://api.data.gov.sg/v1/environment/air-temperature)

  • Station ID
  • Station Name
  • Latitude
  • Longitude

The data set will be used to identify the location of the weather station and the weather data that was tracked.

Note: We will be looking into the API and use the JSON format to extract the geocoordinate for our amenities. Use both links to ensure we do not miss out on any possible location.

BACKGROUND SURVEY OF RELATED WORK

Below are a few visualizations and charts we considered making for our projects.

Visual Considerations Insights / Comments

Title: Monthly mean temperature compared to long term average
Monthly mean temperature.png

Source: http://www.weather.gov.sg/wp-content/uploads/2019/03/Annual-Climate-Assessment-Report-2018.pdf

This is a graph taken from the report by NEA. From this graph, we are able to see that the temperature in 2018 has exceeded the mean of temperatures of the past 30 years. However the limitation in this graph is that although we can see that the temperatures has increased, we are unable to see if this is a systemic increase, or whether it is an anomalous year for temperature.


Title: Isopleth map
Isopleth rainfall.png

Source: http://www.weather.gov.sg/wp-content/uploads/2019/03/Annual-Climate-Assessment-Report-2018.pdf

This is the isopleth map taken from the report by NEA. From the graph, we are able to see that January, June, October, and November are the months where there is more rainfall. Our group can implement such a design to let users get a feel for how the rainfall distribution across Singapore for a specific month will look like.


Title: Whisker plot of temperature
Temperature whisker.png

Source: https://www.ck12.org/statistics/box-and-whisker-plots/rwa/The-Ways-of-Weather/

We are able to see the temperature for the selected area over the course of a year. The whisker plots are able to show the upper and lower boundaries of temperature, and we can observe that the temperature gradually rises to a peak from Jan to Aug, before decreasing until December.

We hope to apply this chart to display the rainfall for a selected area over the course of a year. This allows viewers to be able to better understand the rainfall pattern. This can also be applied to temperature to get a better understanding of temperature patterns in the year.



Title: Heatmap of rainfall
Heatmap of rainfall.png

Source: https://www.shanelynn.ie/analysis-of-weather-data-using-pandas-python-and-seaborn/

This is a heatmap of daily rainfall. Darker colours of red represent heavier rainfall.

This is another way to have a visualization to understand the patterns of rainfall. Through this, we are able to quickly see how many days in a year where there is rain for a selected area. This can also be applied to temperature to get a quick visualization on how hot singapore has been across the year.


Title: Violin plot of temperature with rainfall overlaid
Violinplot rainfall.png

Source: https://www.r-bloggers.com/part-3a-plotting-with-ggplot2/

One of the plots that we chanced upon was a violin plot that overlaid the rainfall points on top. So for May, we are able to see the distribution of average temperature in that month along with how there are few days where there is 20mm of rainfall, and many days where there is no rainfall.

One reason why we can consider using this plot for our visualization is that it will allow us to merge the temperature and rainfall data together into one visualization.


Title: Ridgeline plot of temperature
Ridgeline temperature.png

Source: https://cran.r-project.org/web/packages/ggridges/vignettes/gallery.html/

This graph is a ridgeline plot about temperature over the course of a year. From this graph, we can see that the days in May - July are hotter than the days in Jan - Dec.

Our group hopes to apply this to our temperature and rainfall so that we can see if there is any change to the distribution over the course of the years that we have collected the data for.

STORYBOARD

Dashboards Description

Dashboard 1: Isopleth Map for Weather
Spatial Interpolation

Our group plans to do an Isopleth Map which reflects the weather distribution based on the year, month and locations. This chart will show the data at a high level for users to identify which area has higher rainfall than average and which has lesser rainfall throughout the filtered Month/Year Period.

Similarly, our team plans to do another Isopleth Map to show the distribution of the temperature throughout the whole of Singapore. This chart will show the data at a high level for users to identify which area is hotter than average and which are colder throughout the filtered Month/Year Period.

The purpose of this chart is to understand and identify the rain and temperature patterns of every area in Singapore throughout the past 20 years so as to find out if there is a climate change and if global warming is affecting the weather in Singapore.

Filters used includes:

  • Sliders
  1. Year
  2. Months
  • Single Dropdown List
  1. Map level of detail

From this chart, users will be able to select the location of their interests to gather data from more specific charts.


Dashboard 2: Weather Distribution with Violin Plot
Proj4.jpg

Our group aims to use Violin Plots to visualize the distribution and density of the historical weather data. The violin plot chart is a combination of a Box Plot and a Density Plot that is rotated and placed on each side, to show the distribution shape of the data.

Our charts will show the distribution of the historical rainfall data for users to identify the distribution of the historical temperature data and visualize the difference of the temperate by month throughout each year. The violin plot will be mapped to the continuous variable to represent the amount of rain to visualize the relationship between the rain and temperature in each month. Additionally, we aim to discover any other existing trends and patterns from the weather data. Altogether, these charts will use Year and Area filters. The area filter can be selected from the chart in Dashboard 1 to carry out a further in-depth analysis from dashboard 1.

The purpose of this chart is to understand and identify the rain and temperature distribution patterns of Singapore overall throughout the past 20 years so as to find out if there is a climate change and if global warming is affecting the weather in Singapore.

Filters used includes:

  • Sliders
  1. Year
  • Single Dropdown List
  1. Area

Dashboard 3: Calendar Chart for Rainfall and Temperature over 38 years
Proj3.jpg


The purpose of this calendar chart is a visualization used to show the rainfall amount and temperature over the course of a long span of time, such as months or years. We aim to illustrate how some quantity varies depending on the day of the week, or identify any existing trends or patterns over time purely by the period of the year.

Filters used includes:

  • Sliders
  1. Year
  2. Month


X-Axis: Level of Detail(Sub-zone/ Postal Area/ Zone)
Y-Axis: Rain Precipitation Amount/ Temperature


This chart will be shown together with the chart on Dashboard 2 to help the buyer make the best-informed decisions.


Dashboard 4: Comparing Rainfall precipitation distribution over the months
Proj5.jpg

The purpose of this chart is to identify the trend of the Rain Precipitation Amount in Singapore for each of the months as well as identifying any anomaly.

In this chart, the user will be able to compare:

  • The rainfall distribution of the user’s choice in each zone/postal area.


Filters used includes:

  • Sliders
  1. Month
  2. Year
  • Single Dropdown List
  1. Level of Detail
  2. Rain Precipitation Amount


X-Axis: Level of Detail(Sub-zone/ Postal Area/ Zone)
Y-Axis: Rain Precipitation Amount



Dashboard 5: Distribution of Rain Precipitation Amount to identify the changes in rainfall for the past 38 years
Proj.jpg

The purpose of this chart is to show a detailed breakdown of the Rain Precipitation Amount in Singapore for Users. The ridge plot helps users in identifying the changes in rainfall for the past 20 years.

This chart can show the User's shortlisted area(s) from the chart in Dashboard 1 or any other selected postal area.


Filters used includes:

  • Sliders
  1. Year
  • Single Dropdown List
  1. Level of Detail


The user can adjust the filters to identify any patterns or trends of the weather based on the short-listed areas in Singapore. This chart can help the user better identify which area best suits the user based on his preferences and needs.


X-Axis: Year
Y-Axis: Rain Precipitation Amount



Dashboard 6: Distribution of Temperature to identify the changes in rainfall for the past 38 years
Proj.jpg

The purpose of this chart is to show a detailed breakdown of the Temperature in Singapore for Users. The ridge plot helps users in identifying the changes in rainfall for the past 20 years.

This chart can show the User's shortlisted area(s) from the chart in Dashboard 1 or any other selected postal area.


Filters used includes:

  • Sliders
  1. Year
  • Single Dropdown List
  1. Level of Detail


The user can adjust the filters to identify any patterns or trends of the weather based on the short-listed areas in Singapore. This chart can help the user better identify which area best suits the user based on his preferences and needs.


X-Axis: Year
Y-Axis: Temperature In Degree Celsius

Dashboard 7: Distribution of Temperature to identify the changes in rainfall for the past 20 years
Photo8.jpg
]

The purpose of this chart is to show a detailed breakdown of the Temperature in Singapore for Users. The ridge plot helps users in identifying the changes in rainfall for the past 20 years.

This chart can show the User's shortlisted area(s) from the chart in Dashboard 1 or any other selected postal area.


Filters used includes:

  • Sliders
  1. Year
  • Single Dropdown List
  1. Level of Detail


The user can adjust the filters to identify any patterns or trends of the weather based on the short-listed areas in Singapore. This chart can help the user better identify which area best suits the user based on his preferences and needs.


X-Axis: Year
Y-Axis: Temperature In Degree Celsius

Dashboard 8: Distribution of Temperature to identify the changes in rainfall for the past 20 years
Photo7.jpg

The purpose of this chart is to show a detailed breakdown of the Temperature in Singapore for Users. The ridge plot helps users in identifying the changes in rainfall for the past 20 years.

This chart can show the User's shortlisted area(s) from the chart in Dashboard 1 or any other selected postal area.


Filters used includes:

  • Sliders
  1. Year
  • Single Dropdown List
  1. Level of Detail


The user can adjust the filters to identify any patterns or trends of the weather based on the short-listed areas in Singapore. This chart can help the user better identify which area best suits the user based on his preferences and needs.


X-Axis: Year
Y-Axis: Temperature In Degree Celsius

Overall Dashboard Design: Example of how the overall dashboard will look like
OverallDB.jpg

The purpose of this chart is to show a detailed breakdown of the Temperature in Singapore for Users. The ridge plot helps users in identifying the changes in rainfall for the past 20 years.

This chart can show the User's shortlisted area(s) from the chart in Dashboard 1 or any other selected postal area.


Filters used includes:

  • Sliders
  1. Year
  • Single Dropdown List
  1. Level of Detail


The user can adjust the filters to identify any patterns or trends of the weather based on the short-listed areas in Singapore. This chart can help the user better identify which area best suits the user based on his preferences and needs.


X-Axis: Year
Y-Axis: Temperature In Degree Celsius

TOOLS & TECHNOLOGIES


The technologies and tools our group used to develop our application are:

Technology & Tools Used G2.png


MILESTONES

MilestonesG2.jpg

KEY TECHNICAL CHALLENGES & MITIGATION

No. Challenge Description Mitigation Plan
1.
Software Challenge Unfamiliarity of visualisation tools such as R, R Shiny, Tableau.
  • Github Learning
  • Stackoverflow research
  • Self-directed and peer learning
  • Watch video tutorials from YouTube
  • Hands-on practice using the different training platforms such as Data Camps
2.
Programming Challenge Inexperince with data cleaning and transformation using R
  • Trial and error
  • Read online articles and forums for guidance
  • Watch video tutorials on how to fully utilise packages such as lapply, tidyr and dplyr
3.
Workload Constraint Time and Workload Constrains
  • Design reasonable project timeline based on everyone's ability and capacity.
  • Set milestones and adjust the timeline accordingly based on the team's progress.
4.
Dataset Complexity

Our have different data from multiple sources in multiple different formats, hence we foresee a huge challenge in standardizing the data

  • Note: Our current dataset is looking at 55 areas over the spread of 37 years of data, for every year there are 12 months of data. This gives a total of 13103 CSV files to consolidate and clean by just looking at the weather data.
  • Make use of data preparation tools such as tableau prep
  • Make use of our database management skills to normalize all data tables into third normal form

COMMENTS

No. Name Date Comments
1. (Name) (Date) (Comment)
2. (Name) (Date) (Comment)
3. (Name) (Date) (Comment)