Group02 proposal v2

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Rain & Shine(new).png

Team

 

Proposal

 

Poster

 

Application

 

Research Paper

Version2|Version1



PROBLEM & MOTIVATION

Problem
The current reporting of Singapore's climate has always been primitive and thus it is challenging for viewers to derive in-depth insights. In 2019, multiple news companies reported that Singapore is heating up twice as fast as the rest of the world, combined with the island’s constant high humidity, it could be life-threatening. Professor Matthias Roth of the department of geography at the National University of Singapore (NUS) attributed the rising temperatures to global warming and the Urban Heat Island (UHI) effect. However, there was no data or charts provided to back up their claims on Singapore's climate change.

Motivation
Our team aims to present Singapore's climate data in a more user-friendly and meaningful interpretation way. Through Rain&Shine, an interactive and user-friendly visualization dashboard that shows the distribution of the climate by Subzone, Region, and the whole Singapore, we hope to provide Singaporeans with knowledge and in-depth insights to Singapore's Climate. Additionally, we want to identify the trends inherent within the weather data available and answer questions regarding the changes in Singapore's climate from available historical data.

OBJECTIVES

UPDATE: Due to the shinyapp.io memory limitation, our team has to reduce our data size to ensure the application can run smoothly. Hence, instead of 1982, we will be using data from 1990!


We aim to provide an interactive visualization dashboard to assist General Public, people living in Singapore with understanding the weather of our country with visualization information such as:

  1. Insights on the Rainfall Precipitation distribution of the whole of Singapore and each subzone with rainfall station from 1982 to 2019.
  2. Insights on the Temperature patterns of the whole of Singapore and each subzone with temperature station from 1982 to 2019.
  3. Insights on the relationship of the Rainfall Precipitation and Temperature in the different months yearly.

Target Group:

  • General Public, people living in Singapore, weather enthusiast

DATASET

The Data Sets we will be using for our analysis and for our application is listed below:

Data/Source Variables/Description Rationale Of Usage

Temperature and Rainfall Data
(Jan 1982 - Dec 2019)

(http://www.weather.gov.sg/climate-historical-daily/)

  • Stations
  • Year
  • Month
  • Daily Rainfall
  • Highest 30-min/60-min/120-min Rainfall (mm)
  • Mean/Minimum/Maximum Temperature (°C)
  • Mean/Max Wind Speed (km/h)

This dataset covers a good time series of Singapore's weather from 1982 to 2019 across different weather categories. Our team wish to spot the trend or pattern of Singapore's climate in every town that we can obtain its historical data.

Weather Station Location Data

(https://api.data.gov.sg/v1/environment/rainfall) (https://api.data.gov.sg/v1/environment/air-temperature)

  • Station ID
  • Station Name
  • Latitude
  • Longitude

The data set will be used to identify the location of the weather station and the weather data that was tracked.

Note: We will be looking into the API and use the JSON format to extract the geocoordinate for our amenities. Use both links to ensure we do not miss out on any possible location.

BACKGROUND SURVEY OF RELATED WORK

Below are a few visualizations and charts we considered making for our projects.

Visual Considerations Insights / Comments

Title: Monthly mean temperature compared to long term average
Monthly mean temperature.png

Source: http://www.weather.gov.sg/wp-content/uploads/2019/03/Annual-Climate-Assessment-Report-2018.pdf

This is a graph taken from the report by NEA. From this graph, we are able to see that the temperature in 2018 has exceeded the mean of temperatures of the past 30 years. However the limitation in this graph is that although we can see that the temperatures has increased, we are unable to see if this is a systemic increase, or whether it is an anomalous year for temperature.


Title: Isopleth map
Isopleth rainfall.png

Source: http://www.weather.gov.sg/wp-content/uploads/2019/03/Annual-Climate-Assessment-Report-2018.pdf

This is the isopleth map taken from the report by NEA. From the graph, we are able to see that January, June, October, and November are the months where there is more rainfall. Our group can implement such a design to let users get a feel for how the rainfall distribution across Singapore for a specific month will look like.


Title: Whisker plot of temperature
Temperature whisker.png

Source: https://www.ck12.org/statistics/box-and-whisker-plots/rwa/The-Ways-of-Weather/

We are able to see the temperature for the selected area over the course of a year. The whisker plots are able to show the upper and lower boundaries of temperature, and we can observe that the temperature gradually rises to a peak from Jan to Aug, before decreasing until December.

We hope to apply this chart to display the rainfall for a selected area over the course of a year. This allows viewers to be able to better understand the rainfall pattern. This can also be applied to temperature to get a better understanding of temperature patterns in the year.



Title: Heatmap of rainfall
Heatmap of rainfall.png

Source: https://www.shanelynn.ie/analysis-of-weather-data-using-pandas-python-and-seaborn/

This is a heatmap of daily rainfall. Darker colours of red represent heavier rainfall.

This is another way to have a visualization to understand the patterns of rainfall. Through this, we are able to quickly see how many days in a year where there is rain for a selected area. This can also be applied to temperature to get a quick visualization on how hot singapore has been across the year.


Title: Violin plot of temperature with rainfall overlaid
Violinplot rainfall.png

Source: https://www.r-bloggers.com/part-3a-plotting-with-ggplot2/

One of the plots that we chanced upon was a violin plot that overlaid the rainfall points on top. So for May, we are able to see the distribution of average temperature in that month along with how there are few days where there is 20mm of rainfall, and many days where there is no rainfall.

One reason why we can consider using this plot for our visualization is that it will allow us to merge the temperature and rainfall data together into one visualization.


Title: Ridgeline plot of temperature
Ridgeline temperature.png

Source: https://cran.r-project.org/web/packages/ggridges/vignettes/gallery.html/

This graph is a ridgeline plot about temperature over the course of a year. From this graph, we can see that the days in May - July are hotter than the days in Jan - Dec.

Our group hopes to apply this to our temperature and rainfall so that we can see if there is any change to the distribution over the course of the years that we have collected the data for.

STORYBOARD

Dashboards Description

Dashboard 1: Isopleth Map for Weather
Spatial Interpolation

Our group plans to do an Isopleth Map which reflects the weather distribution based on the year, month and locations. This chart will show the data at a high level for users to identify which area has higher rainfall than average and which has lesser rainfall throughout the filtered Month/Year Period.

Similarly, our team plans to do another Isopleth Map to show the distribution of the temperature throughout the whole of Singapore. This chart will show the data at a high level for users to identify which area is hotter than average and which are colder throughout the filtered Month/Year Period.

The purpose of this chart is to understand and identify the rain and temperature patterns of every area in Singapore throughout the past 20 years so as to find out if there is a climate change and if global warming is affecting the weather in Singapore.

Filters used includes:

  • Sliders
- Year
  • Single Dropdown List
- Months

From this chart, users will be able to select the location of their interests to gather data from more specific charts.

Update: Our group wanted to do an isopleth map on Singapore's climate, however, as there was a lack of guides online to reach how to do Isopleth maps that are compatible with RShiny we were not able to do a point-based isopleth map for the area hence we did a choropleth map with leaflets for our users.


Dashboard 2: Weather Distribution with Violin Plot
Proj4.jpg

Our group aims to use Violin Plots to visualize the distribution and density of the historical weather data. The violin plot chart is a combination of a Box Plot and a Density Plot that is rotated and placed on each side, to show the distribution shape of the data.

Our charts will show the distribution of the historical rainfall data for users to identify the distribution of the historical temperature data and visualize the difference of the temperate by month throughout each year. The violin plot will be mapped to the continuous variable to represent the amount of rain to visualize the relationship between the rain and temperature in each month. Additionally, we aim to discover any other existing trends and patterns from the weather data. Altogether, these charts will use Year and Area filters. The area filter can be selected from the chart in Dashboard 1 to carry out a further in-depth analysis from dashboard 1.

The purpose of this chart is to understand and identify the rain and temperature distribution patterns of Singapore overall throughout the past 38 years, to find out if there is a climate change and if global warming is affecting the weather in Singapore.

Filters used includes:

  • Sliders
- Year
  • Single Dropdown List
- Measurements (Rain Precipitation/Temperature)


Hovering over the graph is possible to show the value details.

  • Density

Dashboard 3: Calendar Chart for Rainfall and Temperature over 38 years
Proj3.jpg


The purpose of this calendar chart is a visualization used to show the rainfall amount and temperature over the course of a long span of time, such as months or years. We aim to illustrate how some quantity varies depending on the day of the week, or identify any existing trends or patterns over time purely by the period of the year.

Filters used includes:

  • Sliders
- Year
  • Single Dropdown List
- Measurements (Rain Precipitation/Temperature)

Dashboard 4: Comparing Rainfall precipitation distribution over the months
Proj5.jpg

The purpose of this chart is to identify the trend of the Rain Precipitation Amount in Singapore for each of the months as well as identifying any anomaly.

In this chart, the user will be able to compare:

  • The rainfall distribution of the user’s choice in each zone/postal area.

Filters used includes:

  • Sliders
- Year
  • Single Dropdown List
- Measurements (Rain Precipitation/Temperature)
Y-Axis: Measurement Type
X-Axis: Months


Hovering over the graph is possible to show the value details.

  1. Max
  2. 75 percentile
  3. Mean
  4. 24 percentile
  5. Min

Dashboard 5: Distribution of Rain Precipitation Amount to identify the changes in rainfall for the past 38 years
Proj.jpg

The purpose of this chart is to show a detailed breakdown of the Rain Precipitation Amount in Singapore for Users. The ridge plot helps users in identifying the changes in rainfall for the past 20 years.

This chart can show the User's shortlisted area(s) from the chart in Dashboard 1 or any other selected postal area.


Filters used includes:

  • Sliders
- Year


The user can adjust the filters to identify any patterns or trends of the weather based on the short-listed areas in Singapore. This chart can help the user better identify which area best suits the user based on his preferences and needs.


X-Axis: Year
Y-Axis: Rain Precipitation Amount

Dashboard 6: Distribution of Temperature to identify the changes in rainfall for the past 38 years
Proj.jpg

The purpose of this chart is to show a detailed breakdown of the Temperature in Singapore for Users. The ridge plot helps users in identifying the changes in rainfall for the past 20 years.

This chart can show the User's shortlisted area(s) from the chart in Dashboard 1 or any other selected postal area.

Filters used includes:

  • Sliders
- Year

The user can adjust the filters to identify any patterns or trends of the weather based on the short-listed areas in Singapore. This chart can help the user better identify which area best suits the user based on his preferences and needs.


X-Axis: Year
Y-Axis: Temperature In Degree Celsius

Dashboard 7: Weather Radial Charts to compare the temperature of each year
Photo7.jpg
]

The purpose of this chart is for the users to be able to visualize and identify the trend of Singapore's temperature by months for each year. The weather radial chart uses colors to represent the temperature where the coolest will be in a blue color tone and the hottest is represented by a yellow color tone. With this chart, users are able to distinguish which months are usually cooler and which ones are hottest in each year and compare them with the other years to identify if there is a pattern for the temperatures.

Furthermore, users are able to hover over the lines in the chart to see the minimum, median and maximum temperature of any day and any region in the selected year.

Filters used includes:

  • Sliders
- Year
  • Single Dropdown List
- Region


X-Axis: Month
Y-Axis: Temperature In Degree Celsius


Hovering over the graph is possible to show the value details.

  1. Date
  2. Min
  3. Mean
  4. Max

Dashboard 8: Identifying Singapore's temperature changes from Year 1982 to 2019
Photo8.jpg

This chart is an overview chart that gets the minimum, median and maximum temperature of the entire Singapore for each day from 1982 to 2019. The purpose of this chart is to help users with visualizing and understanding Singapore's temperature changes from the Year 1982 to 2019, mentioned that is due to global warming effects. The charts also use colors to represent the temperature where the coolest temperature is in blue color tone and the hottest is represented by a yellow color tone. The trend line represents the overall best fit temperature of each day to help users with spotting the changes in the temperature.

Users will be able to hover over the lines in the chart to see the minimum, median and maximum temperature of any day from 1982 to 2019.


X-Axis: Year
Y-Axis: Temperature In Degree Celsius

Hovering over the graph is possible to show the value details.

  1. Date
  2. Max
  3. Median
  4. Min
  5. Predict

Overall Dashboard Design: Example of how the overall dashboard will look like
OverallDB.jpg
  • This is an example of how the overall layout used to contain and organized the above-shown dashboards will appear in our application. Users will be able to select buttons from the navigation menu to switch views amongst our different dashboards to learn about different insights.
  • The charts have been organized by chart types as each chart has a shows different insights and there is no direct interaction between two chart types, hence we placed them on different tabs.
  • Filter panels can be found at the top of the page above the charts or at the left side of the charts.
  • Legends can be found in the chart or at the bottom of each dashboard page if there are colors used.
  • The application allows users to navigate and filter the dashboards to display the group or information that users are interested into their desired dimensions. Overall, our app provides better flexibility and efficiency of use. With our charts, our users will be able to visualize and identify insights and trends of Climate and its changes in Singapore.

TOOLS & TECHNOLOGIES


The technologies and tools our group used to develop our application are:

Technology & Tools Used G2.png


MILESTONES

MilestonesG2.jpg

KEY TECHNICAL CHALLENGES & MITIGATION

No. Challenge Description Mitigation Plan
1.
Software Challenge Unfamiliarity of visualisation tools such as R, R Shiny, Tableau.
  • Github Learning
  • Stackoverflow research
  • Self-directed and peer learning
  • Watch video tutorials from YouTube
  • Hands-on practice using the different training platforms such as Data Camps
2.
Programming Challenge Inexperince with data cleaning and transformation using R
  • Trial and error
  • Read online articles and forums for guidance
  • Watch video tutorials on how to fully utilise packages such as lapply, tidyr and dplyr
3.
Workload Constraint Time and Workload Constrains
  • Design reasonable project timeline based on everyone's ability and capacity.
  • Set milestones and adjust the timeline accordingly based on the team's progress.
4.
Dataset Complexity

Our have different data from multiple sources in multiple different formats, hence we foresee a huge challenge in standardizing the data

  • Note: Our current dataset is looking at 55 areas over the spread of 37 years of data, for every year there are 12 months of data. This gives a total of 13103 CSV files to consolidate and clean by just looking at the weather data.
  • Make use of data preparation tools such as tableau prep
  • Make use of our database management skills to normalize all data tables into third normal form

COMMENTS

No. Name Date Comments
1. (Name) (Date) (Comment)
2. (Name) (Date) (Comment)
3. (Name) (Date) (Comment)