Difference between revisions of "IS428 AY2018-19T1 Kim Do Yeon"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Line 8: Line 8:
  
 
==Visualisation 1( Spatio-temporal Analysis of Official Air Quality)==
 
==Visualisation 1( Spatio-temporal Analysis of Official Air Quality)==
*'''Data Preparation'''
 
For the task, 4 major data sets were in zipped file format and were provided and for visualisation 1, EEA Data.zip was used.
 
First zipped file (EEA Data.zip) had official air quality measurements from 2013 to 2018 from 5 different regions. Before using all the data, some of the files were not included in the visualisation.
 
  
 +
==='''Data Preparation'''===
 +
<p>For the task, 4 major data sets were in zipped file format and were provided and for visualisation 1, EEA Data.zip was used.
 +
First zipped file (EEA Data.zip) had official air quality measurements from 2013 to 2018 from 5 different regions. Before using all the data, some of the files were not included in the visualisation.</p>
  
 +
There are only 3 data sets(2013-2015) for the region code 9484.It excludes this data set as we cannot characterise past and most recent situation. The data observed in region 9484 is not big enough to deliver the pattern from 2013 to 2018 when the visualisation is done. It will not show the right patterns of the air quality as data from 2016 to 2018 is missing.
 +
 +
For this data “BG_5_60881_2018_timeseries.csv” has only one set of data which is only on 2018. However, this dataset can be used as the measurement of the air quality from 2013-2017 is different from 2018. This dataset will be used to compare the daily pattern of the air quality in 2018 over the 5 regions.
 +
 +
In the same zip file (EEA DATA.zip), there is also an additional data “metadata.csv”. It is as shown above. This meta data shows the Longitude, Latitude and the Altitude of the Measurement place and this data can be also concatenated to the timeseries data set. Therefore, all the timeseries data frame is merged and concatenated with the metadata. Each sampling region now have its respective longitude, latitude, altitude and common name. With concatenating the data, geographical features can be also shown.
 +
 +
==='''Observations'''===
 +
<p>From the merged data set, Datetimebegin data from 2013-2016 is daily basis while from 2017-2018 is hourly basis data. To show what typical day looks like for Sofia city, both daily and hourly based visualisation are used.</p>
 +
{| class="wikitable"
 +
|-
 +
! style="font-weight: bold;background: #536a87;color:#fbfcfd;width: 20%;" | Interactive Technique
 +
! style="font-weight: bold;background: #536a87;color:#fbfcfd;width: 40%" | Rationale
 +
! style="font-weight: bold;background: #536a87;color:#fbfcfd;" | Brief Implementation Steps
 +
|-
 +
| <center>'''Highlight the Region to observe the daily data of the air quality''' <br/></center>
 +
|| <center>It is to show the air quality in different regions and its trend over the days. <br/></center>
 +
||
 +
# Filter by the region
 +
# Make Drop down list into regions
 +
|-
 +
| <center>'''Filter the data by the years of interest''' <br></center>
 +
|| <center>To analyse the data in different year ranges</center>
 +
||
 +
# Filter by Date and select year
 +
# Make drop down list
 +
|}
 
==Visualisation 2 ( Spatio-temporal Analysis of Citizen Science Air Quality Measurements)==
 
==Visualisation 2 ( Spatio-temporal Analysis of Citizen Science Air Quality Measurements)==
*'''Data preparation'''
+
 
 +
'''Data preparation'''
 
In  Air tube zip file, there are two csv files. Sensors recorded temperature, humidity and pressure from 2017 to 2018. The Geohash is converted to longitude and latitude using R code(geohash library and “gh_decode()” function was used).After converting the geohash to longitude and latitude, the data frame of 2017,2018 were merged and its respective longitude and latitude were concatenated.  
 
In  Air tube zip file, there are two csv files. Sensors recorded temperature, humidity and pressure from 2017 to 2018. The Geohash is converted to longitude and latitude using R code(geohash library and “gh_decode()” function was used).After converting the geohash to longitude and latitude, the data frame of 2017,2018 were merged and its respective longitude and latitude were concatenated.  
  
 
==Visualisation 3 (Factors affecting the air quality of the city)==
 
==Visualisation 3 (Factors affecting the air quality of the city)==
*'''Data preparation'''
+
'''Data preparation'''
 
From Meteorological Data readings, we can get the latitude and the longitude of the Sofia Airport
 
From Meteorological Data readings, we can get the latitude and the longitude of the Sofia Airport
 
*Latitude: 42.6537
 
*Latitude: 42.6537

Revision as of 21:54, 11 November 2018

Problem & Motivation

Air quality in Bulgaria is now a big concern. The Air quality has been measured every day to keep track of and detect whether the air in Bulgaria is detrimental to people’s health. According to the World Health Organisation, Bulgaria had the highest PM2.5 concentrations of all EU -28-member states in urban areas over the three years average. For PM10, Bulgaria is one of the top countries with 77 micrograms per cube metre (EU limit is 50 micrograms per cube metre). With the large sets of data collected from the sensor, the PM 10 air quality and its factors can be measured. Here, data visualisation will aid in finding the patterns of air quality in Sofia city as well as to identify the issues and factors of air pollution. The interactive visualisation has three functions

  • 1) Patterns of air quality in Sofia city from 2013 to 2018 on a daily and hourly basis
  • 2) Showing the pattern of the data (Temperature, Humidity, Pressure) collected from the sensor.
  • 3) Compare the air pressure and meteorology of Sofia City. Reveal the relationships between the factors causing air quality in Sofia City

Visualisation 1( Spatio-temporal Analysis of Official Air Quality)

Data Preparation

For the task, 4 major data sets were in zipped file format and were provided and for visualisation 1, EEA Data.zip was used. First zipped file (EEA Data.zip) had official air quality measurements from 2013 to 2018 from 5 different regions. Before using all the data, some of the files were not included in the visualisation.

There are only 3 data sets(2013-2015) for the region code 9484.It excludes this data set as we cannot characterise past and most recent situation. The data observed in region 9484 is not big enough to deliver the pattern from 2013 to 2018 when the visualisation is done. It will not show the right patterns of the air quality as data from 2016 to 2018 is missing.

For this data “BG_5_60881_2018_timeseries.csv” has only one set of data which is only on 2018. However, this dataset can be used as the measurement of the air quality from 2013-2017 is different from 2018. This dataset will be used to compare the daily pattern of the air quality in 2018 over the 5 regions.

In the same zip file (EEA DATA.zip), there is also an additional data “metadata.csv”. It is as shown above. This meta data shows the Longitude, Latitude and the Altitude of the Measurement place and this data can be also concatenated to the timeseries data set. Therefore, all the timeseries data frame is merged and concatenated with the metadata. Each sampling region now have its respective longitude, latitude, altitude and common name. With concatenating the data, geographical features can be also shown.

Observations

From the merged data set, Datetimebegin data from 2013-2016 is daily basis while from 2017-2018 is hourly basis data. To show what typical day looks like for Sofia city, both daily and hourly based visualisation are used.

Interactive Technique Rationale Brief Implementation Steps
Highlight the Region to observe the daily data of the air quality
It is to show the air quality in different regions and its trend over the days.
  1. Filter by the region
  2. Make Drop down list into regions
Filter the data by the years of interest
To analyse the data in different year ranges
  1. Filter by Date and select year
  2. Make drop down list

Visualisation 2 ( Spatio-temporal Analysis of Citizen Science Air Quality Measurements)

Data preparation In Air tube zip file, there are two csv files. Sensors recorded temperature, humidity and pressure from 2017 to 2018. The Geohash is converted to longitude and latitude using R code(geohash library and “gh_decode()” function was used).After converting the geohash to longitude and latitude, the data frame of 2017,2018 were merged and its respective longitude and latitude were concatenated.

Visualisation 3 (Factors affecting the air quality of the city)

Data preparation From Meteorological Data readings, we can get the latitude and the longitude of the Sofia Airport

  • Latitude: 42.6537
  • Longitude: 23.3829
  • Elevation: 595 m


References

Comments