ISSS608 2018-19 T1 Assign Lim Si Ling Evelyn Task 2

From Visual Analytics and Applications
Jump to navigation Jump to search

Sofia City.jpg

Overview

Dashboard

Task 1

Task 2

Task 3

Assignment Page

 
Task 2 – Spatio-temporal Analysis of Citizen Science Air Quality Measurements
1 – Coverage of Data points collected by Citizens
Coverage of data points collected by Citizens

Figure 1 – Coverage of data points collected by Citizens

Data Anomaly
1 – There were data points outside Sofia City but in Bulgaria. These data points will be excluded from the analysis with creation of group to separate readings taken within Sofia City with those outside Sofia City.
2 – Even within Sofia City, the data points are not evenly distributed; most of the points are found in the central region and data points were very sparsely distributed in the north eastern region. Also, number of data points increases over time.


2 – Are all the measurements correctly taken?
Outliers in Citizen Data

Figure 2 - Outliers in data collected by Citizen for Pressure, Temperature, Humidity, P1 and P2.

Charts show median for the various measurement aggregated by each citizen's datapoint (i.e. geohash) for each week. Boxplot is put in place to facilitate the detection of outlier especially for Temperature and Humidity where measurements varied with respect to season (i.e. date).
Observations

1 - There are obvious outliers such as Pressure=0, Temperature lower than -100, Humidity=0 and extremely high P1 and P2 concentration values.
2 - Not all the observations for the same citizen were wrongly captured. i.e. For geohash sx3x9tpv3kv, Pressure, Temperature and Humidity had some outliers but not for P1 and P2 concentration.
Approach

Groups were created to remove outliers and median of each day will be used for the later analysis and geographical location will be grouped with hexagonal binning to cater for these outliers.

3 – Scatterplot of P1 and P2 against various measures
Coverage of data points collected by Citizens

Figure 3 – Scatterplot of P1 and P2 against various measures

Observations
- Both P1 and P2 correlation with Pressure, Temperature and Humidity in the way.
- Pollutants are positively correlated with Pressure and Humidity.
- Interestingly, pollutants peaked between -5 to 10 degree celsius and is generally negatively correlated with temperature.


4 – Relationship and Time Series between P1 & P2
Coverage of data points collected by Citizens

Figure 4 – Scatterplot of P1 and P2 against various measures

Observations
1 - P1 and P2 concentrations are highly correlated.
2 - P1 pollutant is generally higher than P2 and both pollutants were generally higher in the winter months from Dec 2017 to Jan 2018.
3 - 95 percentile P1 and P2 concentration was used to remove outliers in the reading, in addition to removal of outliers. It peaked at 8-9 Jan 2018, then followed by 27 Jan 2018.
4 - Median P1 and P2 concentration peaked on 27 Jan 2018.

5 – Time Series of the measurements and Interaction
Time Series of the measurements and Interaction

Figure 5 – Time Series of the measurements and Interaction

This chart is put in place to uncover any potential relationship between each measurements and time. However, there is no obvious pattern from the 6 charts above. Temperature changes accordingly due to season.


6 – Spatio-Temporal Investigation for P1 and P2
Spatio-temporal Investigation for P1 and P2

Figure 6 – The 2 maps show how median concentration of P1 and P2 changed with respect to day and location.

Observations
1 - Median P1 and P2 concentrations were generally more sensitive to time compared to location, i.e. high values are detected during the winter months from Dec 2017 to Jan 2018.
2 - Median concentration of P1 were generally higher than median concentration of P2 across time and location.
3 - North-Eastern and North-Western regions were more prone to unhealthy range of P1 and P2 pollutant.

7 – Spatio-Temporal Investigation for Pressure, Temperature and Humidity
Time Series of the measurements and Interaction

Figure 7 – The 3 maps show how median pressure, temperature and humidity changed with respect to day and location.

Observations
1 - North-Eastern region experienced slightly higher pressure across time.
2 - Temperature did not fluctuate much across location and subjected to season.
3 - Humidity is usually higher the inner city.

8 - Case Study - 16 Jan 2018
Time Series of the measurements and Interaction

Figure 8 – P1 and P2 concentration on 16 Jan 2018

Time Series of the measurements and Interaction

Figure 9 – Pressure, Temperature and Humidity concentration on 16 Jan 2018

As observed and mentioned earlier, P1 and P2 concentration is positively correlated with Pressure and Humidity.

Tableau Public link for Air Quality in Sofia City - Citizen Data Do note that as the data used is big and data cleansing was done in Tableau, the performance of the dashboard might be slow.

Banner Photo from Pexels