ISSS608 2018-19 T1 Assign Wu Jinglong Task2

From Visual Analytics and Applications
Jump to navigation Jump to search


Banner WuJinglong.jpeg

Sofia Air Pollution - Be a Visual Detective

Overview

Task1

Task2

Task3

Dashboard


Data preparation

For 2017 and 2018 data files, geohash decoding is done separately By running following R code, the geo location data can be abstracted. In the end two data files "NBG2017.csv" and "NBG2018.csv" are generated, after data exploration I decided not to join the data cause some of the station names can't be mapped.

Task2 1.png

Performance

We observed that there is data outlier in temperature, humidity and pressure data. Which means not all sensors are working properly at all times. Tableau filter is used to filter out unrealistic values.

Task2 2.png

Filtering is based on the following formula: - Humidity should be between 0 to 100; - Temperature shall not be lower than -50 - Pressure shall be in between 50k to 140k - The P1/P2 value shall not be more than 500

After filtering the data set is more accurate:
Task2 3.png

Sensor coverage

Location coverage
By visualizing Sofia topography dataset, we are able to identify if the sensor is located in Sofia city:

Task2 4.png

For both 2017 and 2018 city original Sensors Coverage Map, it’s not only for Sofia city. Thus we need to filter out the Sofia city readings based on Sofia topography dataset.

Task2 5.png

Data points after filter:

Task2 6.png

After the filtering, we plot a time heat map to see the data coverage in year 2017 and 2018, we find there is missing data in 2017 Oct, 2018 April and July:

Task2 7.png

Air Pollution Measurements

From below density map which represents PM 10 concentration, we can clearly see that the air pollution is mostly in the city area and north area. In the year 2018 there are more new sensors added.

Task2 8.png

Monthly view
From below monthly view from 2017 Sep to 2018 Aug (P10 concentration), we can clearly see that starting from Oct, the P10 concentration starts to hike, it will reach the peak in January, where the whole city area P10 concentration reading is high, during summer the readings are back to a low level:

Task2 9.png

Hourly view
We use the most recent 1 month data in 2018 July-Aug to generate an hourly view of density map for P2.5 and P10 concentration, we can see readings during noon is relatively lower, it will start to increase after 17:00, in early morning(4:00-5:00) it will reach the daily peak.

Task2 10.png
2mv0fq.gif