Sofia City - Data Preparation
Sofia City - Air Quality Analysis
|
|
|
|
|
|
Contents
Data Preparation
Task 1: Spatio-temporal Analysis of Official Air Quality
Air Quality Data Distributed
The air quality data available from EEA is from 6 different air quality stations and distributed over 28 CSV files. The data files are checked and determined to have the same data columns and hence are concatenated directly.
Uneven Spread of Dataset over Time
It is noted that out of the 6 air quality stations, 4 of them provided data across the full date range from 2013 to 2018. However, one of the remaining 2 (BG0054A) only contains data from 2013 to 2015. This could be due to the closing down of this station, which resulted in no further outputs of data from 2016 onwards. The last station (BG0079A) only contains data from 1st January 2018 onwards. This could be a new station that only started operations in 2018.
Difference in Interval of Measurements
It is noted that the air quality measurements prior to end of 2016 are taken on a daily basis. From 2017 onwards, the measurements started to be taken on an hourly basis. This could be due to process improvements to provide data on a more granular basis.
Joining EEA table with Station Info
The air quality stations are identified with an unique string identifier, the EOL code. For easy readability, the table consisting of the monitoring stations data is joined with the EEA table, providing every reading with the corresponding station name, latitude and longitude.