ISSS608 2018-19 T1 Assign Wang Yixuan Data Preparation
Official Air Quality Measurements(EEA Data)
The EEA data set consist of the air quality data of 6 different air quality stations in Sofia city, as well as the metadata including the longitude and latitude, code information of the 6 air quality stations. The data is collected from Jan 2013 to Sep 2018. To better analyse the past and most recent situation of air quality in Sofia city, we need to join all the separate worksheet together with metadata using Tableau.
Drag and drop all the air quality data into Tableau.
Because the worksheets share same columns, they will be concatenated together. And then we join the concatenated worksheet with metadata table with left joint method. Take note that we need to set a main key to make sure the data is joint properly.
Here we use "Air Quality Station" as the key to left join two table.
Then we would like to check the integrity of Official Air Quality Measurements data.
The chart above shows the number of measurements of 6 stations on each day from 2013 to 2018. From the chart we can see that the data is actually lacking of integrity. Mladost station only has measurements for year 2018, while Orlov Most station only has observations between 2013 and 2015. Also, the measurement is observed once a day between year 2013 and 2016 and it's observed once an hour ever since 2017.