ISSS608 2018-19 T1 Assign Clara Chua Kiah Hwii Task 1
|  |  |  |  |  | 
We have data for 6 air quality stations across 2013 – 2018. The air quality stations capture PM10 concentrations, from either traffic air pollution or background air pollution and are mapped below.
Plotting the average PM10 concentration on a daily time series chart, we can see that 4 stations (Druzhba, Hipodruma, IAOS/Pavlovo, Nadezhda) have a gap of data for most of 2017 (Jan 1 2017 – Nov 27 2017). Orlov Most station data has been discontinued from Oct 2 2015; Mladost station only started from Jan 1 2018.
At a glance, we can hypothesise that there is some seasonal variation in the air quality, and that the background and traffic data are fairly congruent – days with high spikes in PM10 concentrations are similar across the different stations.
In addition to lack of data for 2017, the EEA data has a mixture of daily and hourly data (i.e. readings averaged over a day, vs hours). There are also variable readings where, perhaps due to device issues or reading issues, the reading is averaged over less than an hour. Looking at the distribution of the AveragingTime variable, we can see that prior to 2016, the data was averaged on a daily basis; 2016 introduced a mixture of daily and hourly data, and the stations only captured hourly readings from 2017 onwards.
This means that we can only do an apples-to-apples comparison when we take the average daily readings for Sofia across time. For analysis of hourly variation, we can only take data from Nov 2017, when there were consistent hourly readings from the various stations.
Finally, we should also look at the different types of measurement when we look at the data. The following visualisation shows the time series of the daily average and maximum PM10 concentration. Whilst the average gives a good indication of the air quality over time, the maximum concentrations show the severity of the air pollution in Sofia over the same period.
The peak average PM10 reading is on 24 December 2013 (355.9) whereas the maximum PM10 for that day was 396.2. The peak maximum PM10 reading is on Jan 28 2018, however, the average reading for that day shows only 170.4. This implies a huge variation in the PM10 concentrations in a 24 hour period across the air quality stations. We can explore these points further to see whether the variation is location specific, or time specific.
A typical day in Sofia varies from season to season. We define seasons as:
- Winter: Dec – Feb
- Spring: Mar – May
- Summer: Jun – Aug
- Autumn: Sep - Nov
As expected, the PM10 concentration is higher in winter and autumn, and lower in spring and summer. It also shows variation at different times – PM10 concentrations are lower in the daytime (between hours of 11am – 4pm) and there seems to be two peaks from 6am – 10am and from 6pm – midnight. This is not surprising as reports from the European Environment Agency (EEA) show that the source of PM10 / PM2.5 emissions come from mainly three sources:
- (i)	residents burning coal, wood and other materials for heating in winter 
- (ii)	road transport 
- (iii)	industrial use and thermal power plants 
The data is consistent with the fact that residents are burning fuel to keep warm during the winter months, as well as during the night times, when we expect lower temperatures.
Sofia shows the highest PM10 concentrations from Nov – Feb (winter), where almost all average values of PM10 are above the acceptable threshold of 50μg/m3. January seems to be the worst hit month, with average PM10 concentrations high above all other months across the years. While we can see the average PM10 concentration decreasing in some months over the years, we are unable to extrapolate a general trend for all of Sofia.
However, when we look at the maximum reading of PM10 captured each month, we see that the severity of PM10 concentrations seem to be on an upward trend, after a dip in 2015.
The following plot is another way to visualise the daily variation of PM10 concentration across the entire city.






