ISSS608 2018-19 T1 Assign Stanley Alexander Dion Task 1
|
|
|
|
|
Contents
Insight on Station Location
The data comprises of 6 stations located in regions that according to Sofia Globe (2017) measured with worst air measurement. Looking from the above visualisation, we could see the different heights of the stations located across the city of Sofia. The highest altitude is achieved by station located IAOS/Pavlovo with altitude equal to 615 M ASL. There is one more station left, namely Mladost, that is not visualised in our map due to unrecorded altitude.
For the first stab of the analysis, the distance between the sensors and street kerb & the station building are what matter to know the quality of the observation that the sensors are perceived. The further the building from the station’s inlet (sensors), the more a signal loss can possibly happen and create noise to our observations, assuming that data storage and servers are located in the station building. On the other hand, the closer an inlet to a Kerb, the more the pollution sensors will be affected by human temporary activities in the street, e.g. vehicles passing.
Our plot on the right depicts a flawed design of the sensor placement, on which Mladost Station that is designed for measuring traffic pollution is located the furthest from the Kerb, meaning that there is smaller chance that this station will detect pollution coming from passing vehicles. On the other hand, Orlov Most is located the furthest from the building station, on which we can expect there is more sensing disruption occurred.
Insight on Station Report Forms
![]() Observation Period Link to the Graph: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_InitialLookup/DataCollectionMethod?publish=yes |
Since we are going to perform spatio-temporal analysis on this dataset, we need to look into the sensors readings itself; whether we have a consistent inter-observation period. From the plot shown on the left, we can see that the stations are not consistent with averaging method across the time. Before 2016, all observations were standardised and used day as the averaging unit. We can start seeing hourly measurement piloted in 2016 although it is not yet applied consistently to all stations. The trial is followed with a series of observation blackout during 2017 and stations started adjusting to hourly measurement characterised by varying averaging unit at the start of 2018. To maintain consistency and successfully characterize past and recent observations, we would divide the data into two analysis periods, which are before and after 2017’s blackout. |
Temporal Factor of the Pollution
![]() Past Observations (2013-2016) Dashboard Link: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_DailyData/Spatio-TemporalDashboard?publish=yes |
![]() Recent Observations (2017-2018) Dashboard Link: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_HourlyData/InteractiveSpatio-TemporalDashboard |
Taking the first look at a high-level overview, we can view the average concentration of each station throughout the entire observation period. We took average since the number of observations on different stations would be different. We found that Orlov Most stood out as the most polluted region compared with the rest of the station in 2015. whereas starting from 2017 onward, the concentration has slowly shifted to the western area of the city, leaving Druzhba to have significantly lower PM 10 average concentration compared to the rest of the location. Nadezhda pollution has also be seen thickened in the last two years than that of the last 5 years.
Speaking of Nadezhda pollution, we could detect that the peak time in Nadezhda seems to happen earlier and different with the rest of the region. Some hypothesis are as follows:
- This could be a sign that we identify potential source of pollution in Nadezhda and the delay in peaking time of the other regions are caused by particles are blown away to the southern area.
- There is a secondary source of pollution that affects the region, apart from the source which always affects southern region.
*please click on the image to view the gif image with reduced speed
Calendar View of the Month
![]() Observations from 2013-2016 link to the graph: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_DailyData/CalendarPlot2013-2016 |
![]() Observation from 2017-2018 link to the graph: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_HourlyData/CalendarPlot2017-2018 |
To see patterns on daily basis, we could plot a calendar chart to see city-wide observations. From the colour of the calendar chart above, we could notice that the pollution concentration across all the stations are spotted higher on the beginning and end of each year. A very thick pollution concentration is spotted at the beginning of 2013 however it is getting lesser toward recent observations in 2016.
Comparing with the recent hourly observation, we also spotted the same pattern of pollution during the winter season. Additionally, now we could see higher pollution toward the weekend rather than on weekdays. The third week of January 2018 further show us that high pollution happened on the evening hours of the day, especially on Saturdays. This is an interesting point to investigate since the pollution is getting higher when human activities generally lower in the evening.
Cycle of the Months
![]() Daily Cycle Plot (2013-2016) Link to the Graph: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_DailyData/CyclePlot2013-2016 Cycle plot is useful when we want to detect pattern in a seasonal-dependent data on a time-series analysis. The above plot displays the cycles of average concentration per month throughout the different years available for our daily dataset. Abnormal weather is defined by readings outside one standard deviation. We can spot high peaks during January and December. We can see clearly Nadezhda region is lower than the rest stations on two of the winter peaks; within the December period in 2016 when Druzhba, Hipodruma, and Pavlovo region increases, Nadezhda average pollution remain constant at above 50 microgram. |
![]() Hourly Cycle Plot (2017-2018) Link to the Graph: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_HourlyData/CyclePlot2017-2018 Since we have hourly observation starting from next year, we changed the cycle plot view to show average hourly observation for the each of the month. We can see that at 2 AM in the morning in January, IAOS station detected a high average PM10 concentration with very unhealthy indicator. The peak that time is also shared across the other locations with far lesser magnitude. This hour-length pollution peak is suspicious since it only sparks for short period of time. Another anomaly is detected at evening hours only for Mladost Station during December. We know that Mladost is close to CBD area and the station is designed to detect passing vehicles pollution. However, since the Kerb distance is too far from the traffic, we could infer that the pollution is less likely to come from the background traffic. Further investigation is needed to understand the cause. |
Detailed View of the City
![]() Daily Spiral Plot (2013-2016). Active Link to the Graph: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_DailyData/SpiralPlot2013-2016 |
![]() Hourly Spiral Plot (2017-2018). Active Link to the Graph: https://public.tableau.com/profile/stanley.alexander.dion#!/vizhome/Assignment_EEAData_HourlyData/SpiralPlot2017-2018 |
From 2013 till 2016
Spiral plot is useful to see trend repetition at a very granular level. The vis above show us one year of observation for each year; each point on top of each other is exactly the following observation a year later. The dots are colour coded according to different level of hazard according to EU standard. From the visualisation, we are assured to see high winter pollution again during the change of the year. Druzhba and Orlov Most is seen with the most extra above normal air pollutions aside from the usual winter worst condition around the mid of the year. Notice that the graph allows us to see Orlov Most station stopped reporting from last quarter of 2016. Hipodruma is seen to have intermittent sensing disruptions during 2014 while Nadezhda has the most reporting disruptions in the Mid-2016.
2017
After December 2017 onwards, the Spiral plot is changed to have each spike representing an hour of the day; 12 PM is the topmost spike and 12 AM lies on bottom, read in clockwise direction with each layer represent one day of observation. In early part of the observation within December 2017, Nadezhda is seen worse than it has been comparing with the last 4 years result. The peaking time started in the morning till noon, from 10AM – 7 PM. The pollution in Nadezhda suddenly the length of the peaking time just few days later (12PM to 4 PM continued with another period on 11 PM – 2 AM). One day toward the end of the month is observed with unhealthy pollution index shared across all the different station.
On the other hand, we could notice suspicious event where all stations except Mladost have missing observations every 2 AM in the morning. Mladost station itself even only works few days before the end of year.
2018
Across different regions, we could high pollution at the start of the January 2018. Strangely, the pollution is less during the mid of Jan and come back at the end of the month with higher magnitude. The missing observations every 2 AM still happening until January 2018. This explains the reason why we previously see high average concentration at 2 AM for January 2018 across some station's cycle plot; the observation at 2 AM is always gone except for one hour in the month where the sensor peculiarly detects high pollution on that night across the stations.
Among the five stations, Mladost is having less severe pollution at the start of January, whereas Druzhba is seen to have the least severe pollution of the entire 2018. All the three locations: Nadezhda, Hipodruma, and Pavlovo have the same period of peaking concentration, which is at dawn time around 1 AM to 3 AM. One noticeable location that has distinct pollution pattern is Nadezhda, where we can found above normal air pollution started from 10 PM to 3 AM in the morning during the Spring season (March and April).