ISSS608 Assign Pu Yiran-Task 2

From Visual Analytics and Applications
Revision as of 00:06, 17 November 2018 by Yiran.pu.2017 (talk | contribs)
Jump to navigation Jump to search

Pollution-1.jpg    Task 1: Spatio-temporal Analysis of Official Air Quality

Background & Introduction

Data Preparation

Task 1

Task 2

Task 3

 

Get To Know About Sensors

Where did sensors cover

Task2 001.png

After decoding geohash into corresponding longitude and latitude, we are able to locate all the geohash onto map, each of which represents a sensor’s location. As shown in the first graph, there is one remote point which could be an error and will be excluded in further analysis. In total, there are 538 sensors located across the entire Bulgaria, giving 1,048,574 sensor records of 2017 and 2018.

National-widely, sensors were detecting from 382 locations in 2017 and 528 locations in 2018. In Sofia city, sensors were detecting from 240 locations in 2017 and 308 locations in 2018.

Although the sensors are covering national wide of Bulgaria, most of them are centralized in capital city Sofia, and especially, the central region of Sofia city. A small number of sensors also gathers at Plovdiv, a province near Sofia. At the rest provinces/cities, sensors are evenly distributed.

To perform further analysis on Sofia capital city, a set of all the sensors located in Sofia city is created.

Task2 002.png




How did sensors work

Task2 003.png


Although the given time interval of measurement in data is one hour, not all the sensors have been either working well nor giving correct data all the time.

Not all the sensors started working at the beginning — the number of sensors that were working per day was generally increasing during the given time period in data.

Most of the sensors were not working well all the time. As the example given below, even if a sensor was working on a particular day, it might not be working for 24 hours continuously, which makes all the blanks and gaps in the graph.

Even worse, sometimes sensors performed abnormally and gave ridiculous data. Plus, as shown in below, some abnormal status has been lasted for hours and even a few days continuously, which can badly affect some analysis on daily average.

By looking at all the abnormal records, we can find out that most of the error values are the same or in the same range, which can be considered as a systematic breakdown of machines. To make sure the accurate of further analysis, all the error records are excluded.
Task2 006.png

Analysis of Air Pollutants--PM10 and PM2.5