Difference between revisions of "ISSS608 2018-19 T1 Assign Hou Xuelin"
Line 28: | Line 28: | ||
== Task2 Exploration of Sensor Data == | == Task2 Exploration of Sensor Data == | ||
− | === | + | === Sensor Data Quality === |
− | + | ==== Coverage ==== | |
+ | The highest density of sensor is around the urban area around the capital of the city. <br> | ||
+ | Sensors densely covers the southern of Sofia city, while left with a low coverage on the north of the city. <br> | ||
+ | |||
+ | ==== Operation ==== | ||
+ | There in total 1265 sensors deployed around sofia from Sep 2017 to Sep 2018.<br> | ||
+ | The average working sensors is 453.2, and median of working sensors is 513.<br> | ||
+ | |||
+ | [[Image:Calendar-chart.jpg|600px]] | ||
+ | |||
+ | ==== Performance ==== | ||
+ | The measurement of sensors are not consistently reliable. Because some abnormal measurement are observed: | ||
+ | |||
+ | * P1, P2 value are capped at 2000, 1000, which may be the maximum the sensor can measured or measurement error. This is not sure from the data. | ||
+ | * pressure should be ranged from 90000 to 100000 hPs. Negative value is observed from data. | ||
+ | * temperature should be ranged from -10 to 50 degree Celsus. Extreme value, (e.g. -5573, 435) is unreasonable. | ||
+ | * humidity is an percentage, which should be ranged from 0 to 100. Anomalies are also observed, such as -999 and 898. | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! quantile | ||
+ | ! P1 | ||
+ | ! P2 | ||
+ | ! pressure | ||
+ | ! temperature | ||
+ | ! humidity | ||
+ | |- | ||
+ | | 0% | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | -20148 | ||
+ | | -5573 | ||
+ | | -999 | ||
+ | |- | ||
+ | | 10% | ||
+ | | 4 | ||
+ | | 2 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 28 | ||
+ | |- | ||
+ | | 20% | ||
+ | | 7 | ||
+ | | 4 | ||
+ | | 93178 | ||
+ | | 4 | ||
+ | | 40 | ||
+ | |- | ||
+ | | 30% | ||
+ | | 9 | ||
+ | | 6 | ||
+ | | 94075 | ||
+ | | 8 | ||
+ | | 49 | ||
+ | |- | ||
+ | | 40% | ||
+ | | 11 | ||
+ | | 7 | ||
+ | | 94552 | ||
+ | | 12 | ||
+ | | 57 | ||
+ | |- | ||
+ | | 50% | ||
+ | | 14 | ||
+ | | 9 | ||
+ | | 94936 | ||
+ | | 15 | ||
+ | | 63 | ||
+ | |- | ||
+ | | 60% | ||
+ | | 18 | ||
+ | | 11 | ||
+ | | 95360 | ||
+ | | 18 | ||
+ | | 69 | ||
+ | |- | ||
+ | | 70% | ||
+ | | 23 | ||
+ | | 15 | ||
+ | | 96242 | ||
+ | | 21 | ||
+ | | 74 | ||
+ | |- | ||
+ | | 80% | ||
+ | | 34 | ||
+ | | 20 | ||
+ | | 99027 | ||
+ | | 24 | ||
+ | | 80 | ||
+ | |- | ||
+ | | 90% | ||
+ | | 62 | ||
+ | | 33 | ||
+ | | 100140 | ||
+ | | 27 | ||
+ | | 88 | ||
+ | |- | ||
+ | | 100% | ||
+ | | 2000 | ||
+ | | 1000 | ||
+ | | 254165 | ||
+ | | 435 | ||
+ | | 898 | ||
+ | |} | ||
+ | |||
+ | I also noticed that, not all types of measurement are available for all sensors. Therefore, I divided the sensors into 3 types: | ||
+ | |||
+ | * only measure particle concentrations (P1, P2) => particle-measurement only | ||
+ | * only measure Temperature, Pressure & Humidity => TPH-measurement only | ||
+ | * measure all 5 indexes => All-measurement | ||
+ | |||
+ | [[Image:Censor-type-chart.jpg|600px]] | ||
=== anomalies of sensor data === | === anomalies of sensor data === |
Revision as of 10:43, 13 November 2018
Contents
Task1 Exploration of Official Data
Situation of Air Quality
The annual average PM10 concentration is around 45 from 2013 to 2018.
Druzhba is improving its air condition in recent two years, but Nadezhda showed uplift in 2017.
PM10 concentration in the rest of areas are gradually declining.
The PM10 trend in Sofia is highly periodic and the peaks are always fall on winters (Jan/Dec).
This may be due to domestic heating in winters.
A typical PM10 trend within day remains average around 30, and declines to around 20 between 10am - 5pm, when most of people are out for working.
Anomalies of Official Data
- Only Nov/Dec data is recorded in 2017 and the rest months data is all missing.This may not be representative for 2017 annual data.
- the sampling frequency `AveagingTime` is inconsistent throughout the data, it ranges from day, hour and var. This may introduce some bias, when we aggregate the data.
Task2 Exploration of Sensor Data
Sensor Data Quality
Coverage
The highest density of sensor is around the urban area around the capital of the city.
Sensors densely covers the southern of Sofia city, while left with a low coverage on the north of the city.
Operation
There in total 1265 sensors deployed around sofia from Sep 2017 to Sep 2018.
The average working sensors is 453.2, and median of working sensors is 513.
Performance
The measurement of sensors are not consistently reliable. Because some abnormal measurement are observed:
- P1, P2 value are capped at 2000, 1000, which may be the maximum the sensor can measured or measurement error. This is not sure from the data.
- pressure should be ranged from 90000 to 100000 hPs. Negative value is observed from data.
- temperature should be ranged from -10 to 50 degree Celsus. Extreme value, (e.g. -5573, 435) is unreasonable.
- humidity is an percentage, which should be ranged from 0 to 100. Anomalies are also observed, such as -999 and 898.
quantile | P1 | P2 | pressure | temperature | humidity |
---|---|---|---|---|---|
0% | 0 | 0 | -20148 | -5573 | -999 |
10% | 4 | 2 | 0 | 0 | 28 |
20% | 7 | 4 | 93178 | 4 | 40 |
30% | 9 | 6 | 94075 | 8 | 49 |
40% | 11 | 7 | 94552 | 12 | 57 |
50% | 14 | 9 | 94936 | 15 | 63 |
60% | 18 | 11 | 95360 | 18 | 69 |
70% | 23 | 15 | 96242 | 21 | 74 |
80% | 34 | 20 | 99027 | 24 | 80 |
90% | 62 | 33 | 100140 | 27 | 88 |
100% | 2000 | 1000 | 254165 | 435 | 898 |
I also noticed that, not all types of measurement are available for all sensors. Therefore, I divided the sensors into 3 types:
- only measure particle concentrations (P1, P2) => particle-measurement only
- only measure Temperature, Pressure & Humidity => TPH-measurement only
- measure all 5 indexes => All-measurement
anomalies of sensor data
Can you detect any unexpected behaviors of the sensors through analyzing the readings they capture? Limit your response to no more than 4 images and 600 words.
air pollution correlation
Now turn your attention to the air pollution measurements themselves. Which part of the city shows relatively higher readings than others? Are these differences time dependent? Limit your response to no more than 6 images and 800 words.
Task3 Factors Affect Sofia Air Pollution
Urban air pollution is a complex issue. There are many factors affecting the air quality of a city. Some of the possible causes are:
Local energy sources. For example, according to Unmask My City, a global initiative by doctors, nurses, public health practitioners, and allied health professionals dedicated to improving air quality and reducing emissions in our cities, Bulgaria’s main sources of PM10, and fine particle pollution PM2.5 (particles 2.5 microns or smaller) are household burning of fossil fuels or biomass, and transport. Local meteorology such as temperature, pressure, rainfall, humidity, wind etc Local topography Complex interactions between local topography and meteorological characteristics. Transboundary pollution for example the haze that intruded into Singapore from our neighbours. In this third task, you are required to reveal the relationships between the factors mentioned above and the air quality measure detected in Task 1 and Task 2. Limit your response to no more than 5 images and 600 words.
Data Source
Methodology
Application Libraries & Packages
Package Name | Descriptions |
---|---|
xlsx | R package for Excel file manipulation. |