Difference between revisions of "IS428 AY2018-19T1 Gokarn Malika Nitin"
Line 33: | Line 33: | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
− | ! Problem #1 || | + | ! Problem #1 || EEA Data Building Issues |
|- | |- | ||
| Issue || The official air quality measurement readings (EEA data) do not include the longitude and latitude of the place of measurement. Instead, they are contained in a separate metadata file. Additionally, each stations' recordings for a specific year are stored in separate .csv files. | | Issue || The official air quality measurement readings (EEA data) do not include the longitude and latitude of the place of measurement. Instead, they are contained in a separate metadata file. Additionally, each stations' recordings for a specific year are stored in separate .csv files. |
Revision as of 05:34, 11 November 2018
Contents
Problem and Motivation
Air Pollution is the single largest environmental health risk in Europe. It is also an important risk factor across the rest of the world. This is due to the high number of metrics pointing toward air pollution being the primary cause of distress in terms of disease (most deadly of which include cancer) and death. For example, it is estimated that 7 million people died prematurely across the world due to air population. In fact, in the European Union, 400,000 people suffered a premature death.
The level of air pollution across the world is only increasing. Within the European Union, one of the countries with the highest PM2.5 concentration in urban areas, over a three-year average is Bulgaria. At the same time, Bulgaria is also leading on the top polluted countries in the PM10 measure, with 77 μg/m3 on the daily mean concentration, which is much higher than WHO limit as well as the EU limit (50 μg/m3).
It is now a major concern in Bulgaria as to how clean the air you’re breathing right now is. Measurements show that citizens all over the country breathe air that is considered harmful to health. The Organization for Economic Cooperation and Development (OECD) predicts that in 2050 outdoor air pollution will be the top cause of environmentally related deaths worldwide.
Therefore, the aim of this assignment is to reveal the spatiotemporal patterns of air quality and measurement techniques in Sofia City of Bulgaria, thereby identifying issues of concern.
Dataset Analysis and Transformation Process
Dataset Download
Four major data sets in zipped file format are used and are available below:
- Official air quality measurements (5 stations in the city)(EEA Data.zip) – as per EU guidelines on air quality monitoring see the data description HERE…
- Citizen science air quality measurements (Air Tube.zip), incl. temperature, humidity and pressure (many stations) and topography (gridded data).
- Meteorological measurements (1 station)(METEO-data.zip): Temperature; Humidity; Wind speed; Pressure; Rainfall; Visibility
- Topography data (TOPO-DATA)
They can be download by click on this link.
Dataset Cleaning and Transformation
Problem #1 | EEA Data Building Issues |
---|---|
Issue | The official air quality measurement readings (EEA data) do not include the longitude and latitude of the place of measurement. Instead, they are contained in a separate metadata file. Additionally, each stations' recordings for a specific year are stored in separate .csv files. |
Solution | Append all the files together, through a Tableau Union. Eliminate data for station 9484, referring to the station named "Orlov Most". This is due to the fact that data for the years 2016 onwards is missing. I choose not to exclude the data for station 60881 referring to the station "Mladost" solely because the data for Mladost is more recent data, and can be considered a new addition to the station list.
|
Problem #2 | AirTube Data Building Issues |
---|---|
Issue | The citizen science air quality measurement readings (AirTube data) do not include the longitude and latitude of the place of measurement. Instead, they are contained in the form of a geohash code. Unfortunately, Tableau is not built to handle geohash code. |
Solution | Making use of the GitHub python geohash2 library [1] I am able to write a python script that can do the decoding for me, taking into consideration the error of transformation as well.
|
Problem #3 | AirTube Data Outliers and Noise Removal |
---|---|
Issue | The citizen science air quality measurement readings (AirTube data) has multiple "wrong" readings with some being noise while some being representative of broken sensors. Through a simple internet search one can find that the lowest temperature Bulgaria has ever faced is -38.3 degrees Celsius, while the highest is 45.2 degrees Celsius. |
Solution | In order to remove the noise and outliers, the recorded temparature above 50 degrees Celsius and below -40 degrees Celsius are removed. |
Task 1: Spatio-temporal Analysis of Official Air Quality
Characterize the past and most recent situation with respect to air quality measures in Sofia City. What does a typical day look like for Sofia city? Do you see any trends of possible interest in this investigation? What anomalies do you find in the official air quality dataset? How do these affect your analysis of potential problems in the environment?
Your submission for this questions should contain no more than 10 images and 1000 words.
Task 2: Spatio-temporal Analysis of Citizen Science Air Quality Measurements
Using appropriate data visualisation, you are required will be asked to answer the following types of questions:
- Characterize the sensors’ coverage, performance and operation. Are they well distributed over the entire city? Are they all working properly at all times? Can you detect any unexpected behaviours of the sensors by analyzing the readings they capture? Limit your response to no more than 4 images and 600 words.
- Now turn your attention to the air pollution measurements themselves. Which part of the city shows relatively higher readings than others? Are these differences time-dependent? Limit your response to no more than 6 images and 800 words.
Task 3
Urban air pollution is a complex issue. There are many factors affecting the air quality of a city. Some of the possible causes are:
- Local energy sources. For example, according to Unmask My City, a global initiative by doctors, nurses, public health practitioners, and allied health professionals dedicated to improving air quality and reducing emissions in our cities, Bulgaria’s main sources of PM10, and fine particle pollution PM2.5 (particles 2.5 microns or smaller) are household burning of fossil fuels or biomass, and transport.
- Local meteorology such as temperature, pressure, rainfall, humidity, wind etc
- Local topography
- Complex interactions between local topography and meteorological characteristics.
- Transboundary pollution, for example, the haze that intruded into Singapore from our neighbours.
In this third task, you are required to reveal the relationships between the factors mentioned above and the air quality measure detected in Task 1 and Task 2. Limit your response to no more than 5 images and 600 words.
Software
- Tableau - for visualization of the various tasks
- Python - for geocoding