IS428 AY2018-19T1 Le Thanh An
Contents
► Problem & Motivation
Air pollution is an important risk factor for health in Europe and worldwide. A recent review of the global burden of disease showed that it is one of the top ten risk factors for health globally. Worldwide an estimated 7 million people died prematurely because of pollution; in the European Union (EU) 400,000 people suffer a premature death. The Organisation for Economic Cooperation and Development (OECD) predicts that in 2050 outdoor air pollution will be the top cause of environmentally related deaths worldwide. In addition, air pollution has also been classified as the leading environmental cause of cancer.
Air quality in Bulgaria is a big concern: measurements show that citizens all over the country breathe in air that is considered harmful to health. For example, concentrations of PM2.5 and PM10 are much higher than what the EU and the World Health Organization (WHO) have set to protect health.
Bulgaria had the highest PM2.5 concentrations of all EU-28 member states in urban areas over a three-year average. For PM10, Bulgaria is also leading on the top polluted countries with 77 μg/m3on the daily mean concentration (EU limit value is 50 μg/m3).
According to the WHO, 60 percent of the urban population in Bulgaria is exposed to dangerous (unhealthy) levels of particulate matter (PM10).
► Exploratory Data Analysis & Data Transformation
EEA Data
We are given 27 csv files, along with 1 metadata xlsx file. Below are the columns present in the 27 csv files and their descriptions.
Since there are 27 files, we will have to combine them. I used Tableau's in-built Union function as shown below
And below are the columns present in the metadata and their descriptions.
We will need to combine the data in the csv files with the metadata. We will do this by Inner Joining AirQualityStationEolCode in Tableau
Problem #1 | |
---|---|
Issue | There are missing data from Jan 2017 to Nov 2017. In addition, Orlov station doesn't have any data from year 2016-2018, and within 2012-2015 for Orlov there are missing data between July 2013 to September 2013 as well as Feb 12 2015 to Feb 24 2015. Mladost station only have data in 2018. |
Solution | Example |
Air Tube Data
Problem #2 | |
---|---|
Issue | Tableau does not recognise geohash as geospatial data. Air Tube data uses geohashes to locate its sensors. We would need to transform the geohashes into latitude and longitude for further visualisation |
Solution | Example |