IS428 2018-19 T1 Assign Fu Yu
Contents
Problem & Motivation
Air pollution is an important risk factor for health in Europe and worldwide. A recent review of the global burden of disease showed that it is one of the top ten risk factors for health globally. Worldwide an estimated 7 million people died prematurely because of pollution; in the European Union (EU) 400,000 people suffer a premature death. The Organisation for Economic Cooperation and Development (OECD) predicts that in 2050 outdoor air pollution will be the top cause of environmentally related deaths worldwide. In addition, air pollution has also been classified as the leading environmental cause of cancer.
Air quality in Bulgaria is a big concern: measurements show that citizens all over the country breathe in air that is considered harmful to health. For example, concentrations of PM2.5 and PM10 are much higher than what the EU and the World Health Organization (WHO) have set to protect health.
Bulgaria had the highest PM2.5 concentrations of all EU-28 member states in urban areas over a three-year average. For PM10, Bulgaria is also leading on the top polluted countries with 77 μg/m3on the daily mean concentration (EU limit value is 50 μg/m3).
According to the WHO, 60 percent of the urban population in Bulgaria is exposed to dangerous (unhealthy) levels of particulate matter (PM10).
Dataset Analysis & Transformation Process
Decode the geohash column in Air Tube data files
Geohash tells the station locations. However Tableau is not able to interpret geohash as geographic data. Before Air Tube data is imported to Tableau for analysis, geohash needs to be decoded into geographical coordinates. As the two Air Tube data files- data_bg_2017.xlsx and data_bg_2018.xlsx are of big sizes and there are duplicate geohash records in the data, an Excel file containing a unique geohash list was created.
Step 1: Use "pygeohash" package to decode the geohash list and output the coordinates in an Excel file
Step 2: Combine geohash list and coordinates list into one Excel file and update the coordinates, latitude, longitude in data_bg_2017.xlsx and data_bg_2018.xlsx using VLOOKUP, LEFT and RIGHT functions in Excel