IS428 AY2018-19T1 Zheng Bingbing

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search

Background & Motivation

Air pollution is an important risk factor for health in Europe and worldwide. A recent review of the global burden of disease showed that it is one of the top ten risk factors for health globally. Worldwide an estimated 7 million people died prematurely because of pollution; in the European Union (EU) 400,000 people suffer a premature death. The Organisation for Economic Cooperation and Development (OECD) predicts that in 2050 outdoor air pollution will be the top cause of environmentally related deaths worldwide. In addition, air pollution has also been classified as the leading environmental cause of cancer.

"In Sofia, air pollution norms were exceeded 70 times in the heating period from October 2017 to March 2018, citizens’ initiative AirBG.info says. The day with the worst air pollution in Sofia was January 27, when the norm was exceeded six times over. Things got so out of control that even the European Court of Justice ruled against Bulgaria in a case brought by the European Commission against the country over its failure to implement measures to reduce air pollution. The two main reasons for the air pollution are believed to be solid fuel heating and motor vehicle traffic." -- Datathon

This project aims to create an interactive dashboard to examine the air quality in Sofia from the year 2013 to 14 Sept 2018.

PM Rate Classifcation Table (Europe)

Air index.png


Dataset Analysis & Transformation

Official air quality measurements

Problem #1 Data merging required for an easy process in the tableau.
Issue Manual processing is not convenience to handle the larger number of records. An alternative method of merging data is required.
Solution
Dataclean1.png.png

Tableau Prep is using on merging of the data, it allows dynamic views of the output data records. And detecting any unusual behavior before making the visualizations.

Problem #2 Inconsistency of data record time
Issue Year 2017 use data recording time as "2017-01-01 01:00:01", ending with ":01", which other records ended with ":00".
Solution
Dataclean2.png

Since we will only be using the hours as the measure, the last second will not have an effort on the result. Furthermore, all the DatatimeEnd are records ended with ":00" second. There is no action required.

Problem #3 Miss data points
Issue The Year 2017 have a lot of months without datapoint, furthermore, the year 2018 only have the records till September 14.
Solution
Dataclean4.png

In order to have an overview of the air pollution in Sofia City, the data points from year 2017 and 2018 will keep for the analysis.

Citizen science air quality measurements (Airtube

Problem #1 Geohash required to be transferred into latlong
Issue Tableau does not recognize Geohash code, a transformation of Geohash into latlong is required.
Solution
Dataclean3.png

By using the R file provided by Dr.KAM to transfer the geohash into latlong.

Problem #2 Records with sensor data outside of Sofia City
Issue Points outside of the Sofia City region required to be removed.
Solution
Dataclean5.png

Those points will be excluded during the processing.

Interactive Dashboard Design

Detail Analysis

Task 1: Spatio-temporal Analysis of Official Air Quality

Characterise the past and most recent situation with respect to air quality measures in Sofia City.
B1.1.png

A calendar chart is used to classify the Sofia city's both past and present air quality. The colour classification was followed by the UN Common Air Quality Index (CAQI) with the following setting:
PM10 concentration 0-25: Good (Light Green)
PM10 concentration 26-50: Fair (Green)
PM10 concentration 51-75: Moderate (Yellow)
PM10 concentration 76-100: Poor (Orange)
PM10 concentration >100: Very Poor (Red)

B1.2.png

Majority of the weeks were classified as "Fair" in the past years. In recent years, more weeks that classified as "Fair" changed to "Good" condition.

B1.3.png

In the recent year 2018, the average weekly PM10 concentration has improved. More weeks are classified as "Good" condition which more less than 25 ug/m3.

B1.4.png

From the above graph, we can observe Sofia has its poorest air quality usually from December till February period. With the recent effort by reducing the PM10 population, we can see an effective reduction of the PM10 concentration on December 2017. There are no weeks classified as "Very Poor" in February 2018, which it does occur in the past years. Reducing the numbers of weeks that classified as "Very Poor" shows Sofia City is improving its air quality.

A typical day in Sofia
B2.1.png


Anomalies find in the official air quality dataset
B3.1.png



Task 2: Spatio-temporal Analysis of Citizen Science Air Quality Measurements

Task 3: Analysis factors affecting the Sofia Air Quality

References

[1] Datathon Air Sofia Case : https://www.datasciencesociety.net/the-telelink-case-one-step-closer-to-a-better-air-quality-and-city/
[2] Tableau Training Library : https://www.tableau.com/learn/training
[3] Air quality index : https://en.wikipedia.org/wiki/Air_quality_index