Difference between revisions of "IS428 AY2018-19T1 Zheng Bingbing"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Line 25: Line 25:
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! Problem #2 || In consistency of data record time  
+
! Problem #2 || Inconsistency of data record time  
 
|-
 
|-
 
| Issue || Year 2017 use data recording time as "2017-01-01 01:00:01", ending with ":01", which other records ended with ":00".
 
| Issue || Year 2017 use data recording time as "2017-01-01 01:00:01", ending with ":01", which other records ended with ":00".
Line 31: Line 31:
 
| Solution ||  
 
| Solution ||  
 
[[File:dataclean2.png|700px|center]]
 
[[File:dataclean2.png|700px|center]]
Since we will only be using the hours as the measure, the last second will not have effort on the result. Furthermore, all the DatatimeEnd are records ended with ":00" second. There is not action required.
+
Since we will only be using the hours as the measure, the last second will not have an effort on the result. Furthermore, all the DatatimeEnd are records ended with ":00" second. There is no action required.
 +
<br/>
 +
|}
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Problem #3 || Miss data points
 +
|-
 +
| Issue || The Year 2017 have a lot of months without datapoint, furthermore, the year 2018 only have the records till September 14.
 +
|-
 +
| Solution ||
 +
[[File:dataclean4.png|700px|center]]
 +
In order to have an overall view of the data point, the data points from year 2017 and 2018 will keep for the analysis.
 
<br/>
 
<br/>
 
|}
 
|}

Revision as of 00:38, 12 November 2018

Background & Motivation

Air pollution is an important risk factor for health in Europe and worldwide. A recent review of the global burden of disease showed that it is one of the top ten risk factors for health globally. Worldwide an estimated 7 million people died prematurely because of pollution; in the European Union (EU) 400,000 people suffer a premature death. The Organisation for Economic Cooperation and Development (OECD) predicts that in 2050 outdoor air pollution will be the top cause of environmentally related deaths worldwide. In addition, air pollution has also been classified as the leading environmental cause of cancer.

"In Sofia, air pollution norms were exceeded 70 times in the heating period from October 2017 to March 2018, citizens’ initiative AirBG.info says. The day with the worst air pollution in Sofia was January 27, when the norm was exceeded six times over. Things got so out of control that even the European Court of Justice ruled against Bulgaria in a case brought by the European Commission against the country over its failure to implement measures to reduce air pollution. The two main reasons for the air pollution are believed to be solid fuel heating and motor vehicle traffic." -- Datathon

This project aims to create an interactive dashboard to examine the air quality in Sofia from the year 2013 to 14 Sept 2018.

PM Rate Classifcation Table (Europe)

Air index.png


Dataset Analysis & Transformation

Official air quality measurements

Problem #1 Data merging required for an easy process in the tableau.
Issue Manual processing is not convenience to handle the larger number of records. An alternative method of merging data is required.
Solution
Dataclean1.png.png

Tableau Prep is using on merging of the data, it allows dynamic views of the output data records. And detecting any unusual behavior before making the visualizations.

Problem #2 Inconsistency of data record time
Issue Year 2017 use data recording time as "2017-01-01 01:00:01", ending with ":01", which other records ended with ":00".
Solution
Dataclean2.png

Since we will only be using the hours as the measure, the last second will not have an effort on the result. Furthermore, all the DatatimeEnd are records ended with ":00" second. There is no action required.

Problem #3 Miss data points
Issue The Year 2017 have a lot of months without datapoint, furthermore, the year 2018 only have the records till September 14.
Solution
Dataclean4.png

In order to have an overall view of the data point, the data points from year 2017 and 2018 will keep for the analysis.

Citizen science air quality measurements (Airtube

Problem #1 Geohash required to be transferred into latlong
Issue Tableau does not recognize Geohash code, a transformation of Geohash into latlong is required.
Solution
Dataclean3.png

By using the R file provided by Dr.KAM to transfer the geohash into latlong.

Problem #2 Records with sensor data outside of Sofia City
Issue Points outside of the Sofia City region required to be removed.
Solution
Dataclean3.png


Meteorological measurements

Topography data

Interactive Dashboard Design

Detail Analysis

Task 1: Spatio-temporal Analysis of Official Air Quality

Task 2: Spatio-temporal Analysis of Citizen Science Air Quality Measurements

Task 3:

References

[1] Datathon Air Sofia Case : https://www.datasciencesociety.net/the-telelink-case-one-step-closer-to-a-better-air-quality-and-city/
[2] Tableau Training Library : https://www.tableau.com/learn/training
[3] Air quality index : https://en.wikipedia.org/wiki/Air_quality_index