IS428 AY2018-19T1 Rainean Young Calubad

From Visual Analytics for Business Intelligence
Revision as of 23:29, 11 November 2018 by Rycalubad.2016 (talk | contribs)
Jump to navigation Jump to search

To be a Visual Detective


Overview

Air pollution is an important risk factor for health in Europe and worldwide. A recent review of the global burden of disease showed that it is one of the top ten risk factors for health globally. Worldwide an estimated 7 million people died prematurely because of pollution; in the European Union (EU) 400,000 people suffer a premature death. The Organisation for Economic Cooperation and Development (OECD) predicts that in 2050 outdoor air pollution will be the top cause of environmentally related deaths worldwide. In addition, air pollution has also been classified as the leading environmental cause of cancer.

Air quality in Bulgaria is a big concern: measurements show that citizens all over the country breathe in air that is considered harmful to health. For example, concentrations of PM2.5 and PM10 are much higher than what the EU and the World Health Organization (WHO) have set to protect health.

Bulgaria had the highest PM2.5 concentrations of all EU-28 member states in urban areas over a three-year average. For PM10, Bulgaria is also leading on the top polluted countries with 77 μg/m3on the daily mean concentration (EU limit value is 50 μg/m3).

According to the WHO, 60 percent of the urban population in Bulgaria is exposed to dangerous (unhealthy) levels of particulate matter (PM10).

==

Task 1: Spatio-temporal Analysis of Official Air Quality

== Questions:


Characterize the past and most recent situation with respect to air quality measures in Sofia City. What does a typical day look like for Sofia city? Do you see any trends of possible interest in this investigation? What anomalies do you find in the official air quality dataset? How do these affect your analysis of potential problems to the environment?


Data Cleaning & Transformation:


To investigate the above problems, we need to clean the data and then visualise the dataset. The cleaning process involves:

  • Combining all the datasets across the years into one excel file
  • Linking the dataset with the metadata to retrieve the longitude and latitude of the air quality station


Dashboard:


In order to investigate the above problem, I have created two dashboards as following:


Dashboard 1:

This dashboard uses data from 2013 to 2017 which contains the daily readings of the air quality at different stations

This dashboard allows the users to:

  1. 1. See the fluctuations in the air pollutants daily, and
  2. 2. Drill-down and filter the data by Year and by Air Quality Station
Screenshot 2018-11-11 at 8.59.02 PM.png


Dashboard 2:

This dashboard uses data in 2018 which contains the hourly readings of air quality

This dashboard allows the users to:

  1. 1. See the fluctuations in the air pollutants hourly and daily, and
  2. 2. Compare the air pollutant readings across different air quality stations


Dashboard 2.2 - long.png

Visualisation and Insights:




Task 2: Spatio-temporal Analysis of Citizen Science Air Quality Measurements


Questions:


Characterize the sensors’ coverage, performance and operation. Are they well distributed over the entire city? Are they all working properly at all times? Can you detect any unexpected behaviors of the sensors through analyzing the readings they capture?
Now turn your attention to the air pollution measurements themselves. Which part of the city shows relatively higher readings than others? Are these differences time dependent?


Data Cleaning & Transformation:


To investigate the above problems, we need to clean the data and then visualise the dataset. The cleaning process involves:

  • Combining all the datasets across the years into one excel file
  • Linking the dataset with the metadata to retrieve the longitude and latitude of the air quality station
  • Filter to only get data from Sofia City.


GroupingSofiaCity.png

The filtering process to only show data from Sofia City was done through the use of the Lasso tool in Tableau to group the geohashes for Sofia City. Afterwards, I used the group created to only show values for Sofia city and removed the records for neighboring cities.



Dashboard 1:


Task2Dashboard1.png

This dashboard uses data from 2017 to 2018 to show the P1 and P2 measurements for the various parts of Sofia City in Bulgaria.

This dashboard allows the users to:

  1. See which parts of Sofia City are the sensors located
  2. See which sensors record the most and the least amount of data
  3. See how the sensors are performing by looking at their average pressure, temperature and humidity readings


Visualisation 1:
Image:

Task2Dashboard1Viz1.png


Description:

This visualization is a symbol map that shows the distribution of sensors all over Sofia City and the number of records measured by each sensor. The circles denote a presence of a sensor in that part of the city, and the size of the circle denotes how many measurements were recorded by the sensor.

The user would be able to:

  • Locate at which part(s) of Sofia City are most sensors located
  • Compare the number of measurements recorded by each of the sensors in Sofia City

This would allow us to answer the question of what is the coverage of the sensors in Sofia City and how well each is operating.


Insights:

From the visualisation, we have the following insights:

  1. Most sensors are located at the center of Sofia City
  2. There are a few sensors located at the edges of Sofia City
  3. Zooming in further, we can see that more than half of the sensors have recorded a lot of data, and relatively the same amounts of data. However, there are also a lot of sensors that did not record that many data.
  4. It is interesting to note that the sensors that only have a few records of data are at the same place as those that recorded a lot of data. It can be speculated that the sensors with few data have malfunctioned or broken down and were replaced.


Visualisation 2:
Image:

Task2Dashboard1Viz2.png


Description:

This visualization shows three line graphs that shows the average pressure, temperature and humidity through time, 2017 to 2018, in Sofia City.

The user would be able to:

  • Compare the average measurements for pressure, temperature and humidty through time.

This would allow us to answer the question of what is the performance of the sensors in Sofia City and how well each is operating.


Insights:

From the visualisation, we have the following insights:

  1. For the month of February, the sensors have failed to take measurements for temperature, pressure and humidity for one day.
  2. The sensors failed again to take measurements for pressure from March 30 to April 1. However, it's unexpected because it was able to take measurements for temperature and humidity on the same day.
  3. Filtering the visualization by month, we can see a trend where the average temperature and average humidity always go in opposite directions. When average temperature rise, average humidity drops and vice versa.


Dashboard 2:

Task2Dashboard2.png

This dashboard uses data from 2017 to 2018 to show the P1 and P2 measurements for the various parts of Sofia City in Bulgaria.

This dashboard allows the users to:

  1. 1. See the pollutant concentrations in the different parts of Sofia City
  2. 2. See how the pollutant concentrations in the different parts of Sofia City change with respect to the day of the month


Visualisation 1:
Image:

Task2Dashboard2Viz1.png


Description:

This visualization is a symbol map that shows the concentration of P1 in the various parts of Sofia City. The average concentration of P1 is denoted by the color and size of the symbol(circle).

The user would be able to:

  • Determine which parts of Sofia city have high P1 concentration
  • Filter by year and hour to see how time affects the P1 concentration

This would allow us to answer the question on which parts of Sofia city have high concentrations of P1.


Insights:

From the visualisation, we have the following insights:

  1. Most concentrations of P1 are half of the biggest average value, which is around 50.
  2. The center part of Sofia City is the only area with high P1 concentrations (>100) along with the upper right edge of Sofia City.


Visualisation 2:
Image:

Task2Dashboard2Viz2.png


Description:

This visualization is a symbol map that shows the concentration of P2 in the various parts of Sofia City. The average concentration of P2 is denoted by the color and size of the symbol(circle).

The user would be able to:

  • Determine which parts of Sofia city have high P2 concentration
  • Filter by year and hour to see how time affects the P2 concentration

This would allow us to answer the question on which parts of Sofia city have high concentrations of P2.


Insights:

From the visualisation, we have the following insights:

  1. Most concentrations of P2 are in half of the biggest average value, which is around 30.
  2. The center part of Sofia City is the only area with high P2 concentrations (>70).
  3. The parts of Sofia City with high P2 concentrations are the same parts of Sofia City with high P1 concentrations.

Visualisation 3:
Image:

Task2Dashboard2Viz3.png


Description:

This visualization is a heatmap of each of the sensor in Sofia city and how their P1 concentration changes by the day of the month.

The user would be able to:

  • Determine which day(s) of the month have the highest P1 concentration and which days have the lowest
  • See if there is any relationship between the differences in P1 concentration in the parts of Sofia city and time.
  • See if time changes the concentration of P1 for the different parts of Sofia City.

This would allow us to answer the question on whether the differences in P1 concentrations in Sofia City are time dependent.


Insights:

From the visualisation, we have the following insights:

  1. The differences that we see in terms of P1 concentrations are time dependent because not all days have high concentrations of P1, and in most days the columns show only one single color which means that there is no difference in terms of P1 concentration for all parts of Sofia city.
  2. Selecting some of the sensors with highest average P1 concentration shows that it does have high P1 concentrations on the same days of the month, 8th and 26th. The image below shows this.
Task2Dashboard2Viz3-1.png


Visualisation 4:
Image:

Task2Dashboard2Viz4.png


Description:

This visualization is a heatmap of each of the sensor in Sofia city and how their P2 concentration changes by the day of the month.

The user would be able to:

  • Determine which day(s) of the month have the highest P2 concentration and which days have the lowest
  • See if there is any relationship between the differences in P2 concentration in the parts of Sofia city and time.
  • See if time changes the concentration of P1 for the different parts of Sofia City.

This would allow us to answer the question on whether the differences in P2 concentrations in Sofia City are time dependent.


Insights:

From the visualisation, we have the following insights:

  1. The differences that we see in terms of P2 concentrations are time dependent because not all days have high concentrations of P2, and in most days the columns show only one single color which means that there is no difference in terms of P2 concentration for all parts of Sofia city.
  2. There are days where concentration of p2 is high in all parts of Sofia city as evidenced by the dark color for the whole column on day 26



Task 3: Air Quality Measure Analysis

Urban air pollution is a complex issue. There are many factors affecting the air quality of a city. Some of the possible causes are:

  1. Local energy sources. For example, according to Unmask My City, a global initiative by doctors, nurses, public health practitioners, and allied health professionals dedicated to improving air quality and reducing emissions in our cities, Bulgaria’s main sources of PM10, and fine particle pollution PM2.5 (particles 2.5 microns or smaller) are household burning of fossil fuels or biomass, and transport.
  2. Local meteorology such as temperature, pressure, rainfall, humidity, wind etc
  3. Local topography
  4. Complex interactions between local topography and meteorological characteristics.
  5. Transboundary pollution for example the haze that intruded into Singapore from our neighbours.

In this third task, you are required to reveal the relationships between the factors mentioned above and the air quality measure detected in Task 1 and Task 2. Limit your response to no more than 5 images and 600 words.


Conclusion


Reference


Feedbacks

Please feel free to provide your feedback. Thank you.