ISSS608 2018-19 T1 Assign Choo Mei Xuan Overview

From Visual Analytics and Applications
Jump to navigation Jump to search

align="right" Air Quality in Sofia


Task 1

Task 2

Task 3




It was reported in September 2018 that Bulgaria is on the top of the list of lost years due to low air quality and Sofia has no plan for solving air pollution. [1]


Air pollution harms human health and the environment. It is now the biggest environmental risk to public health in Europe. It kills more than 1,000 premature deaths every day across the EU and more than 1% of the daily total of deaths in the EU. This is 10 times higher than the number of car accidents deaths. [2] The typical air pollutants such as Particulate matter (PM10/2.5) are of major concern, as they are small enough to penetrate deep into the lungs and so potentially pose significant health risks. [3]

Data Preparation

There are four set of data provided as described below

Data Set Data Preparation
Official Air Quality Measurements
  • It consists of air quality measurements from 5 stations in Sofia City.
  • A total of 28 data files for year 2013 to 2018 were provided and were combined into a single dataset using JMP Pro.
  • The metadata was then joined with the combined dataset with additional information such as Common Name of air stations and its latitude and longitude using SAS EG.
  • Additional data processing was performed using SAS EG and tableau for the manipulation of date field.
Citizen Science Air Quality Measurement
  • It consists of air quality measurements collected from numerous air quality sensors in Sofia City indicated by its respective geohash. Geohash package in R was then used to convert these geohash into latitude and longitude.
  • A total of 2 data files for year 2017 to 2018 were provided and were combined into a single dataset using JMP Pro.
Metrological Measurement
  • It consists of metrological measurements such as temperature, wind speed, pressure and visibility.
  • A data file for year 2012 to 2018 were provided and Microsoft Excel was used to concatenate the "Year", "Month" and "Day" columns into a "Date" column.
Topography Data
  • It consists of longitude, latitude and elevation information.
  • No additional data processing was performed.