ISSS608 2018-19 T1 Assign Tan Le Wen Angelina Data Preparation

From Visual Analytics and Applications
Jump to navigation Jump to search

Photo verybig 186361.jpg You take my breath away, Sofia.

Background

Data Preparation

Task 1

Task 2

Task 3

Conclusion


 



Data Preparation and Data Visualisation

Data Preparation

EEA Dataset

This dataset consists of 26 CSV files. The data is joined in JMPr0 14.

AT DP1.png


The data is then opened in Excel, where data preparation work was done.

  1. The combined file is joined together with the metadata file to get the additional information on the Air Quality Stations (altitude etc)
  2. Dates are grouped according to Season
  3. Hourly data were grouped into 3-hour block, 6-hour block for analysis

Airtube Dataset

This dataset was decoded using geohash function in R. The combined dataset is over 3 million rows, hence Excel was not able to open it. Data for 2017 and 2018 were joined using JMPro.

The distributions for Temperature, Humidity and Pressure were plotted in JMPro and there are some anomalies. These anomalies are being removed before I open it in Tableau to do analysis. The ranges for all three parameters are incorrect due to the some faulty sensors. The following image show the ranges for each of the parameters that the analysis for Task 2 are done in.

AT DP2.jpg


I chose to get the ranges from www.weatheronline.co.uk instead of using the distribution found in the meteorology dataset (weather at Sofia's Airport) as there are already some faulty sensor readings, and I am not sure whether those sensors are used to determine the parameters at Sofia's Airport. Hence I decided to use an independent source that have historical data, and I extract the data corresponding to the timeframe of the Airtube dataset.

This dataset consists of sensor readings from all over Bulgaria. The scope of the assignment is limited to only Sofia City, hence I excluded all data that do not fall within the city.

For Task 3 analysis, I have hard coded in the longitude and latitude of the 3 power plants mentioned:

  1. Sofia Power Plant
  2. Sofia Iztok Power Plant
  3. Pernik Republika Power Plant

Afterwhich, I used these coordinates to annotate on the map in Tableau. This is to ensure that I label the exact coordinates of the power plants.

Data Visualisation

Calendar Plot:

AT DV1 calendarplot.jpg


Heat Map:

AT DV2 heatmap.jpg


Hexbins:

AT DV5 hexbins.jpg