IS428 AY2019-20T1 Assign Wendy Ng Sock Ling Transformation

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search

MC1-2019.jpg VAST Challenge 2019 MC1: Crowdsourcing for Situational Awareness

Overview

Data Transformation

Dashboard Design

Tasks

Transforming Data Given

The data given consists of the following fields:

  • time: timestamp of incoming report/record, in the format YYYY-MM-DD hh:mm:ss
  • location: id of neighborhood where person reporting is feeling the shaking and/or seeing the damage
  • shake_intensity, sewer_and_water, power, roads_and_bridges, medical, buildings: reported categorical value of how violent the shaking was/how bad the damage was (0 - lowest, 10 - highest; missing data allowed)

A check on the data was done and the following are the fields with missing data which will be left as blank:

  1. Shake intensity: 12,144 missing data
  2. Buildings: 170 missing data
  3. Medical: 47,441 missing data
  4. Sewer_and_water: 171 missing data

The high number of missing data from medical is understandable as it is not found everywhere in a neighborhood and people might not be near any.

With the dataset given, the reports do not have an identifier. Hence, a Report ID column is created as seen below.

WN Create report ID.png

This file is then imported into tableau as a data source. Next, a pivot is done on the following columns in tableau:

  • Buildings
  • sewer_and_water
  • power
  • roads_and bridges
  • medical
  • shake_intensity

WN transforming pivot.png

The following is the data with the columns above pivoted. The columns are then renamed as "Objects/shake intensity" and "Extend of damage/ shake intensity".

WN transforming pivoted.png

Creating An Aggregated Data Set

The aggregated data set is created for the ease of analysis done on the higher level where all reports for the timestamp and location are aggregated into one row. This data set is created using tableau prep where the given data is aggregated by location and time column as seen below. The following are the aggregated fields:

  • Sum of the number of rows
  • Average damage on the buildings
  • Average damage on the medical facility
  • Average damage on the power
  • Average damage on the roads and bridges
  • Average damage on the sewer and water
  • Average shake intensity

WN tableau prep.png

The file is then saved as a csv output and imported into tableau as a data source. Next, a pivot is done on the following columns in tableau in the same way as the previous data set:

  • Buildings
  • Medical
  • Power
  • Roads and Bridges
  • Sewer and Water
  • Shake Intensity

The following is the data with the columns above pivoted. The columns are then renamed as "Objects/shake intensity" and "Extend of damage/ shake intensity".

WN aggregated pivoted.png


Creating the Location Coordinates

This step is done with the help of Ryan Sleeper from Playfair Data. [1]

Firstly, it is necessary to obtain the longitude and latitude from Minicase 2 description document as seen below.

WN Coordinates from mc2.png

The following are the steps taken to create the coordinates for each location:

1) Create a new excel file with the columns "location", "X coordinate" and "Y coordinate" then insert the longitude and latitude from minicase 2.

WN new excel file.png

2) Import this file into tableau as the data source with this sheet as the table. Change the X and Y coordinates to Longitude and Latitude respectively. Location should be a string

3) In tableau, Map > background images > Add Image, insert the town map file and key in the x and y coordinates as seen in the MC2 data description. Below is the image of a filled coordinates. At the options tab, check the boxes “Lock Aspect Ratio” and “Always Show Entire Image”. After which click on Maps > background maps and set it to none.

WN tableau map import.png

4) Put the X coordinate to the columns and Y coordinates to the rows.

WN step 4.png

5) Manually get the average coordinates of 19 locations and key into the excel file created previously. In order to get the X and Y coordinates, right click the graph and annotate with points. The excel file should look something like the following.

WN excel data entry.png

6) Refresh the data extract and perform a check to ensure locations are in the area it should be as below.

WN final map with coordinates.png


Putting the Data Sets Together

This step is combining transformed data set and the aggregated data set with the custom map created. Both data set will be left joined to the custom map created in order to get the coordinates of the locations. When joining the locations, ensure that both the “Location” in the “Location coordinates” are of the data type "string" to prevent error in joining. The X and Y coordinates should have the data type changed to Longitude and Latitude respectively if they are not already changed.

The following is the transformed data set with the location coordinates left joined to it.

WN transforming left join coordinates.png

The following is the aggregated data set with the location coordinates file left joined to it.

WN aggregated left join coordinates.png


Creating Black and White Map

This map is used later on in the analysis for easier identification of location and more readability by removing the colours. This is done by putting the image "StHimarkLabeledMap" into powerpoint and changing the colour under the tab "format" to grayscale. Below is the map to be used in analysis later on.

WN black and white labelled map.png