Difference between revisions of "IS428 AY2019-20T1 Assign Wendy Ng Sock Ling Transformation"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
(Created page with "== Heading text ==")
 
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Heading text ==
+
<div style=background:#C1C7C9 border:#A3BFB1>
 +
[[Image:MC1-2019.jpg|180px]]
 +
<b><font size = 5; color="#000000">VAST Challenge 2019 MC1: Crowdsourcing for Situational Awareness</font></b>
 +
</div>
 +
 
 +
{|style="background-color:#C1C7C9;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 +
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#107896; text-align:center;" width="25%" |
 +
;
 +
[[IS428_AY2019-20T1_Assign_Wendy_Ng_Sock_Ling| <font color="#FFFFFF">Overview</font>]]
 +
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#107896; text-align:center;" width="25%" |
 +
;
 +
[[IS428_AY2019-20T1_Assign_Wendy_Ng_Sock_Ling_Transformation| <font color="#FFFFFF">Data Transformation</font>]]
 +
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#107896; text-align:center;" width="25%" |
 +
;
 +
[[IS428_AY2019-20T1_Assign_Wendy_Ng_Sock_Ling_Dashboard_Design| <font color="#FFFFFF">Dashboard Design</font>]]
 +
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#107896; text-align:center;" width="25%" |
 +
;
 +
[[IS428_AY2019-20T1_Assign_Wendy_Ng_Sock_Ling_Tasks| <font color="#FFFFFF">Tasks</font>]]
 +
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#107896; text-align:center;" width="25%" |
 +
|}
 +
 
 +
== Transforming Data Given ==
 +
The data given consists of the following fields:
 +
* time: timestamp of incoming report/record, in the format YYYY-MM-DD hh:mm:ss
 +
* location: id of neighborhood where person reporting is feeling the shaking and/or seeing the damage
 +
* shake_intensity, sewer_and_water, power, roads_and_bridges, medical, buildings: reported categorical value of how violent the shaking was/how bad the damage was (0 - lowest, 10 - highest; missing data allowed)
 +
 
 +
A check on the data was done and the following are the fields with missing data which will be left as blank:
 +
# Shake intensity: 12,144 missing data
 +
# Buildings: 170 missing data
 +
# Medical: 47,441 missing data
 +
# Sewer_and_water: 171 missing data
 +
 
 +
The high number of missing data from medical is understandable as it is not found everywhere in a neighborhood and people might not be near any.
 +
 
 +
With the dataset given, the reports do not have an identifier. Hence, a Report ID column is created as seen below.
 +
<p>[[File:WN_Create report ID.png|600px|center]]</p>
 +
 
 +
This file is then imported into tableau as a data source. Next, a pivot is done on the following columns in tableau:
 +
* Buildings
 +
* sewer_and_water
 +
* power
 +
* roads_and bridges
 +
* medical
 +
* shake_intensity
 +
 
 +
<p>[[File:WN_transforming pivot.png|600px|center]]</p>
 +
 
 +
The following is the data with the columns above pivoted. The columns are then renamed as "Objects/shake intensity" and "Extend of damage/ shake intensity".
 +
<p>[[File:WN_transforming pivoted.png|600px|center]]</p>
 +
 
 +
== Creating An Aggregated Data Set ==
 +
The aggregated data set is created for the ease of analysis done on the higher level where all reports for the timestamp and location are aggregated into one row. This data set is created using tableau prep where the given data is aggregated by location and time column as seen below.
 +
The following are the aggregated fields:
 +
* Sum of the number of rows
 +
* Average damage on the buildings
 +
* Average damage on the medical facility
 +
* Average damage on the power
 +
* Average damage on the roads and bridges
 +
* Average damage on the sewer and water
 +
* Average shake intensity
 +
 
 +
<p>[[File:WN_tableau prep.png|600px|center]]</p>
 +
The file is then saved as a csv output and imported into tableau as a data source. Next, a pivot is done on the following columns in tableau in the same way as the previous data set:
 +
* Buildings
 +
* Medical
 +
* Power
 +
* Roads and Bridges
 +
* Sewer and Water
 +
* Shake Intensity
 +
 
 +
The following is the data with the columns above pivoted. The columns are then renamed as "Objects/shake intensity" and "Extend of damage/ shake intensity".
 +
<p>[[File:WN_aggregated pivoted.png|600px|center]]</p>
 +
 
 +
 
 +
== Creating the Location Coordinates ==
 +
This step is done with the help of Ryan Sleeper from Playfair Data. [https://playfairdata.com/make-custom-map-tableau/]
 +
 
 +
Firstly, it is necessary to obtain the longitude and latitude from Minicase 2 description document as seen below.
 +
<p>[[File:WN_Coordinates from mc2.png|300px|center]]</p>
 +
 
 +
The following are the steps taken to create the coordinates for each location:
 +
 
 +
1) Create a new excel file with the columns "location", "X coordinate" and "Y coordinate" then insert the longitude and latitude from minicase 2.
 +
<p>[[File:WN_new excel file.png|300px|center]]</p>
 +
 
 +
2) Import this file into tableau as the data source with this sheet as the table. Change the X and Y coordinates to Longitude and Latitude respectively. Location should be a string
 +
 
 +
3) In tableau, Map > background images > Add Image, insert the town map file and key in the x and y coordinates as seen in the MC2 data description. Below is the image of a filled coordinates. At the options tab, check the boxes “Lock Aspect Ratio” and “Always Show Entire Image”. After which click on Maps > background maps and set it to none. 
 +
<p>[[File:WN_tableau map import.png|300px|center]]</p>
 +
 
 +
4) Put the X coordinate to the columns and Y coordinates to the rows.
 +
<p>[[File:WN_ step 4.png|300px|center]]</p>
 +
 
 +
5) Manually get the average coordinates of 19 locations and key into the excel file created previously. In order to get the X and Y coordinates, right click the graph and annotate with points. The excel file should look something like the following.
 +
<p>[[File:WN_excel data entry.png|300px|center]]</p>
 +
 
 +
6) Refresh the data extract and perform a check to ensure locations are in the area it should be as below.
 +
<p>[[File:WN_final map with coordinates.png|300px|center]]</p>
 +
 
 +
 
 +
== Putting the Data Sets Together ==
 +
This step is combining transformed data set and the aggregated data set with the custom map created. Both data set will be left joined to the custom map created in order to get the coordinates of the locations. When joining the locations, ensure that both the “Location” in the “Location coordinates” are of the data type "string" to prevent error in joining. The X and Y coordinates should  have the data type changed to Longitude and Latitude respectively if they are not already changed.
 +
 
 +
The following is the transformed data set with the location coordinates left joined to it.
 +
<p>[[File:WN_transforming left join coordinates.png|600px|center]]</p>
 +
 
 +
The following is the aggregated data set with the location coordinates file left joined to it.
 +
<p>[[File:WN_aggregated left join coordinates.png|600px|center]]</p>
 +
 
 +
 
 +
== Creating Black and White Map ==
 +
This map is used later on in the analysis for easier identification of location and more readability by removing the colours. This is done by putting the image "StHimarkLabeledMap" into powerpoint and changing the colour under the tab "format" to grayscale. Below is the map to be used in analysis later on.
 +
<p>[[File:WN_black and white labelled map.png|300px|center]]</p>

Latest revision as of 23:51, 13 October 2019

MC1-2019.jpg VAST Challenge 2019 MC1: Crowdsourcing for Situational Awareness

Overview

Data Transformation

Dashboard Design

Tasks

Transforming Data Given

The data given consists of the following fields:

  • time: timestamp of incoming report/record, in the format YYYY-MM-DD hh:mm:ss
  • location: id of neighborhood where person reporting is feeling the shaking and/or seeing the damage
  • shake_intensity, sewer_and_water, power, roads_and_bridges, medical, buildings: reported categorical value of how violent the shaking was/how bad the damage was (0 - lowest, 10 - highest; missing data allowed)

A check on the data was done and the following are the fields with missing data which will be left as blank:

  1. Shake intensity: 12,144 missing data
  2. Buildings: 170 missing data
  3. Medical: 47,441 missing data
  4. Sewer_and_water: 171 missing data

The high number of missing data from medical is understandable as it is not found everywhere in a neighborhood and people might not be near any.

With the dataset given, the reports do not have an identifier. Hence, a Report ID column is created as seen below.

WN Create report ID.png

This file is then imported into tableau as a data source. Next, a pivot is done on the following columns in tableau:

  • Buildings
  • sewer_and_water
  • power
  • roads_and bridges
  • medical
  • shake_intensity

WN transforming pivot.png

The following is the data with the columns above pivoted. The columns are then renamed as "Objects/shake intensity" and "Extend of damage/ shake intensity".

WN transforming pivoted.png

Creating An Aggregated Data Set

The aggregated data set is created for the ease of analysis done on the higher level where all reports for the timestamp and location are aggregated into one row. This data set is created using tableau prep where the given data is aggregated by location and time column as seen below. The following are the aggregated fields:

  • Sum of the number of rows
  • Average damage on the buildings
  • Average damage on the medical facility
  • Average damage on the power
  • Average damage on the roads and bridges
  • Average damage on the sewer and water
  • Average shake intensity

WN tableau prep.png

The file is then saved as a csv output and imported into tableau as a data source. Next, a pivot is done on the following columns in tableau in the same way as the previous data set:

  • Buildings
  • Medical
  • Power
  • Roads and Bridges
  • Sewer and Water
  • Shake Intensity

The following is the data with the columns above pivoted. The columns are then renamed as "Objects/shake intensity" and "Extend of damage/ shake intensity".

WN aggregated pivoted.png


Creating the Location Coordinates

This step is done with the help of Ryan Sleeper from Playfair Data. [1]

Firstly, it is necessary to obtain the longitude and latitude from Minicase 2 description document as seen below.

WN Coordinates from mc2.png

The following are the steps taken to create the coordinates for each location:

1) Create a new excel file with the columns "location", "X coordinate" and "Y coordinate" then insert the longitude and latitude from minicase 2.

WN new excel file.png

2) Import this file into tableau as the data source with this sheet as the table. Change the X and Y coordinates to Longitude and Latitude respectively. Location should be a string

3) In tableau, Map > background images > Add Image, insert the town map file and key in the x and y coordinates as seen in the MC2 data description. Below is the image of a filled coordinates. At the options tab, check the boxes “Lock Aspect Ratio” and “Always Show Entire Image”. After which click on Maps > background maps and set it to none.

WN tableau map import.png

4) Put the X coordinate to the columns and Y coordinates to the rows.

WN step 4.png

5) Manually get the average coordinates of 19 locations and key into the excel file created previously. In order to get the X and Y coordinates, right click the graph and annotate with points. The excel file should look something like the following.

WN excel data entry.png

6) Refresh the data extract and perform a check to ensure locations are in the area it should be as below.

WN final map with coordinates.png


Putting the Data Sets Together

This step is combining transformed data set and the aggregated data set with the custom map created. Both data set will be left joined to the custom map created in order to get the coordinates of the locations. When joining the locations, ensure that both the “Location” in the “Location coordinates” are of the data type "string" to prevent error in joining. The X and Y coordinates should have the data type changed to Longitude and Latitude respectively if they are not already changed.

The following is the transformed data set with the location coordinates left joined to it.

WN transforming left join coordinates.png

The following is the aggregated data set with the location coordinates file left joined to it.

WN aggregated left join coordinates.png


Creating Black and White Map

This map is used later on in the analysis for easier identification of location and more readability by removing the colours. This is done by putting the image "StHimarkLabeledMap" into powerpoint and changing the colour under the tab "format" to grayscale. Below is the map to be used in analysis later on.

WN black and white labelled map.png