IS428 AY2019-20T1 Assign Damian Er Zhong Ying Analysis & Transformation
|
|
|
|
|
Dataset Analysis and Transformation
Before proceeding to perform our data visualisation, we need to first look at the dataset provided and prepare our data for analysis in Tableau.
The data provided in MC1 is a CSV file (mc1-reports-data.csv) which spans the entire length of the event. It is to be noted that reports are made by citizens at any time but are only recorded in 5-minute batches/increments due to the server configuration. Furthermore, delays in the receipt of reports may occur during power outages.
The following are the fields in the dataset:
- time: timestamp of incoming report/record, in the format YYYY-MM-DD hh:mm:ss
- location: id of neighborhood where person reporting is feeling the shaking and/or seeing the damage
- {shake_intensity, sewer_and_water, power, roads_and_bridges, medical, buildings}: reported categorical value of how violent the shaking was/how bad the damage was (0 - lowest, 10 - highest; missing data allowed)
In order to aid our analysis, we will be using the two shakemap (PNG) files which indicate where the corresponding earthquakes' epicenters originate as well as how much shaking can be felt across the city.
In addition to the data provided by Mini-Challenge 1, data from Mini-Challenge 2 is used as well. The shape file (StHimark.shp) from Mini-Challenge 2 was used to generate the longitude and latitude to plot out the map of St.Himark.
The following will document the issues faced during our analysis and the solution to resolve the issue using Tableau Prep.
Problem #1 | Naming Convention |
---|---|
Issue | The default column names in the dataset are typical of those extracted from a database, with underscores and uncapitalised letters. This makes it quite unsightly when the fields are shown in a visualisation. |
Solution |
Using Tableau Prep, the column names are renamed to remove the underscores, as well as capitalisation of the letters as appropriate. This will make the fields look more aesthetically pleasing and professional in our visualisation later on. |
Problem #2 | Pivoting of Damage Type |
---|---|
Issue | The current format of the different damage types (Shake Intensity, Sewer and Water, Power, Roads and Bridges, Medical, Buildings) makes it difficult for visualisation as they cannot be used as filters. |
Solution |
Using Tableau Prep, we can pivot the different damage types from crosstab format into columnar format. To do this:
|
Dataset Import Structure and Process
After we have performed the necessary transformation in Tableau Prep, we will output the file and save it for import into Tableau Desktop.
After saving the output file in Tableau Prep, the output data file (from Tableau Prep) will be imported into Tableau Desktop, along with the StHimark.shp shapefile.
After successful importing of the dataset, we face the following problems in Tableau Desktop which can be resolved.
Problem #3 | Joining of data fields |
---|---|
Issue | After importing the dataset and shape file into Tableau, the files are not linked. Thus, this makes it difficult for visualising damage values on the map later on.
|
Solution |
To resolve this issue, simply peform an inner join on the two files. This allows us to generate polygons for the map later on in our visualisation. |
Problem #4 | Difficult to tell what the numbers in locations and damage values represents |
---|---|
Issue | It is difficult to determine which number corresponds to which location. The emergency responder will have to constantly refer to the map provided to determine the location of the number shown. Furthermore, the damage values are coded in numbers to represent the scale of damage. There is no visual representation of this number.
|
Solution |
We will use the aliases feature in Tableau to map the location numbers to their location names as well as map the damage values according to the Modified Mercalli Intensity Scale that is present in the shake maps.
|
After resolving all 4 problems listed above, we will be ready to start on our visualisation in Tableau Desktop.