IS428 AY2019-20T1 Parth Goda Rajesh
Contents
Background
Welcome to St. Himark! A fictional city that will is being used in this visual case study. It is a city of 19 neighborhoods, all of which have their unique characteristics and amenities. St. Himark has a population of 246,839 people, and it's located in the Oceanus Sea. It is also home to the world-renowned St. Himark Museum, beautiful beaches, and the Wilson Forest Nature Preserve. It is one of the best cities to raise a family and work. Always Safe Nuclear Power Plant provides the majority of the power in the city and jobs in the Safe Town. Mayor Jordan and the city council current govern the city.
The runs in the following utilities:
- Water and Sewage
- Road and Bridge
- Gas
- Garbage
- Power
There is always construction going on in the above utilities.
St. Himark is segregated into 19 neighborhoods:
- PALACE HILLS
- NORTHWEST
- OLD TOWN
- SAFE TOWN
- SOUTHWEST
- DOWNTOWN
- WILSON FOREST
- SCENIC VISTA
- BROADVIEW
- CHAPPARAL
- TERRAPIN SPRINGS
- PEPPER MILL
- CHEDDARFORD
- EASTON
- WESTON
- SOUTHTOWN
- OAK WILLOW
- EAST PARTON
- WEST PARTON
Problem - VAST challenge MC1
There was an earthquake northwest of St. Himark. It occurred between 6 April 2020 and 8 April 2020. The city's officials needed to collect data immediately to understand the extent of the damage. Then can then allocate resources efficiently to the areas of town where it's needed and dispatch their emergency services.
At first, they only have the seismic readings of the earthquake and used that for their first round of dispatch. Now, however, they need more information to get a better gauge of what is going on at the ground level.
Purpose
To gather the information, the city official's need. They launched an app where the citizens can report the intensity of shake and level of damage done to utility infrastructure. The officials can use this tool to record data provided by citizens. The citizens use the app to note down the level of damage seen on a utility/infrastructure building in a Neighbourhood. They can also record the shake intensity in the neighborhood. The data is stored every 5 mins. They may also be some data loss or delay due to power shortages.
With all this data, visualizations were created to understand the data faster. Recommendations and decisions can be churned out more quickly to get help to people faster.
The following questions also have to be answered:
- Emergency responders will base their initial response on the earthquake shake map. Use visual analytics to determine how their response should change based on damage reports from citizens on the ground. How would you prioritize neighborhoods for the reaction? Which parts of the city are the hardest hit?
- Use visual analytics to show uncertainty in the data. Compare the reliability of neighborhood reports. Which neighborhoods are providing reliable reports? Provide a rationale for your response.
- How do conditions change over time? How does uncertainty in change over time? Describe the key changes you see.
Data Gathering and Clean up
The data provided in an mc1-reports-data.csv file with the following data:
The headers were:
- Time: A timestamp of the report made by a citizen. The format is in DD/MM/YY HH:MM:SS
- sewer_and_water: Damage recorded on the sewer and water systems in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- power: Damage recorded on the power generation systems in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- roads_and_bridges: Damage recorded on the roads and bridges in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- medical: Damage recorded on the medical facilities in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- buildings: Damage recorded on the buildings in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- shake_intensity: How violent the shaking was in the neighborhood and at the timestamp.
- location: Id of the neighborhood for which the citizen is reporting his readings. (This will be matched to the neighborhood data in the map file)
Cleaning up the data
Using Tableau Prep Builder, data from the CSV file was moved around and changed a little to make visualizations better.
Pivoting
To start, I first pivot the medical, power, road_and_bridges, sewer_and_water, buildings, and shake_intensity on the dashboard. The utilities are called "Source of reading," and values are called "Readings."
Cleaning up names
Next, I renamed the following sources of reading and capitalized the rest:
- road_and_bridges into "Road and Bridges."
- sewer_and_water into "Sewer and Water"
- shake_intensity into "Shake Intensity"
Setting up Tableau
To start my visualization journey, I first added a file called StHimark.shp taken form the MC2 VAST Data challenge 2's data files to create the interactive map on a tableau workbook. This file has the following fields:
- ID: Id of the neighborhood
- location: Name of the neighborhood
- Longitude: Longitude coordinate of the neighborhood
- Latitude: Latitude coordinate of the neighborhood
I then dragged and dropped the output file form tableau prep into tableau. I used ID from StHimark.shp and location from mc1-reports-data.csv and inner joined them:
This was the result.
Visualization and Interactive techniques
See the charts here: https://public.tableau.com/profile/parth.goda#!/vizhome/VASTchallengemc1Parthgoda/MainPage?publish=yes
The visualizations I created were all connected from a simple main page that let the user choose if they want to see either:
- Based on each neighborhood in St. Himark
- Based on each reading source. E.g., Building or Medical damage
- Based on the Map of St. Himark progression through the six days
This design has implemented the idea that when city officials turn to this dashboard to look for data on how to allocate resources, they can start to form their decisions based on either neighborhood, a utility that they want to work on, or see the timeline of the whole incident.
Main Page |
---|
Purpose / Description This is the landing page of the applicant. On this page the user gets a quick summary of what is the purpose of this application and an option to dive into three areas of reporting. I have created this page as a starting point where the user can keep coming back to and navigating away from.
|
Interactive Technique I also use interesting interactive techniques for the dashboard to be more user friendly.
|
Neighborhood Dashboard |
---|
Purpose / Description This dashboard is designed to understand what is going on in each neighbourhood. When the user first lands on the page, all the charts show data for all the neighborhoods and sources of readings. But when the user filters the data by selecting an area on the map or one of the buttons on the list, the data changes to show the real picture in each neighborhood. The purpose of this is for the use case where city officials need to know what is going on in each neighborhood. They can see what the most reported utility and what is the most reading that is being reported. They can also see all the filtered data on a timeline, to understand when and how much was reported. |
Interactive Technique
|
Types of Charts used There are three types of charts used:
|
Utility Dashboard |
---|
Purpose / Description This dashboard is designed to understand how each utility affects each neighbourhood. When the user first lands on the page, all the charts show data for all sources of readings and ranks how badly damaged the neighborhoods are. When the user filters the data by selecting one of the buttons on the Source of the readings list, clearer data is shown where the ranking is now based on the selected source of reading. The purpose of this is for the use case where city officials need to allocate specific resources like emergency power or building repairs. They can see what the most reported neighborhoods and what is the total/average readings that are being reported. They can also see all the filtered data on a timeline, to understand when and how much was reported. |
Interactive Technique
|
Types of Charts used There are three types of charts used:
|
Time Dashboard |
---|
Purpose / Description This dashboard was designed to understand what kind of data was reported during the earthquake. Users can see how each neighbourhood was affected in terms of the damage done to infrastructure and shake intensity over a period of time. They can specify a period of time in the maps or look at the overall picture on the heat maps. |
Interactive Technique
|
Types of Charts used There are two types of charts used:
|
Task Results
Question 1
Question: Emergency responders will base their initial response on the earthquake shake map. Use visual analytics to determine how their response should change based on damage reports from citizens on the ground. How would you prioritize neighborhoods for the reaction? Which parts of the city are the hardest hit?
Point | Recommendation |
---|---|
1 | To start I would go to the Utility Dashboard to understand the different ways the city has been hit. Change the "source of readings" filter to all, building, power, etc. to analyze which neighborhoods need help and are worst hit. |
2 | Let's start with overall reports:
On both average and total readings, the following towns seem to be hardest hit in descending order
However, for the first response, the city officials should use the shake intensity as this represents the towns that are closest to the earthquake epicenter
They should also pay special attention to Safe town as the nuclear reactor is there. There will be some building damage and power damage to the reactor., Worst case, it could represent the like of Fukushima Daiichi nuclear disaster. |
3 | Howeve,r to make better decisions on which neighborhoods deserve special attention, filtering the data based on the sources would be better.
From the charts, the following would need the most help to fix the buildings
|
4 | Next would be which areas need special medical emergency response:
Which is a priority as these areas are the ones with seven hospitals shared between them. |
5 | For Power, Roads, and bridges, and sewage response, the data shows that all towns need about the same amount of help. The exceptions are old town and scenic view. They seem to stand out with most amounts of reports recorded by citizens.
For example, here is the power total reading chart: Looking at the socio-economic class of the residents at Scenic View, it would seem that since they have better infrastructure and equipment, an upper-class mindset, their numbers might be over recorded as they are not very close to the epicentre of the earthquake. It is a little suspicious that their number match of neighborhoods closer to the epicentre. |
Question 2
Question: Use visual analytics to show uncertainty in the data. Compare the reliability of neighborhood reports. Which neighborhoods are providing reliable reports? Provide a rationale for your response.
Point | Recommendation |
---|---|
1 | One way the reports are not reliable is due to the many gaps in data. Especially on the 8th of April 2020, which is around the time the earthquake happened.
For example, in this heat map, we can clearly see the missing information in some neighborhoods close to the epicentre: These neighborhoods are Old town, Boardview, Chapparal, Scenic Vista. All the readings after the gaps are much higher. It would seem maybe a backlog of data would rush in when the network connection had fixed. To collect the data when the earthquake hit was the main goal of the app, however, the missing data during the earthquake makes the whole process less reliable. |
2 | There is a possibility of inflated numbers due to mindsets and socio-economic class. I would like to accentuate the following:
These areas are expensive places with trendy and rich patrons. Which also comes with better infrastructure, utilities, maintenance, and security. The people might be more educated and tech-savvy and have more awareness of the app compared to the rest of the town. It's also a possibility of an upper class, elite or entitled mentality that affect the numbers reported and their frequency by the citizens. In the case of boardview: most of their citizen are of the older generation, and some level of fear or panic might affect the numbers more than usual. |
3 | Similar to the observations in point 1, the following areas had missing data a day or so after the earthquake happened:
This definitely affected the reliability and certainty of the data and the charts. Specially to understand the aftermath of the earthquake and see which areas still need attention after the first response has been sent out. |
4 | One way the neighborhoods are providing certain data is the shake intensity. Areas close to the epicentre do show higher average readings compared to areas further away. This also mirrors the information in the earthquake shake map provided.
Thus, to some extent, data from the following towns can be said to be more certain:
|
5 | One issue with the data collected is in the palace hill and northwest neighborhoods. As seen in this picture, these places have the highest concentrations of roads in St. Himark.
However, the data shows that not much damage is seen or reported on the roads in this area. That would be contradictory to common logic as these areas by default, should record the most damage to roads and belief. On the other hand, the old town and scenic view record and report the most damage. This is not realistic as these towns do not have that many roads compared to palace hill and northwest. |
6 | One neighborhood that was not providing reliable reports if Wilson forest. Most of the data is either missing due to remoteness of the area, or the neighborhood is not populated enough to provide the city officials with data.
In this heat map of all data collected in Wilson forest, you can see that there is almost no data: |
Question 3
Question: How do conditions change over time? How does uncertainty in change over time? Describe the key changes you see. Limit your response to 500 words and 8 images.
Point | Recommendation |
---|---|
1 | In most neighborhoods, areas of Utility and shake intensity, the data follows this trend:
Except on 6/4/2020 at 4-5 pm when maybe there might be some pre-earthquake shaking, the readings are all soft or manageable till 8/4/2020 about 8 am. Which is approximately when the earthquake happened. Then, there are some after-shake readings or a pile-up of data from cut out neighborhoods on early 9/4/2020 and afternoon 10/4/2020. |
2 | Uncertainly does increase over time, mostly after the earthquake. This might be due to the damage to infrastructure that caused some data to be missing in some neighborhoods of St. Himark. Before the earthquake, there were not many missing data (except for Wilson Forest) even though there was construction going on in some places, but afterward, more and more missing data started to occur. Especially to neighborhoods that were closer to the epicentre and had more damaged buildings and roads.
This would make sense as after the earthquake; there were reports of damage to power infrastructure all around town. Without power, many citizens will not be able to power their devices and log the damage and shake data on the app. We can see here in these charts that every neighborhood in St. Himark had experienced some power damage. |
2 | One other way uncertainty was affected by time was during the 6/4/2020 4-5 pm pre-earthquake shaking. It can be noticed on the heatmap that overall in all neighborhoods the average readings dropped:
However, at the same time, the number of records coming in overall increased from that same period onwards: There is something wrong with this as more people would not log onto the app after pre-earthquake shaking to record lower readings of shaking and damage. |
Reference
Icon for gauge https://www.123rf.com/clipart-vector/guage.html?oriSearch=utility&sti=nh1v9o3lxchgqttrvu%7C
Home icon: https://www.iconfinder.com/icons/185038/home_house_streamline_icon
Fukashima disaster: https://en.wikipedia.org/wiki/Fukushima_Daiichi_nuclear_disaster
Model answers:
- Gwendoline Tan https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Gwendoline_Tan_Wan_Xin
- Lim Kim Yong https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Lim_Kim_Yong
- Chew Yuxi https://public.tableau.com/profile/yuxi7903#!/vizhome/VA_Assignment_Chew_Yuxi/OAQStationsTimeSeries
- Tan Kee Hock https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Tan_Kee_Hock