Difference between revisions of "IS428 AY2019-20T1 Parth Goda Rajesh"
Parthrg.2017 (talk | contribs) |
Parthrg.2017 (talk | contribs) |
||
(17 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | Welcome to St. Himark! A fictional city that will is being used in this visual case study. It is a city of 19 neighborhoods, all of which have their unique characteristics and amenities. St. Himark has a population of 246,839 people and it's located in the Oceanus Sea. It is also home to the world-renowned St. Himark Museum, beautiful beaches, and the Wilson Forest Nature Preserve. It is one of the best cities to raise a family and work. Always Safe Nuclear Power Plant provides the majority of the power in the city and jobs in the Safe Town. Mayor Jordan and the city council current govern the city. | + | == Background == |
+ | Welcome to St. Himark! A fictional city that will is being used in this visual case study. It is a city of 19 neighborhoods, all of which have their unique characteristics and amenities. St. Himark has a population of 246,839 people, and it's located in the Oceanus Sea. It is also home to the world-renowned St. Himark Museum, beautiful beaches, and the Wilson Forest Nature Preserve. It is one of the best cities to raise a family and work. Always Safe Nuclear Power Plant provides the majority of the power in the city and jobs in the Safe Town. Mayor Jordan and the city council current govern the city. | ||
The runs in the following utilities: | The runs in the following utilities: | ||
Line 32: | Line 33: | ||
# WEST PARTON | # WEST PARTON | ||
− | == Problem == | + | == Problem - VAST challenge MC1== |
There was an earthquake northwest of St. Himark. It occurred between 6 April 2020 and 8 April 2020. The city's officials needed to collect data immediately to understand the extent of the damage. Then can then allocate resources efficiently to the areas of town where it's needed and dispatch their emergency services. | There was an earthquake northwest of St. Himark. It occurred between 6 April 2020 and 8 April 2020. The city's officials needed to collect data immediately to understand the extent of the damage. Then can then allocate resources efficiently to the areas of town where it's needed and dispatch their emergency services. | ||
− | At first, they only have the seismic readings of the earthquake and used that for their first round of dispatch. Now, however, they need more information to get a better gauge of what is going on | + | At first, they only have the seismic readings of the earthquake and used that for their first round of dispatch. Now, however, they need more information to get a better gauge of what is going on at the ground level. |
== Purpose == | == Purpose == | ||
− | To gather the information the city official's need. They launched an app where the citizens can report the intensity of shake and level of damage done to utility infrastructure. The officials can use this tool to record data provided by citizens. The citizens use the app to note down the level of damage seen on a utility/infrastructure building in a Neighbourhood. They can also record the shake intensity in the neighborhood. The data is stored every 5 mins. They may also be some data loss or delay due to power shortages. | + | To gather the information, the city official's need. They launched an app where the citizens can report the intensity of shake and level of damage done to utility infrastructure. The officials can use this tool to record data provided by citizens. The citizens use the app to note down the level of damage seen on a utility/infrastructure building in a Neighbourhood. They can also record the shake intensity in the neighborhood. The data is stored every 5 mins. They may also be some data loss or delay due to power shortages. |
− | With all this data, visualizations were created to understand the data faster. Recommendations and decisions can be churned out | + | With all this data, visualizations were created to understand the data faster. Recommendations and decisions can be churned out more quickly to get help to people faster. |
The following questions also have to be answered: | The following questions also have to be answered: | ||
− | # Emergency responders will base their initial response on the earthquake shake map. Use visual analytics to determine how their response should change based on damage reports from citizens on the ground. How would you prioritize neighborhoods for | + | # Emergency responders will base their initial response on the earthquake shake map. Use visual analytics to determine how their response should change based on damage reports from citizens on the ground. How would you prioritize neighborhoods for the reaction? Which parts of the city are the hardest hit? |
# Use visual analytics to show uncertainty in the data. Compare the reliability of neighborhood reports. Which neighborhoods are providing reliable reports? Provide a rationale for your response. | # Use visual analytics to show uncertainty in the data. Compare the reliability of neighborhood reports. Which neighborhoods are providing reliable reports? Provide a rationale for your response. | ||
# How do conditions change over time? How does uncertainty in change over time? Describe the key changes you see. | # How do conditions change over time? How does uncertainty in change over time? Describe the key changes you see. | ||
Line 50: | Line 51: | ||
== Data Gathering and Clean up == | == Data Gathering and Clean up == | ||
− | The data provided in | + | The data provided in an mc1-reports-data.csv file with the following data: |
[[File:Screenshot 2019-10-12 at 5.10.14 PM.png|thumb|center|700px|The first few rows of the data provided in the CSV file]] | [[File:Screenshot 2019-10-12 at 5.10.14 PM.png|thumb|center|700px|The first few rows of the data provided in the CSV file]] | ||
The headers were: | The headers were: | ||
# Time: A timestamp of the report made by a citizen. The format is in DD/MM/YY HH:MM:SS | # Time: A timestamp of the report made by a citizen. The format is in DD/MM/YY HH:MM:SS | ||
− | # sewer_and_water: Damage recorded on the sewer and water systems in the neighborhood and at the timestamp. 0 is the lowest level of damage while 10 is the highest | + | # sewer_and_water: Damage recorded on the sewer and water systems in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest |
− | # power: Damage recorded on the power generation systems in the neighborhood and at the timestamp. 0 is the lowest level of damage while 10 is the highest | + | # power: Damage recorded on the power generation systems in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest |
− | # roads_and_bridges: Damage recorded on the roads and bridges in the neighborhood and at the timestamp. 0 is the lowest level of damage while 10 is the highest | + | # roads_and_bridges: Damage recorded on the roads and bridges in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest |
− | # medical: Damage recorded on the medical facilities in the neighborhood and at the timestamp. 0 is the lowest level of damage while 10 is the highest | + | # medical: Damage recorded on the medical facilities in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest |
− | # buildings: Damage recorded on the buildings in the neighborhood and at the timestamp. 0 is the lowest level of damage while 10 is the highest | + | # buildings: Damage recorded on the buildings in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest |
# shake_intensity: How violent the shaking was in the neighborhood and at the timestamp. | # shake_intensity: How violent the shaking was in the neighborhood and at the timestamp. | ||
− | # location: Id of the | + | # location: Id of the neighborhood for which the citizen is reporting his readings. (This will be matched to the neighborhood data in the map file) |
− | = Cleaning up the data = | + | == Cleaning up the data == |
Using Tableau Prep Builder, data from the CSV file was moved around and changed a little to make visualizations better. | Using Tableau Prep Builder, data from the CSV file was moved around and changed a little to make visualizations better. | ||
− | To start, I first pivot the medical, power, road_and_bridges, sewer_and_water, buildings and | + | === Pivoting === |
+ | To start, I first pivot the medical, power, road_and_bridges, sewer_and_water, buildings, and shake_intensity on the dashboard. The utilities are called "Source of reading," and values are called "Readings." | ||
− | [[File:Screenshot 2019-10-12 at 6.34.55 PM.png|1000px|thumb|center|Step 1: | + | [[File:Screenshot 2019-10-12 at 6.34.55 PM.png|1000px|thumb|center|Step 1: Pivoting the dashboard]] |
− | Next | + | === Cleaning up names === |
+ | Next, I renamed the following sources of reading and capitalized the rest: | ||
+ | # road_and_bridges into "Road and Bridges." | ||
+ | # sewer_and_water into "Sewer and Water" | ||
+ | # shake_intensity into "Shake Intensity" | ||
+ | |||
+ | [[File:Screenshot 2019-10-12 at 9.17.34 PM.png|500px|thumb|center| Step 2: Name clean up]] | ||
+ | |||
+ | == Setting up Tableau == | ||
+ | |||
+ | To start my visualization journey, I first added a file called StHimark.shp taken form the MC2 VAST Data challenge 2's data files to create the interactive map on a tableau workbook. This file has the following fields: | ||
+ | |||
+ | # ID: Id of the neighborhood | ||
+ | # location: Name of the neighborhood | ||
+ | # Longitude: Longitude coordinate of the neighborhood | ||
+ | # Latitude: Latitude coordinate of the neighborhood | ||
+ | |||
+ | I then dragged and dropped the output file form tableau prep into tableau. I used ID from StHimark.shp and location from mc1-reports-data.csv and inner joined them: | ||
+ | [[File:Screenshot 2019-10-12 at 10.41.40 PM.png|700px|thumb|center|Step 3: Inner Join the two worksheets]] | ||
+ | |||
+ | This was the result. | ||
+ | |||
+ | [[File:Screenshot 2019-10-12 at 10.52.34 PM.png|700px|thumb|center|Final Result of Data Transformation]] | ||
+ | |||
+ | == Visualization and Interactive techniques== | ||
+ | See the charts here: https://public.tableau.com/profile/parth.goda#!/vizhome/VASTchallengemc1Parthgoda/MainPage?publish=yes | ||
+ | |||
+ | The visualizations I created were all connected from a simple main page that let the user choose if they want to see either: | ||
+ | |||
+ | # Based on each neighborhood in St. Himark | ||
+ | # Based on each reading source. E.g., Building or Medical damage | ||
+ | # Based on the Map of St. Himark progression through the six days | ||
+ | |||
+ | This design has implemented the idea that when city officials turn to this dashboard to look for data on how to allocate resources, they can start to form their decisions based on either neighborhood, a utility that they want to work on, or see the timeline of the whole incident. | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! style="font-weight: bold;background: #536a87;color:#ffffff;width: 20%;" | Main Page | ||
+ | |- | ||
+ | | <b>Purpose / Description</b><br> This is the landing page of the applicant. On this page the user gets a quick summary of what is the purpose of this application and an option to dive into three areas of reporting. I have created this page as a starting point where the user can keep coming back to and navigating away from. | ||
+ | [[File:Screenshot 2019-10-13 at 1.44.17 AM.png |800px|thumb|center|Main Page of Application]] | ||
+ | <br> | ||
+ | |- | ||
+ | | <b>Interactive Technique</b><br> | ||
+ | I also use interesting interactive techniques for the dashboard to be more user friendly. | ||
+ | <ol> | ||
+ | <li>Select : Button Redirection</li>Available in the tableau dashboard catalogue of objects: Button. When clicked on in the tableau public website, it redirects the user to specific pages that are mapped by me | ||
+ | [[File:Screenshot 2019-10-13 at 1.47.47 AM.png|400px|thumb|center|Button used to redirect]] | ||
+ | <li>Select : Hover</li> | ||
+ | For Extra information about the button and what it does or where it leads to | ||
+ | [[File:Screenshot 2019-10-13 at 1.47.19 AM.png|400px|center|thumb|Button Tooltip]] | ||
+ | </ol> | ||
+ | |} | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! style="font-weight: bold;background: #536a87;color:#ffffff;width: 20%;" | Neighborhood Dashboard | ||
+ | |- | ||
+ | | <b>Purpose / Description</b><br> This dashboard is designed to understand what is going on in each neighbourhood. When the user first lands on the page, all the charts show data for all the neighborhoods and sources of readings. But when the user filters the data by selecting an area on the map or one of the buttons on the list, the data changes to show the real picture in each neighborhood. The purpose of this is for the use case where city officials need to know what is going on in each neighborhood. They can see what the most reported utility and what is the most reading that is being reported. They can also see all the filtered data on a timeline, to understand when and how much was reported. | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 4.17.36 PM.png|700px|thumb|center|The top half of the neighborhoods dashboard]] | ||
+ | [[File:Screenshot 2019-10-13 at 4.17.52 PM.png|700px|thumb|center|The bottom half of the neighborhoods dashboard]] | ||
+ | |||
+ | |- | ||
+ | | <b>Interactive Technique</b><br> | ||
+ | <ol> | ||
+ | <li>Tooltips </li> | ||
+ | A user can put his cursor over any data point or visual elements in the charts to find out specific details about that data point. A small box will appear with relevant information. | ||
+ | [[File:Screenshot 2019-10-13 at 4.26.14 PM.png|400px|thumb|center|Tooltips that appear when cursor is placed on an element]] | ||
+ | <br> | ||
+ | <li>Filter</li> | ||
+ | There are two filters on this dashboard: | ||
+ | # Neighborhood | ||
+ | # Source of reading | ||
+ | |||
+ | Using these two filters, the users can find specific information and make decisions about things they are interested in. On this dashboard, the main filter is the map filter on the right called "Map reference." By choosing one area, they can see the timeline, bar graphs, and heat maps about everything reported in that area. He can also apply a second filter from the "Source of reading" to get more information. | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 4.34.34 PM.png|700px|thumb|center|The two filters available on the neighborhood dashboard]] | ||
+ | <li>Connect</li> | ||
+ | Since both the heat map and the total readings collected bar graphs are time-based visualizations. When a user hovers over one of the two charts, the same data point will be highlighted in the other chart too. It's more user-friendly because it helps to keep track of the data points. | ||
+ | [[File:Screenshot 2019-10-13 at 4.52.19 PM.png|700px|thumb|center|Two connected graphs highlight together when hovered over]] | ||
+ | </ol> | ||
+ | |- | ||
+ | | <b>Types of Charts used</b><br> | ||
+ | There are three types of charts used: | ||
+ | # Bar graphs | ||
+ | # Heat Maps | ||
+ | # Bar graphs over a timeline | ||
+ | [[File:Screenshot 2019-10-13 at 4.58.32 PM.png|700px|thumb|center|All the charts used in the neighbourhood dashboard]] | ||
+ | |} | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! style="font-weight: bold;background: #536a87;color:#ffffff;width: 20%;" | Utility Dashboard | ||
+ | |- | ||
+ | | <b>Purpose / Description</b><br> This dashboard is designed to understand how each utility affects each neighbourhood. When the user first lands on the page, all the charts show data for all sources of readings and ranks how badly damaged the neighborhoods are. When the user filters the data by selecting one of the buttons on the Source of the readings list, clearer data is shown where the ranking is now based on the selected source of reading. The purpose of this is for the use case where city officials need to allocate specific resources like emergency power or building repairs. They can see what the most reported neighborhoods and what is the total/average readings that are being reported. They can also see all the filtered data on a timeline, to understand when and how much was reported. | ||
+ | [[File:Screenshot 2019-10-13 at 5.38.57 PM.png|700px|thumb|center|The Utility Dashboard]] | ||
+ | |- | ||
+ | | <b>Interactive Technique</b><br> | ||
+ | <ol> | ||
+ | <li>Tooltips </li> | ||
+ | A user can put his cursor over any data point or visual elements in the charts to find out specific details about that data point. A small box will appear with relevant information. | ||
+ | [[File:Screenshot 2019-10-13 at 5.40.12 PM.png|700px|thumb|center|Tool tip observed in the utility dashboard]] | ||
+ | <br> | ||
+ | <li>Filter</li> | ||
+ | There are two filters on this dashboard: | ||
+ | # Neighborhood | ||
+ | # Source of reading | ||
+ | |||
+ | Using these two filters, the users can find specific information and make decisions about things they are interested in. On this dashboard, the main filter is the source of readings filter on the right called "Select Source of Reading." By choosing one reading type, they can see the timeline, bar graphs, and choropleth map report information on that source of reading. They can also apply a second filter from the "Select Neighbourhood(s)" to get more area-specific | ||
+ | [[File:Screenshot 2019-10-13 at 5.44.17 PM.png|700px|thumb|center|Map and utility filter]] | ||
+ | |||
+ | <li>Connect</li> | ||
+ | These four graphs are all connected by hovering the cursor over them. The neighborhood that is under your cursor will be highlighted in all four charts. The users can now track the neighborhood place on all four charts. They can know its rank in both total and average in the comparison charts and the maps | ||
+ | [[File:Screenshot 2019-10-13 at 5.53.46 PM.png|700px|thumb|center|4 Connected charts]] | ||
+ | </ol> | ||
+ | |- | ||
+ | | <b>Types of Charts used</b><br> | ||
+ | There are three types of charts used: | ||
+ | # Bar graphs | ||
+ | [[File:Screenshot 2019-10-13 at 6.18.56 PM.png|500px|thumb|center|A sample bar graph in the Dashboard ]] | ||
+ | # Bar graphs over a timeline | ||
+ | [[File:Screenshot 2019-10-13 at 6.19.05 PM.png|500px|thumb|center|A timeline-based bar graph]] | ||
+ | # Choropleth map | ||
+ | [[File:Screenshot 2019-10-13 at 6.18.59 PM.png|500px|thumb|center|A choropleth map chart ]] | ||
+ | |} | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! style="font-weight: bold;background: #536a87;color:#ffffff;width: 20%;" | Time Dashboard | ||
+ | |- | ||
+ | | <b>Purpose / Description</b><br> | ||
+ | This dashboard was designed to understand what kind of data was reported during the earthquake. Users can see how each neighbourhood was affected in terms of the damage done to infrastructure and shake intensity over a period of time. They can specify a period of time in the maps or look at the overall picture on the heat maps. | ||
+ | [[File:Screenshot 2019-10-13 at 6.59.27 PM.png|700px|thumb|center|Top Half of the time Dashboard]] | ||
+ | [[File:Screenshot 2019-10-13 at 6.59.34 PM.png|700px|thumb|center|Bottom Half of the time Dashboard]] | ||
+ | |- | ||
+ | | <b>Interactive Technique</b><br> | ||
+ | <ol> | ||
+ | <li>Tooltips </li> | ||
+ | A user can put his cursor over any data point or visual elements in the charts to find out specific details about that data point. A small box will appear with relevant information. | ||
+ | [[File:Screenshot 2019-10-13 at 7.02.53 PM.png|500px|thumb|center|Tooltip in the time Dashboard ]] | ||
+ | <br> | ||
+ | <li>Filter</li> | ||
+ | There are two filters on this dashboard: | ||
+ | # Source of reading | ||
+ | # Time pages | ||
+ | |||
+ | A user can use the source of readings filter to get charts on each area of utility or the shake intensity. This filters the data in all four charts. However, the time pages filter the choropleth maps for the specific moment selected. | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 7.06.44 PM.png|500px|thumb|center|Time Dashboard Filters]] | ||
+ | <li>Connect</li> | ||
+ | The two heat maps are connected as the share the same timelines and data structure. By hovering over any heatmap point with the cursor, the same time period of the point will be highlighted in the other heat map. | ||
+ | [[File:Screenshot 2019-10-13 at 7.07.52 PM.png|500px|thumb|center|Hovering over one Heatmap]] | ||
+ | </ol> | ||
+ | |- | ||
+ | | <b>Types of Charts used</b><br> | ||
+ | There are two types of charts used: | ||
+ | # Choropleth map | ||
+ | [[File:Screenshot 2019-10-13 at 7.12.23 PM.png|500px|thumb|center|Time based Choropleth Map]] | ||
+ | # Heat Map | ||
+ | [[File:Screenshot 2019-10-13 at 6.59.34 PM.png|700px|thumb|center|Heat Maps]] | ||
+ | |} | ||
+ | |||
+ | == Task Results == | ||
+ | |||
+ | === Question 1 === | ||
+ | |||
+ | ==== Question: Emergency responders will base their initial response on the earthquake shake map. Use visual analytics to determine how their response should change based on damage reports from citizens on the ground. How would you prioritize neighborhoods for the reaction? Which parts of the city are the hardest hit?==== | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Point !! Recommendation | ||
+ | |- | ||
+ | | 1 || To start I would go to the Utility Dashboard to understand the different ways the city has been hit. Change the "source of readings" filter to all, building, power, etc. to analyze which neighborhoods need help and are worst hit. | ||
+ | |- | ||
+ | | 2 || Let's start with overall reports: | ||
+ | On both average and total readings, the following towns seem to be hardest hit in descending order | ||
+ | # Old town | ||
+ | # Boardview | ||
+ | # Scenic Vista | ||
+ | # Easton and Terrapin Springs | ||
+ | [[File:Screenshot 2019-10-13 at 7.39.53 PM.png|700px|thumb|center|Overall Worst Hit]] | ||
+ | |||
+ | However, for the first response, the city officials should use the shake intensity as this represents the towns that are closest to the earthquake epicenter | ||
+ | # Old town | ||
+ | # Pepper Mill | ||
+ | # Wilson Forest | ||
+ | # Safe Town | ||
+ | # Easton | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 7.52.23 PM.png|700px|thumb|center|Shake Intensity Map]] | ||
+ | They should also pay special attention to Safe town as the nuclear reactor is there. There will be some building damage and power damage to the reactor., Worst case, it could represent the like of Fukushima Daiichi nuclear disaster. | ||
+ | |- | ||
+ | | 3 || Howeve,r to make better decisions on which neighborhoods deserve special attention, filtering the data based on the sources would be better. | ||
+ | From the charts, the following would need the most help to fix the buildings | ||
+ | # Old town | ||
+ | # Boardview | ||
+ | # Chapparal | ||
+ | # Scenic Vista | ||
+ | # East Patron | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 7.49.03 PM.png|700px|thumb|center|Most Building Damage]] | ||
+ | |- | ||
+ | | 4 || Next would be which areas need special medical emergency response: | ||
+ | # Old Town | ||
+ | # Broadview | ||
+ | # Palace Hills | ||
+ | # Southton | ||
+ | # Downtown | ||
+ | |||
+ | Which is a priority as these areas are the ones with seven hospitals shared between them. | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 7.51.44 PM.png|700px|thumb|center|Hospital Damage Chart]] | ||
+ | |- | ||
+ | | 5 || For Power, Roads, and bridges, and sewage response, the data shows that all towns need about the same amount of help. The exceptions are old town and scenic view. They seem to stand out with most amounts of reports recorded by citizens. | ||
+ | |||
+ | For example, here is the power total reading chart: | ||
+ | [[File:Screenshot 2019-10-13 at 8.08.28 PM.png|500px|thumb|center|Power Chart]] | ||
+ | |||
+ | Looking at the socio-economic class of the residents at Scenic View, it would seem that since they have better infrastructure and equipment, an upper-class mindset, their numbers might be over recorded as they are not very close to the epicentre of the earthquake. It is a little suspicious that their number match of neighborhoods closer to the epicentre. | ||
+ | |} | ||
+ | |||
+ | === Question 2 === | ||
+ | |||
+ | ==== Question: Use visual analytics to show uncertainty in the data. Compare the reliability of neighborhood reports. Which neighborhoods are providing reliable reports? Provide a rationale for your response. ==== | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Point !! Recommendation | ||
+ | |- | ||
+ | | 1 || One way the reports are not reliable is due to the many gaps in data. Especially on the 8th of April 2020, which is around the time the earthquake happened. | ||
+ | |||
+ | For example, in this heat map, we can clearly see the missing information in some neighborhoods close to the epicentre: | ||
+ | [[File:Screenshot 2019-10-13 at 8.29.46 PM.png|700px|thumb|center|Heat Map with missing data]] | ||
+ | |||
+ | These neighborhoods are Old town, Boardview, Chapparal, Scenic Vista. All the readings after the gaps are much higher. It would seem maybe a backlog of data would rush in when the network connection had fixed. To collect the data when the earthquake hit was the main goal of the app, however, the missing data during the earthquake makes the whole process less reliable. | ||
+ | |- | ||
+ | | 2 || There is a possibility of inflated numbers due to mindsets and socio-economic class. I would like to accentuate the following: | ||
+ | # SCENIC VISTA | ||
+ | # NORTHWEST | ||
+ | # PALACE HILLS | ||
+ | # Boardview | ||
+ | |||
+ | These areas are expensive places with trendy and rich patrons. Which also comes with better infrastructure, utilities, maintenance, and security. | ||
+ | The people might be more educated and tech-savvy and have more awareness of the app compared to the rest of the town. It's also a possibility of an upper class, elite or entitled mentality that affect the numbers reported and their frequency by the citizens. In the case of boardview: most of their citizen are of the older generation, and some level of fear or panic might affect the numbers more than usual. | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 8.54.15 PM.png|700px|thumb|center|Neighbourhoods of uncertainty]] | ||
+ | |||
+ | |- | ||
+ | | 3 ||Similar to the observations in point 1, the following areas had missing data a day or so after the earthquake happened: | ||
+ | # Eaton | ||
+ | # Oak Willow | ||
+ | # Old town | ||
+ | # Pepper Mill | ||
+ | # Safe Town | ||
+ | # Scenic View | ||
+ | This definitely affected the reliability and certainty of the data and the charts. Specially to understand the aftermath of the earthquake and see which areas still need attention after the first response has been sent out. | ||
+ | [[File:Screenshot 2019-10-13 at 9.01.50 PM.png|500px|thumb|center|More missing data after the earthquake]] | ||
+ | |- | ||
+ | | 4 || One way the neighborhoods are providing certain data is the shake intensity. Areas close to the epicentre do show higher average readings compared to areas further away. This also mirrors the information in the earthquake shake map provided. | ||
+ | |||
+ | Thus, to some extent, data from the following towns can be said to be more certain: | ||
+ | # Old Town | ||
+ | # Safe Town | ||
+ | # Pepper Mill | ||
+ | # Wilson Forest | ||
+ | |||
+ | [[File:Mc1-majorquake-shakemap.png|500px|thumb|right|Provided shake map by VAST challenge organisers]] | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 9.23.25 PM.png|500px|thumb|left|Observed average shake intensity]] | ||
+ | |||
+ | |||
+ | |- | ||
+ | | 5 || One issue with the data collected is in the palace hill and northwest neighborhoods. As seen in this picture, these places have the highest concentrations of roads in St. Himark. | ||
+ | [[File:Mc1-majorquake-shakemap.png|500px|thumb|center|Provided Roadmap by VAST challenge organisers]] | ||
+ | |||
+ | However, the data shows that not much damage is seen or reported on the roads in this area. That would be contradictory to common logic as these areas by default, should record the most damage to roads and belief. On the other hand, the old town and scenic view record and report the most damage. This is not realistic as these towns do not have that many roads compared to palace hill and northwest. | ||
+ | |||
+ | [[File:Screenshot 2019-10-13 at 9.38.00 PM.png|800px|thumb|center|Roads and bridges Data]] | ||
+ | |- | ||
+ | | 6 || One neighborhood that was not providing reliable reports if Wilson forest. Most of the data is either missing due to remoteness of the area, or the neighborhood is not populated enough to provide the city officials with data. | ||
+ | |||
+ | In this heat map of all data collected in Wilson forest, you can see that there is almost no data: | ||
+ | [[File:Screenshot 2019-10-13 at 9.58.26 PM.png|500px|thumb|center|Wilson Forest heat map]] | ||
+ | |||
+ | |} | ||
+ | |||
+ | === Question 3 === | ||
+ | |||
+ | ==== Question: How do conditions change over time? How does uncertainty in change over time? Describe the key changes you see. Limit your response to 500 words and 8 images. ==== | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Point !! Recommendation | ||
+ | |- | ||
+ | | 1 || In most neighborhoods, areas of Utility and shake intensity, the data follows this trend: | ||
+ | [[File:Screenshot 2019-10-13 at 10.01.00 PM.png|500px|thumb|center|Total data collected on a timeline]] | ||
+ | |||
+ | Except on 6/4/2020 at 4-5 pm when maybe there might be some pre-earthquake shaking, the readings are all soft or manageable till 8/4/2020 about 8 am. Which is approximately when the earthquake happened. Then, there are some after-shake readings or a pile-up of data from cut out neighborhoods on early 9/4/2020 and afternoon 10/4/2020. | ||
+ | |- | ||
+ | | 2 || Uncertainly does increase over time, mostly after the earthquake. This might be due to the damage to infrastructure that caused some data to be missing in some neighborhoods of St. Himark. Before the earthquake, there were not many missing data (except for Wilson Forest) even though there was construction going on in some places, but afterward, more and more missing data started to occur. Especially to neighborhoods that were closer to the epicentre and had more damaged buildings and roads. | ||
+ | |||
+ | This would make sense as after the earthquake; there were reports of damage to power infrastructure all around town. Without power, many citizens will not be able to power their devices and log the damage and shake data on the app. We can see here in these charts that every neighborhood in St. Himark had experienced some power damage. | ||
+ | [[File:Screenshot 2019-10-13 at 11.10.06 PM.png|500px|thumb|center|Average Power Damage per neighbourhood]] | ||
+ | |||
+ | |- | ||
+ | | 2 || One other way uncertainty was affected by time was during the 6/4/2020 4-5 pm pre-earthquake shaking. It can be noticed on the heatmap that overall in all neighborhoods the average readings dropped: | ||
+ | [[File:Screenshot 2019-10-13 at 10.35.10 PM.png|500px|thumb|center|Avg readings Heat Map: notice mid-day 6/4/2020]] | ||
+ | However, at the same time, the number of records coming in overall increased from that same period onwards: | ||
+ | [[File:Screenshot 2019-10-13 at 10.35.29 PM.png|500px|thumb|center|Total readings coming in. Notice 6/4/2020 mid-day]] | ||
+ | |||
+ | There is something wrong with this as more people would not log onto the app after pre-earthquake shaking to record lower readings of shaking and damage. | ||
+ | |} | ||
+ | |||
+ | == Reference == | ||
+ | |||
+ | Icon for neighboorhood: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&ved=2ahUKEwiL1o3QzpPlAhWw6XMBHfqeBi4Qjhx6BAgBEAI&url=https%3A%2F%2Fwww.logosurfer.com%2Flogo%2Fthe-neighbourhood-logo&psig=AOvVaw0BBHU8ychLfogor4jV3f7K&ust=1570862792003906 | ||
+ | |||
+ | Icon for gauge https://www.123rf.com/clipart-vector/guage.html?oriSearch=utility&sti=nh1v9o3lxchgqttrvu| | ||
+ | |||
+ | Home icon: https://www.iconfinder.com/icons/185038/home_house_streamline_icon | ||
+ | |||
+ | Fukashima disaster: https://en.wikipedia.org/wiki/Fukushima_Daiichi_nuclear_disaster | ||
+ | |||
+ | Model answers: | ||
+ | # Gwendoline Tan https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Gwendoline_Tan_Wan_Xin | ||
+ | # Lim Kim Yong https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Lim_Kim_Yong | ||
+ | # Chew Yuxi https://public.tableau.com/profile/yuxi7903#!/vizhome/VA_Assignment_Chew_Yuxi/OAQStationsTimeSeries | ||
+ | # Tan Kee Hock https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Tan_Kee_Hock | ||
+ | |||
+ | |||
+ | == Comments == |
Latest revision as of 23:26, 13 October 2019
Contents
Background
Welcome to St. Himark! A fictional city that will is being used in this visual case study. It is a city of 19 neighborhoods, all of which have their unique characteristics and amenities. St. Himark has a population of 246,839 people, and it's located in the Oceanus Sea. It is also home to the world-renowned St. Himark Museum, beautiful beaches, and the Wilson Forest Nature Preserve. It is one of the best cities to raise a family and work. Always Safe Nuclear Power Plant provides the majority of the power in the city and jobs in the Safe Town. Mayor Jordan and the city council current govern the city.
The runs in the following utilities:
- Water and Sewage
- Road and Bridge
- Gas
- Garbage
- Power
There is always construction going on in the above utilities.
St. Himark is segregated into 19 neighborhoods:
- PALACE HILLS
- NORTHWEST
- OLD TOWN
- SAFE TOWN
- SOUTHWEST
- DOWNTOWN
- WILSON FOREST
- SCENIC VISTA
- BROADVIEW
- CHAPPARAL
- TERRAPIN SPRINGS
- PEPPER MILL
- CHEDDARFORD
- EASTON
- WESTON
- SOUTHTOWN
- OAK WILLOW
- EAST PARTON
- WEST PARTON
Problem - VAST challenge MC1
There was an earthquake northwest of St. Himark. It occurred between 6 April 2020 and 8 April 2020. The city's officials needed to collect data immediately to understand the extent of the damage. Then can then allocate resources efficiently to the areas of town where it's needed and dispatch their emergency services.
At first, they only have the seismic readings of the earthquake and used that for their first round of dispatch. Now, however, they need more information to get a better gauge of what is going on at the ground level.
Purpose
To gather the information, the city official's need. They launched an app where the citizens can report the intensity of shake and level of damage done to utility infrastructure. The officials can use this tool to record data provided by citizens. The citizens use the app to note down the level of damage seen on a utility/infrastructure building in a Neighbourhood. They can also record the shake intensity in the neighborhood. The data is stored every 5 mins. They may also be some data loss or delay due to power shortages.
With all this data, visualizations were created to understand the data faster. Recommendations and decisions can be churned out more quickly to get help to people faster.
The following questions also have to be answered:
- Emergency responders will base their initial response on the earthquake shake map. Use visual analytics to determine how their response should change based on damage reports from citizens on the ground. How would you prioritize neighborhoods for the reaction? Which parts of the city are the hardest hit?
- Use visual analytics to show uncertainty in the data. Compare the reliability of neighborhood reports. Which neighborhoods are providing reliable reports? Provide a rationale for your response.
- How do conditions change over time? How does uncertainty in change over time? Describe the key changes you see.
Data Gathering and Clean up
The data provided in an mc1-reports-data.csv file with the following data:
The headers were:
- Time: A timestamp of the report made by a citizen. The format is in DD/MM/YY HH:MM:SS
- sewer_and_water: Damage recorded on the sewer and water systems in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- power: Damage recorded on the power generation systems in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- roads_and_bridges: Damage recorded on the roads and bridges in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- medical: Damage recorded on the medical facilities in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- buildings: Damage recorded on the buildings in the neighborhood and at the timestamp. 0 is the lowest level of damage, while 10 is the highest
- shake_intensity: How violent the shaking was in the neighborhood and at the timestamp.
- location: Id of the neighborhood for which the citizen is reporting his readings. (This will be matched to the neighborhood data in the map file)
Cleaning up the data
Using Tableau Prep Builder, data from the CSV file was moved around and changed a little to make visualizations better.
Pivoting
To start, I first pivot the medical, power, road_and_bridges, sewer_and_water, buildings, and shake_intensity on the dashboard. The utilities are called "Source of reading," and values are called "Readings."
Cleaning up names
Next, I renamed the following sources of reading and capitalized the rest:
- road_and_bridges into "Road and Bridges."
- sewer_and_water into "Sewer and Water"
- shake_intensity into "Shake Intensity"
Setting up Tableau
To start my visualization journey, I first added a file called StHimark.shp taken form the MC2 VAST Data challenge 2's data files to create the interactive map on a tableau workbook. This file has the following fields:
- ID: Id of the neighborhood
- location: Name of the neighborhood
- Longitude: Longitude coordinate of the neighborhood
- Latitude: Latitude coordinate of the neighborhood
I then dragged and dropped the output file form tableau prep into tableau. I used ID from StHimark.shp and location from mc1-reports-data.csv and inner joined them:
This was the result.
Visualization and Interactive techniques
See the charts here: https://public.tableau.com/profile/parth.goda#!/vizhome/VASTchallengemc1Parthgoda/MainPage?publish=yes
The visualizations I created were all connected from a simple main page that let the user choose if they want to see either:
- Based on each neighborhood in St. Himark
- Based on each reading source. E.g., Building or Medical damage
- Based on the Map of St. Himark progression through the six days
This design has implemented the idea that when city officials turn to this dashboard to look for data on how to allocate resources, they can start to form their decisions based on either neighborhood, a utility that they want to work on, or see the timeline of the whole incident.
Main Page |
---|
Purpose / Description This is the landing page of the applicant. On this page the user gets a quick summary of what is the purpose of this application and an option to dive into three areas of reporting. I have created this page as a starting point where the user can keep coming back to and navigating away from.
|
Interactive Technique I also use interesting interactive techniques for the dashboard to be more user friendly.
|
Neighborhood Dashboard |
---|
Purpose / Description This dashboard is designed to understand what is going on in each neighbourhood. When the user first lands on the page, all the charts show data for all the neighborhoods and sources of readings. But when the user filters the data by selecting an area on the map or one of the buttons on the list, the data changes to show the real picture in each neighborhood. The purpose of this is for the use case where city officials need to know what is going on in each neighborhood. They can see what the most reported utility and what is the most reading that is being reported. They can also see all the filtered data on a timeline, to understand when and how much was reported. |
Interactive Technique
|
Types of Charts used There are three types of charts used:
|
Utility Dashboard |
---|
Purpose / Description This dashboard is designed to understand how each utility affects each neighbourhood. When the user first lands on the page, all the charts show data for all sources of readings and ranks how badly damaged the neighborhoods are. When the user filters the data by selecting one of the buttons on the Source of the readings list, clearer data is shown where the ranking is now based on the selected source of reading. The purpose of this is for the use case where city officials need to allocate specific resources like emergency power or building repairs. They can see what the most reported neighborhoods and what is the total/average readings that are being reported. They can also see all the filtered data on a timeline, to understand when and how much was reported. |
Interactive Technique
|
Types of Charts used There are three types of charts used:
|
Time Dashboard |
---|
Purpose / Description This dashboard was designed to understand what kind of data was reported during the earthquake. Users can see how each neighbourhood was affected in terms of the damage done to infrastructure and shake intensity over a period of time. They can specify a period of time in the maps or look at the overall picture on the heat maps. |
Interactive Technique
|
Types of Charts used There are two types of charts used:
|
Task Results
Question 1
Question: Emergency responders will base their initial response on the earthquake shake map. Use visual analytics to determine how their response should change based on damage reports from citizens on the ground. How would you prioritize neighborhoods for the reaction? Which parts of the city are the hardest hit?
Point | Recommendation |
---|---|
1 | To start I would go to the Utility Dashboard to understand the different ways the city has been hit. Change the "source of readings" filter to all, building, power, etc. to analyze which neighborhoods need help and are worst hit. |
2 | Let's start with overall reports:
On both average and total readings, the following towns seem to be hardest hit in descending order
However, for the first response, the city officials should use the shake intensity as this represents the towns that are closest to the earthquake epicenter
They should also pay special attention to Safe town as the nuclear reactor is there. There will be some building damage and power damage to the reactor., Worst case, it could represent the like of Fukushima Daiichi nuclear disaster. |
3 | Howeve,r to make better decisions on which neighborhoods deserve special attention, filtering the data based on the sources would be better.
From the charts, the following would need the most help to fix the buildings
|
4 | Next would be which areas need special medical emergency response:
Which is a priority as these areas are the ones with seven hospitals shared between them. |
5 | For Power, Roads, and bridges, and sewage response, the data shows that all towns need about the same amount of help. The exceptions are old town and scenic view. They seem to stand out with most amounts of reports recorded by citizens.
For example, here is the power total reading chart: Looking at the socio-economic class of the residents at Scenic View, it would seem that since they have better infrastructure and equipment, an upper-class mindset, their numbers might be over recorded as they are not very close to the epicentre of the earthquake. It is a little suspicious that their number match of neighborhoods closer to the epicentre. |
Question 2
Question: Use visual analytics to show uncertainty in the data. Compare the reliability of neighborhood reports. Which neighborhoods are providing reliable reports? Provide a rationale for your response.
Point | Recommendation |
---|---|
1 | One way the reports are not reliable is due to the many gaps in data. Especially on the 8th of April 2020, which is around the time the earthquake happened.
For example, in this heat map, we can clearly see the missing information in some neighborhoods close to the epicentre: These neighborhoods are Old town, Boardview, Chapparal, Scenic Vista. All the readings after the gaps are much higher. It would seem maybe a backlog of data would rush in when the network connection had fixed. To collect the data when the earthquake hit was the main goal of the app, however, the missing data during the earthquake makes the whole process less reliable. |
2 | There is a possibility of inflated numbers due to mindsets and socio-economic class. I would like to accentuate the following:
These areas are expensive places with trendy and rich patrons. Which also comes with better infrastructure, utilities, maintenance, and security. The people might be more educated and tech-savvy and have more awareness of the app compared to the rest of the town. It's also a possibility of an upper class, elite or entitled mentality that affect the numbers reported and their frequency by the citizens. In the case of boardview: most of their citizen are of the older generation, and some level of fear or panic might affect the numbers more than usual. |
3 | Similar to the observations in point 1, the following areas had missing data a day or so after the earthquake happened:
This definitely affected the reliability and certainty of the data and the charts. Specially to understand the aftermath of the earthquake and see which areas still need attention after the first response has been sent out. |
4 | One way the neighborhoods are providing certain data is the shake intensity. Areas close to the epicentre do show higher average readings compared to areas further away. This also mirrors the information in the earthquake shake map provided.
Thus, to some extent, data from the following towns can be said to be more certain:
|
5 | One issue with the data collected is in the palace hill and northwest neighborhoods. As seen in this picture, these places have the highest concentrations of roads in St. Himark.
However, the data shows that not much damage is seen or reported on the roads in this area. That would be contradictory to common logic as these areas by default, should record the most damage to roads and belief. On the other hand, the old town and scenic view record and report the most damage. This is not realistic as these towns do not have that many roads compared to palace hill and northwest. |
6 | One neighborhood that was not providing reliable reports if Wilson forest. Most of the data is either missing due to remoteness of the area, or the neighborhood is not populated enough to provide the city officials with data.
In this heat map of all data collected in Wilson forest, you can see that there is almost no data: |
Question 3
Question: How do conditions change over time? How does uncertainty in change over time? Describe the key changes you see. Limit your response to 500 words and 8 images.
Point | Recommendation |
---|---|
1 | In most neighborhoods, areas of Utility and shake intensity, the data follows this trend:
Except on 6/4/2020 at 4-5 pm when maybe there might be some pre-earthquake shaking, the readings are all soft or manageable till 8/4/2020 about 8 am. Which is approximately when the earthquake happened. Then, there are some after-shake readings or a pile-up of data from cut out neighborhoods on early 9/4/2020 and afternoon 10/4/2020. |
2 | Uncertainly does increase over time, mostly after the earthquake. This might be due to the damage to infrastructure that caused some data to be missing in some neighborhoods of St. Himark. Before the earthquake, there were not many missing data (except for Wilson Forest) even though there was construction going on in some places, but afterward, more and more missing data started to occur. Especially to neighborhoods that were closer to the epicentre and had more damaged buildings and roads.
This would make sense as after the earthquake; there were reports of damage to power infrastructure all around town. Without power, many citizens will not be able to power their devices and log the damage and shake data on the app. We can see here in these charts that every neighborhood in St. Himark had experienced some power damage. |
2 | One other way uncertainty was affected by time was during the 6/4/2020 4-5 pm pre-earthquake shaking. It can be noticed on the heatmap that overall in all neighborhoods the average readings dropped:
However, at the same time, the number of records coming in overall increased from that same period onwards: There is something wrong with this as more people would not log onto the app after pre-earthquake shaking to record lower readings of shaking and damage. |
Reference
Icon for gauge https://www.123rf.com/clipart-vector/guage.html?oriSearch=utility&sti=nh1v9o3lxchgqttrvu%7C
Home icon: https://www.iconfinder.com/icons/185038/home_house_streamline_icon
Fukashima disaster: https://en.wikipedia.org/wiki/Fukushima_Daiichi_nuclear_disaster
Model answers:
- Gwendoline Tan https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Gwendoline_Tan_Wan_Xin
- Lim Kim Yong https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Lim_Kim_Yong
- Chew Yuxi https://public.tableau.com/profile/yuxi7903#!/vizhome/VA_Assignment_Chew_Yuxi/OAQStationsTimeSeries
- Tan Kee Hock https://wiki.smu.edu.sg/1617t1IS428g1/IS428_2016-17_Term1_Assign3_Tan_Kee_Hock