IS428 AY2019-20T1 Assign Ronald Lay Answers

From Visual Analytics for Business Intelligence
Revision as of 17:00, 13 October 2019 by Ronald.lay.2017 (talk | contribs)
Jump to navigation Jump to search

VAST Challenge 2019: Mini-Challenge 1

 

Problem & Tasks

 

Data Transformation

Interactive Visualization

 

Answers


Q1: Emergency responders will base their initial response on the earthquake shake map. Use visual analytics to determine how their response should change based on damage reports from citizens on the ground. How would you prioritize neighborhoods for response? Which parts of the city are hardest hit?

Earthquake events

  • Figure 1.1 - Overall intensity readings over time period

Figure 1.1 presents a scatter plot chart with timeline of damaged reports. Based on figure 1.1, there are meaningful insights which are divided into 3 key events as followed:

Pre-Earthquake

The pre-earthquake has noticeable pattern on Monday afternoon at 14.30. Despite relatively low damage and shake intensity, it has significant number of damage reports represented by the large size of the dot and the highest damage report belongs to building represented by blue dot. It has intensity between 2 and 3.

Major Earthquake

Major earthquake commences on 8.30 AM. The highest damage report as followed:

Category Colour Intensity level
Power
Red
5.5 - 8
Roads and Bridges
Cyan
5 - 7
Sewer and water
Green
4.5 - 7
Building
Blue
4 - 5
Medical
Orange
3 - 6

All categories experience significant damage reports, particularly Power experiences the most damage. Later on, power outage is the root cause of reporting reliability issues, which will be discussed on Question 2.

Post Earthquake

Category Colour Intensity level
Sewer and water
Green
4.5 - 7.5
Roads and Bridges
Cyan
4 - 7
Power
Red
3.5 - 6
Building
Blue
3.5 - 5.5
Medical
Orange
2.5 - 5

Another challenge is presented to St. Himark as another earthquake occurs after a major earthquake on Thursday afternoon at 3 PM. Sewer and water has the most significant damage report as indicated by the highest intensity level range in the table. This poses a challenge to public health as spoiled sewer and contaminated water will give birth to disease-cause germs to spread around the town. Although the intensity is not as great as major earthquake, Road & Bridges and Power still experience damage to a certain extent, which contribute to reliability issues as discussed later on Question 2.

Hardest-hit region

  • Figure 2.1 - Overall intensity readings over time period

Box-and-Whisker plot provides an overview on which region is hardest-hit based on the range and median. The range contains highest and lowest observation to determine how high/low the damage is on each region, while the median represents the middle value of the data. For comparison purpose, let's use Old town and Pepper Mill to view the damage reports between two towns.

<Put picture here to show>

The neighbourhoods

Prioritization

Q2: Use visual analytics to show uncertainty in the data. Compare the reliability of neighborhood reports. Which neighborhoods are providing reliable reports? Provide a rationale for your response.

Missing reports among neighborhoods

Measure reliability among neighborhoods
  • Figure 2.1 - Overall intensity readings over time period

Based on Figure 2.1, there are 3 key analysis:

  • Downtown, Northwest and Weston provide the most reliable reports among all the neighborhoods
  • Wilson Forest provides the least reliable reports. Possible explanation could be Wilson Forest may experience power outage even before the major earthquake happens. However, there is no ongoing repair under Power current project.
  • As highlighted in oval red, there are occasional periods where there are simply no reports. The possible cause may point to power/server outages

Delayed reports

Overall Delayed reports
  • Figure 2.2 - Delayed reporting

Power outages and other infrastructural problem result in delayed reports (Indicated by red ovals) and the server does not process the information until the power is restored. The explanation of number annotation is as followed:

  • 1 & 2: It is noticeable the reported damage is on different timing. The timestamp is only recorded when the power is restored, resulting in an increase of the amount of damage reports from Thursday 3 to 5 PM due to accumulation of reports over the period of power outages.

Delayed reports by neighbourhood
  • Figure 2.3 - Delayed reporting per neighbour
  • Figure 2.4 - Medical reporting per neighbour

Based on the Figure 2.3, the highlighted red box shows there is indeed a sudden increase in number of reports posted on the server at the same time. Most of the neighbors are affected at some point of time, particularly Broadview, Chapparal, Old Town and Scenic Vista are the most vulnerable.
Using filter function to include only medical, the discovery led us to Figure 2.4, which shows 2 key analysis:

  • The medical reports are mostly available between 8th at 8 PM to 11 PM and 9th at 3 PM to 7 PM for most of the neighborhoods.
  • Cheddarford, Wilson Forest and Chapparal have the most amount of low density and missing reports across all the dates.


Variation on reported intensity and number of reports
  • Figure 2.5 - number of reports vs reported intensity

Based on figure 2.5, The reported intensity is highly varied across categories, which indicates there is a varying response among all the records and particularly medical is vulnerable to the reliability issue. It suggests the submitted records by the devices are of a little help in assessing damages over time as it only records in 5 minutes batch and as discussed earlier, power outages and other infrastructural damages highly impact the accuracy of intensity readings at a specific time. Hence, it is necessary for monitoring tools to monitor and record intensity every second and the zero downtime deployment of server.

Q3: How do conditions change over time? How does uncertainty in data change over time? Describe the key changes you see.

1. Discrepancy in reported and shake intensity

Damage intensity versus Shake Intensity (Yellow line)
  • Figure 3.1 - Damage report vs shake intensity

Based on MC1 Data description, all the intensity are reported by people of St. Himark. However, there is a discrepancy between reported damage intensity and shake intensity, which leads to 2 possibilites:

  • The reading of shake intensity can possibly be based on seismic monitor. It is explained by less variation on shake intensity (Refer to Figure 2.6) and significantly lower number of reports as seismic monitor or similar tools only reports to the server when they detect vibrations/shakes
  • There is a difference in perceived feeling and actual view of damage. Based on figure 3.1, we can draw an insight that the actual view has more impact on our judgement, which can be explained by higher reported damage intensity.

2. Missing reports in Wilson Forest

Referring to Figure 2.1, Wilson Forest provides the least reliable reports due to many missing reports over time and possible factors include power outages and other infrastructural damage that breaks electrical distribution system to Wilson Forest. Based on St. Himark report, there is an ongoing project of Wilson Forest Highway which could potentially impact the electrical distribution system and more evidences are needed to prove. Another uncertainty is before and the major earthquake happens, there is little amount of damage reports recorded to the server; hence, more exploration needs to be done

3. Blackout Period

Damage intensity versus Shake Intensity (Yellow line)
  • Figure 3.2 - Blackout period
  • There is no live report and it is possible the condition could have changed during the blackout period (Higlighted with red ovals)
  • It is difficult to detect a new damage report and previously damage reports reported during certain blackout period. As such, there is an uncertainty in using the damage reports to indicate a temporal change as provided in figure 2.2 and 2.3