IS428 AY2019-20T1 Assign Ho Jue Hong Question

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search

Jh powerplant.jpg MINI-CHALLENGE 2: CITIZEN SCIENCE TO THE RESCUE

PROBLEM & MOTIVATION

 

DATA CLEANING

 

DASHBOARD

 

QUESTION AND ANSWERS

Question 1

Visualize radiation measurements over time from both static and mobile sensors to identify areas where radiation over background is detected. Characterize changes over time.

Q1 a.png



Given the question, I have plotted every minute of each day in the data provided. With the merger of the dataset with the Neighbourhood data, this is the level of radiation plotted over a 5 Days period. There are two types of sensors used to pick up the readings, the line denoted in blue is by the mobile sensor and the line in orange is picked up by static sensors. Across every graph plotted, there is a dotted red line. The red line is set at a value of 75 CPM, which means that any data point plotted above that value means that the radiation is very dangerous.

On the 6th April, we can see that the radiation across all neighbourhoods is relatively safe, with some spikes in the data point which lasts less than a minute. Hence there is nothing out of the blue on the 6th April.

On the 7th April, everything was similar to the 6th of April, everything is relatively safe with some random spikes in the data. Hence everything seems normal on 7th April.

On the 8th April, we can see a sharp spike in the radiation levels at Safe Town from 1:21 P.M. to 4:30 P.M. before Old town had a similar spike at 4:55 P.M. all the way to 9:28 P.M. Other then this two towns which has a prolong period of unhealthy radiation, the rest of the town seemed ok. Coincidentally, the nuclear powerplant is located at Safe Town and Old Town is situated right beside Safe Town. We can generally summarise that winds were blowing westwards.

On 9th April, we can see Safe Town has a sudden spike in radiation, tending towards the unhealthy zone and the neighbourhood Scenic Vista had a huge fluctuation of radiation alternating between the healthy and unhealthy range. The place that took the worst hit was the Wilson forest with an average of 1,093 CPM for the day. The radiation was 14 times over the unhealthy limit. From the two town hits on the 9th April, we could say the wind was travelling in a curved shape.

On 10th April, We can see that the only two towns still being affected by unhealthy radiation are, Wilson forest and Scenic Vista.



We can see here that there is the first spike in radiation near the nuclear powerplant, I do suspect that there is some sort of accident that happened this period



There seems to be some problem with Static Sensor 12 because when mobile sensor 10 was near it, it showed a huge spike in radiation but the static sensor seemed to be very normal. Either Static Sensor 12 had problems or Mobile Sensor 10 had issues.



Lastly We can see the spike in radiation for most of the Mobile Sensors that were stationed near Wilson Forest from Mobile Sensor 20 to 25, the only Sensor that shows inconsistent readings is Mobile Sensor 23, Mobile Sensors 28 and 29 shows equally high radiation readings and 30 showed a small reading despite being in the same area, hence we can conclude that there was indeed a radiation spike because of the data collected from the various number of mobile sensors

Question 2

Use visual analytics to represent and analyze uncertainty in the measurement of radiation across the city.

Compare uncertainty of the static sensors to the mobile sensors. What anomalies can you see? Are there sensors that are too uncertain to trust?

Mobile 9.png



From the picture above we can tell that Static Sensor 15 had a loss of data on the 9th and 10th of April. We could concur that an earthquake struck the Nuclear Powerplant on the 8th April that period and it damaged the infrastructure. We could assume that Static Sensor 15 was damaged in the process. We could also tell that Mobile 9 was in the vicinity of the Nuclear Power plant and picked up a high level of nuclear radiation on 8th April. Hence it could be clear that sensor 15 was trustable till the point of the Earthquake strike.

Mobile 10.png



Furthermore, I would like to elaborate on Static Sensor 12 and Static Mobile 10, we could tell that there we major discrepancy, at 8th April 7 PM, mobile sensor 10 picked up a huge spike in radiation but, the static sensor 12 which was in the same area did not show any spike in data. Hence it was very confusing on which sensor to believe in.

Lastly, Static sensor 13 did not show any response to the Nuclear Power plant being struck by an earthquake, it maintained its regular readings without any spike through the period of 8th April to 10th April which makes it even more confusing. These are the three major anomalies that question the reliability of the data.

Q2 static.png



Q2 mobile.png



Comparing the amount of data collected we can tell that Static sensor, collects more complete data as compared to mobile sensors. There are huge chunks of missing data in mobile sensors, hence in terms of completeness, I would trust Static Sensors more. Given the case that the Earthquake might have disrupted the Static Sensors readings, mobile Sensors might be the readings we want to rely on.

Q2 scatter static.png
Q2 scatter mobile1.png



Lastly, by using the scatter chart we are able to pick up readings that are totally off. But firstly we can compare the randomness of how each sensor picks up data. As seen above we can see that Static sensors have a less random distribution of picking up spikes in radiation level, there is usually a built up as compared to Mobile Readings.



Static4.png
Mobile9.png



The two outliers we are able to pick up are from static sensor 4 and mobile sensor 12. Static Sensor 4 picked up a negative value in radiation hence it is impossible, we could deem that there was something wrong with the equipment. Mobile Sensor 12 picked up a reading of 57,345 CPM which is also most likely impossible hence making the readings inconsistent.

Which regions of the city have greater uncertainty of radiation measurement? Use visual analytics to explain your rationale.



Uncertainty map.png



Lastly, I plotted the map using the number of records as we can see the colours that are denoted in light blue means it has lesser records, hence the more uncertain the data is. The area highlighted in red has totally no static sensors, if the mobile sensors move out of range, there will be totally zero readings in that area, hence I would say these are the most uncertain area.

What effects do you see in the sensor readings after the earthquake and other major events? What effect do these events have on uncertainty?



Q2 static missing.png
Q2 mobile missing.png



To visualise the amount of data we are missing, I used a calendar heat map to show the total amount of data missing, as We can see that only for static15 sensor, there was a loss of data from around 8th April 10 PM to 10th April 10 PM. The completeness of dataset could be reiterated by the Calendar Heatmap. Compared to mobile sensors. The mobile sensors have a huge chunk of missing data on both 8th, 9th and 10th. Hence it is quite inconsistent and making it very uncertain if the radiation is really spiking. Hence the effects of the earthquake could be clearly seen here, mobile sensors are missing large chunks of data because infrastructure might be damaged, people might be injured hence the lack of data, whereas static data provides us with a more complete view of what is happening throughout the entire event.


Question 3

Given the uncertainty you observed in question 2, are the radiation measurements reliable enough to locate areas of concern?

Highlight potential locations of contamination, including the locations of contaminated cars. Should St. Himark officials be worried about contaminated cars moving around the city?

Q3 contam neigh.png

Yes, firstly what I would do is refer back the previous chart of plotting radiation by Neighbourhood, given that the earthquake only happened on the 8th April, I would reduce the Timestamp to only the 8th, 9th and 10th April. From this information, we can deduce that the following towns are likely to be contaminated. Old Town, Safe Town, Scenic Vista and Wilson Forest.

Filter contam.png

Safe Town had a spike of radiation on 8th April 1:21 pm to 4:31 pm and 9th April from 5:00 pm to 8:30 PM. hence the car that passed through Safe Town could be narrowed down to:

1)Mobile 44

2)Mobile 43

3)Mobile 39

4)Mobile 22

5)Mobile 15

6)Mobile 13

7)Mobile 9

8)Mobile 46

9)Mobile 36

10)Moble 32

11)Mobile 12

12)Mobile 10

13)Mobile 8

14)Mobile 7

Hence from Safe Town alone there are a total of 14 Contaminated Cars

Q3 old town contam Car.png

Old Town had a spike of radiation on 8th April 4:51 pm to 9:31 pm hence the car that passed through Old Town could be narrowed down to:

1)Mobile 5

2)Mobile 7

3)Mobile 8

4)Mobile 9

5)Mobile 10

6)Mobile 11

7)Mobile 12

8)Mobile 21

9)Mobile 23

10)Mobile 36

11)Mobile 46

12)Mobile 47

Q3 scenic vista contam Car.png

Scenic Vista had a spike of radiation on 9th April 3:02 pm to 10 April 1:02 pm hence the car that passed through Scenic Vista could be narrowed down to:

1)Mobile 20

2)Mobile 23

Q3 Wilson Forest contam Car.png

Wilson Forest had a spike of radiation on 9th April 7:43 AM to 10th April 1:02 pm hence the car that passed through Wilson Forest could be narrowed down to:

1)Mobile 21

2)Mobile 22

3)Mobile 24

4)Mobile 25

5)Mobile 27

6)Mobile 28

7)Mobile 29

8)Mobile 30

9)Mobile 45

The final list of cars that were contaminated are as follow:

1)Mobile 5

2)Mobile 7

3)Mobile 8

4)Mobile 9

5)Mobile 10

6)Mobile 11

7)Mobile 12

8)Mobile 13

9)Mobile 15

10)Mobile 20

11)Mobile 21

12)Mobile 22

13)Mobile 23

14)Mobile 24

15)Mobile 25

16)Mobile 27

17)Mobile 28

18)Mobile 29

19)Mobile 30

20)Mobile 36

21)Mobile 39

22)Mobile 43

23)Mobile 44

24)Mobile 45

25)Mobile 46

26)Mobile 47

Therefore in total, there are 26 possibly contaminated cars.

The next thing to do is to map the location where these cars travelled hence understanding the possible places that might be contaminated

Q3 contam Car map travel.png



The map above was mapped with the corresponding time slot on the last three days, Hence we could see that all 26 Cars would nearly travel throughout the city hence causing more contamination. Therefore immediate measures must be taken in order to avoid any further contamination.

Estimate how many cars may have been contaminated when coolant leaked from the Always Safe plant. Use visual analysis of radiation measurements to determine if any have left the area.

Q3 safe town radiation levels.png

Firstly We have to determine the spike in radiation in Safe Town, that has already been done previously,Safe Town had a spike of radiation on 8th April 1:21 pm to 4:31 pm and 9th April from 5:00 pm to 8:30 PM. hence the car that passed through Safe Town could be narrowed down to:

1)Mobile 44

2)Mobile 43

3)Mobile 39

4)Mobile 22

5)Mobile 15

6)Mobile 13

7)Mobile 9

8)Mobile 46

9)Mobile 36

10)Moble 32

11)Mobile 12

12)Mobile 10

13)Mobile 8

14)Mobile 7

Hence from Safe Town alone there are a total of 14 Contaminated Cars, To follow up we would need to track these cars where were they travelling for the next two days after the contamination

Q3 before-Contamination.png



This prior to contaminations, all the cars are chosen here has been in the radiation when the level was above 75 CPM. Hence we will be tracking the movement of these cars from 8th April 1:31 PM to the end of 10th April.

Q3 after Contamination a.png



Q3 after-Contamination b.png



I split the selection of cars, to show the full extent of where they were travelling. at the end of the day, only Mobile 9, Mobile 13 and Mobile 15 are situated in Safe Town. But all the cars that were contaminated have left the area and only these 3 mobile sensors have returned to Safe town.




Indicated where you would deploy more sensors to improve radiation monitoring in the city. Would you recommend more static sensors or more mobile sensors or both? Use your visualization of radiation measurement uncertainty to justify your recommendation.

Q3 Static location.png



The stars highlighted in red shows proposed static sensors location, it is placed at every intersection to capture radiation data and hence being able to accurately display the radiation across the city.

Q2 static missing.png



We can see that there are missing data for static data, this might be caused by the earthquake hence we added two more new sensors near the nuclear power plant to resolve this issue.

Q2 mobile missing.png



For Mobile sensor, we could generally see the gap around that area, therefore more mobile sensors could be included to help cover the missing areas. By using both Static and Mobile data, we are able to form a better view and have greater clarity for the amount of radiation across the city. With the increase in both sensors, if an earthquake does strike again it will resolve the whole issue of missing data.

Question 4

Summarize the state of radiation measurements at the end of the available period. Use your novel visualizations and analysis approaches to suggest a course of action for the city. Use visual analytics to compare the static sensor network to the mobile sensor network. What are the strengths and weaknesses of each approach? How do they support each other?

Q1 a.png



We could see from the graphs in each neighbourhood that the radiation level all falls below the 75 CPM mark, it's safe once again in this city to travel around. But most importantly they should start to de-contaminate all the cars that have been contaminated

Mobile coverage across city.png



We can see those mobile sensors gives a bigger area of coverage, other than the need to increase more mobile sensors along Scenic Vista, Wilson Forest, the coverage of data is pretty comprehensive. The only problem is that there will not be any live data, because a mobile sensor only can be at a place at a given, time hence we might not be able to tell the current time. Mobile sensors give a better average CPM reading as it takes into account the area as compared to static sensors. I would like to suggest for more mobile sensors to be deployed around the areas that are not really highlighted like scenic vista and Wilson Forest. Furthermore, they should increase the number of mobile sensors around the road should they suspect that there might be a spike in radiation. This helps to ensure a continuous flow of data hence a better analysis. This gives the data a bigger width but not that much depth as compared to Static sensors as mobile sensors needs to be operated by humans, hence it cannot be operating 24/7 a week.

Static coverage across city.png



We can see this static sensor, provides some sort of coverage but it is not as extensive as the one shown in the mobile sensor. We can all agree that static sensors have a limited range, hence given an example we can be in Old Town's border next to Safe Town, the static sensor in Old Town is located right in the middle of the county. Hence even though the radiation readings might be safe, but in actual fact, the radiation might be higher than what is actually being shown. The plus point for Static sensors is that it gives a 24hr coverage of radiation levels without needing anyone to operate it. This gives continuous data flow but with the problem that it doesn't have width but only depth. A suggestion would be installing more static sensors in every county. This helps to improve the coverage and coupled together with the mobile sensor, this gives the data more witdh and depth.

With the combination of both sensors, it gives the data more depth and width hence the data can be combined to produce accurate reporting of the radiation levels, unlike the amount of uncertainty given in this case.

Question 5

The data for this challenge can be analyzed either as a static collection or as a dynamic stream of data, as it would occur in a real emergency. Describe how you analyzed the data - as a static collection or a stream. How do you think this choice affected your analysis?

Q1 a.png



Static Data Analysis

Analysing the data of historical data points for the radiation via static methods enable me to realise the overall picture of the radiation level around the city. It helped me to quickly discover clusters and patterns in the data. When sorted properly data would naturally form clusters that can be used for meaningful analysis.

Q3 contam Car map travel.png



Dynamic Data Analysis

By using dynamic analysis to track the movement of mobile sensors helps a lot with time-sensitive data, it gives us the direct location of the mobile sensors and the authorities can take immediate action on de-contaminating these sensors. Although the data allows us to analyse it statically, it will not show any useful insights.

Hence choosing the right analysis method for the given problem is extremely important.