IS428 AY2019-20T1 Assign Sean Chai Shong Hee Q2
This section will answer questions for Mini Case 2
Contents
Question 2
Calculated fields
Two calculated fields were created during the data preparation phase:
Calculated Field Name | Formula | Explanation |
---|---|---|
The calculated field Value > Average Value will help determine if the value recorded is higher than the average recorded radiation value of the neighbourhood for a particular day.
Certainty of static sensors
Before carrying out visualisations for the uncertainty of static sensors, I wanted to first find out the certainty of static sensors. This was done by visualising the variation in values recorded by static sensors.
The area chart shows how values recorded by the static sensors vary with the previous value recorded. Here, we can see the variations in values recorded. Generally, there is little variation between consecutive values recorded. This indicates that the static sensors are consistent in their measurements of radiation values.
Uncertainty of static sensors
Monitoring activities for static sensor 15 ceased from 8th April at 10pm and only resumed on the 10th of April at 8pm. Static sensor 15 may have malfunctioned due to exposure to high levels of nuclear radiation. Being situated at the entrance of the nuclear power plant, static sensor 15 would have been exposed to the heaviest dose of radiation leakage. The pattern of malfunction is also consistent with the behaviour of many of the mobile sensors. A number of mobile sensors suffered malfunctions during the same time period.
Uncertainty of mobile sensors
A scatterplot was generated to determine sensors whose recorded radiation values were higher than the average radiation values of the neighbourhood where measurements were made.
I observed a significant amount of missing values for data collected by mobile sensors. While missing data affects the reliability of mobile sensors, we cannot disregard the radiation values recorded by mobile sensors because data that is collected is still useful in helping us see changes in radiation levels.
I segmented the data, splitting them into 3 segments based on time ranges:
- Segment 1: 9 April, 7AM - 9 April, 7PM
- Segment 2: 9 April, 7PM - 10 April, 7AM
- Segment 3: 10 April, 7AM - 10 April, 6PM
It is interesting to note that this time period coincides with the period of time where Static Sensor 15 stopped recording data (between 8 April, 10PM to 10 April, 8PM).
Below were the following observations I made from the 3 segments:
Segment | Observation | Possible Explanation |
---|---|---|
Research has shown that nuclear radiation can cause effects similar to electro-magnetic pulses, which disrupts the operations of electronics. In this case, it may have caused mobile sensors to malfunction.
Based on my observations made, I conclude that values from these sensors taken between 7pm on 9 April and 7am on 10 April are too uncertain to trust.
Sensors with high probabilities of being contaminated: 21, 22, 24, 25, 27, 28, 29, 45, 46
There were also values for mobile sensors recorded for locations outside of the city. These measures gave an inaccurate representation of the values that were above the average values taken in the neighbourhood. As I was only concerned with data points inside the city, these values were discarded when an inner join was carried out.
Nevertheless, it is good to note that these data points might have been results of flaky GPS locations.
Regions in city with higher uncertainty of radiation measurements
Factor 1
Distribution of sensors contribute to uncertainty in radiation measurement in different neighbourhoods.
It is difficult to cross-validate data collected by mobile sensor in areas where there is no static sensor. Static sensors collect the most reliable measurements due to their professional calibration. This makes data collected by mobile sensors in regions without static sensors susceptible to uncertainty.
There is uneven distribution of mobile sensors across the neighbourhoods as well. Areas with lesser unique mobile sensors might be subjected to data bias as there is less data for cross-validation.
Lastly, the amount of data collected in each region plays a part in determining the certainty of measurements as well. Less data collected equates to lesser data available for cross-validation.
Based on the three criterions mentioned, the following neighbourhoods might have uncertainty in measurements of radiation
- Oak Willow
- Chapparel
- Wilson Forest
- Terrapin Springs
- Pepper Mill
Factor 2
Radiation measurements may also be uncertain due to radioactive contamination. Following radiation leakage after the earthquake, some mobile sensors were contaminated, which led them to have readings constantly higher than the average readings of the neighbourhood.
The above dashboard shows the path of travel of vehicles suspected to be contaminated by radiation between 9 April 7PM - 10 April 7AM. We can see that values were concentrated in Wilson Forest and Old Town.
The readings in the following regions are more likely to be uncertain due to contamination by mobile sensors:
- Wilson Forest
- Old Town