ISSS608 2017-18 T3 Assign Lim Wee Kiong Insights

From Visual Analytics and Applications
Jump to navigation Jump to search

DuckFam.jpg    VAST 2018 Mini-Challenge 2: Like a Duck to Water

Introduction

Data Preparation

Dashboard Methodology

Insights & Findings

Conclusion & Comments


 


Question 1

Characterize the past and most recent situation with respect to chemical contamination in the Boonsong Lekagul waterways. Do you see any trends of possible interest in this investigation?

Insights from Data Visualization

The amount of readings taken over the years are uneven. There seem to be less readings in the distant past and the recent times, while more readings were taken of the middle years [2005 – 2008].

LWKInsights0.jpg


Taking an example of arsenic, data collection has also been sporadic. There are no data taken between 1999 to 2003, and 2005 to 2007. Extracting patterns from missing data will be inaccurate.

LWKInsights2.jpg


In an overview chart of all the measures taken for all locations, we can see that some values had huge peaks at certain times, e.g. iron in 2003 and increased deposits of dissolved salts from 2005. This data is inconclusive unless we delve deeper into each location and examine their seasonality.

LWKInsights3.jpg


For instance, heavy metals such as iron and zinc had interesting observations by comparing the past and present. These metals are potentially dangerous when found in high quantity in water.

LWKInsights4.jpg
LWKInsights5.jpg


Water Quality Measures are important as well, as dissolved minerals in high quantity could result in bacteria and viruses permeating through water, where animals and people can drink and be harmed.

LWKInsights6.jpg


When I looked at the seasonality of each measure over all the locations, we can find different trends between the past and present. For example, in selected locations, e.g. Achara, I found elevated levels of mercury from 2010 – 2011.

LWKInsights7.jpg


Or high levels of zinc in Kannika and Boonsri:

LWKInsights8.jpg


I also saw increased volatility in one location: Tansanee. The volatility seems to be apparent in a couple of measures: bicarbonates, chlorides, nitrates and total nitrogen.

LWKInsights9.jpg


The last observations for seasonality is that some data exhibit unstable readings in the early years and stable readings recently, e.g. chromium, and the reverse behaviour for some, e.g. total hardness. Water conditions differ through the years and over the different regions constantly.

LWKInsights10.jpg
LWKInsights11.jpg


Summary from Insights

The readings do not show consistent behaviour from the past to present. There are times when measures stabilized and in other times, the reverse happen. Some areas have missing data from the past but exhibit curious behaviour in the present, e.g. Tansanee. It is definitely worthwhile to investigate all these phenomena in greater details. Which leads us to the next question… can we improve the sampling process?


Question 2

What anomalies do you find in the waterway samples dataset? How do these affect your analysis of potential problems to the environment? Is the Hydrology Department collecting sufficient data to understand the comprehensive situation across the Preserve? What changes would you propose to make in the sampling approach to best understand the situation?

Insight from Data Visualization

As discovered from the last question, the data collected are uneven and irregular. From the Calendar plot, we can see that there are times when more data was collected and time when the collection is very low.

LWKInsights12.jpg


It is also curious that for three sites, there were no data collected before 2009. It is probably due to water sensors being installed at a later stage.

LWKInsights13.jpg


Another major problem with the data is the lack of timestamping. It is found that for certain measures for specific locations, there could have been multiple readings taken. But as there were no time stamp, we would not know the exact change in reading throughout the day for that measure. In this example, there were 0 – 3 readings taken for arsenic in Sakda over the period of observation.

LWKInsights14.jpg


Another observation made is with regards to the placement of water sensors. Water readings are highly influenced by water tributaries and some of the sensors are connected to one another. The ten locations can be separated into four clusters (the red boxes below denote the 4 clusters). For instance, one cluster had 5 sensors and their readings may be interdependent.

LWKInsights15.jpg


Summary from Insights

The anomalies arise from the lack of data and irregular reading of the data. This is due to two factors:

- Lack of resources to constantly read the data

- Uneven distribution of water sensors over the four river streams

Also, if we have the avenue to overlay the Kasios factory location over the data, we can probably find out more correlations between the factory discharge and the water measure readings.

My recommended changes to the sampling approach include:

1. Additional water sensors for each main river streams and at least one sensor at each major tributary

2. Due to resource issues, to focus the reading on critical water contaminants (as I have outlined in the data preparation portion)

3. Reduced but focused reading of data. Since the water data are largely seasonal, one way to reduce effort is to focus the reading for the data over a focused period. We can do 3 – 5 measures for all locations from Jan – Mar, and switch to another 3 – 5 measures. This will help to alleviate the stress of limited resources yet provide a better apple-to-apple comparison.

Now that we understand the shortcoming of this data set, our primary problem has not been solved yet. Are the water data showing anything that is related to the beautiful birds?


Question 3

After reviewing the data, do any of your findings cause particular concern for the Pipit or other wildlife? Would you suggest any changes in the sampling strategy to better understand the waterways situation in the Preserve?

Insights from Data Visualization

The culprit in question: Methylosmoline.

The data does not stretch a long time into the past, and it is fairly recent. There is a sharp increase in the data for both Kohsoom and Somchair, with Chai having a higher than average reading as well.

LWKInsights16.jpg


Surprisingly, the increase in Methylosmoline readings in the two areas did not affect the mineral deposits as well. So, there are inconclusive evidence of correlation between Methylosomoline with other measures.

LWKInsights17.jpg


The behaviour of Methylosmoline at Chai, Kohsoom and Somchair are not consistent as well.

There are two curious observations. In Kohsoom, there was an extremely high reading of 145 µg/l on Aug 16, and the whole period of 2016 had very high values. It tapered down in 2017, but there seems to be suspicious activities in 2016.

For Somchair, the high value of 130 µg/l is sustained from mid-2016 to now, which means that there are potentially dumping activities in that region happening even now, and the activities are continuing.

These 2 readings posed a serious threat to the wellbeing of the Pipit population (as well as the wildlife).

LWKInsights18.jpg


The total dissolved salt in water is cause for concern as well, as the Pipit will drink from the water and too much salt could lead to death. Tansanee, Busarakhan and Somchair has been experiencing high levels of dissolved salt in recent years.

LWKInsights19.jpg


The Coliforms level is elevated in Kohsoom as well, which could cause harmful bacteria leading to more death to the Pipit population.

LWKInsights20.jpg


Summary from Insights

The findings on methylosmoline, dissolved salt and coliforms caused particular concern to Pipit and other wildlife in the waterways. There seems to be elevated readings of these measures and all could cause potential harm as it could:

- Affect the water they consume

- Cause illnesses such as skin disease with constant contact

- Cause harm to neighbouring flora that uses the water for irrigation, which in turn causes harm as the animals eat the flora

However, I cannot find direct connection between the various measures, e.g. increased level of methylosmoline does not impact any other water pollutants in the same area directly.

This lack of correlation is due to the observations in Question 2:

- Lack of resources to constantly read the data

- Uneven distribution of water sensors over the four river streams

The change in sampling strategies are the same:

1. Additional water sensors for each main river streams and at least one sensor at each major tributary

2. Reduced but focused reading of critical water contaminants data


Back to Dropbox Page

Go back.png