Difference between revisions of "Observations & Insights"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 113: Line 113:
 
To sum up, Boonsri, Kohsoom, Busarakhan and Somchair, we are wondering have factories that would destroy the habitat of wildlife.  
 
To sum up, Boonsri, Kohsoom, Busarakhan and Somchair, we are wondering have factories that would destroy the habitat of wildlife.  
  
[[File:an4_16.png|500px]]
 
 
[[File:an4_17.png|500px]]
 
[[File:an4_17.png|500px]]
 
+
[[File:an4_18.png|500px]]
  
 
= Question 4b:Would you suggest any changes in the sampling strategy to better understand the waterways situation in the Preserve?=
 
= Question 4b:Would you suggest any changes in the sampling strategy to better understand the waterways situation in the Preserve?=

Revision as of 01:46, 9 July 2018

width="100%"

Mini-Challenge 2 Overview: Like a Duck to Water

Background Preparation Visualization Observations & Insights Feedback


Question 1:Data selection for chemical components

We only pick 2011s to 2016s data to filter the chemicals we are concerned. Research period is 6 years. Using histogram, we filter out 6 years and see the below calendar chart. It gives us 29 measures and no blank records inside. Import data “Boonsong Lekagul waterways readings.csv”. Explore data by JMP. We found this dataset displays multiple different values among the same sample data, measure, and locations. Shown as below picture. However, we could use average value to compute when using Tableau. Therefore, decide to leave it at first.

An4 1.png

Question 1 is asking chemical contamination. Therefore, we exclude Biochemical Oxygen, Chemical Oxygen Demand (Cr), Chemical Oxygen Demand (Mn), Dissolved oxygen, Oxygen saturation, Total dissolved salts, Water Temperature. Most of the measures are used for evaluating the water quality. Also, exclude Nitrates because of small value. Grouping each average value during 2011s to 2016s manually, got 2 main groups and 6 chemicals cannot group, need to explore more.

Group 1: Bicarbonates, Calcium, Chlorides, Copper, Nitrites, Orthophosphate-phosphorus, Sodium, Sulphates (8 chemicals) Most of the amount of average value is large except Bicarbonates We can also tell dominant pattern which spikes in early period of 2012 and the end of 2014~2015. The distribution of most chemical values inside are around trend line which is stable. These chemicals were being manageable well.

Group 2: Ammonium, Chromium, Lead, Nickel, Potassium, Total nitrogen, Total phosphorus. (7 chemicals) The amount of average value is not that large. We can also tell dominant pattern which spikes in 2012~2013, 2014, and 2016. The trend we could tell it’s downward.

An4 2.png

The rest 6 chemicals cannot group together.

An4 3.png

From below 6 charts, we could find that charts about Zinc, Magnesium, Cadmium, and Mercury, trend is downwards. The charts about Anionic active surfactants, and Arsenic, trend is upwards. However, we should zoom into recent years which is from 2013 to 2016. Looks like Zinc, Cadmium, Anionic active surfactants and Arsenic have been released recently.

An4 4.png

Explore 4 chemicals which raised up recently. Explore data by places.

i) Zinc: from below line chart, Zinc was detected as 157.5 µg/l on March 2015. Zinc was detected at Tansanee for 154 µg/l. Though the sensor at Tansanee did not have full records to detect Zinc for full period.

ii) Cadmium: from below line chart, we could tell Cadmim was released from Busarakhan and Somchair from 2013 to 2016. However, it did not exceed the expected range. This heavy metal was still being manageable.

iii) Anionic active surfactants: this contamination was detected as 0.2683 mg/l on February 2014 and exceeded the expected range. When we trace from location, we find that Kohsoom and Boonsri is the same river system and these two released the most. Tansanee exceeded the expected standard on July 2017 and Decha exceeded on May and November 2016. Decha should be watched out that these two data are quite closed and near present.

iv) Arsenic: Arsenic was detected as 9.334 µg/l on August 2015 and exceeded the expected range. Tansanee released the most for 17.14 µg/l. The same as Zinc, the sensor at Tansanee did not have full records to detect Arsenic for full period.

An4 5.png An4 6.png An4 7.png An4 8.png

Question 2a: What anomalies do you find in the waterway samples dataset? How do these affect your analysis of potential problems to the environment?

(1) No consistency. Some of waterway data did not have records Using calendar chart, we could easily tell that some of chemical did not have any records in that year. It would cause the sampling methodology inaccurate. Therefore, when we compare the value of measures, we could only choose non-gap year measure.

An4 9.png

(2) Should not use haphazard sampling Haphazard sampling is one of the non-statistical sampling methods. This method does not care about suitability and just randomly pick the sample. The problem is the risk of sampling cannot quantify and sampling error may be larger. The last of problem is inefficient though the cost of this method is cheap.

Below chart is Nickel’s line chart. We could tell that every dot is not equivalent because data pick is not regular.

An4 10.png

(3) Some places of sensors should be placed more To be more accurate, sensors should be placed on each river’s upstream and downstream. The lift sensors at Decha and Tansanee are not enough. Should place more sensor at the upstream and downstream of these two places’ river. Also, if we have already known the direction of factory, nearby places should have sensors too.

An4 11.png

(4) Data from Tansanee is much higher than other 9 places From below chart, we can easily tell that average of value at Tansanee is much higher than 9 places. We found that Tansanee only has 2011 to 2015. However, picking records year at Decha and Achara are the same as Tanesanee which also means 2011 to 2015.

Then, we aggregate number of records to see the accumulated number. Tansanee is the smallest. We assume that the number of data is not sufficient which causes the skewness.

An4 12.png

Question 2b:Is the Hydrology Department collecting sufficient data to understand the comprehensive situation across the Preserve?

Not collecting sufficient data. Firstly, as previous mentioned, should have more upstream and downstream sensors to better understand the whole waterway. The factory area should be placed sensors too. Secondly, the data is not consecutive or picked by period. It would cause error easily. Like some of the chemicals got different values at the same location and date. It would cause the analysis inaccurate.

Question 2c:What changes would you propose to make in the sampling approach to best understand the situation?

Using Statistical Sampling. We could use systematic sampling, cluster sampling, stratified and random sampling. The advantage is objective and could quantify sampling error. The larger population is, the more accurate statistical sampling is. If this sampling approach changed to systematic. We should set the amount of data that we want, fix the period and time.

Document data accurately, if result did not have the one of chemicals, should write in 0. Should not let it null. It would make reader confused whether it is no data detected or lack of data.

Question 3a:After reviewing the data, do any of your findings cause particular concern for the Pipit or other wildlife?

Read on 2017s Vast Challenge mini case, we understood the following three dominant chemical contaminants may ruin nature and atmosphere. Thus, we focus on places which factories might be placed and it would bother Pipit or other wildlife. Also, we have already explored these three chemicals and found they appear from 2014 to 2016 i.e., recent years. Therefore, we only focus on these three years.

(1) AGOC-3A We are using map and combine line chart 1 and line chart 2, When I hover on Boonsri, and Kohsoom the left upper would give me the average value and we could see 2015s this chemical exceeds the expected line. Busarakhan also exceed the range but the time is 2016s.

Somchair also face that AGOC-A exceeds the range, time is also 2016s, but he river system is different from above three.

An4 13.png

An4 14.png

An4 15.png

An4 16.png

(2) Chlorodinine We have used location check (2) dashboard to check. None of places’ value exceed the three times Standard Deviation.

(3) Methylosmolene Detect Methylosmolene: using the same map again and change the chemical. We could clarify that Boonsri and Kohsoom do have contamination because the average of value exceeds the expected range in 2015 and 2016 desperately.

To sum up, Boonsri, Kohsoom, Busarakhan and Somchair, we are wondering have factories that would destroy the habitat of wildlife.

An4 17.png An4 18.png

Question 4b:Would you suggest any changes in the sampling strategy to better understand the waterways situation in the Preserve?

Sampling strategy should be changed. Like set pick water sample up automatically or real-time monitring because the factory would pollute waste, posion, or heavy metal material secretly. They would dump randomly so it is hard to predict the dumping time. Second, the sample data contains physical and chemical testing measurement: Biochemical Oxygen, Chemical Oxygen Demand (Cr), Chemical Oxygen Demand (Mn), Dissolved oxygen, Oxygen saturation, Total dissolved salts, Water Temperature. These is really good for us to analyze the water quality though this time did not have space to use. I would suggest adding pH, topography, the speed of river, biological indicators (like Ephemeroptera).

Thirdly, measurements are often made in a laboratory which require a water sample to be collected, preserved, transported, and analyzed at location. Every process should be cautious or it would effect our tesing outcome.