ISSS608 2017-18 T3 Assignment SHI CHEN Data Exploratory

From Visual Analytics and Applications
Jump to navigation Jump to search

Cover1.jpg VAST Challenge 2018: Trying to be an environmental detective

Overview

Methodology & Data Exploratory

Answer&Insights

Back to main table

 


Dataset Description

Here is the brief description of data given by Vast Challenge. In total, there are two files, one is "Boonsong Lekagul waterways readings", which is showing the record of each chemicals throughout the timeline and observation location.

Variable Description
Id ID of each record
Value Measured value for the chemical or property in this record
Location Name of the location sample was taken from. They are Achara, Boonsri, Busarakhan, Chai, Decha, Kannika, Kohsoom, Sakda, Somchair, Tansanee.
Sample Date Sampling time (The record shows it was from Jan 1998 to Dec 2016)
Measure Measurement indicators, including 104 chemicals and water temperature monitoring as well as monitoring of a biological quantity
Unit Unit of measured chemicals. There are mg/l, µg/l and C. (Macrozoobenthos is not listed with any type of units. )

Methodology

Workflow

I. Data preparation:

Combine three tables together to get full picture of changes in all observation years, observation location and chemicals. 2-1.jpg

II. Geographical preparation:

There is a topographic map in this area. But there’s no other information like latitude and longitude. It should only be used to generate relative coordinates for each observation location.

2-2.jpg

III. Data exploration:

The geography, the number of species, the number and type of measurement, the number of detections, and the timeline are combined together to explore, to discover their associations and identify the potential factors affecting birds.

IV. Statistical application:

To find out the abnormal chemical elements faster, in addition to looking at the change values of each substance in turn, I also introduced the concept of Z-score to assist the judgment. 2-3.jpg

The larger the value of z score, the greater the volatility that represents it.

V. Final deliverable: This challenge analysis was done in tableau. Link:

Tool

  • Tableau : Used to do EDA and build dashboard
  • JMP : To look through distribution of data

Data exploratory

Hydrological overview

*How many observation location and streams are there in this area?

In total, Hydrology department has gradually set 10 stations from 1998-2016. More specifically, Achara, Decha, Tansanee were added as observation station since 2009.

2-4.jpg

According to the map river distribution, the 10 stations are distributed in the upper, middle and downstream of 4 different streams. In order to facilitate the analysis of the correlation of different points in a river, these observation points were grouped. From right to left are path 1, 2, 3 and 4. In the below graph, the locations with same color are in the same stream.

2-5.jpg

* What is the flow of the river?

As all the streams are originated from the north part, it can be speculated that the terrain in the north is higher than other area. The river flows southward as a whole.

* What is the monitoring time and frequency of the Hydrological Department?

Monitoring Time:

Current data indicates that the monitoring period for the hydrological sector began in 1998 and three new locations were added in 2008.

Conjecture1: This may indicate that at earlier years, these three years appeared environment problems.

Frequency of Detection:

2-6.jpg

Group Findings
Boonsri and Chai Belonging to upstream and middle stream of path1.
Kannika and Sakda Belonging to downstream of river and had closer distance.
Kohsoom and Busarakhan and Somchair Belonging to upper area of the map, which are also originated from same mountain.
Achara, Decha, Tansanee They are all added from 2009.)

From the time point of view, the department has increased the measurement frequency from 2004 to 2007 at almost all observation locations, and location Boonsri and Chai are particularly prominent.

Conjecture2: There might be a serious environmental problem happened before 2004, especially in Boonsri and Chai. So the hydrological department has already estimated that environmental problems had been discovered in these two locations.

Observations overview

*How many kinds of chemicals are there?

Based on the count of distinct kinds of chemicals, it is observed that there are 104 kinds of chemicals, as well as two other indicators, water temperature and Macrozoobenthos (creature).

*How the measured substance type changes over time? 2003 might be an important year to be noticed.

Area chart has been used to detect average chemicals value changes over time.

2-7.jpg

It can be found that in 2003 there were explosive growth of certain chemicals. Later on in anomaly analysis part, this would be explained more. This phenomenon can also verify that the previous speculation is basically reasonable. Due to the large-scale growth of certain substances, environmental problems have gradually become prominent.

*What is the difference in the type of measurement at each location?

This can be interactively explored through dashboard.

*Other observations

Based on unit of measures, it can be noticed that the Hydrological Department

Anomaly Analysis:

*Were there any anomaly chemicals throughout different locations?

2-8.jpg

In the above graph, each dot represents every chemical. Green to Red are used to show the value of z-score. The deeper the green, the more stable it is. The deeper the red, the greater the fluctuation. You can not only see how many chemicals that readings detected in that particular year, but also faster find out the volatile chemicals in that period of time.

*For deeper mining, how did these chemicals fluctuate in the past years?

When clicking the dot that you want to explore, the below graph would be shown the changing pattern of this chemicals.

2-9.jpg 2-10.jpg

*How did it distributed in different locations?

380px

Right hand side map would allow you to see when and where it distributed in past years.

Reference

Hydrological measurements - published by Important India https://ocw.tudelft.nl/courses/hydrological-measurements/
TBC - published by Data2X TBC