Difference between revisions of "ISSS608 2017-18 T3 Assign Song Xuejing Data Preparation"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 29: Line 29:
 
==<font size="5"><font color="#f75b47" face=" Georgia">'''Data Preparation'''</font></font>==
 
==<font size="5"><font color="#f75b47" face=" Georgia">'''Data Preparation'''</font></font>==
  
<font size="3.5"><font face=" Georgia" color="#000000">We have two dataset which include the chemical units of measure and Boonsong Lekagul waterways readings. There are 106 different chemical measurements of the water, some of them are chemicals that will damage the water, which may possible output by the furniture factory, and some of the chemicals are just minerals such as iron and zinc. Also, these samples were randomly taken from 1998 to 2016. Therefore, firstly, we need to filter out all the chemicals that we need.  
+
<font size="3.5"><font face=" Georgia" color="#000000">
 +
1.1 Choosing Chemicals<br/>
 +
We have two dataset which include the chemical units of measure and Boonsong Lekagul waterways readings. There are 106 different chemical measurements of the water, some of them are chemicals that will damage the water, which may possible output by the furniture factory, and some of the chemicals are just minerals such as iron and zinc. Also, these samples were randomly taken from 1998 to 2016. Therefore, firstly, we need to filter out all the chemicals that we need.  
  
 
We should use chemicals have enough data points at least in recent years, because we need to identify that the water quality has been bad since the furniture factory has been build. If the chemical is no longer measured anymore, there is no point to identify the changes of the water quality. After first round of filtering, we have these chemicals remaining, only these chemicals have data till 2016.
 
We should use chemicals have enough data points at least in recent years, because we need to identify that the water quality has been bad since the furniture factory has been build. If the chemical is no longer measured anymore, there is no point to identify the changes of the water quality. After first round of filtering, we have these chemicals remaining, only these chemicals have data till 2016.
 
[[Image:Song1.png|300px|center]]
 
[[Image:Song1.png|300px|center]]
 +
 +
1.2 Map<br/>
 +
Since we have a map of the locations of sample points and the waterways, we need to map our sample points onto it, in order to detect the anomalies in different waterways. Use this map as background image in tableau and create the coordinates of different locations. The locations points and map are shown as below.
 +
Also, based on the location, we separate all the location into four streams, from right to left on the map, they are 1 to 4, respectively. Because once the upstream of the river has been polluted, the downstream will be affect to some extent. Therefore, we separate the streams as below.
 +
[[Image:map1.png|400px|center]]
 +
[[Image:map2.png|400px|center]]
 
</font></font>
 
</font></font>
  
Line 41: Line 49:
 
After going through all the chemicals, we have found three apparent type of trends. First, the value of some chemicals increased from past to recent years. Including chemicals as below.  
 
After going through all the chemicals, we have found three apparent type of trends. First, the value of some chemicals increased from past to recent years. Including chemicals as below.  
  
[[Image:q1.1.png|500px|center]]
 
 
[[Image:q1.2.png|500px|center]]
 
[[Image:q1.2.png|500px|center]]
 
[[Image:q1.3.png|500px|center]]
 
[[Image:q1.3.png|500px|center]]
  
 
Secondly, the value of some chemicals tends to have more outliers than past few years, and these chemicals are as shown below.  
 
Secondly, the value of some chemicals tends to have more outliers than past few years, and these chemicals are as shown below.  
 +
[[Image:q1.1.png|500px|center]]
 
[[Image:q1.4.png|500px|center]]
 
[[Image:q1.4.png|500px|center]]
 
[[Image:q1.5.png|500px|center]]
 
[[Image:q1.5.png|500px|center]]
Line 60: Line 68:
 
[[Image:q1.14.png|500px|center]]
 
[[Image:q1.14.png|500px|center]]
 
[[Image:q1.15.png|500px|center]]
 
[[Image:q1.15.png|500px|center]]
 +
 +
In summary, chemical measurements such as Chlorides and Total hardness have increased from past years to recent years.<br/>
 +
 +
For Chlorides, almost all natural waters contain chloride and sulfate ions. Their concentrations vary considerably according to the mineral content of the earth in any given area. In small amounts they are not significant. In large concentrations they present problems. Usually chloride concentrations are low.
 +
The EPA Secondary Drinking Water Regulations recommend a maximum concentration of 250 mg/1 for chloride ions and 250 mg/1 for sulfate ions (expressed as Cl- and S04--, not as CaC03).
 +
Therefore, we can notice that the value of chloride has exceeded 20mg/l in recent year, which is not a good things, and we can dig deeper in the question 2.
 +
<br/>
 +
For total hardness, it is actually a measurement of groundwater. The value of total hardness has increased, which means the river has a higher content of minerals.
 +
In past years, generaly, the value of total hardness varies from 60 - 180 mg/l. However, in recent years, the value increased to 100 - 380 mg/l. According to hardness categary, if the value is more than 180mg/l, the category of water is very hard.
 +
<br/>
 +
 +
In group 2, there are more outliers in chemical measurements such as Anionic active surfactants, Total nitrogen, Macrozoobenthos, and Chemical Oxygen Demand (Cr).
 +
<br/>
 +
For Anionic active surfactants, surfactants are compounds that lower the surface tension (or interfacial tension) between two liquids, between a gas and a liquid, or between a liquid and a solid. Surfactants may act as detergents, wetting agents, emulsifiers, foaming agents, and dispersants.
 +
The value of recent years is diversity. Most of the values are zero, and some of the value are higher than 0.1. Unlike past years, there is a gap between 0 to 0.1 mg/l. Anionic surfactants represent, by volume, the most important group of surfactants used in cleaning products.
 +
Therefore, if the value of Anionic active surfactants is mostly 0 mg/l, which means the water quality may not as good as before, because the cleaning ability of the water has become lower.
 +
<br/>
 +
 +
For Total nitrogen, total Nitrogen is an essential nutrient for plants and animals. However, an excess amount of nitrogen in a waterway may lead to low levels of dissolved oxygen and negatively alter various plant life and organisms. In our cases, the value of total nitrogen has more outliers since 2013.
 +
There are three forms of nitrogen that are commonly measured in water bodies: ammonia, nitrates and nitrites. In further analysis, we can drill down to see which chemical contribute more to this measurement.
 +
<br/>
 +
 +
Macrozoobenthos is practically defined as the invertebrate community living in or on the sediment or hard substrates and retained on a 1 mm2 mesh sieve.
 +
And the value of Macrozoobenthos actually in ground water is very low during these years, however, the outliers in 2014 is very different from other years.
 +
<br/>
 +
 +
In group 3, the value of chemical measuremens such as Atrazine, Cadmium, Chromium, Dissolved silicates, gamma-Hexachlorocyclohexane, Lead, Mercury, p,p-DDT, and Petroleum hydrocarbons have decreased from past years to recent years. 
 +
<br/>
 +
 +
Atrazine is well tolerated by actively growing corn and sorghum, which absorb and metabolize the herbicide and thereby detoxify it. In recent years, the value of Atrazine nearly all become 0 µg/l.
 +
the outlier value of gamma-Hexachlorocyclohexane, Cadmium and Chromium decreased a lot, and in recent years, the value of these two chemicals is very stable.
 +
Lindane(gamma-Hexachlorocyclohexane) has been detected in groundwater and surface water samples collected near hazardous waste sites, however, in our cases, the value is zero in recently. Therefore, to some extents, the river may also has become better.
 
</font></font>
 
</font></font>

Revision as of 10:49, 8 July 2018

VAST Challenge 2018: Suspense at the Wildlife Preserve
Mini Challenge 2 - Like a duck to water

Background

Methodology & Answer

Dashboard Design

Feedback

Back to main

 


Data Preparation

1.1 Choosing Chemicals
We have two dataset which include the chemical units of measure and Boonsong Lekagul waterways readings. There are 106 different chemical measurements of the water, some of them are chemicals that will damage the water, which may possible output by the furniture factory, and some of the chemicals are just minerals such as iron and zinc. Also, these samples were randomly taken from 1998 to 2016. Therefore, firstly, we need to filter out all the chemicals that we need.

We should use chemicals have enough data points at least in recent years, because we need to identify that the water quality has been bad since the furniture factory has been build. If the chemical is no longer measured anymore, there is no point to identify the changes of the water quality. After first round of filtering, we have these chemicals remaining, only these chemicals have data till 2016.

Song1.png

1.2 Map
Since we have a map of the locations of sample points and the waterways, we need to map our sample points onto it, in order to detect the anomalies in different waterways. Use this map as background image in tableau and create the coordinates of different locations. The locations points and map are shown as below. Also, based on the location, we separate all the location into four streams, from right to left on the map, they are 1 to 4, respectively. Because once the upstream of the river has been polluted, the downstream will be affect to some extent. Therefore, we separate the streams as below.

Map1.png
Map2.png

Question 1

After filtering out all the unnecessary chemicals, we can take a look at the overview of all the chemicals in different places one by one. For question1, we need to characterize the past and most recent situation with respect to chemical contamination in the Boonsong Lekagul waterways. Because we don’t know which year the factory was build, we define the recent as data point after 2013, and the past as data points before 2013, in order to see if there is any trend of some chemical measurements. After going through all the chemicals, we have found three apparent type of trends. First, the value of some chemicals increased from past to recent years. Including chemicals as below.

Q1.2.png
Q1.3.png

Secondly, the value of some chemicals tends to have more outliers than past few years, and these chemicals are as shown below.

Q1.1.png
Q1.4.png
Q1.5.png
Q1.6.png

Thirdly, there are even some value of chemicals decreased from past to recent years, because the samples are not just including chemical contamination, but also include some minerals. Therefore, these chemicals are as below.

Q1.7.png
Q1.8.png
Q1.9.png
Q1.10.png
Q1.11.png
Q1.12.png
Q1.13.png
Q1.14.png
Q1.15.png

In summary, chemical measurements such as Chlorides and Total hardness have increased from past years to recent years.

For Chlorides, almost all natural waters contain chloride and sulfate ions. Their concentrations vary considerably according to the mineral content of the earth in any given area. In small amounts they are not significant. In large concentrations they present problems. Usually chloride concentrations are low. The EPA Secondary Drinking Water Regulations recommend a maximum concentration of 250 mg/1 for chloride ions and 250 mg/1 for sulfate ions (expressed as Cl- and S04--, not as CaC03). Therefore, we can notice that the value of chloride has exceeded 20mg/l in recent year, which is not a good things, and we can dig deeper in the question 2.
For total hardness, it is actually a measurement of groundwater. The value of total hardness has increased, which means the river has a higher content of minerals. In past years, generaly, the value of total hardness varies from 60 - 180 mg/l. However, in recent years, the value increased to 100 - 380 mg/l. According to hardness categary, if the value is more than 180mg/l, the category of water is very hard.

In group 2, there are more outliers in chemical measurements such as Anionic active surfactants, Total nitrogen, Macrozoobenthos, and Chemical Oxygen Demand (Cr).
For Anionic active surfactants, surfactants are compounds that lower the surface tension (or interfacial tension) between two liquids, between a gas and a liquid, or between a liquid and a solid. Surfactants may act as detergents, wetting agents, emulsifiers, foaming agents, and dispersants. The value of recent years is diversity. Most of the values are zero, and some of the value are higher than 0.1. Unlike past years, there is a gap between 0 to 0.1 mg/l. Anionic surfactants represent, by volume, the most important group of surfactants used in cleaning products. Therefore, if the value of Anionic active surfactants is mostly 0 mg/l, which means the water quality may not as good as before, because the cleaning ability of the water has become lower.

For Total nitrogen, total Nitrogen is an essential nutrient for plants and animals. However, an excess amount of nitrogen in a waterway may lead to low levels of dissolved oxygen and negatively alter various plant life and organisms. In our cases, the value of total nitrogen has more outliers since 2013. There are three forms of nitrogen that are commonly measured in water bodies: ammonia, nitrates and nitrites. In further analysis, we can drill down to see which chemical contribute more to this measurement.

Macrozoobenthos is practically defined as the invertebrate community living in or on the sediment or hard substrates and retained on a 1 mm2 mesh sieve. And the value of Macrozoobenthos actually in ground water is very low during these years, however, the outliers in 2014 is very different from other years.

In group 3, the value of chemical measuremens such as Atrazine, Cadmium, Chromium, Dissolved silicates, gamma-Hexachlorocyclohexane, Lead, Mercury, p,p-DDT, and Petroleum hydrocarbons have decreased from past years to recent years.

Atrazine is well tolerated by actively growing corn and sorghum, which absorb and metabolize the herbicide and thereby detoxify it. In recent years, the value of Atrazine nearly all become 0 µg/l. the outlier value of gamma-Hexachlorocyclohexane, Cadmium and Chromium decreased a lot, and in recent years, the value of these two chemicals is very stable. Lindane(gamma-Hexachlorocyclohexane) has been detected in groundwater and surface water samples collected near hazardous waste sites, however, in our cases, the value is zero in recently. Therefore, to some extents, the river may also has become better.