Difference between revisions of "ISSS608 2017-18 T3 Assign Liu Yuqing Analysis & Insights"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 23: Line 23:
 
[[ISSS608 2017-18 T3 Assign Liu Yuqing_Feedback| <font color="#FFFFFF">'''Feedback'''</font>]]
 
[[ISSS608 2017-18 T3 Assign Liu Yuqing_Feedback| <font color="#FFFFFF">'''Feedback'''</font>]]
 
|}
 
|}
==Methodology==
 
===Parameters===
 
Set range of standard deviation
 
  
[[image:image2-1.jpg|400px]]
+
==Visualization Dashboad==
 +
Please find the link to dashboard here:https://public.tableau.com/profile/liu.yuqing4353#!/vizhome/LIUYUQING_Mini-Challenge2/LIUYUQING-MiniChallenge2?publish=yes
  
Set start date of visualization to exclude data points not after the day
+
==Visualization analysis==
 +
===Overview of measures===
 +
====Trend of each measure====
 +
Measured time interval of each chemical is quite different, so first of all, we observed value trend over time of each measure using boxplot.
 +
 
 +
[[image:image3-1.jpg|500px]]
 +
[[image:Image3-2.jpg|600px]]
 +
 
 +
To compare trends of different chemicals over the time, we group measures according to measured time interval regularity. Create set of each group in tableau.
 +
 
 +
Measure groups
 +
 
 +
[[image:Image3-3.png|300px]]
 +
 
 +
There are 12 chemicals are excluded compare to MCL level, 6 of them belongs to Group A and only 1 chemical in Group D.
 +
Perhaps this also shows that the relatively stable material measurement time is relatively short, and the opposite is not a very stable material measurement time span is longer
 +
1.1.2 Pattern of Macrozoobenthos
 +
Almost measured units of all chemicals are mg/l or ug/l, there is no unit for .Macrozoobenthos are macro-benthic animals, their life and appearance are relevant with some chemicals and the environment.
 +
 
 +
[[image:Image3-4.jpg|800px]]
  
[[image:image2-2.jpg|400px]]
 
  
Set end date of visualization to exclude data points not before the day
+
Average value of Macrozoobenthos surged in 2014, this pattern is remarkable in Kannika and Sakda. In this case, we zoom in Kannika and Sakda, observe the standard variance of Macrozoobenthos in the two locations.
  
[[image:image2-3.jpg|400px]]
+
[[image:Image3-5.jpg|600px]]
===Creating calculated fields===
+
[[image:Image3-6.jpg|600px]]
Create isInRangeDate
 
  
[[image:image2-4.jpg|600px]]
+
Observe same pattern in 2014 in upper stream locations. Upper stream locations of Sakda are Somchair and Achara, upper stream locations of Kannika is Chai.
  
Create average value of measures cross window
+
===Analysis by year===
 +
====2008 – 2009====
 +
Average value of several chemicals of Boonsri and Kohsoom are higher than the average value of other locations
  
[[image:image2-5.jpg|600px]]
+
[[image:Image3-7.jpg|800px]]
  
Create lower value of control interval
 
  
[[image:image2-6.jpg|600px]]
+
Somchair, Achara and Sakda belong to same main river, we focus on these 3 locations to observe distribution of measures. There are 3 main measures appeared at the 3 locations from 2008 – 2009 – Bicarbonates, Oxygen saturation, Calcium. Value of each measure at Sakda is higher than other two locations.
  
Create upper value of control interval
+
[[image:Image3-8.jpg|800px]]
  
[[image:image2-11.jpg|600px]]
 
  
Create outliers to mark datapoints out of control
+
Boonsri, Kohsoom, Busarakhan are upper streams of Chai, Chai is upper stream of Kannika, so we focus on these 5 locations together. Bicarbonates, Oxygen saturation, Calcium are also very high in these locations. For the measure distribution graph, size stands for numbers of records. Based on this graph, we can notice that records of Chia of many measures are more than Boosri and other locations. Besides, measures are concentrate on Biochemical Oxygen, Chemical Oxygen demand and total nitrogen.
  
[[image:image2-8.jpg|600px]]
+
[[image:Image3-9.jpg|800px]]
  
 +
====2005 – 2007====
 +
Total coliforms had different trend from other measures – average value decreases from 2005 to 2006 and increases from 2006 to 2007, trends of other measures are opposite.
  
==Visualization analysis==
+
[[image:Image3-10.jpg|800px]]
===Overview of measures===
+
 
====Trend of each measure====
+
 
Measured time interval of each chemical is quite different, so first of all, we observed value trend over time of each measure using boxplot.
+
Trend of total coliforms at Kohsoon is different from Boonsri, Chai, Busarak and Kanika.
 +
 
 +
[[image:Image3-11.jpg|800px]]
 +
 
 +
====2008 – 2016====
 +
Total dissolved phosphorus only appeared in 2015 and 2016, which average value at Kohsoom is much higher than average value of other locations belong to same main river.
  
[[image:image3-1.jpg|600px|left]]
+
[[image:Image3-12.jpg|800px]]
[[image:Image3-2.jpg|600px|right]]
 
  
  
==Data Discription==
+
Among the continuously changing measures from 2005 to 2007, there are no measured values in Decha and Tansanee.
Entry gates are positioned at the Preserve entrances.  Each vehicle receives an entry ticket at the gate and is assigned a vehicle class; the entry is recorded.  The entry ticket contains an RF-tag that enables the Preserve sensors to pick up the passage of a vehicle through the Preserve. Each vehicle surrenders their entry ticket when exiting the Preserve and the exit is recorded.When vehicles enter the Preserve, they must proceed through a gate and obtain a pass.
 
===Details Discription===
 
1. Entrances: All vehicles pass through an Entrance when entering or leaving the Preserve.  
 
  
2. General-gates: All vehicles may pass through these gates.  These sensors provide valuable information for the Preserve Rangers trying to understand the flow of traffic through the Preserve.
+
[[image:Image3-13.jpg|400px]]
 
3. Gates: These are gates that prevent general traffic from passing.  Preserve Ranger vehicles have tags that allow them to pass through these gates to inspect or perform work on the roadway beyond.
 
 
 
4. Ranger-stops: These sensors represent working areas for the Rangers, so you will often see a Ranger-stop sensor at the end of a road managed by a Gate.  Some Ranger-stops are in other locations, however, so these sensors record all traffic passing by.
 
 
5. Camping: These sensors record visitors to the Preserve camping areas.  Visitors pass by these entering and exiting a campground.
 
  
==Step by Step Discription==
 
===Exclude misleading records===
 
<table cellspacing="10" border="1">
 
<tr>
 
<td>
 
Exclusion criterion:
 
  
(1) id without full movement records( all visitors should come in and get out from both entrances so those ids that only have one entrance should be excluded).
+
For Sakda, Somchair and Achara streams, trends of Petroleum hydrocarbons are quite abnormal from 2008 to 2016. The most obvious phenomenon is Petroleum hydrocarbons increased significantly in 2012, which maybe caused by sewage of factory around Somchair.
  
(2)id without full timestamp records.( those id without the come in and get out records).
+
[[image:Image3-14.jpg|800px]]
  
(3) repeat records
+
====1998 – 2016====
<td>[[File:Entrances number2.PNG|700px|right]]</td>
+
Iron grew significantly in third Quarter, 2003 at all marked locations. For other places, the average value of iron shows a fluctuated trend. Most metal materials have a continuous measurement time over the 18 years from 1998 – 2016.
</tr>
 
</table>
 
  
===Extract the car sequences===
+
[[image:Image3-15.jpg|800px]]
<table cellspacing="10" border="1">
 
<tr>
 
<td> The most essentail step for data preparation is to extract the car sequences which helps to summary the life patterns. This step is conducted in JMP with the function[Col Rank(Timestamp,car-id)]
 
<td>[[File:Car sequences.PNG|800px|right]]</td>
 
</tr>
 
</table>
 
===Label vehicles in their movement order and then create Episode based on the movement order and id===
 
<table cellspacing="10" border="1">
 
<tr>
 
<td> Number the movement order for each car-id and then list the car-id with episode.
 
  
Level: The order of movement(entrance---gate :level of entrance is one and the level of gate is two)
 
  
Episode: Each entrance & exit is one episode, so normally each id should has only one episode.
+
For two metal chemicals, their average values were out of control level from 1998 – 2002. For Lead, Chai and Boonsri the two locations are mainly influenced the trend. For manganese, Chai showed the most obvious influence, followed by Busarakhan, Kannika and Kohsoom.
<td>[[File:Level_and_eposide.PNG|800px|right]]</td>
 
</tr>
 
</table>
 
  
===Compute the last duration, camping hour and single checkpoint duration===
+
[[image:Image3-16.jpg|600px]]
<table cellspacing="10" border="1">
+
[[image:Image3-17.jpg|600px]]
<tr>
+
==Conclusions==
<td> Compute the multipul duration in excel for better visualization.Last duration(day),camping hour(hour),single checkpoint duration(min)
+
1. The factory sewage location may be somewhere between Kohsoom and Busarakhan and near Somchair. According to analysis, most chemicals averages are high at Kohsoom, Chai, Sakda and Kannika.
<td>[[File:Camping hour.PNG|800px|right]]</td>
 
</tr>
 
</table>
 
  
===Coordinates Extractation and Preparation for Gephi===
+
2. Trend of upstream and downstream are quite similar, but different upstreams may have different effects on downstream. For example, Boonsri has more influence than Kohsoom on Chai.  
<table cellspacing="10" border="1">
 
<tr>
 
<td> Extract the coordinates in tableau and then prepare the coordinates for Gephi visualization
 
<td>[[File:Gephi.PNG|600px|right]]</td>
 
</tr>
 
</table>
 
  
===Participent Segamentation===
+
3. The locations of the sewage by factories may be different at different times. Sewage pollution may occurred around 2014 – 2016 near Kohsoom or Busarakhan, however, this should be happened around 2003 – 2004 near Boonsri.
<table cellspacing="10" border="1">
 
<tr>
 
<td> All the participents are classified into Camper,No-camper and Ranger
 
<td>[[File:Visitors type.PNG|700px|right]]</td>
 
</tr>
 
</table>
 

Revision as of 18:46, 8 July 2018

Shadiao duck.jpg        Vast: Mini Challenge 2: Like a duck to water

Background

Data preparation

Analysis & Insights

Feedback

Visualization Dashboad

Please find the link to dashboard here:https://public.tableau.com/profile/liu.yuqing4353#!/vizhome/LIUYUQING_Mini-Challenge2/LIUYUQING-MiniChallenge2?publish=yes

Visualization analysis

Overview of measures

Trend of each measure

Measured time interval of each chemical is quite different, so first of all, we observed value trend over time of each measure using boxplot.

Image3-1.jpg Image3-2.jpg

To compare trends of different chemicals over the time, we group measures according to measured time interval regularity. Create set of each group in tableau.

Measure groups

Image3-3.png

There are 12 chemicals are excluded compare to MCL level, 6 of them belongs to Group A and only 1 chemical in Group D. Perhaps this also shows that the relatively stable material measurement time is relatively short, and the opposite is not a very stable material measurement time span is longer 1.1.2 Pattern of Macrozoobenthos Almost measured units of all chemicals are mg/l or ug/l, there is no unit for .Macrozoobenthos are macro-benthic animals, their life and appearance are relevant with some chemicals and the environment.

Image3-4.jpg


Average value of Macrozoobenthos surged in 2014, this pattern is remarkable in Kannika and Sakda. In this case, we zoom in Kannika and Sakda, observe the standard variance of Macrozoobenthos in the two locations.

Image3-5.jpg Image3-6.jpg

Observe same pattern in 2014 in upper stream locations. Upper stream locations of Sakda are Somchair and Achara, upper stream locations of Kannika is Chai.

Analysis by year

2008 – 2009

Average value of several chemicals of Boonsri and Kohsoom are higher than the average value of other locations

Image3-7.jpg


Somchair, Achara and Sakda belong to same main river, we focus on these 3 locations to observe distribution of measures. There are 3 main measures appeared at the 3 locations from 2008 – 2009 – Bicarbonates, Oxygen saturation, Calcium. Value of each measure at Sakda is higher than other two locations.

Image3-8.jpg


Boonsri, Kohsoom, Busarakhan are upper streams of Chai, Chai is upper stream of Kannika, so we focus on these 5 locations together. Bicarbonates, Oxygen saturation, Calcium are also very high in these locations. For the measure distribution graph, size stands for numbers of records. Based on this graph, we can notice that records of Chia of many measures are more than Boosri and other locations. Besides, measures are concentrate on Biochemical Oxygen, Chemical Oxygen demand and total nitrogen.

Image3-9.jpg

2005 – 2007

Total coliforms had different trend from other measures – average value decreases from 2005 to 2006 and increases from 2006 to 2007, trends of other measures are opposite.

Image3-10.jpg


Trend of total coliforms at Kohsoon is different from Boonsri, Chai, Busarak and Kanika.

Image3-11.jpg

2008 – 2016

Total dissolved phosphorus only appeared in 2015 and 2016, which average value at Kohsoom is much higher than average value of other locations belong to same main river.

Image3-12.jpg


Among the continuously changing measures from 2005 to 2007, there are no measured values in Decha and Tansanee.

Image3-13.jpg


For Sakda, Somchair and Achara streams, trends of Petroleum hydrocarbons are quite abnormal from 2008 to 2016. The most obvious phenomenon is Petroleum hydrocarbons increased significantly in 2012, which maybe caused by sewage of factory around Somchair.

Image3-14.jpg

1998 – 2016

Iron grew significantly in third Quarter, 2003 at all marked locations. For other places, the average value of iron shows a fluctuated trend. Most metal materials have a continuous measurement time over the 18 years from 1998 – 2016.

Image3-15.jpg


For two metal chemicals, their average values were out of control level from 1998 – 2002. For Lead, Chai and Boonsri the two locations are mainly influenced the trend. For manganese, Chai showed the most obvious influence, followed by Busarakhan, Kannika and Kohsoom.

Image3-16.jpg Image3-17.jpg

Conclusions

1. The factory sewage location may be somewhere between Kohsoom and Busarakhan and near Somchair. According to analysis, most chemicals averages are high at Kohsoom, Chai, Sakda and Kannika.

2. Trend of upstream and downstream are quite similar, but different upstreams may have different effects on downstream. For example, Boonsri has more influence than Kohsoom on Chai.

3. The locations of the sewage by factories may be different at different times. Sewage pollution may occurred around 2014 – 2016 near Kohsoom or Busarakhan, however, this should be happened around 2003 – 2004 near Boonsri.