Preparation

From Visual Analytics and Applications
Revision as of 00:01, 9 July 2018 by Pachen.2017 (talk | contribs)
Jump to navigation Jump to search
width="100%"

Mini-Challenge 2 Overview: Like a Duck to Water

Background Preparation Visualization Observations & Insights Feedback


Explore data

Import data “Boonsong Lekagul waterways readings.csv”. Explore data by JMP. We found this dataset displays multiple different values among the same sample data, measure, and locations. Shown as below picture. However, we could use average value to compute when using Tableau. Therefore, decide to leave it at first.

An2.png

Using Analysis> Distribution function and drag Sample date to Y, columns. We get the data distribution which is from 01/11/1998 to 12/31/2016, total 19 years.

500px

500px

Explore the Location picture

Waterways Final.jpg is the picture that we would use. This picture has 10 places’ names. We look at Boonsong Lekagul waterways readings dataset and found column location matches locations name in picture. There are water quality sensors placed.

Next, we look carefully at this picture. Small rivers merging into a bigger river is the normal. Therefore, we could also tell rivers should flow from north to south. We could divide the 10 locations into 4 parts due to direction of streamflow. Split 10 locations into 4 parts would help us to analyze the following question 3 easily. We save this file name as “Waterways Final_changed.”

500px

Using Tableau and Excel to add X, Y coordinate

Above dataset, Boonsong Lekagul waterways readings, contains most of the information that we could analyze. Next, we decide to use the second dataset, Location.csv. It contains X coordinate column and Y coordinate column. However, these two columns are blank. We need to add these two columns by ourselves. Notice the LR and UL are 249. Later when we insert the picture into tableau, we need to set the picture’s format.

500px

Using Tableau to open Location file and drag X coord and Ycoord into columns and Rows desperately. Drag Location into Details and also drag location into Filters. Click all.

500px

To add picture into this sheet. Click on Map, Background Image, and then choose the Sheet 1. Click on Add Image.

500px 500px
Insert the Waterways Final_changed. Jpg
X Field on Right: type 240
Y Field on Top: type 249

500px

When we pick one dot nearby each location, it would should up the X and Y coordinate details. We re-write it into Location.txt and repeat the process until 10 places have already picked up.

500px 500px

Merge two files by tableau

Merge “Boonsong Lekagul waterways readings.txt” and “Location” into one file. I have already changed the name from “Boonsong Lekagul waterways readings” to “data”. Using left outer join to match properly.

500px

Calendar chart

Drag sample date into columns and Measure into Rows. We set the sample date as Year level and change it into discrete. Darg Number of records into Color and Labels, and then change the color to red. Change automatic to Square to let color fill in and then use “All” at mark label.

500px 500px 500px500px

Using calender chart we could tell that not every measures have recourds every years and every years’ sampling records are not completed. Therefore, add Measure, Sample date into filter that we could pick the year that we are concerned. Histagram would introduce we use conscade Year (Sample Date) filter to help us link sheets together.

Histogram

For histogram, I create one calculation called “Count_appear_year” which can help me count how many years that measures appear. Drag Measure into Columns and Count_appear_year and number of records into Rows. Drag Count_appear_year into Label that the upper histogram would display the amount of aggregate counted year. Click on “All” at marks label. Next, we sort out count_of_year by ascending.

500px 500px 500px500px

Owing to Number of Records on Rows, also drag Number of Records into mark label. Click on “Highlighted” at mark label that when I have already selected the amount of count_appear_year I would like, the number of Records would also display. Using the second histogram, we could see that every measure’s number of records are different. It tells us the sensors are not consistent to detect. It is random to pick up water.

500px500px

Also, add Sample_date on Year level, count_appear_year, and Measure into filter. Here is the tricky that I click on right triangle of Year (Sample Date) filter, click on apply for Worksheets, and choose Selected Worksheet. I have already completed my whole tableau file, so the list would display whole sheets. The list I would tick all exclude Map because map do not need time on filter. It would help me when years I have selected on histogram, it would reflect to all sheets which contain Year (Sample Date) filter.

500px500px500px

Line chart

Create Anomaly calculation to help us point out some outliers which anomalies should be. Use three times of Standard Deviation to identify the outliers.





Data Consolidation

First of all, we need to combine all 12 image data into one consolidated file, which is necessary for the following analysis.

The following R code shows the process of data consolidation including adding more columns for their "Date". File:Imagedata r YR.PNG

Adding Region Clustering

After using relevant measurement to define the property of certain regions on a particular date, we need to apply it for all dates.

We export the clustered Region data into csv file and use R to duplicate it for all dates.

File:Region csv YR.PNG

File:Addregion r YR.PNG

JMP

Imagedata

After consolidating the data, there are more than 5 million records in the dataset, which is above the limit of Excel.

So we use JMP to view and modify the data. What shown below is the consolidated dataset.

File:Imagedata jmp YR.PNG

NDVI difference

In order to see whether there is obvious change of Plant Health between 2016 and before, we calculate the difference of NDVI between (2016 and 2014) and (2016 and 2015).

We import data into JMP and use formula to calculate the difference between two particular dates.

File:Difference jmp YR.png

Tableau

Calculated field

In order to discover the features and better evaluate particular performance, we need to use Calculated Field to create more measurements including NDVI, NDWI, NDMI, NDSI, AVI, BSI, etc.

File:Cal1 YR.png

Bin

In order to see the distribution of each band and measurement from the histogram, we need create bin for them. Otherwise, they are continuous variable which can not reflect the accurate distribution.

File:Bin1 YR.png

File:Bin2 YR.png

File:Bin3 YR.png

Parameter

In order to offer a better interaction to users, we need to create parameter for different measurements. This interaction allows users to choose different measurement and its related graphs in a single dashboard.

File:Index par YR.png

File:Par1 YR.png