Preparation
Background | Preparation | Visualization | Observations & Insights | Feedback |
Contents
Explore data
Import data “Boonsong Lekagul waterways readings.csv”. Explore data by JMP. We found this dataset displays multiple different values among the same sample data, measure, and locations. Shown as below picture. However, we could use average value to compute when using Tableau. Therefore, decide to leave it at first.
This csv file contains all information on Sep "2016-09-06" and other 11 csv files are with the same format
Tools
R
Data Consolidation
First of all, we need to combine all 12 image data into one consolidated file, which is necessary for the following analysis.
The following R code shows the process of data consolidation including adding more columns for their "Date". File:Imagedata r YR.PNG
Adding Region Clustering
After using relevant measurement to define the property of certain regions on a particular date, we need to apply it for all dates.
We export the clustered Region data into csv file and use R to duplicate it for all dates.
JMP
Imagedata
After consolidating the data, there are more than 5 million records in the dataset, which is above the limit of Excel.
So we use JMP to view and modify the data. What shown below is the consolidated dataset.
NDVI difference
In order to see whether there is obvious change of Plant Health between 2016 and before, we calculate the difference of NDVI between (2016 and 2014) and (2016 and 2015).
We import data into JMP and use formula to calculate the difference between two particular dates.
Tableau
Calculated field
In order to discover the features and better evaluate particular performance, we need to use Calculated Field to create more measurements including NDVI, NDWI, NDMI, NDSI, AVI, BSI, etc.
- NDVI cal YR.png
- NDWI cal YR.png
- NDMI cal YR.png
- AVI cal YR.png
- BSI1 cal YR.png
- BSI2 cal YR.png
Bin
In order to see the distribution of each band and measurement from the histogram, we need create bin for them. Otherwise, they are continuous variable which can not reflect the accurate distribution.
Parameter
In order to offer a better interaction to users, we need to create parameter for different measurements. This interaction allows users to choose different measurement and its related graphs in a single dashboard.