ISSS608 2016-17 T3 Assign ZHANG YANRONG Data Preparation

From Visual Analytics and Applications
Jump to navigation Jump to search

Page.jpg VAST Challenge 2017 MC3

Background

Data Preparation

Visualization

Answer

Reference

Feedback

 


Descrption

The whole dataset for MC3 contain 12 csv files and 12 tif files.

For csv file, each file contains coordinate information and band values for particular date, which is shown below:

MC3data YR.png

This csv file contains all information on Sep "2016-09-06" and other 11 csv files are with the same format

MC3data csv YR.png

Tools

R

Data Consolidation

First of all, we need to combine all 12 image data into one consolidated file, which is necessary for the following analysis.

The following R code shows the process of data consolidation including adding more columns for their "Date". Imagedata r YR.PNG

Adding Region Clustering

After using relevant measurement to define the property of certain regions on a particular date, we need to apply it for all dates.

We export the clustered Region data into csv file and use R to duplicate it for all dates.

Region csv YR.PNG

Addregion r YR.PNG

JMP

Imagedata

After consolidating the data, there are more than 5 million records in the dataset, which is above the limit of Excel.

So we use JMP to view and modify the data. What shown below is the consolidated dataset.

Imagedata jmp YR.PNG

NDVI difference

In order to see whether there is obvious change of Plant Health between 2016 and before, we calculate the difference of NDVI between (2016 and 2014) and (2016 and 2015).

We import data into JMP and use formula to calculate the difference between two particular dates.

Difference jmp YR.png

Tableau

Calculated field

In order to discover the features and better evaluate particular performance, we need to use Calculated Field to create more measurements including NDVI, NDWI, NDMI, NDSI, AVI, BSI, etc.

Cal1 YR.png

Bin

In order to see the distribution of each band and measurement from the histogram, we need create bin for them. Otherwise, they are continuous variable which can not reflect the accurate distribution.

Bin1 YR.png

Bin2 YR.png

Bin3 YR.png

Parameter

In order to offer a better interaction to users, we need to create parameter for different measurements. This interaction allows users to choose different measurement and its related graphs in a single dashboard.

Index par YR.png

Par1 YR.png