ISSS608 2017-18 T3 Assign Lim Wee Kiong Data Preparation

From Visual Analytics and Applications
Revision as of 08:07, 8 July 2018 by Wklim.2017 (talk | contribs)
Jump to navigation Jump to search

DuckFam.jpg    VAST 2018 Mini-Challenge 2: Like a Duck to Water

Introduction

Data Preparation

Dashboard Methodology

Insights & Findings

Conclusion & Comments


 


Data Preparation

Understanding the Raw Data – Samples Readings and Measures

The data given to us comes in 2 main files: Boonsong Lekagul waterways readings.csv and chemical units of measure.csv.

Data cleaning is done in Excel and data visualization in Tableau.

Descriptions of the data fields for Boonsong Lekagul waterways readings are as follow:

Field Description
ID Identification number for the record (only for bookkeeping)
Value Measured value for the chemical or property in this record
Location Name of the location sample was taken from. See the map for geo-location of the sampling site.
Sample Date Date sample was taken from the location
Measure Chemicals (e.g., Sodium) or water properties (e.g., Water temperature) measured in the record

A sample of the data is shown here:

LWKdataprep1.jpg


There are a total of 136,825 sample data points across 104 different measures.

The chemical units of measure csv file is basically the measures with an additional field for the units of measurement. The sample data is as shown below:

LWKdataprep2.jpg


At this moment, there does not seem to be any need to clean the data as it looks usable. However, an initial scan of the csv file shows that there could potentially be missing data for several, if not all the measures.


Deriving Auxiliary Data – The Map

Another data given to us is the Waterways Final.jpg, which is a low-res map of the preserve and it shows the location of the various sampling points. I believed there is value in knowing the exact coordinates of each point and hence I have created a tableau version of the map.

Step 1: A new location.csv is created with the coordinates of the preserve locations and the 4 corners of the map:

Region X Y
UL 0 249
LL 0 0
UR 249 249
LR 249 0
Achara 106.5 161.18
Boonsri 134.88 196.48
Busarakhan 184.7 141.8
Chai 153.6 126.6
Decha 38 101
Kannika 165.3 70.6
Kohsoom 185.4 166
Sakda 133.5 34.6
Somchair 85.1 132.1
Tansanee 84.4 78.9

Step 2: Location.csv is loaded into Tableau and X is plotted to [Columns] and Y to [Rows]. Location is mapped to [Details].

Step 3: The Waterways jpg is loaded via [Map] > [Background Images] > [Add Images] > [Waterways Final] to obtain the final output.


LWKdataprep3.jpg


The points are annotated as well so that when the cursor is at each location, we can see the exact coordinates of each station:


Back to Dropbox Page

Go back.png