Difference between revisions of "ISSS608 2017-18 T3 Assign Lim Wee Kiong Data Preparation"
Wklim.2017 (talk | contribs) |
Wklim.2017 (talk | contribs) |
||
Line 151: | Line 151: | ||
Step 3: The Waterways jpg is loaded via [Map] > [Background Images] > [Add Images] > [Waterways Final] to obtain the final output. | Step 3: The Waterways jpg is loaded via [Map] > [Background Images] > [Add Images] > [Waterways Final] to obtain the final output. | ||
+ | [[Image:LWKdataprep3.jpg|center|400px]] | ||
+ | |||
+ | The points are annotated as well so that when the cursor is at each location, we can see the exact coordinates of each station: | ||
+ | |||
+ | [[Image:LWKdataprep4.jpg|center|500px]] | ||
+ | |||
+ | This map is important as it will be used as part of the auxiliary data for our analysis, as we try to determine whether water flow contributes to the readings. | ||
+ | |||
+ | ==Obtaining Knowledge on Hydrology== | ||
− | + | While this is not a requirement, but it seems useful to learn more about hydrology and water pollution as we embarked on this task. | |
+ | We have established earlier that Methylosmolene is the main toxic compound in question. But what other chemicals or measures would be useful in knowing its impact to the fauna in the preserve, especially the birds? | ||
− | |||
Back to Dropbox Page | Back to Dropbox Page | ||
[[File:Go back.png|40px|frameless|left|link=Assignment_Dropbox_G1]] | [[File:Go back.png|40px|frameless|left|link=Assignment_Dropbox_G1]] |
Revision as of 08:12, 8 July 2018
|
|
|
|
|
|
Contents
Data Preparation
Understanding the Raw Data – Samples Readings and Measures
The data given to us comes in 2 main files: Boonsong Lekagul waterways readings.csv and chemical units of measure.csv.
Data cleaning is done in Excel and data visualization in Tableau.
Descriptions of the data fields for Boonsong Lekagul waterways readings are as follow:
Field | Description |
---|---|
ID | Identification number for the record (only for bookkeeping) |
Value | Measured value for the chemical or property in this record |
Location | Name of the location sample was taken from. See the map for geo-location of the sampling site. |
Sample Date | Date sample was taken from the location |
Measure | Chemicals (e.g., Sodium) or water properties (e.g., Water temperature) measured in the record |
A sample of the data is shown here:
There are a total of 136,825 sample data points across 104 different measures.
The chemical units of measure csv file is basically the measures with an additional field for the units of measurement. The sample data is as shown below:
At this moment, there does not seem to be any need to clean the data as it looks usable. However, an initial scan of the csv file shows that there could potentially be missing data for several, if not all the measures.
Deriving Auxiliary Data – The Map
Another data given to us is the Waterways Final.jpg, which is a low-res map of the preserve and it shows the location of the various sampling points. I believed there is value in knowing the exact coordinates of each point and hence I have created a tableau version of the map.
Step 1: A new location.csv is created with the coordinates of the preserve locations and the 4 corners of the map:
Region | X | Y |
---|---|---|
UL | 0 | 249 |
LL | 0 | 0 |
UR | 249 | 249 |
LR | 249 | 0 |
Achara | 106.5 | 161.18 |
Boonsri | 134.88 | 196.48 |
Busarakhan | 184.7 | 141.8 |
Chai | 153.6 | 126.6 |
Decha | 38 | 101 |
Kannika | 165.3 | 70.6 |
Kohsoom | 185.4 | 166 |
Sakda | 133.5 | 34.6 |
Somchair | 85.1 | 132.1 |
Tansanee | 84.4 | 78.9 |
Step 2: Location.csv is loaded into Tableau and X is plotted to [Columns] and Y to [Rows]. Location is mapped to [Details].
Step 3: The Waterways jpg is loaded via [Map] > [Background Images] > [Add Images] > [Waterways Final] to obtain the final output.
The points are annotated as well so that when the cursor is at each location, we can see the exact coordinates of each station:
This map is important as it will be used as part of the auxiliary data for our analysis, as we try to determine whether water flow contributes to the readings.
Obtaining Knowledge on Hydrology
While this is not a requirement, but it seems useful to learn more about hydrology and water pollution as we embarked on this task.
We have established earlier that Methylosmolene is the main toxic compound in question. But what other chemicals or measures would be useful in knowing its impact to the fauna in the preserve, especially the birds?
Back to Dropbox Page