Difference between revisions of "ISSS608 2017-18 T3 Assign Vaishnavi Praveen Agarwal DataPrep"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(115 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
<div>
 +
[[Image:pic.jpg|550px]]
 +
<font size = 5; font color="#21618C"> &nbsp; VAST Challenge: Mini Challenge 2</font>   
 +
</div>
  
 
<!--MAIN HEADER -->  
 
<!--MAIN HEADER -->  
 
{|style="background-color:#0B5345;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |  
 
{|style="background-color:#0B5345;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#0B5345; text-align:center;" width="20%" |   
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#21618C; text-align:center;" width="17%" |   
 
;  
 
;  
 
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal| <font color="#FFFFFF">The Challenge</font>]]  
 
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal| <font color="#FFFFFF">The Challenge</font>]]  
 
   
 
   
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#7D6608; text-align:center;" width="20%" |   
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#808B96; text-align:center;" width="16%" |   
 
;  
 
;  
 
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal_DataPrep| <font color="#FFFFFF">Data Preparation</font>]]  
 
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal_DataPrep| <font color="#FFFFFF">Data Preparation</font>]]  
 
   
 
   
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#0B5345; text-align:center;" width="20%" |   
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#21618C; text-align:center;" width="16%" |   
 
;  
 
;  
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal_Visual| <font color="#FFFFFF">Visualization Tools</font>]]  
+
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal_Visual| <font color="#FFFFFF">Visualizations</font>]]  
 
   
 
   
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#0B5345; text-align:center;" width="20%" |   
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#21618C; text-align:center;" width="18%" |   
 
;  
 
;  
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal_Vast| <font color="#FFFFFF">VAST Submissions</font>]]  
+
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal_Vast| <font color="#FFFFFF">Insights and Conclusion</font>]]  
  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#0B5345; text-align:center;" width="20%" |   
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#21618C; text-align:center;" width="19%" |   
 
;  
 
;  
 
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal_Feedback| <font color="#FFFFFF">Feedback and Comments</font>]]  
 
[[ISSS608_2017-18_T3_Assign_Vaishnavi_Praveen_Agarwal_Feedback| <font color="#FFFFFF">Feedback and Comments</font>]]  
 +
 +
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#21618C; text-align:center;" width="14%" | 
 +
;
 +
[[Assignment_Dropbox_G2| <font color="#FFFFFF">All Assignments</font>]]
 
   
 
   
 
|  &nbsp;  
 
|  &nbsp;  
|}  
+
|}
<!--MAIN HEADER-->
 
  
  
<!--The visualization exercises covered in the next 2 tabs make use of 4 distinct datasets. A brief overview of the data preparation and univariate analysis for all 4 datasets is covered in subsequent sections. These datasets and corresponding analysis has been prepared using both JMP Pro 13 and Tableau.
+
=Data Description=
 +
<div style="margin:0px; padding: 2px; font-family: Arial; border-radius: 1px; text-align:left">
 +
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 +
|-
 +
|style="text-align:center"|
 +
;
 +
<b>File Name</b>
 +
|style="text-align:center"|
 +
<b>Variables</b>
  
A brief description of all 4 datasets and preparatory steps is given below:
+
|-
 +
|
 +
<b>Boonsong Lekagul waterways readings (.csv file)</b>
 +
||
  
=Geolocation Data=
+
The given data file has reading of 106 chemicals at 10 different locations from 1998 - 2016.
 +
 
 +
<b>i. id</b> (numeric , unique)
 +
<br><b>ii. value</b> (numeric)
 +
<br><b>iii. location</b> (string)
 +
<br><b>iv. sample date</b> (string)
 +
<br><b>v. measure</b> (string)
 +
 +
|-
 +
|
 +
<b>chemical units of measure (.csv file)</b>
 +
||
  
This dataset was created based on the location X and Y coordinate points provided as part of the challenge description. The VAST challenge documentation provides geographical coordinates for all 4 factories and 9 sensors. In addition, a “Type” column was added to differentiate between the type of infrastructural construction (Factory or Sensor) represented by the corresponding coordinates.
+
The given data file has the unit of each measure (106 measures) in which the readings were taken.
  
Based on the given information, the table below with 4 columns was created which when plotted in a 200x200 coordinate block map shows the following geographical layout of all factories and sensors.
+
<b>i. measure</b> (string, unique)
 +
<br><b>ii. unit</b> (string)
  
[[]]
+
|-
<div style="float:center;text-align:center;font-size:13px;"> <u>''''</u> </div>
+
|
 +
<b>Waterways Final (.jpg file)</b>
 +
||
  
 +
The Waterways Final is a map image that shows the location of dumping site and the waterways.
 +
<span style=text-align:center;">[[Image:Waterways Final.jpg|200px|thumb|left]]</span>
  
=Sensor Data=
+
|}
<div>
 
<div style=float:right; width:20%;">
 
[[|300px|right|border]]
 
<div style="float:center;text-align:center;font-size:13px;"> <u>''''</u> </div>
 
 
</div>
 
</div>
<div style=float:left;width:70%;">
 
This dataset provided by the VAST challenge documentation in its original format, contains hourly readings for each of the 4 chemicals captured by every Sensor. The number of records span across 3 months i.e. April, August and December, 2016. The adjacent image shows a sample of the dataset provided.
 
Using JMP Pro 13, missing values for this dataset were checked to confirm that all 79,243 rows in this dataset did not have any missing values.
 
  
  
[[|500px|center|border]]
+
=Data Preparation=
<div style="float:center;text-align:center;font-size:13px;"> <u>''''</u> </div>
+
===1. Grouping Measures===
  
 +
There are a total of 106 measures given in the data file, and it is difficult to view all the measures at a glance. In order to make the visualization simple I have grouped the measure based on their Chemical Composition and Behavior.
  
Univariate analysis of the readings also show that the measure of chemical readings is highly skewed. It is noteworthy to highlight the degree of skewness by stating that the 99.5th percentile measure is 6.46 and the maximum value at 101.1. The median reading is at 0.39. These insights have been used in the preparation of selected visualizations, as covered in the next tab.
+
- A new column was added in the file <b>chemical units of measure.csv</b>
 +
<br>- Name the column as <b>groups</b>
 +
<br>- Chemical properties of each measure was studied and they were divided in <b>12 groups</b>.
 +
<br>- Manually assigned the relevant group to each measure.
  
The distribution of the numeric nominal Monitor field was also analysed to ascertain the frequency of all readings for all Sensors. As shown in the adjacent image, all 9 Sensors capture similar number of readings in the 3 months of data provided with minor differences as shown in the frequency count below:
+
<b>The 12 groups formed are:</b>
  
<div style="float:left;;width:50%;">
+
<div style="margin:0px; padding: 2px; font-family: Arial; border-radius: 1px; text-align:left">
[[|500px|center|border]]<br/>
+
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
<center style="font-size:13px;"> <u>''''</u> </center>
+
|-
</div>
+
|style="text-align:center"|
<div style="float:left;;width:50%;>
+
;
[[|450px|center|border]]
+
<b>Groups</b>
<center style="font-size:13px;"> <u>''''</u> </center>
+
|style="text-align:center"|
</div>
+
<b>Measures</b>
</div>
 
</div>
 
  
 +
|-
 +
|
 +
<b>Field measurements</b>
 +
||
  
 +
Anionic active surfactants, Fecal coliforms, Fecal streptococci, Total coliforms, Total extractable matter, Total hardness
 +
 +
|-
 +
|
 +
<b>Heavy metals</b>
 +
||
  
=Meteorological Data=
+
Arsenic, Cadmium, Chromium, Copper, Iron, Lead, Manganese, Mercury, Nickel, Zinc
  
 +
|-
 +
|
 +
<b>Hydrocarbon</b>
 +
||
  
<div>
+
Acenaphthene, Acenaphthylene, Anthracene, Benzo(a)anthracene, Benzo(a)pyrene, Benzo(b)fluoranthene, Benzo(g,h,i)perylene, Benzo(k)fluoranthene, Chrysene, Fluoranthene, Fluorene, Indeno(1,2,3-c,d)pyrene, Naphthalene, PAHs, Pentachlorobenzene, Petroleum hydrocarbons, Phenanthrene, Pyrene, Tetrachloromethane
<div style=float:right; width:40%;">
 
  
[[|530px|right|border]]
+
|-
<div style="float:center;text-align:center;font-size:13px;"> <u>''''</u> </div>
+
|
</div>
+
<b>Insecticides, Pesticides and Herbicides</b>  
<div style=float:left;width:60%;">
+
||
This dataset, as provided by the VAST challenge documentation in its original format, contains atmospheric wind related information. The data covers meteorological readings once every 3 hours with the general wind direction and wind speed captured for that time frame. A snippet of this dataset is shown in the adjacent image.
 
  
The relevance of the data field for “Elevation” given in this dataset is unclear and requires additional information. For the purpose of this investigation, this field has been ignored until further information can be provided.
+
Alachlor, Aldrin, Atrazine, Endosulfan (alpha), Endosulfan (beta), gamma-Hexachlorocyclohexane, Isodrin, Methoxychlor, Metolachlor, p,p-DDD, p,p-DDE, p,p-DDT, Simazine, Trifluralin
  
Possible missing values were checked with the following results:
+
|-
 +
|
 +
<b>Metal</b>
 +
||
  
[[|500px|center|border]]
+
Aluminium, Barium, Berilium, Boron, Calcium, Cesium, Magnesium, Potassium, Selenium, Sodium
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 6''</u> </div>
 
</div>
 
  
As shown above, there are 2 anomalous records. Further investigations showed that the 2 records are - a empty row and missing values on 30th August, 2016 at 3AM. These 2 records were removed as part of the data cleaning process.
+
|-
 +
|
 +
<b>Mineral and Nutrients</b>
 +
||
  
Univariate analysis on the Wind Speed (m/s) shows the frequency of wind speed is well distributed between 0.1 m/s and 6.8 m/s.
+
Dissolved silicates, Orthophosphate-phosphorus, Silica (SiO2), Total dissolved phosphorus, Total nitrogen, Total phosphorus
  
[[|700px|center|border]]
+
|-
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 8''</u> </div>
+
|
 +
<b>Nitrogen</b>  
 +
||
  
Frequency analysis of Wind Direction shows that the wind direction is more skewed between 150 to 360 degrees. Since the given wind direction data is north-facing, the frequency distribution shows that the wind direction is mostly from west to east.
+
Inorganic nitrogen, Organic nitrogen, Total organic carbon
  
[[|700px|center|border]]
+
|-
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 9''</u> </div>
+
|
 +
<b>Organic</b>  
 +
||
  
 +
alpha-Hexachlorocyclohexane, AOX, beta-Hexaxchlorocyclohexane, Dieldrin, Endrin, Heptachlor, Heptachloroepoxide, Hexachlorobenzene, PCB 101, PCB 118, PCB 138, PCB 153, PCB 180, PCB 28, PCB 52
  
 +
|-
 +
|
 +
<b>Others</b>
 +
||
  
=Factory Identification Data=
+
1,2,3-Trichlorobenzene, 1,2,4-Trichlorobenzene, AGOC-3A, Macrozoobenthos, Methylosmoline
  
This is a manually created and derived dataset that uses the all of the aforementioned data tables.  The purpose and usage of this dataset is to map the chemical readings captured by Sensors at each time frame, with the general wind speed and direction, in an effort to perform geolocation analysis and estimate the factory origin of chemical emissions.
+
|-
 +
|
 +
<b>Oxygen</b>
 +
||
  
This dataset combines the Sensor data, Meteorological data and Geolocation data to perform an overall geographical analysis. Steps to recreate this dataset are as follows:
+
Biochemical Oxygen, Chemical Oxygen Demand (Cr), Chemical Oxygen Demand (Mn), Dissolved organic carbon, Dissolved oxygen, Oxygen saturation
  
* Using JMP Pro 13, a Cartesian join was performed on Meteorological data and Geolocation data.
+
|-
 +
|
 +
<b>Salt</b>
 +
||
  
 +
Ammonium, Bicarbonates, Carbonates, Chlorides, Chlorodinine, Cyanides, Nitrates, Nitrites, Sulfides, Sulphates, Total dissolved salts
  
[[|500px|center|border]]
+
|-
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 10''</u> </div>
+
|
 +
<b>Water temperature</b>  
 +
||
  
 +
Water temperature
  
<div style="padding-left:2%;">The resultant dataset contains all Meteorological records for all 9 sensors and 4 factories, as shown below.  Our investigation only requires us to consider the Sensor data and estimate the factory origin of chemical emissions. The next 2 steps show the process of retrieving this subset of data.</div>
+
|}
 +
</div>
  
* The Sensor names were recoded to its numerical equivalent before creating a subset, as shown below:
+
===2. Coordinate Plotting===
  
[[|500px|center|border]]
+
Waterways Final is a Map image provided in the data, but the data files does not contain Latitude and Longitude for the Places shown on the Map. Therefore, to use the Map as an interactive sheet, I will create the Latitude and Longitude manually in the following way:
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 11''</u> </div>
 
  
 +
<div style="margin:0px; padding: 2px; font-family: Arial; border-radius: 1px; text-align:left">
 +
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 +
|-
 +
|style="text-align:center"|
 +
;
 +
<b>Steps</b>
 +
|style="text-align:center"|
 +
<b>Image</b>
  
*A subset of this dataset was created to filter out records only for the 9 Sensors and not the 4 factories. This was done using the Subset functionality of JMP on the field “Type”.
+
|-
 +
|
 +
<b>Adding Data and Background Image for Map</b>
  
[[|500px|center|border]]
+
*Open Boonsong Lekagul waterways readings (.csv file) in Tableau, right click on Location Dimension, click on Geographic Role and select ZIP Code / Postcode. This will generate Latitude and Longitude in Measures. Add the generated fields to Columns and Rows respectively.
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 12''</u> </div>
+
*Go to Map options and click on Background Images. In the dialogue box, browse the Waterways Final (.jpg file). In the X Field set Right value to 249 and in the Y Field set the Top value to 249. Go to options and select Always show entire image option and Apply.
 +
 +
||
 +
<span style=text-align:center;">[[Image:Vaishnavi_1.png|300px|thumb|center]]</span>
  
 +
|-
 +
|
 +
<b>Annotate Point to generate Coordinates</b>
  
*The resultant Subset created for Sensors with 6,345 rows is shown below:
+
*Create a new excel file Locations, add a column named Location and manually enter the places. Add other two column named X and Y.
 +
*Right click on the tip of the arrow on map, click on Annotate and select Point.
 +
*In the dialog box, insert fields Latitude(generated) and Longitude(generated).
 +
*Manually copy paste the annotated values in the Location.xlsx file, X and Y columns.
 +
*Do this for all the locations.
  
 +
||
 +
<span style=text-align:center;">[[Image:Vaishnavi_2.png|300px|thumb|center]]</span>
  
[[|500px|center|border]]
+
|}
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 13''</u> </div>
+
</div>
  
 +
===3. Combining Files===
  
 +
After cleaning and preparing the data, we have total 3 files: <b>Boonsong Lekagul waterways readings (.csv file)</b>, <b>chemical units of measure (.csv file)</b> and <b>Location (.xlsx file)</b>. We will combine these file using the common columns as Join clause and then do the Visualization.
  
<div>
+
<div style="margin:0px; padding: 2px; font-family: Arial; border-radius: 1px; text-align:left">
<div style=float:right; width:25%;">
+
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
[[|150px|right|border]]
+
|-
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 14''</u> </div>
+
|style="text-align:center"|
</div>
+
;
<div style=float:left;width:70%;">
+
<b>Steps</b>
*Further investigations showed that in order to plot the influence area of each Sensor to trace back to the Factory origin of chemical emissions, any visualization will require a polygonal mapping of its geographically cover the meteorological influence.
+
|style="text-align:center"|
<div style="padding-left:2%;">By applying and modifying a theoretical understanding of the Windrose model, a triangle was used as the decided upon polygon. In order to plot a triangle for each Sensor based on the wind speed and direction at every given time-frame; the required data should contain 3 coordinate points to capture the 3 ends of the triangle.
+
<b>Image</b>
  
In order to do this, a dummy table containing the 3 coordinate types was created, as shown in the adjacent image.</div>
+
|-
*A Cartesian join was performed on these 2 data tables, which resulted in a consolidated data table with 19,035 records in the following format:
+
|
</div>
+
<b>Combining Boonsong Lekagul waterways readings (.csv file) and chemical units of measure (.csv file)</b>  
  
 +
||
 +
*Open the Data Source, add a new Text File <b>Boonsong Lekagul waterways readings (.csv file)</b>.
 +
*Again click on Add in Connections, click on Text File and add <b>chemical units of measure (.csv file)</b>.
 +
*Click on Join and select <b>Left Join</b>.
 +
<span style=text-align:center;">[[Image:Vaishnavi_3.png|550px|thumb|center]]</span>
 +
*Use <b>Measure</b> as the Join clause between the two files.
 +
<span style=text-align:center;">[[Image:Vaishnavi_4.png|300px|thumb|center]]</span>
  
[[|600px|center|border]]
 
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 15''</u> </div>
 
</div>
 
  
 +
|-
 +
|
 +
<b>Combining Boonsong Lekagul waterways readings (.csv file) and Location (.xlsx file)</b>
  
*Finally, this dataset was loaded in Tableau to perform visual analysis. The corresponding Sensor readings for the time frames captured in this dataset were mapped by performing an inner join with the Sensor data in Tableau. The screenshot below shows the inner join performed on Sensor and the Date time fields.
+
||
 +
*Open the Data Source, add a new Microsoft Excel File <b>Location (.xlsx file)</b>.  
 +
*Click on Join and select <b>Left Join</b>.
 +
<span style=text-align:center;">[[Image:Vaishnavi_5.png|400px|thumb|center]]</span>
 +
*Use <b>Location</b> as the Join clause between the two files.
 +
<span style=text-align:center;">[[Image:Vaishnavi_6.png|300px|thumb|center]]</span>
  
  
[[|600px|center|border]]
+
|}
<div style="float:center;text-align:center;font-size:13px;"> <u>''Figure 16''</u> </div>
 
 
</div>
 
</div>
 +
 +
===4. Final Dataset===
  
  
  
The limitation of the last step is that the analysis performed in Question 3 of Mini Challenge 2 from the VAST Challenge 2017, is based on Sensor readings every 3 hours and not every hour.
+
<span style=text-align:center;">[[Image:Vaishnavi_7.png|1100px|thumb|left]]</span>
-->
 

Latest revision as of 11:44, 8 July 2018

Pic.jpg   VAST Challenge: Mini Challenge 2

The Challenge

Data Preparation

Visualizations

Insights and Conclusion

Feedback and Comments

All Assignments

 


Data Description

File Name

Variables

Boonsong Lekagul waterways readings (.csv file)

The given data file has reading of 106 chemicals at 10 different locations from 1998 - 2016.

i. id (numeric , unique)
ii. value (numeric)
iii. location (string)
iv. sample date (string)
v. measure (string)

chemical units of measure (.csv file)

The given data file has the unit of each measure (106 measures) in which the readings were taken.

i. measure (string, unique)
ii. unit (string)

Waterways Final (.jpg file)

The Waterways Final is a map image that shows the location of dumping site and the waterways.

Waterways Final.jpg


Data Preparation

1. Grouping Measures

There are a total of 106 measures given in the data file, and it is difficult to view all the measures at a glance. In order to make the visualization simple I have grouped the measure based on their Chemical Composition and Behavior.

- A new column was added in the file chemical units of measure.csv
- Name the column as groups
- Chemical properties of each measure was studied and they were divided in 12 groups.
- Manually assigned the relevant group to each measure.

The 12 groups formed are:

Groups

Measures

Field measurements

Anionic active surfactants, Fecal coliforms, Fecal streptococci, Total coliforms, Total extractable matter, Total hardness

Heavy metals

Arsenic, Cadmium, Chromium, Copper, Iron, Lead, Manganese, Mercury, Nickel, Zinc

Hydrocarbon

Acenaphthene, Acenaphthylene, Anthracene, Benzo(a)anthracene, Benzo(a)pyrene, Benzo(b)fluoranthene, Benzo(g,h,i)perylene, Benzo(k)fluoranthene, Chrysene, Fluoranthene, Fluorene, Indeno(1,2,3-c,d)pyrene, Naphthalene, PAHs, Pentachlorobenzene, Petroleum hydrocarbons, Phenanthrene, Pyrene, Tetrachloromethane

Insecticides, Pesticides and Herbicides

Alachlor, Aldrin, Atrazine, Endosulfan (alpha), Endosulfan (beta), gamma-Hexachlorocyclohexane, Isodrin, Methoxychlor, Metolachlor, p,p-DDD, p,p-DDE, p,p-DDT, Simazine, Trifluralin

Metal

Aluminium, Barium, Berilium, Boron, Calcium, Cesium, Magnesium, Potassium, Selenium, Sodium

Mineral and Nutrients

Dissolved silicates, Orthophosphate-phosphorus, Silica (SiO2), Total dissolved phosphorus, Total nitrogen, Total phosphorus

Nitrogen

Inorganic nitrogen, Organic nitrogen, Total organic carbon

Organic

alpha-Hexachlorocyclohexane, AOX, beta-Hexaxchlorocyclohexane, Dieldrin, Endrin, Heptachlor, Heptachloroepoxide, Hexachlorobenzene, PCB 101, PCB 118, PCB 138, PCB 153, PCB 180, PCB 28, PCB 52

Others

1,2,3-Trichlorobenzene, 1,2,4-Trichlorobenzene, AGOC-3A, Macrozoobenthos, Methylosmoline

Oxygen

Biochemical Oxygen, Chemical Oxygen Demand (Cr), Chemical Oxygen Demand (Mn), Dissolved organic carbon, Dissolved oxygen, Oxygen saturation

Salt

Ammonium, Bicarbonates, Carbonates, Chlorides, Chlorodinine, Cyanides, Nitrates, Nitrites, Sulfides, Sulphates, Total dissolved salts

Water temperature

Water temperature

2. Coordinate Plotting

Waterways Final is a Map image provided in the data, but the data files does not contain Latitude and Longitude for the Places shown on the Map. Therefore, to use the Map as an interactive sheet, I will create the Latitude and Longitude manually in the following way:

Steps

Image

Adding Data and Background Image for Map

  • Open Boonsong Lekagul waterways readings (.csv file) in Tableau, right click on Location Dimension, click on Geographic Role and select ZIP Code / Postcode. This will generate Latitude and Longitude in Measures. Add the generated fields to Columns and Rows respectively.
  • Go to Map options and click on Background Images. In the dialogue box, browse the Waterways Final (.jpg file). In the X Field set Right value to 249 and in the Y Field set the Top value to 249. Go to options and select Always show entire image option and Apply.
Vaishnavi 1.png

Annotate Point to generate Coordinates

  • Create a new excel file Locations, add a column named Location and manually enter the places. Add other two column named X and Y.
  • Right click on the tip of the arrow on map, click on Annotate and select Point.
  • In the dialog box, insert fields Latitude(generated) and Longitude(generated).
  • Manually copy paste the annotated values in the Location.xlsx file, X and Y columns.
  • Do this for all the locations.
Vaishnavi 2.png

3. Combining Files

After cleaning and preparing the data, we have total 3 files: Boonsong Lekagul waterways readings (.csv file), chemical units of measure (.csv file) and Location (.xlsx file). We will combine these file using the common columns as Join clause and then do the Visualization.

Steps

Image

Combining Boonsong Lekagul waterways readings (.csv file) and chemical units of measure (.csv file)

  • Open the Data Source, add a new Text File Boonsong Lekagul waterways readings (.csv file).
  • Again click on Add in Connections, click on Text File and add chemical units of measure (.csv file).
  • Click on Join and select Left Join.
Vaishnavi 3.png
  • Use Measure as the Join clause between the two files.
Vaishnavi 4.png


Combining Boonsong Lekagul waterways readings (.csv file) and Location (.xlsx file)

  • Open the Data Source, add a new Microsoft Excel File Location (.xlsx file).
  • Click on Join and select Left Join.
Vaishnavi 5.png
  • Use Location as the Join clause between the two files.
Vaishnavi 6.png


4. Final Dataset

Vaishnavi 7.png