ISSS608 2017-18 T3 Assign Vaishnavi Praveen Agarwal DataPrep
|
|
|
|
|
|
Contents
Data Description
File Name |
Variables |
Boonsong Lekagul waterways readings (.csv file) |
The given data file has reading of 106 chemicals at 10 different locations from 1998 - 2016. i. id (numeric , unique)
|
chemical units of measure (.csv file) |
The given data file has the unit of each measure (106 measures) in which the readings were taken. i. measure (string, unique)
|
Waterways Final (.jpg file) |
The Waterways Final is a map image that shows the location of dumping site and the waterways. |
Data Preparation
1. Grouping Measures
There are a total of 106 measures given in the data file, and it is difficult to view all the measures at a glance. In order to make the visualization simple I have grouped the measure based on their Chemical Composition and Behavior.
- A new column was added in the file chemical units of measure.csv
- Name the column as groups
- Chemical properties of each measure was studied and they were divided in 12 groups.
- Manually assigned the relevant group to each measure.
The 12 groups formed are:
Groups |
Measures |
Field measurements |
Anionic active surfactants, Fecal coliforms, Fecal streptococci, Total coliforms, Total extractable matter, Total hardness |
Heavy metals |
Arsenic, Cadmium, Chromium, Copper, Iron, Lead, Manganese, Mercury, Nickel, Zinc |
Hydrocarbon |
Acenaphthene, Acenaphthylene, Anthracene, Benzo(a)anthracene, Benzo(a)pyrene, Benzo(b)fluoranthene, Benzo(g,h,i)perylene, Benzo(k)fluoranthene, Chrysene, Fluoranthene, Fluorene, Indeno(1,2,3-c,d)pyrene, Naphthalene, PAHs, Pentachlorobenzene, Petroleum hydrocarbons, Phenanthrene, Pyrene, Tetrachloromethane |
Insecticides, Pesticides and Herbicides |
Alachlor, Aldrin, Atrazine, Endosulfan (alpha), Endosulfan (beta), gamma-Hexachlorocyclohexane, Isodrin, Methoxychlor, Metolachlor, p,p-DDD, p,p-DDE, p,p-DDT, Simazine, Trifluralin |
Metal |
Aluminium, Barium, Berilium, Boron, Calcium, Cesium, Magnesium, Potassium, Selenium, Sodium |
Mineral and Nutrients |
Dissolved silicates, Orthophosphate-phosphorus, Silica (SiO2), Total dissolved phosphorus, Total nitrogen, Total phosphorus |
Nitrogen |
Inorganic nitrogen, Organic nitrogen, Total organic carbon |
Organic |
alpha-Hexachlorocyclohexane, AOX, beta-Hexaxchlorocyclohexane, Dieldrin, Endrin, Heptachlor, Heptachloroepoxide, Hexachlorobenzene, PCB 101, PCB 118, PCB 138, PCB 153, PCB 180, PCB 28, PCB 52 |
Others |
1,2,3-Trichlorobenzene, 1,2,4-Trichlorobenzene, AGOC-3A, Macrozoobenthos, Methylosmoline |
Oxygen |
Biochemical Oxygen, Chemical Oxygen Demand (Cr), Chemical Oxygen Demand (Mn), Dissolved organic carbon, Dissolved oxygen, Oxygen saturation |
Salt |
Ammonium, Bicarbonates, Carbonates, Chlorides, Chlorodinine, Cyanides, Nitrates, Nitrites, Sulfides, Sulphates, Total dissolved salts |
Water temperature |
Water temperature |
2. Coordinate Plotting
Waterways Final is a Map image provided in the data, but the data files does not contain Latitude and Longitude for the Places shown on the Map. Therefore, to use the Map as an interactive sheet, I will create the Latitude and Longitude manually in the following way:
Steps |
Image |
Adding Data and Background Image for Map
|
|
Annotate Point to generate Coordinates
|
3. Combining Files
After cleaning and preparing the data, we have total 3 files: Boonsong Lekagul waterways readings (.csv file), chemical units of measure (.csv file) and Location (.xlsx file). We will combine these file using the common columns as Join clause and then do the Visualization.
Steps |
Image |
Combining Boonsong Lekagul waterways readings (.csv file) and chemical units of measure (.csv file) |
|
Combining Boonsong Lekagul waterways readings (.csv file) and Location (.xlsx file) |
|