IS428 2017-18 T1 Assign Sarah Jane Tong

From Visual Analytics for Business Intelligence
Revision as of 20:31, 8 October 2017 by Sarahtong.2014 (talk | contribs)
Jump to navigation Jump to search

Problem and Motivation

Background

Rose-Crested Blue Pipits are dwindling in numbers, and the cause is likely to be from one or more of the four types of chemicals emitted from the four factories. The factories are located south-west of the nature preserve where the birds reside, with the 9 sensors placed in the vicinity surrounding the 4 factories.

The 4 factories are
  • Roadrunner Fitness Electronics
  • Kasios Office Furniture
  • Radiance ColourTek
  • Indigo Sol Boards
A brief description of the 4 chemicals are also as follows

1. Appluimonia – An airborne odor which can possibly cause serious injury, long-term health effects, or death to humans or animals.

2. Chlorodinine – Corrosive chemical which can attack and chemically destroy exposed body tissues. It has been used as a disinfectant and sterilizing agent as well as other uses. It is harmful if inhaled or swallowed.

3. Methylosmolene – Has potent effects on vertebrates. Liquid forms of Methylosmolene are required by law to be chemically neutralized before disposal. Strictly regulated.

4. AGOC-3A – A solvent which is not extremely harmful to human and environmental health.

General Task Overview

To determine the cause of the dwindling numbers of the birds. Isolating the chemicals from the dataset and identifying if there are correlations between the emissions and the data.

Subtasks
  1. Characterize the sensors’ performance and operation. Are they all working properly at all times? Can you detect any unexpected behaviors of the sensors through analyzing the readings they capture?
  2. Which chemicals are being detected by the sensor group? What patterns of chemical releases do you see, as being reported in the data?
  3. Which factories are responsible for which chemical releases? Carefully describe how you determined this using all the data you have available. For the factories you identified, describe any observed patterns of operation revealed in the data.

Given the large number of variables and measures, there is a need to build an interactive data visualization tool to help to analyse the correlations between various variables. The visualisations will draw links between each factory, chemical and monitor.

Dataset Analysis & Transformation Process

There were 3 excel datasets provided in the assignment with different formats and attributes. And 1 document regarding general information about the environment in the problem question.

This section will elaborate on the dataset analysis and transformation process for each dataset in order to prepare the data for import and analysis on an interactive visualization.

Dataset Name Brief Description
Sensor Location.xlsx 9 sensors and their locations in X-Y coordinates in 2 columns.
Sensor Data.xlsx Approximately 80,000 rows each with a unique set of chemical, monitor, date and time and readings.

Date and time are taken in 3 months, April, August and December. One reading is taken 24 times a day, at the start of each hour for the entire month.

Meteorological Data.xlsx Approximately 2,000 rows with a unique time and date stamp, and respective wind direction and wind speed. Elevation is also given as 340 m.
MC2 Data Descriptions.doc Provided the X-Y coordinates of the factories and the sensors. Also provides information on the map, which is a 200 x 200 grid.

Data Cleaning

Meteorological Data

caption

Elevation in Col E was removed