IS428 2017-18 T1 Assign Dinh Viet Nguyen

From Visual Analytics for Business Intelligence
Revision as of 20:10, 8 October 2017 by Vndinh.2014 (talk | contribs)
Jump to navigation Jump to search

Overview

Mistford is a mid-size city is located to the southwest of a large nature preserve. The city has a small industrial area with four light-manufacturing endeavors. Mitch Vogel is a post-doc student studying ornithology at Mistford College and has been discovering signs that the number of nesting pairs of the Rose-Crested Blue Pipit, a popular local bird due to its attractive plumage and pleasant songs, is decreasing! The decrease is sufficiently significant that the Pangera Ornithology Conservation Society is sponsoring Mitch to undertake additional studies to identify the possible reasons. Mitch is gaining access to several datasets that may help him in his work, and he has asked you (and your colleagues) as experts in visual analytics to help him analyze these datasets.

Mitch Vogel was immediately suspicious of the noxious gases just pouring out of the smokestacks from the four manufacturing factories south of the nature preserve. He was almost certain that all of these companies are contributing to the downfall of the poor Rose-crested Blue Pipit bird. But when he talked to company representatives and workers, they all seem to be nice people and actually pretty respectful of the environment.

In fact, Mitch was surprised to learn that the factories had recently taken steps to make their processes more environmentally friendly, even though it raised their cost of production. Mitch discovered that the state government has been monitoring the gaseous effluents from the factories through a set of sensors, distributed around the factories, and set between the smokestacks, the city of Mistford and the nature preserve. The state has given Mitch access to their air sampler data, meteorological data, and locations map. Mitch is very good in Excel, but he knows that there are better tools for data discovery, and he knows that you are very clever at visual analytics and would be able to help perform an analysis.

Companies

Roadrunner Fitness Electronics
Roadrunner produces personal fitness trackers, heart rate monitors, headlamps, GPS watches, and other sport-related consumer electronics.

Kasios Office Furniture
Kasios Office Furniture manufactures metal and composite-wood office furniture including desks, tables, and chairs.

Radiance ColourTek
Radiance produces solvent based optically variable metallic flake paints.

Indigo Sol Boards
Indigo Sol produces skateboards and snowboards.

Chemicals

Appluimonia
An airborne odor is caused by a substance in the air that you can smell.
Chlorodinine
Corrosives are materials that can attack and chemically destroy exposed body tissues.
Methylosmolene
This is a trade name for a family of volatile organic solvents.
AGOC-3A
New environmental regulations, and consumer demand, have led to the development of low-VOC and zero-VOC solvents.

Data

Map of Lekagul Wildlife Preserve Area

The factories and sensors locations are provided in terms of x,y coordinates on a 200x200 grid, representing a 12x12miles area, with (0,0) at the lower left hand corner (southwest). The sensors map shows the locations of the sensors and factories by number for the sensors and by name for the factories.

Factory Locations

The following are the factory locations:
Roadrunner Fitness Electronics: 89,27
Kasios Office Furniture: 90,21
Radiance ColourTek: 109,26
Indigo Sol Boards: 120,22

Sensor Locations

Factory locations are provided in an excel workbook with these information:
62,21
66,35
76,41
88,45
103,43
102,22
89,3
74,7
119,42


Problem 1

Characterize the sensors’ performance and operation. Are they all working properly at all times? Can you detect any unexpected behaviors of the sensors through analyzing the readings they capture?

As stated by the problem statement, the sensor data contains 3 months’ worth of readings by nine sensors, each monitoring four substances. Upon a cursory inspection of the data using Excel, I have found out that readings at the sensors are measured 1 hour apart. Measurements are taken in April, August and December, on every single day of the month. As a result, there are two possible criteria with which the performance and operation of the sensors can be assessed.

Missing reading assessment

Each sensor had to record a reading every hour for all the days on which the measurements were supposed to be taken, to ensure that they are operating properly. In order to see at a quick glance whether the sensors have all readings they are supposed to have, a calendar heatmap can be an effective tool.

01 hourly sensor reading heatmap.jpg
In order to see at a quick glance whether the sensors have all readings they are supposed to have, a calendar heatmap can be an effective tool. In Tableau, the setup for this map is as follows:
Columns: MONTH (Date Time), HOUR (Date Time)
Rows: DAY (Date Time)
Filters: Monitor
Color: SUM(Reading)
Once the setup is done, I can use the ‘Monitor’ filter to obtain a heatmap view of the reading of each individual sensor. It is revealed from this heatmap that there are several empty slots with no data i.e. no pop-up to show the data when the cursor is placed on those slots. This is an indication that the sensors were not functioning properly, if at all, at the time represented by those slots, hence no readings available. By looking at each heatmap, I have found out an intriguing observation that the missing readings all happen at 00:00am. The following table summarises the dates with missing reading.


Problem 2

Now turn your attention to the chemicals themselves. Which chemicals are being detected by the sensor group? What patterns of chemical releases do you see, as being reported in the data?

Problem 3

Which factories are responsible for which chemical releases? Carefully describe how you determined this using all the data you have available. For the factories you identified, describe any observed patterns of operation revealed in the data.