IS428 AY2019-20T1 Assign Christine Data Analysis Transformation

From Visual Analytics for Business Intelligence
Revision as of 17:15, 10 October 2019 by Christine.2016 (talk | contribs)
Jump to navigation Jump to search

Christine.2016 NuclearIcon.png VISUALIZATION OF ALWAYS SAFE NUCLEAR POWER PLANT

PROBLEM & MOTIVATION

 

DATA ANALYSIS & TRANSFORMATION

 

INTERACTIVE VISUALIZATION

 

ANOMALIES OBSERVATION

 

REFERENCE


The very first step of analysis is located at the data cleaning and transformation so that it can bring value to the analysis conducted. Raw dataset in zip file is downloaded from VAST Challenge – Mini Challenge 2 are as follow:

  • StHimarkNeighborhoodShapefile Folder (consist of shape file that allow user to use the map file in geometry format along with the ID and name of neighbourhood)
  • StaticSensorReadings.csv (contains multiple Static Sensor ID respectively with the sensor readings over a period of time)
  • StaticSensorLocations.csv (contains multiple Static Sensor ID respectively with the its location)
  • MobileSensorReadings.csv (contains multiple Mobile Sensor ID respectively with the sensor readings and its location over a period of time)

Before using the raw dataset downloaded, this section will elaborate on the dataset analysis and transformation process in order to prepare the data for import and analysis can be conducted.

COMBINE STATIC READINGS AND STATIC LOCATIONS

File used: StaticSensorReadings.csv and StaticSensorLocations.csv

Christine.2016 figure b.1 staticReadingLocations.png
Figure b.1 - Combine Static Readings and Locations

Both files will be joined through the Applied Join Clauses by using Sensor-id in both files (Figure b.2). Through this Join clause, tidier data (Tall & Skinny) can be generated.

Christine.2016 figure b.2 staticIDCleaning.png
Figure b.2 - Clause in Join and Calculated Field

Because of on the next few steps, this static sensor readings will be combined with mobile sensor readings; hence, to avoid confusion on the Sensor-id, I decided to add identifier (e.g. Static) to the initial Sensor-id. Initially, Sensor-id was in Integer (numeric) format; in order to add String to Integer, Sensor-id need to be changed to String type. On next step needs to create the Calculated field, by adding a calculated field named “SensorID_Static” which consists of Sensor-id along with the word ‘Static’. Final look of table will be as Figure b.3.

Christine.2016 figure b.3 afterProcess.png
Figure b.3 - Final look of table


DATA CLEANING ON MOBILE SENSOR READINGS

File used: MobileSensorReadings.csv

Christine.2016 figure b.4 MobileCleaning.png
Figure b.4 - Cleaning of Mobile Sensor Readings
Christine.2016 figure b.5 mobileIDCleaning.png
Figure b.5 - Cleaning of Mobile Sensor ID and Calculated Field

Similar process will be done as Static readings. In order to avoid confusion in Sensor-id differentiation, additional identifier will be added into the Sensor-id (Sensor-id will be transformed into String type) by using this calculated field. New column named SensorID_Mobile will be created. Final look of table will be as Figure b.6.

Christine.2016 figure b.6 afterProcess.png
Figure b.6 - Final look of table


COMBINE STATIC SENSORS AND MOBILE SENSORS' READINGS


FINAL WORKFLOW AND DATASET


Figure b.1