IS428 AY2019-20T1 Assign Ho Jue Hong Data Cleaning

From Visual Analytics for Business Intelligence
Revision as of 23:47, 12 October 2019 by Juehong.ho.2017 (talk | contribs)
Jump to navigation Jump to search

MINI-CHALLENGE 2: CITIZEN SCIENCE TO THE RESCUE

PROBLEM & MOTIVATION

 

DATA CLEANING

 

DASHBOARD

 

PROBLEMS

 

REFERENCE

Data Flow

Data flow.png

Merging of Static Sensor Location and Readings

Firstly to make the dataset easier to use, I would clean the dataset first, removing the redundant columns. After looking through the datasets given, the two datasets StaticSensorLocation.csv and StaticSensorReading.csv have a similar column, Sensor-id. Hence we are able to do an inner join between both datasets and remove the extra columns.

1st cleaning.png



Join clause.png



The applied join clause would be the Sensor-id columns as shown above. Now that we have merged both columns together, we need to remove the extra sensor-id column.

Change1.png



Here we can see that we made 4 changes, the first change is to add a calculated field where we use the formula "Static"+str([Sensor-id]) to create a [new_id] to make this unique. Hence when I combine with the mobile reading dataset later, there won't be any clashes.

The second thing I did was to remove [sensor-id] since we are using the new column [new_id]

The third thing I did was to remove the other [sensor-id] field.

Lastly, the removal of [Units] column because it serves no other purpose

Cleaning Mobile Sensor Reading

Mobile1.png



Mobile changes.png



To be able to union both the new output of Static-sensors.csv and mobile.csv, we have the make sure the sensor-keys are well labelled. Hence the first thing we did here was to create a calculated field where [New_id] = "Mobile"+str([Sensor-id]) this creates the the ID for mobile sensors.

The second thing we did was once again to remove the [Units] column as it serves no purpose anymore.

Lastly, because we have the new field [New_id] that we created [Sensor-id] is redudant and hence it will be removed too.