IS428 AY2019-20T1 Assign Ho Jue Hong Data Cleaning
Contents
Data Flow
Merging of Static Sensor Location and Readings
Firstly to make the dataset easier to use, I would clean the dataset first, removing the redundant columns. After looking through the datasets given, the two datasets StaticSensorLocation.csv and StaticSensorReading.csv have a similar column, Sensor-id. Hence we are able to do an inner join between both datasets and remove the extra columns.
The applied join clause would be the Sensor-id columns as shown above. Now that we have merged both columns together, we need to remove the extra sensor-id column.
Here we can see that we made 4 changes, the first change is to add a calculated field where we use the formula "Static"+str([Sensor-id]) to create a [new_id] to make this unique. Hence when I combine with the mobile reading dataset later, there won't be any clashes.
The second thing I did was to remove [sensor-id] since we are using the new column [new_id]
The third thing I did was to remove the other [sensor-id] field.
The fourth thing, the removal of [Units] column because it serves no other purpose.
Lastly, adding the new field [User-ID] so values won't be null when you append it to the the cleaned mobile dataset.
Cleaning Mobile Sensor Reading
To be able to union both the new output of Static-sensors.csv and mobile.csv, we have the make sure the sensor-keys are well labelled. Hence the first thing we did here was to create a calculated field where [New_id] = "Mobile"+str([Sensor-id]) this creates the the ID for mobile sensors.
The second thing we did was once again to remove the [Units] column as it serves no purpose anymore.
Lastly, because we have the new field [New_id] that we created [Sensor-id] is redudant and hence it will be removed too.
Union of both MobileSensorReading Dataset and combine dataset of SensorLocation and SensorReading Dataset
To union means to append, given that both datasets are "Cleaned" and have the same columns the could be appended without any problems. Once the flow is done we export it into a CSV file and the file is ready to be used in Tableau
Inner Joint of output.csv file and Geometry of St Himark File
Here we are given the StHimark.shp file, essentially it tells us about the area code and name, by using the formula MAKEPOINT, we are able to map the [lat] and [long] to the Geometry shapefile.