DataCleaning

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search

Home-header.jpg VAST 2019 MC2: Citizen Science to the Rescue

Overview

Data Cleaning

Dashboard

Question & Answers

 



Cleaning of sensors data

Data flow

Dataflow.png

Static sensor readings

I would like to append the static sensor readings to the mobile sensor readings. However, the fields of both files have to be the same. The missing data in the static sensor readings is the latitude and longitude of the sensor id, which can be found in the static location data. Hence, first I joined the StaticSensorReadings file to the StaticSensorLocations file.

Joinstaticlocation.png

To identify the sensor type, I decided to add a column “Sensor Type” to distinguish between static and mobile sensors.

Staticsensortype.png

As both files contain the same format of sensor id and some of the sensor id in the static sensor reading can also be found in the mobile sensor reading file but they are not the same sensor. Thus, a new id has to be generated. For the mobile sensor reading file, I simply concatenated the sensor id with a letter “M” in the front. As for the static reading sensor, I simply concatenated the sensor id with a letter “S” in the front.

Str s.png

The following columns are redundant for analysis: 1. Sensor-id (old sensor id) 2. Sensor-id1 (old sensor id from the static sensor location file) 3. Units Thus, I decided to delete these 3 columns.

Mobile sensor readings

Same for the mobile sensor readings file, I added the sensor type as “Mobile”.

Mobilesensortype.png

I then changed added a new column “Sensor id” to concatenate “M” in front of the original sensor id to differentiate it from the statics sensors.

Sensoridmobile.png

Union

Union.png

Next, I appended both the mobile and static sensors data together. Once done, I exported the dataset as a csv file.

Import into Tableau

In order to join the data to the St Himark file, there is a need to join them by the longitude and latitude to the geometry polygon of the St Himark file. In order to do so, I used the function “MAKEPOINT” and see which points intersects which polygon to obtain the neighbourhood for each reading.

Makepoint.png