IS428 AY2019-20T1 Assign Kok Jim Meng Data Preparation

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search

StHimarkNeighborhoodMap.png IS428 VAST 2019 Mini Challenge 2

Problem & Motivation

Data Preparation

Interactive Visualisation

Tasks Questions and Answers

References and Comments

 

Data Preparation

The dataset zip file given includes:

  • Mobile Sensor Readings
  • Static Sensor Readings
  • Static Sensor Location
  • Data Description
  • Maps of St Himark

Static Sensor Readings and Static Sensor Location

JM DP Fig1.png
Figure 1: Screenshot of the two Static CSVs Tables

Issue: In the CSV file of Static Sensor Readings, there are no geographic coordinates of the static sensors as the data are in the CSV file of Static Sensor Locations. Solution: Use Tableau Prep to join the two tables into one CSV based on the common field which is Sensor-id and cleaning it by removing the extra sensor-id field. With this, it would be possible to perform a map visualization for Static Sensors using the longitude and latitude in Tableau. The following is the data preparation used in Tableau Prep for Static Sensor data.

JM DP Fig2.png
Figure 2: Join based on Sensor-id
JM DP Fig3.png
Figure 3: Clean the newly joined table by removing the extra sensor-id field
JM DP Fig4.png
Figure 4: Newly joined table is created and preparing to output as a CSV file

Newly created Static Sensor Readings with Location and Motion Sensor Readings CSVs

Issue: Both the newly created Static Sensor Readings with Location and given Motion Sensor Readings tables are separated. Moreover, both tables’ Sensor IDs are just numbers where same numbers appear in both tables. This does not make sense as both static and mobile sensors are different and same numbers appear. Solution: Merge both newly created Static Sensor Readings with Location table with the given Motion Sensor Reading table based on the timestamp when the sensors have detected. Furthermore, I have classified the sensors based on their types – Static, and Mobile. In addition, I have also reassigned the sensors IDs where, X is a number, M-XX is a mobile sensor and S-XX is a static sensor.

The following steps or process is how I have merged both the newly created Static Sensor Readings with Location table with the given Motion Sensor Reading table:

JM DP Fig5.png
Figure 5: Process of merging the two tables

First of all, I have created a new calculated field in the Mobile Sensor Readings table called Mobile-Sensor-id where the formula I have used is:

“M” + “-” + STR([Sensor-id])

This means the concatenation of M and the hyphen and the original Sensor-id data to form the Mobile Sensor ID values. Thereafter, remove the original Sensor-Id field which is not relevant anymore.

JM DP Fig6.png
Figure 6: Create a new field called Mobile-Sensor-id and remove the original Sensor-id

Next, same procedure as above for the Mobile, I have created a new calculated field in the Static Sensor Readings with Locations table called Static-Sensor-id where the formula I have used is:

“S” + “-” + STR([Sensor-id])

This means the concatenation of S and the hyphen and the original Sensor-id data to form the Static Sensor ID values. Thereafter, remove the original Sensor-Id field in this table as it is not relevant anymore.

JM DP Fig7.png
Figure 7: Create a new field called Static-Sensor-id and remove the original Sensor-id

After remaking the sensor ID values in the both tables, I will use Left Join for both tables based on Timestamp as Timestamp is the common field for both tables. The reason why I use Left Join is because Mobile Sensor table has more data than the newly created Static Sensor table. Hence, all data from both tables will be merged together based on Timestamp.

JM DP Fig8.png
Figure 8: Use Left Join to join both tables based on Timestamp field

After merging both tables, there are redundant fields appear that need to be removed. These fields include Units fields and User-Id field, and an extra Timestamp field. Furthermore, some fields are necessary to be renamed such as Lat & Long, and Lat-1 & Long-1 which will be renamed to Mobile-Lat & Mobile-Long, and Static-Lat & Static Long as these are belonged to their sensor types.

JM DP Fig9.png
Figure 9: Remove and rename fields

Thereafter, it is time to classify the sensors based on its type and correspond the values of the sensors based on the classification. Hence, I have created two calculated fields called “Sensor-Classification” and “Value-Combined”. In Sensor-Classification, I have applied this formula to classify the sensors into Mobile Sensors and Static Sensors:

IF Find([Sensor-Id], ‘M’) = 1
THEN “Mobile Sensors”
ELSE “Static Sensors”
END

The above formula means that if the Sensor-Id contains the letter “M” then the sensors is belongs to Mobile Sensors else it’s Static Sensors. As for the “Value-Combined” field, I have applied this formula:

IF [Sensor-Classification] = ‘Mobile Sensors’
THEN [Value]
ELSE [Value-1]
END

This means that if it is Mobile Sensor, the value will be based on the original Mobile Sensors Readings table’s Value field else if it’s Static Sensor then will be based on the Static Sensors Readings with Locations table’s Value field. Thereafter, I have removed both the Value field and Value-1 field.

JM DP Fig10.png
Figure 10: Classify the sensors into Mobile and Static

Lastly, I separate the newly created Latitude field and Longitude field into Mobile’s and Static’s as part of merging into one Latitude field and one Longitude field. This is the formula that I have applied for both Latitude and Longitude:

IF [Sensor-Classification] = 'Static Sensors'
THEN [Static-Lat]
ELSE [Mobile-Lat]
END

IF [Sensor-Classification] = 'Static Sensors'
THEN [Static-Long]
ELSE [Mobile-Long]
END

This means that based on the sensor type, that latitude and longitude are belonged to the respective sensor type. Thereafter, I removed the Mobile-Latitude, Mobile-Longitude, Static-Latitude, and Static-Longitude fields.

JM DP Fig11.png
Figure 11: Managing the Latitude and Longitude fields.

Finally, I have created the new output that is used for the Tableau visualisation for this assignment.

JM DP Fig12.png
Figure 12: Preparing to output as a CSV file which needs to be used in constructing charts and dashboards in Tableau