IS428 AY2019-20T1 Assign Kok Jim Meng Data Preparation
|
|
|
|
|
Data Preparation
The dataset zip file given includes:
- Mobile Sensor Readings
- Static Sensor Readings
- Static Sensor Location
- Data Description
- Maps of St Himark
Static Sensor Readings and Static Sensor Location
Issue: In the CSV file of Static Sensor Readings, there are no geographic coordinates of the static sensors as the data are in the CSV file of Static Sensor Locations. Solution: Use Tableau Prep to join the two tables into one CSV based on the common field which is Sensor-id and cleaning it by removing the extra sensor-id field. With this, it would be possible to perform a map visualization for Static Sensors using the longitude and latitude in Tableau. The following is the data preparation used in Tableau Prep for Static Sensor data.
Newly created Static Sensor Readings with Location and Motion Sensor Readings CSVs
Issue: Both the newly created Static Sensor Readings with Location and given Motion Sensor Readings tables are separated. Moreover, both tables’ Sensor IDs are just numbers where same numbers appear in both tables. This does not make sense as both static and mobile sensors are different and same numbers appear. Solution: Merge both newly created Static Sensor Readings with Location table with the given Motion Sensor Reading table based on the timestamp when the sensors have detected. Furthermore, I have classified the sensors based on their types – Static, and Mobile. In addition, I have also reassigned the sensors IDs where, X is a number, M-XX is a mobile sensor and S-XX is a static sensor.
The following steps or process is how I have merged both the newly created Static Sensor Readings with Location table with the given Motion Sensor Reading table:
First of all, I have created a new calculated field in the Mobile Sensor Readings table called Mobile-Sensor-id where the formula I have used is:
This means the concatenation of M and the hyphen and the original Sensor-id data to form the Mobile Sensor ID values. Thereafter, remove the original Sensor-Id field which is not relevant anymore.
Next, same procedure as above for the Mobile, I have created a new calculated field in the Static Sensor Readings with Locations table called Static-Sensor-id where the formula I have used is:
This means the concatenation of S and the hyphen and the original Sensor-id data to form the Static Sensor ID values. Thereafter, remove the original Sensor-Id field in this table as it is not relevant anymore.
After remaking the sensor ID values in the both tables, I will use Left Join for both tables based on Timestamp as Timestamp is the common field for both tables. The reason why I use Left Join is because Mobile Sensor table has more data than the newly created Static Sensor table. Hence, all data from both tables will be merged together based on Timestamp.
After merging both tables, there are redundant fields appear that need to be removed. These fields include Units fields and User-Id field, and an extra Timestamp field. Furthermore, some fields are necessary to be renamed such as Lat & Long, and Lat-1 & Long-1 which will be renamed to Mobile-Lat & Mobile-Long, and Static-Lat & Static Long as these are belonged to their sensor types.
Thereafter, it is time to classify the sensors based on its type and correspond the values of the sensors based on the classification. Hence, I have created two calculated fields called “Sensor-Classification” and “Value-Combined”. In Sensor-Classification, I have applied this formula to classify the sensors into Mobile Sensors and Static Sensors:
The above formula means that if the Sensor-Id contains the letter “M” then the sensors is belongs to Mobile Sensors else it’s Static Sensors. As for the “Value-Combined” field, I have applied this formula:
This means that if it is Mobile Sensor, the value will be based on the original Mobile Sensors Readings table’s Value field else if it’s Static Sensor then will be based on the Static Sensors Readings with Locations table’s Value field. Thereafter, I have removed both the Value field and Value-1 field.
Lastly, I separate the newly created Latitude field and Longitude field into Mobile’s and Static’s as part of merging into one Latitude field and one Longitude field. This is the formula that I have applied for both Latitude and Longitude:
This means that based on the sensor type, that latitude and longitude are belonged to the respective sensor type. Thereafter, I removed the Mobile-Latitude, Mobile-Longitude, Static-Latitude, and Static-Longitude fields.
Finally, I have created the new output that is used for the Tableau visualisation for this assignment.