IS428 AY2019-20T1 Assign Kok Jim Meng Data Preparation
|
|
|
|
|
Data Preparation
The dataset zip file given includes:
- Mobile Sensor Readings
- Static Sensor Readings
- Static Sensor Location
- Data Description
- Maps of St Himark
Static Sensor Readings and Static Sensor Location
Issue: In the CSV file of Static Sensor Readings, there are no geographic coordinates of the static sensors as the data are in the CSV file of Static Sensor Locations. Solution: Use Tableau Prep to join the two tables into one CSV based on the common field which is Sensor-id and cleaning it by removing the extra sensor-id field. With this, it would be possible to perform a map visualization for Static Sensors using the longitude and latitude in Tableau. The following is the data preparation used in Tableau Prep for Static Sensor data.
Newly created Static Sensor Readings with Location and Motion Sensor Readings CSVs
Issue: Both the newly created Static Sensor Readings with Location and given Motion Sensor Readings tables are separated. Moreover, both tables’ Sensor IDs are just numbers where same numbers appear in both tables. This does not make sense as both static and mobile sensors are different and same numbers appear. Solution: Merge both newly created Static Sensor Readings with Location table with the given Motion Sensor Reading table based on the timestamp when the sensors have detected. Furthermore, I have classified the sensors based on their types – Static, and Mobile. In addition, I have also reassigned the sensors IDs where, X is a number, M-XX is a mobile sensor and S-XX is a static sensor.
The following steps or process is how I have merged both the newly created Static Sensor Readings with Location table with the given Motion Sensor Reading table:
First of all, I have created a new calculated field in the Mobile Sensor Readings table called Mobile-Sensor-id where the formula I have used is:
This means the concatenation of M and the hyphen and the original Sensor-id data to form the Mobile Sensor ID values. Thereafter, remove the original Sensor-Id field which is not relevant anymore.
Next, same procedure as above for the Mobile, I have created a new calculated field in the Static Sensor Readings with Locations table called Static-Sensor-id where the formula I have used is:
This means the concatenation of S and the hyphen and the original Sensor-id data to form the Static Sensor ID values. Thereafter, remove the original Sensor-Id field in this table as it is not relevant anymore.
After remaking the sensor ID values in the both tables, I will use Left Join for both tables based on Timestamp as Timestamp is the common field for both tables. The reason why I use Left Join is because Mobile Sensor table has more data than the newly created Static Sensor table. Hence, all data from both tables will be merged together based on Timestamp.
After merging both tables, there are redundant fields appear that need to be removed. These fields include Units fields and User-Id field, and an extra Timestamp field. Furthermore, some fields are necessary to be renamed such as Lat & Long, and Lat-1 & Long-1 which will be renamed to Mobile-Lat & Mobile-Long, and Static-Lat & Static Long as these are belonged to their sensor types.
Thereafter, it is time to classify the sensors based on its type and correspond the values of the sensors based on the classification. Hence, I have created two calculated fields called “Sensor-Classification” and “Value-Combined”. In Sensor-Classification, I have applied this formula to classify the sensors into Mobile Sensors and Static Sensors:
The above formula means that if the Sensor-Id contains the letter “M” then the sensors is belongs to Mobile Sensors else it’s Static Sensors. As for the “Value-Combined” field, I have applied this formula: