Difference between revisions of "DataCleaning"
(Undo revision 4845 by Juehong.ho.2017 (talk)) Tag: Undo |
|||
(11 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | <div style=background:# | + | <div style=background:#1c2e4a border:#A3BFB1> |
− | <font size = 5; color="#FFFFFF"> | + | [[File:Home-header.jpg|frameless]] |
+ | <font size = 5; color="#FFFFFF">VAST 2019 MC2: Citizen Science to the Rescue</font> | ||
</div> | </div> | ||
+ | |||
<!--MAIN HEADER --> | <!--MAIN HEADER --> | ||
− | {|style="background-color:# | + | {|style="background-color:#23395d;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0" | |
− | | style="font-family:Century Gothic; font-size:100%; solid #000000; background:# | + | | style="font-family:Century Gothic; font-size:100%; solid #000000; background:#23395d; text-align:center;" width="25%" | |
; | ; | ||
− | [[ | + | [[IS428_AY2019-20T1_Assign_Jazreel Tho Wei Wen| <font color="#FFFFFF">Overview</font>]] |
− | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:# | + | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#23395d; text-align:center;" width="25%" | |
; | ; | ||
[[DataCleaning| <font color="#FFFFFF">Data Cleaning</font>]] | [[DataCleaning| <font color="#FFFFFF">Data Cleaning</font>]] | ||
− | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:# | + | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#23395d; text-align:center;" width="25%" | |
; | ; | ||
[[Dashboard| <font color="#FFFFFF">Dashboard</font>]] | [[Dashboard| <font color="#FFFFFF">Dashboard</font>]] | ||
− | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:# | + | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#23395d; text-align:center;" width="25%" | |
; | ; | ||
Line 23: | Line 25: | ||
| | | | ||
|} | |} | ||
+ | |||
+ | |||
+ | |||
<br/> | <br/> | ||
<font size="5">'''Cleaning of sensors data'''</font><br> | <font size="5">'''Cleaning of sensors data'''</font><br> | ||
− | [[File:Dataflow.png| | + | ==Data flow== |
+ | [[File:Dataflow.png|frameless|upright=3]] | ||
==Static sensor readings== | ==Static sensor readings== | ||
Line 46: | Line 52: | ||
Thus, I decided to delete these 3 columns. | Thus, I decided to delete these 3 columns. | ||
− | == | + | ==Mobile sensor readings== |
Same for the mobile sensor readings file, I added the sensor type as “Mobile”. | Same for the mobile sensor readings file, I added the sensor type as “Mobile”. | ||
[[File:Mobilesensortype.png|frame|center]] | [[File:Mobilesensortype.png|frame|center]] | ||
+ | |||
+ | I then changed added a new column “Sensor id” to concatenate “M” in front of the original sensor id to differentiate it from the statics sensors. | ||
+ | |||
+ | [[File:Sensoridmobile.png|frame|center]] | ||
+ | |||
+ | ==Union== | ||
+ | [[File:Union.png|frameless|upright=3]] | ||
+ | |||
+ | Next, I appended both the mobile and static sensors data together. Once done, I exported the dataset as a csv file. | ||
+ | |||
+ | ==Import into Tableau== | ||
+ | In order to join the data to the St Himark file, there is a need to join them by the longitude and latitude to the geometry polygon of the St Himark file. In order to do so, I used the function “MAKEPOINT” and see which points intersects which polygon to obtain the neighbourhood for each reading. | ||
+ | |||
+ | [[File:Makepoint.png|frameless|upright=3]] |
Latest revision as of 01:28, 13 October 2019
|
|
|
|
Cleaning of sensors data
Contents
Data flow
Static sensor readings
I would like to append the static sensor readings to the mobile sensor readings. However, the fields of both files have to be the same. The missing data in the static sensor readings is the latitude and longitude of the sensor id, which can be found in the static location data. Hence, first I joined the StaticSensorReadings file to the StaticSensorLocations file.
To identify the sensor type, I decided to add a column “Sensor Type” to distinguish between static and mobile sensors.
As both files contain the same format of sensor id and some of the sensor id in the static sensor reading can also be found in the mobile sensor reading file but they are not the same sensor. Thus, a new id has to be generated. For the mobile sensor reading file, I simply concatenated the sensor id with a letter “M” in the front. As for the static reading sensor, I simply concatenated the sensor id with a letter “S” in the front.
The following columns are redundant for analysis: 1. Sensor-id (old sensor id) 2. Sensor-id1 (old sensor id from the static sensor location file) 3. Units Thus, I decided to delete these 3 columns.
Mobile sensor readings
Same for the mobile sensor readings file, I added the sensor type as “Mobile”.
I then changed added a new column “Sensor id” to concatenate “M” in front of the original sensor id to differentiate it from the statics sensors.
Union
Next, I appended both the mobile and static sensors data together. Once done, I exported the dataset as a csv file.
Import into Tableau
In order to join the data to the St Himark file, there is a need to join them by the longitude and latitude to the geometry polygon of the St Himark file. In order to do so, I used the function “MAKEPOINT” and see which points intersects which polygon to obtain the neighbourhood for each reading.