Difference between revisions of "DataCleaning"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
(Undo revision 4845 by Juehong.ho.2017 (talk))
Tag: Undo
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<div style=background:#CD5C5C border:#A3BFB1>
+
<div style=background:#1c2e4a border:#A3BFB1>
<font size = 5; color="#FFFFFF">St. Hilmark Radiation Analysis</font>
+
[[File:Home-header.jpg|frameless]]
 +
<font size = 5; color="#FFFFFF">VAST 2019 MC2: Citizen Science to the Rescue</font>
 
</div>
 
</div>
 +
 
<!--MAIN HEADER -->
 
<!--MAIN HEADER -->
{|style="background-color:#FA8072;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
+
{|style="background-color:#23395d;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#A52A2A; text-align:center;" width="25%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#23395d; text-align:center;" width="25%" |  
 
;
 
;
[[Overview| <font color="#FFFFFF">Overview</font>]]
+
[[IS428_AY2019-20T1_Assign_Jazreel Tho Wei Wen| <font color="#FFFFFF">Overview</font>]]
  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#A52A2A; text-align:center;" width="25%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#23395d; text-align:center;" width="25%" |  
 
;
 
;
 
[[DataCleaning| <font color="#FFFFFF">Data Cleaning</font>]]
 
[[DataCleaning| <font color="#FFFFFF">Data Cleaning</font>]]
  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#A52A2A; text-align:center;" width="25%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#23395d; text-align:center;" width="25%" |  
 
;
 
;
 
[[Dashboard| <font color="#FFFFFF">Dashboard</font>]]
 
[[Dashboard| <font color="#FFFFFF">Dashboard</font>]]
  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#A52A2A; text-align:center;" width="25%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#23395d; text-align:center;" width="25%" |  
 
;
 
;
  
Line 23: Line 25:
 
|  &nbsp;
 
|  &nbsp;
 
|}
 
|}
 +
 +
 +
 
<br/>
 
<br/>
  
 
<font size="5">'''Cleaning of sensors data'''</font><br>
 
<font size="5">'''Cleaning of sensors data'''</font><br>
[[File:Dataflow.png|thumb]]
+
==Data flow==
{| class="wikitable centered" width="90%"
+
[[File:Dataflow.png|frameless|upright=3]]
!1
+
 
!2
+
==Static sensor readings==
 +
I would like to append the static sensor readings to the mobile sensor readings. However, the fields of both files have to be the same. The missing data in the static sensor readings is the latitude and longitude of the sensor id, which can be found in the static location data. Hence, first I joined the StaticSensorReadings file to the StaticSensorLocations file.
 +
 
 +
[[File:Joinstaticlocation.png|frame|center]]
 +
 
 +
To identify the sensor type, I decided to add a column “Sensor Type” to distinguish between static and mobile sensors.
 +
[[File:Staticsensortype.png|frame|center]]
 +
 
 +
As both files contain the same format of sensor id and some of the sensor id in the static sensor reading can also be found in the mobile sensor reading file but they are not the same sensor.  Thus, a new id has to be generated. For the mobile sensor reading file, I simply concatenated the sensor id with a letter “M” in the front. As for the static reading sensor, I simply concatenated the sensor id with a letter “S” in the front.
 +
 
 +
[[File:Str s.png|frame|center]]
 +
 
 +
The following columns are redundant for analysis:
 +
1. Sensor-id (old sensor id)
 +
2. Sensor-id1 (old sensor id from the static sensor location file)
 +
3. Units
 +
Thus, I decided to delete these 3 columns.
 +
 
 +
==Mobile sensor readings==
 +
Same for the mobile sensor readings file, I added the sensor type as “Mobile”.
 +
[[File:Mobilesensortype.png|frame|center]]
 +
 
 +
I then changed added a new column “Sensor id” to concatenate “M” in front of the original sensor id to differentiate it from the statics sensors.
 +
 
 +
[[File:Sensoridmobile.png|frame|center]]
 +
 
 +
==Union==
 +
[[File:Union.png|frameless|upright=3]]
 +
 
 +
Next, I appended both the mobile and static sensors data together. Once done, I exported the dataset as a csv file.
 +
 
 +
==Import into Tableau==
 +
In order to join the data to the St Himark file, there is a need to join them by the longitude and latitude to the geometry polygon of the St Himark file. In order to do so, I used the function “MAKEPOINT” and see which points intersects which polygon to obtain the neighbourhood for each reading.
 +
 
 +
[[File:Makepoint.png|frameless|upright=3]]

Latest revision as of 01:28, 13 October 2019

Home-header.jpg VAST 2019 MC2: Citizen Science to the Rescue

Overview

Data Cleaning

Dashboard

Question & Answers

 



Cleaning of sensors data

Data flow

Dataflow.png

Static sensor readings

I would like to append the static sensor readings to the mobile sensor readings. However, the fields of both files have to be the same. The missing data in the static sensor readings is the latitude and longitude of the sensor id, which can be found in the static location data. Hence, first I joined the StaticSensorReadings file to the StaticSensorLocations file.

Joinstaticlocation.png

To identify the sensor type, I decided to add a column “Sensor Type” to distinguish between static and mobile sensors.

Staticsensortype.png

As both files contain the same format of sensor id and some of the sensor id in the static sensor reading can also be found in the mobile sensor reading file but they are not the same sensor. Thus, a new id has to be generated. For the mobile sensor reading file, I simply concatenated the sensor id with a letter “M” in the front. As for the static reading sensor, I simply concatenated the sensor id with a letter “S” in the front.

Str s.png

The following columns are redundant for analysis: 1. Sensor-id (old sensor id) 2. Sensor-id1 (old sensor id from the static sensor location file) 3. Units Thus, I decided to delete these 3 columns.

Mobile sensor readings

Same for the mobile sensor readings file, I added the sensor type as “Mobile”.

Mobilesensortype.png

I then changed added a new column “Sensor id” to concatenate “M” in front of the original sensor id to differentiate it from the statics sensors.

Sensoridmobile.png

Union

Union.png

Next, I appended both the mobile and static sensors data together. Once done, I exported the dataset as a csv file.

Import into Tableau

In order to join the data to the St Himark file, there is a need to join them by the longitude and latitude to the geometry polygon of the St Himark file. In order to do so, I used the function “MAKEPOINT” and see which points intersects which polygon to obtain the neighbourhood for each reading.

Makepoint.png