Difference between revisions of "IS428 AY2019-20T1 Assign Sean Chai Shong Hee Data Preprosessing"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
 
(24 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
<div style="background: #f48024 ; letter-spacing:-0.08em;font-size:20px"><font color=#f48024 face="Century Gothic">Mini Case Challenge 2: Visualising Radiation Measurements in St. Himark</font></div>
 +
 
<!--MAIN HEADER -->
 
<!--MAIN HEADER -->
{|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
+
{|style="background-color:#f48024;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#2B3856; text-align:center;" width="20%" |  
+
| style="font-family:Century Gothic; font-size:90%; solid #f48024 ; border-bottom:0px solid #f48024 ; background:#f48024 ; text-align:center;" width="25%" |  
;
+
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee| <font color="#fff">Problem And Motivation</font>]]
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee| <font color="#FFFFFF">Problem And Motivation</font>]]
+
| &nbsp;
  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" |  
+
| style="font-family:Century Gothic; font-size:90%; solid #f48024 ; border-bottom:0px solid #f48024 ; background:#bcbbbb; text-align:center;" width="25%" |  
;
+
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee_Data_Preprosessing| <font color="#fff">Data Preprocessing</font>]]
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee_Data_Preprosessing| <font color="#FFFFFF">Data Preprocessing</font>]]
+
| &nbsp;
  
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#2B3856; text-align:center;" width="20%" |  
+
| style="font-family:Century Gothic; font-size:90%; solid #f48024 ; border-bottom:0px solid #f48024 ; background:#f48024 ; text-align:center;" width="25%" |  
;
+
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee_Interactive_Visualisation| <font color="#fff">Interactive Visualisation</font>]]
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee_Interactive_Visualisation| <font color="#FFFFFF">Interactive Visualisation</font>]]
+
| &nbsp;
  
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#2B3856; text-align:center;" width="20%" |  
+
| style="font-family:Century Gothic; font-size:90%; solid #f48024 ; border-bottom:0px solid #f48024 ; background:#f48024 ; text-align:center;" width="30%" |  
;
+
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee_Interesting_Anomalies_&_Observations| <font color="#fff">Interesting Anomalies & Observations</font>]]
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee_Interesting_Anomalies_&_Observations| <font color="#FFFFFF">Interesting Anomalies & Observations</font>]]
+
| &nbsp;
  
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#2B3856; text-align:center;" width="20%" |
 
;
 
[[IS428_AY2019-20T1_Assign_Sean_Chai_Shong_Hee_References| <font color="#FFFFFF">References</font>]]
 
 
|  &nbsp;
 
 
|}
 
|}
  
== Data Preprocessing ==
+
== Data Preprocessing with Pandas ==
 
 
 
<p>Data was provided in the form of 3 csv files:</p>
 
<p>Data was provided in the form of 3 csv files:</p>
 
*MobileSensorReadings.csv: Consists of measures of radiation values in count per minute. Also contained the longitude and latitude of where the values were collected, and the respective time they were recorded
 
*MobileSensorReadings.csv: Consists of measures of radiation values in count per minute. Also contained the longitude and latitude of where the values were collected, and the respective time they were recorded
Line 31: Line 27:
 
*StaticSensorLocations.csv: Consists of the longitude and latitude of each static sensor  
 
*StaticSensorLocations.csv: Consists of the longitude and latitude of each static sensor  
  
<p>A Shapefile was also provided, which contained Polygons of the region of St. Himark</p>
+
<p>A polygon Shapefile was also provided, which provides the spatial representation of St. Himark</p>
  
<p>Before any meaningful observations could be made with Tableau, the csv files had to be preprocessed to allow Tableau to carry out a spatial join between the point and polygon data. As such, the longitude and latitude values were converted to Point geometry values using geoPandas. The geoPandas Dataframe was later saved as a Shapefile.</p>
+
<p>Before any meaningful observations can be made with Tableau, the csv files have to preprocessed to allow Tableau to carry out a spatial join between point and polygon geometric data. As such, the longitude and latitude values needs to be converted to Point geometry values using GeoPandas. The GeoDataframe is later exported into a Shapefile.</p>
  
== 1. Importing python libraries ==
+
=== Importing python libraries ===
 
[[File:Importing relevant libraries.png|thumb|none]]
 
[[File:Importing relevant libraries.png|thumb|none]]
<p>We first start off by importing libraries that are necessary in our creation of our new Shapefile. These libraries include Pandas, Shapely and GeoPandas.</p>
+
<p>I first start off by importing libraries that are necessary in the creation of the new Shapefiles. These libraries include Pandas, Shapely and GeoPandas.</p>
  
== 2. Reading csv files to Pandas Dataframe ==  
+
=== Reading csv files to Pandas Dataframe ===
 
[[File:Reading StaticSensorReadings.csv and StaticSensorLocations.csv.png|thumb|none]]
 
[[File:Reading StaticSensorReadings.csv and StaticSensorLocations.csv.png|thumb|none]]
<p>Next, we read our csv files to a Pandas Dataframe.</p>
+
<p>Next, I read the csv files into a Pandas Dataframe.</p>
 +
 
 +
=== Merging StaticSensorReadings and StaticSensorLocations ===
 +
[[File:Merging dataframes.png|thumb|none]]
 +
<p>I proceed by merging the static_reading dataframe with the static_loc dataframe. Here, a full outer join on "Sensor-id" is done to ensure all values are accounted for, even if they are null.</p>
 +
 
 +
=== Creating the GeoDataframe ===
 +
[[File:GeoPandas Dataframe.png|thumb|none]]
 +
<p>Longitude and Latitude values are first converted into geometry point values. Here, I append the geometry point values into a new column called "geometry" and create my new GeoDataframe</p>
 +
 
 +
[[File:Sample of dataframe.png|thumb|none]]
 +
<p>We can see that I now have a new column called "geometry", which contains geometry point values for each of the rows</p>
 +
 
 +
=== Exporting our Dataframe to a Shapefile ===
 +
[[File:Exporting Shapefile.png|thumb|none]]
 +
<p>Finally, I export my GeoDataframe to a Shapefile. This Shapefile can be later used for spatial join with the provided "StHimark.shp"</p>
 +
 
 +
=== Creating Shapefile for MobileSensorReadings ===
 +
[[File:Creating MobileSensorReadings Shapefile.png|thumb|none]]
 +
<p>Following the steps above, I did the same for "MobileSensorReadings.csv" and generated my mobile sensors Shapefile.</p>
 +
 
 +
== Tableau ==
 +
<p>With my new Shapefile created, I will now begin to work on visualisations in Tableau.</p>
 +
 
 +
<p>Tableau supports the matching of locations of geographic points from one data table to polygons of another data table. This is done by using the predicate "intersect".
  
== 3. Merging StaticSensorReadings and StaticSensorLocations ==
+
To leverage on Tableau's "intersect" function, I opened my static sensor Shapefile as my primary data file, followed by the provided "StHimark.shp" Shapefile.  
[[File:Merging dataframes.png|thumb]]
 
<p>We proceed by merging our static_reading dataframe with our static_loc dataframe. Here, we do a full outer join on "Sensor-id" to ensure all values are accounted for, even if they are null.</p>
 
  
== 4. Creating our GeoPandas Dataframe ==
+
[[File:Spatial join (intersect).png|thumb|none]]
[[File:GeoPandas Dataframe.png|thumb]]
 
<p>We first convert our Longitude and Latitude into geometry point values. Here, we append the geometry point values into a new column called "geometry" and create our new geo Dataframe</p>
 
  
[[File:Sample of dataframe.png|thumb]]
+
Establish a full outer join for both files to ensure no values are left out. The two data tables will be connected by their geometries, where point geometry from my static sensor Shapefile intersects the polygon of the "StHimark" Shapefile.
<p>We can see that we now have a new column called "geometry", which contains geometry point values for each of the rows</p>
 
  
 +
[[File:Mobile sensor spatial join (intersect).png|thumb|none]]
  
 +
The same will be done for my mobile sensor Shapefile. A new data source is created, with my mobile sensor Shapefile being my primary data file, followed by the "StHimark.shp" Shapefile.
  
== 5.
+
[[File:Mobile and Static Data.png|thumb|none]]
 +
The above picture shows a sample of how my data files look like after intersecting point geometry values with polygon.</p>

Latest revision as of 00:37, 12 October 2019

Mini Case Challenge 2: Visualising Radiation Measurements in St. Himark

Problem And Motivation

 

Data Preprocessing

 

Interactive Visualisation

 

Interesting Anomalies & Observations

 

Data Preprocessing with Pandas

Data was provided in the form of 3 csv files:

  • MobileSensorReadings.csv: Consists of measures of radiation values in count per minute. Also contained the longitude and latitude of where the values were collected, and the respective time they were recorded
  • StaticSensorReadings.csv: Consists of measure of radiation values in count per minute. Also contained the respective time at which the values were recorded.
  • StaticSensorLocations.csv: Consists of the longitude and latitude of each static sensor

A polygon Shapefile was also provided, which provides the spatial representation of St. Himark

Before any meaningful observations can be made with Tableau, the csv files have to preprocessed to allow Tableau to carry out a spatial join between point and polygon geometric data. As such, the longitude and latitude values needs to be converted to Point geometry values using GeoPandas. The GeoDataframe is later exported into a Shapefile.

Importing python libraries

Importing relevant libraries.png

I first start off by importing libraries that are necessary in the creation of the new Shapefiles. These libraries include Pandas, Shapely and GeoPandas.

Reading csv files to Pandas Dataframe

Reading StaticSensorReadings.csv and StaticSensorLocations.csv.png

Next, I read the csv files into a Pandas Dataframe.

Merging StaticSensorReadings and StaticSensorLocations

Merging dataframes.png

I proceed by merging the static_reading dataframe with the static_loc dataframe. Here, a full outer join on "Sensor-id" is done to ensure all values are accounted for, even if they are null.

Creating the GeoDataframe

GeoPandas Dataframe.png

Longitude and Latitude values are first converted into geometry point values. Here, I append the geometry point values into a new column called "geometry" and create my new GeoDataframe

Sample of dataframe.png

We can see that I now have a new column called "geometry", which contains geometry point values for each of the rows

Exporting our Dataframe to a Shapefile

Exporting Shapefile.png

Finally, I export my GeoDataframe to a Shapefile. This Shapefile can be later used for spatial join with the provided "StHimark.shp"

Creating Shapefile for MobileSensorReadings

Creating MobileSensorReadings Shapefile.png

Following the steps above, I did the same for "MobileSensorReadings.csv" and generated my mobile sensors Shapefile.

Tableau

With my new Shapefile created, I will now begin to work on visualisations in Tableau.

Tableau supports the matching of locations of geographic points from one data table to polygons of another data table. This is done by using the predicate "intersect". To leverage on Tableau's "intersect" function, I opened my static sensor Shapefile as my primary data file, followed by the provided "StHimark.shp" Shapefile.

Spatial join (intersect).png

Establish a full outer join for both files to ensure no values are left out. The two data tables will be connected by their geometries, where point geometry from my static sensor Shapefile intersects the polygon of the "StHimark" Shapefile.

Mobile sensor spatial join (intersect).png

The same will be done for my mobile sensor Shapefile. A new data source is created, with my mobile sensor Shapefile being my primary data file, followed by the "StHimark.shp" Shapefile.

Mobile and Static Data.png

The above picture shows a sample of how my data files look like after intersecting point geometry values with polygon.