Difference between revisions of "ISSS608 2017-18 T3 Assign Tan Yong Ying Data Overview and Cleaning"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 28: Line 28:
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! # !! Dataset Description !! Screenshot and Metadata
+
! Dataset Description !! Screenshot and Metadata
 
|-
 
|-
| 1 || Calls and songs from known birds in the Boonsong Lekagul Wildlife Preserve. They are in MP3 format with varying lengths. The file name contains the species name and an integer that can be referenced with AllBirdsv4.csv for its metadata. || [[Image:AllBirdsFileNames.PNG|center|frame|'''Screenshot of file names provided by Mistford College''']]
+
| Calls and songs from known birds in the Boonsong Lekagul Wildlife Preserve. They are in MP3 format with varying lengths. The file name contains the species name and an integer that can be referenced with AllBirdsv4.csv for its metadata. || [[Image:AllBirdsFileNames.PNG|center|frame|'''Screenshot of file names provided by Mistford College''']]
 
|-
 
|-
| 2 || “AllBirdsv4.csv” contains metadata for the calls and songs from known birds in eight variables.  
+
| “AllBirdsv4.csv” contains metadata for the calls and songs from known birds in eight variables.  
 
||  
 
||  
 
{| class="wikitable" style="margin: auto;"
 
{| class="wikitable" style="margin: auto;"
Line 56: Line 56:
 
|} <br> [[Image:AllBirdsMetadata.PNG|800px|border]]
 
|} <br> [[Image:AllBirdsMetadata.PNG|800px|border]]
 
|-
 
|-
| 3 || Fifteen bird sound files provided by Kasios to support their claim that Rose-crested Blue Pipits are still plentiful across the Preserve. They were recorded over the past few months and are in MP3 format. The file names are integers between 1 and 15, and its metadata can be referenced with Test Bird Locations.csv. || [[Image:TestBirdsFileNames.PNG|center|frame|'''Screenshot of file names provided by Kasios''']]  
+
| Fifteen bird sound files provided by Kasios to support their claim that Rose-crested Blue Pipits are still plentiful across the Preserve. They were recorded over the past few months and are in MP3 format. The file names are integers between 1 and 15, and its metadata can be referenced with Test Bird Locations.csv. || [[Image:TestBirdsFileNames.PNG|center|frame|'''Screenshot of file names provided by Kasios''']]  
 
|-
 
|-
| 4 || “Test Bird Locations.csv” contains metadata for the Kasios files in three variables.  
+
| “Test Bird Locations.csv” contains metadata for the Kasios files in three variables.  
 
||
 
||
 
{| class="wikitable" style="margin: auto;"
 
{| class="wikitable" style="margin: auto;"
Line 72: Line 72:
 
|} <br> [[Image:TestBirdsMetadata.PNG|center|frame|'''Screenshot of Test Birds Locations.csv''']]
 
|} <br> [[Image:TestBirdsMetadata.PNG|center|frame|'''Screenshot of Test Birds Locations.csv''']]
 
|-
 
|-
| 5 || “Lekagul Roadways 2018.bmp” is a 200 by 200 pixel map of the Preserve which outlines the roadways through the site. The bottom left of the map has coordinate (0,0) and the top right of the map has coordinate (199,199). Note that the alleged dumping site was centered around coordinate (148, 159) and its extent has not been thoroughly studied. || [[Image:Roadways.jpg|center|frame|'''Map of Lekagul Preserve with roadways''']]
+
| “Lekagul Roadways 2018.bmp” is a 200 by 200 pixel map of the Preserve which outlines the roadways through the site. The bottom left of the map has coordinate (0,0) and the top right of the map has coordinate (199,199). Note that the alleged dumping site was centered around coordinate (148, 159) and its extent has not been thoroughly studied. || [[Image:Roadways.jpg|center|frame|'''Map of Lekagul Preserve with roadways''']]
 
|}
 
|}
 
+
<br>
 +
<div style="border-style: solid; border-width:0; background: #c8bdb9; padding: 7px; font-weight: bold; text-align:left; line-height: wrap_content; text-indent: 20px; font-size:20px; font-family:Century Gothic;border-bottom:5px solid white; border-top:5px solid black"><font color= #000000>Data Cleaning</font></div>
 +
Out of the 5 pieces of data listed above, only AllBirdsv4.csv requires data cleaning to remove values that cannot be imputed or replaced manually through guessing or inference. The data cleaning outcome for each variable in AllBirdsv4.csv is as follows:
 +
# '''File ID'''
 
Banner image credit to: [https://www.flickr.com/photos/23660854@N07/24385545393 Marshal Hedin]
 
Banner image credit to: [https://www.flickr.com/photos/23660854@N07/24385545393 Marshal Hedin]

Revision as of 21:05, 6 July 2018

Bird TYY.png VAST Challenge 2018: Suspense at the Wildlife Preserve

Overview

Data Overview and Cleaning

Application Design

Insights

Conclusion


Data Overview

For this challenge, we were provided with the following data:

Dataset Description Screenshot and Metadata
Calls and songs from known birds in the Boonsong Lekagul Wildlife Preserve. They are in MP3 format with varying lengths. The file name contains the species name and an integer that can be referenced with AllBirdsv4.csv for its metadata.
Screenshot of file names provided by Mistford College
“AllBirdsv4.csv” contains metadata for the calls and songs from known birds in eight variables.
Description of each variable in AllBirdsv4.csv
Variable Name Description
File ID The integer index of the file names for calls and songs from known birds
English_name The common English name for the recorded bird
Vocalization_type Type of bird sound, typically a “call” or a “song”
Quality Quality of the recorded bird sound (A, B, C, D, E or no score). A recording with loud background noise or low bird sound purity has a lower quality.
Time Time the recording was captured
Date Date the recording was captured
X X coordinate of the location the recording was captured at
Y Y coordinate of the location the recording was captured at

AllBirdsMetadata.PNG
Fifteen bird sound files provided by Kasios to support their claim that Rose-crested Blue Pipits are still plentiful across the Preserve. They were recorded over the past few months and are in MP3 format. The file names are integers between 1 and 15, and its metadata can be referenced with Test Bird Locations.csv.
Screenshot of file names provided by Kasios
“Test Bird Locations.csv” contains metadata for the Kasios files in three variables.
Description of each variable in Test Bird Locations.csv
Variable Name Description
ID The integer index of the file names provided by Kasios
X X coordinate of the location the Kasios recording was captured at
Y Y coordinate of the location the Kasios recording was captured at

Screenshot of Test Birds Locations.csv
“Lekagul Roadways 2018.bmp” is a 200 by 200 pixel map of the Preserve which outlines the roadways through the site. The bottom left of the map has coordinate (0,0) and the top right of the map has coordinate (199,199). Note that the alleged dumping site was centered around coordinate (148, 159) and its extent has not been thoroughly studied.
Map of Lekagul Preserve with roadways


Data Cleaning

Out of the 5 pieces of data listed above, only AllBirdsv4.csv requires data cleaning to remove values that cannot be imputed or replaced manually through guessing or inference. The data cleaning outcome for each variable in AllBirdsv4.csv is as follows:

  1. File ID

Banner image credit to: Marshal Hedin