ISSS608 2018-19 T1 Assign Oh Zhen Yao Matthias Data Processing
Jump to navigation
Jump to search
|
|
|
|
|
Data Overview
For this challenge, we were provided with the following data:
Dataset Description | Screenshot and Metadata | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Calls and songs from known birds in the Boonsong Lekagul Wildlife Preserve. They are in MP3 format with varying lengths. The file name contains the species name and an integer that can be referenced with AllBirdsv4.csv for its metadata. | File:AllBirdsFileNames.PNG Screenshot of file names provided by Mistford College | ||||||||||||||||||
2. “AllBirdsv4.csv” contains metadata for the calls and songs from known birds in eight variables. |
800px |
Data Cleaning
Out of the 5 pieces of data listed above, only AllBirdsv4.csv requires data cleaning to remove values that cannot be imputed or replaced manually through guessing or inference. The data cleaning outcome for each variable in AllBirdsv4.csv is as follows:
- File ID
This variable has no invalid values, meaning all File IDs are valid integer values. - English_name
This variable has no invalid values. The summary shows we have recordings from 19 unique known species in the Preserve provided to us.
- Y
Two records have invalid values for the Y variable, and they are removed from further analysis.
Banner image credit to: Marshal Hedin