|
|
Line 203: |
Line 203: |
| * Apply Monti-Carlo simulation | | * Apply Monti-Carlo simulation |
| * Visualise significance based on grey band | | * Visualise significance based on grey band |
| + | |
| + | |- |
| + | | |
| + | 5. |
| + | || |
| + | <b>Audio Processing</b> |
| + | || |
| + | <b>i. Data Preparation (Density-Based Measure)</b> |
| + | * Read in MP3 Files (Training & Testing Data) |
| + | * Convert to .wav format using `writeWav()` |
| + | * Convert .wav files to data frame using `analyzeFolder()` |
| + | * Read in data frame |
| + | |
| + | |
| + | <b>ii. Audio Extraction & Manipulation </b> |
| + | * Extract only 1 of 2 channels (choose left). |
| + | * Convert each sound array to floating point values ranging from -1 to 1. |
| + | |
| + | |
| + | <b>iii. Adjust Parameters (sigma) </b> |
| + | * Adjust the plots by using the sigma of the most dense cluster |
| + | ** This is typically the largest sigma |
| + | |
| + | |
| + | <b>iv. Fine-Tune for Clearer Visualisation </b> |
| + | * Then add in the dumping site & adjust the colour/size |
| + | * So that we can visualize the clusters relative to the dumping site |
| + | |
| |} | | |} |
| </div> | | </div> |
| <br> | | <br> |
Revision as of 21:24, 22 June 2018
“Mine dear rose pipits, whence did do thou vanish?”
Tools
R is the primary tool used in this analysis. The following lists the packages used for the project’s scope - for data cleaning, data visualisation, geospatial analysis and audio processing.
- R libraries
- sp
- rgdal
- sf
- raster
- spatstat
- maptools
- gplots
- ggplot2
- ggmap
- rasterVis
- lattice
- latticeExtra
- tidyverse
- zoo
- tmap
- reshape2
- quantmod
- ggTimeSeries
- viridis
- rlang
- soundgen
- tuneR
- phonTools
- seewave
|
Approach Taken
The following outlines the approach used for the analysis.
Step
|
Approach
|
Description
|
1.
|
Data Understanding
|
i. Read in Raster Layer (Lekagul Roadways Map)
- It is a single layer raster file. 200x200.
class : RasterLayer
dimensions : 200, 200, 40000 (nrow, ncol, ncell)
resolution : 1, 1 (x, y)
extent : 0, 200, 0, 200 (xmin, xmax, ymin, ymax)
coord. ref. : NA
names : Lekagul_Roadways_2018
values : 0, 255 (min, max)
ii. Find out structure of Raster Layer
Extent : 40000
CRS arguments : NA
File Size : 41078
Object Size : 14376 bytes
Layer : 1
|
2.
|
Data Cleaning
|
i. Import two CSV Files (Birds)
- 2081 Training Birds (Metadata)
- 15 Test Birds (Provided by Kasios)
ii. Fix Data Quality Issues
- Change File ID from numeric to character
- Change coordinates to numeric
- Change Date from Character to Date
- Omit the two NA values for the Y coordinate.
- Clean the Dates (All standardise to m/d/y. For missing month/year, I will replace with NA. For missing day, I will impute as 1st day of the month.)
- Clean the Timing (Standardise all to 24 hour formatting. Use “.” instead of ":")
- Clean the Vocalisation Type (Standardise all to lower case. For values consisting of both ‘song and call’, change to ‘call’, assumed as a sign of distress while ‘song’ is assumed as the default)
- Clean the Quality (Recode ‘no score’ as ‘NA’)
iii. Data Manipulation
- Extract out the “Year” and “Month” from the date, as new columns
- Create a new column for Quarter (Q1,Q2,Q3,Q4) & Season (Spring, Summer, Fall, Winter)
iv. Geospatial File Compatibility
- Convert CSV file (2081 birds) into the following:
- spatial point data frame
- sp format
- shp format
- st_read compatible format
- readOGR compatible format
- ppp format (for spatstat compatibility)
v. Data Overview & Exploration
- Overlay 2081 Birds, Raster Map & Dumping Site, for an integrated overview using `plot()`
- Use `facet_wrap` to identify location of clustering across species, across time, and across season, and by call/song
vi. Segregation of Treatment & Control Groups
- Use ‘Rose Pipits’ as Treatment Group
- Use ‘Ordinary Snape’ and ‘Lesse Birchbeere’ as Control Groups
- Use ‘All Birds’ as third control
|
3.
|
Geospatial Visualisation
|
Spatial Point Pattern Visualisation (Density-Based Measure)
i. Prepare polygon layer
- Create a 200x200 spatial polygon to depict the boundaries of Lekagul raster map
- Merge Raster Polygon with Rose Pipit Layer, using `owin` from spatstat package
ii. Kernel Density Plot
- First, set sigma=bw.diggle
- Apply the Kernel Density Plot (By Year; 2012-2017)
- For All Birds
- For Rose Pipits only (Treatment Group)
- For OS & LB only (Control Groups)
iii. Adjust Parameters (sigma)
- Adjust the plots by using the sigma of the most dense cluster
- This is typically the largest sigma
iv. Fine-Tune for Clearer Visualisation
- Then add in the dumping site & adjust the colour/size
- So that we can visualize the clusters relative to the dumping site
|
4.
|
Statistical Confirmation
|
Spatial Point Pattern Analysis (Distance-Based Measure)
i. Quadrat Analysis
- Apply Monti-Carlo Simulation
- Followed by Quadrat Test to test for clustering
ii. K-Nearest Neighbour
- Apply Monti-Carlo Simulation
- Followed by Clark-Evans Test to test for clustering
iii. K-Function
- Apply Monti-Carlo simulation
- Visualise significance based on grey band
|
5.
|
Audio Processing
|
i. Data Preparation (Density-Based Measure)
- Read in MP3 Files (Training & Testing Data)
- Convert to .wav format using `writeWav()`
- Convert .wav files to data frame using `analyzeFolder()`
- Read in data frame
ii. Audio Extraction & Manipulation
- Extract only 1 of 2 channels (choose left).
- Convert each sound array to floating point values ranging from -1 to 1.
iii. Adjust Parameters (sigma)
- Adjust the plots by using the sigma of the most dense cluster
- This is typically the largest sigma
iv. Fine-Tune for Clearer Visualisation
- Then add in the dumping site & adjust the colour/size
- So that we can visualize the clusters relative to the dumping site
|