ISSS608 2016-17 T3 Assign Chan En Ying Grace Methodology

From Visual Analytics and Applications
Jump to navigation Jump to search

Rose Pipits.png “Mine dear rose pipits, whence did do thou vanish?”

Background

Methodology

Did Rose Pipit kicketh the bucket?

Which song belongs to thee?

Conclusion

 


Tools

R is the primary tool used in this analysis. The following lists the packages used for the project’s scope - for data cleaning, data visualisation, geospatial analysis and audio processing.

  • R libraries
    • sp
    • rgdal
    • sf
    • raster
    • spatstat
    • maptools
    • gplots
    • ggplot2
    • ggmap
    • rasterVis
    • lattice
    • latticeExtra
    • tidyverse
    • zoo
    • tmap
    • reshape2
    • quantmod
    • ggTimeSeries
    • viridis
    • rlang
    • soundgen
    • tuneR
    • phonTools
    • seewave


Approach Taken

The following outlines the approach used for the analysis.

Step

Approach

Description

1.

Data Understanding

1. Read in Raster Layer (Lekagul Roadways Map)
- It is a single layer raster file. 200x200.
class : RasterLayer
dimensions : 200, 200, 40000 (nrow, ncol, ncell)
resolution : 1, 1 (x, y)
extent : 0, 200, 0, 200 (xmin, xmax, ymin, ymax)
coord. ref. : NA
names : Lekagul_Roadways_2018
values : 0, 255 (min, max)


2. Find out structure of Raster Layer
Extent : 40000
CRS arguments : NA
File Size : 41078
Object Size : 14376 bytes
Layer : 1

2.

Data Cleaning

1. Import two CSV Files (Birds)

  • 2081 Training Birds (Metadata)
  • 15 Test Birds (Provided by Kasios)


2. Fix Data Quality Issues

  • Change File ID from numeric to character
  • Change coordinates to numeric
  • Change Date from Character to Date
  • Omit the two NA values for the Y coordinate.
  • Clean the Dates (All standardise to m/d/y. For missing month/year, I will replace with NA. For missing day, I will impute as 1st day of the month.)
  • Clean the Timing (Standardise all to 24 hour formatting. Use “.” instead of ":")
  • Clean the Vocalisation Type (Standardise all to lower case. For values consisting of both ‘song and call’, change to ‘call’, assumed as a sign of distress while ‘song’ is assumed as the default)
  • Clean the Quality (Recode ‘no score’ as ‘NA’)


3. Data Manipulation

  • Extract out the “Year” and “Month” from the date, as new columns
  • Create a new column for Quarter (Q1,Q2,Q3,Q4) & Season (Spring, Summer, Fall, Winter)


4. Geospatial File Compatibility

  • Convert CSV file (2081 birds) into the following:
    • spatial point data frame
    • sp format
    • shp format
    • st_read compatible format
    • readOGR compatible format
    • ppp format (for spatstat compatibility)


5. Data Overview & Exploration

  • Overlay 2081 Birds, Raster Map & Dumping Site, for an integrated overview using `plot()`
  • Use `facet_wrap` to identify location of clustering across species, across time, and across season, and by call/song


6. Segregation of Treatment & Control Groups

  • Use ‘Rose Pipits’ as Treatment Group
  • Use ‘Ordinary Snape’ and ‘Lesse Birchbeere’ as Control Groups
  • Use ‘All Birds’ as third control