ISSS608 2017-18 T3 Assign Li Hongxin Methodology
Revision as of 20:28, 6 July 2018 by Hongxin.li.2017 (talk | contribs)
|
|
|
|
Contents
Tools
a. R: used for data cleaning.
Packages: tidyverse
b. Tableau: used for Map & Pattern visualization.
c. Python: used for density visualization, audio visualization and audio classification.
Packages: os, glob, pandas, numpy, matplotlib, seaborn, librosa, sklearn
Process for Data Preparation
The following are key steps for data cleaning, and data manipulation for further visualization and analysis.
Step 1: Deal with Missing Values. Replace all symbols such as "?", "??:??" in Time, and "No score" in Quality which
stand for missing values, into NA.
Step 2: Fix Data Quality Issues. Transform all letters into uppercase for convenience, and remove extra spaces and "?".
Step 3: Unify the Date & Time Format. Transform all Date into "%Y-%m-%d" format and Time into "HH:mm" format.
Step 4: Modify Data Types Change X and Y coordinate from character into int.
Pattern Visualization and Analysis
b
Audio Visualization and Classification
c