Difference between revisions of "ISSS608 2017-18 T3 Assign Lu Yanzhang Data Preparation Methodology"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(9 intermediate revisions by the same user not shown)
Line 45: Line 45:
 
For further use of timestamp data, the format needs to be transformed to '''YYYY/MM//DD''' rather than raw second format.
 
For further use of timestamp data, the format needs to be transformed to '''YYYY/MM//DD''' rather than raw second format.
  
==Audio File Processing==
+
==Join operation among diverse tables in JMP==
1. To process the audio files, following R packages are loaded in this assignment: soundgen, tuneR, seewave
 
  
2. As the function analyzeFolder() which converts audio files to dataframe can only read WAV format, it is necessary for me to convert MP3 format to WAV format. In the first step, I convert the all the MP3 audio files to WAV format
+
Join the tables where the source or target is “suspicious” and select out the suspicious transactions for the further visualizations in Tableau and social network analytics in Gephi
  
3. Not all the audio files are good quality. Some audio contains noise which will distract the audio classification task. Extract all the audio files which quality is 'A'
+
==Social network modeling in Gephi==
 +
Import the suspicious data file into Gephi and model the data with two methodologies:
  
4. Call analyzeFolder() to read all the wav file as dataframe. And store the dataframe as csv format. Audio files in both All Birds and Test Birds from Kasios are required to process as above
+
1. Eigenvalue centrality for vertex importance calculation.
  
The actual code for audio processing can be found at [https://github.com/runyu/B6_Visual_Analytics/blob/master/Assignment_MC1_audio_preparation.Rmd here]
+
2. Modularity for clustering calculation.
 +
 
 +
==Visualization in Tableau==
 +
 
 +
1. Visualize the communication table by day and by month to interpret the growth from 2015 to 2017.
 +
 
 +
2. Visualize the suspicious staffs' activities.

Latest revision as of 14:52, 10 July 2018

MC3 2018.jpg

VAST Challenge 2018 MC3:
Who hurts the brid?

INTRODUCTION

DATA PREPARATION & METHODOLOGY

OBSERVATION AND INSIGHTS

Back to Dropbox

 


Tools

The following tools have been used in this assignment

1. Python - Timestamp transformation and new data source generation.

The following packages are used in this assignment: pandas, numpy, glob, datetime.

2. JMP Pro - Data preparation

3. Tableau - Visualization

4. Gephi - Social network modeling and visualization

Timestamp Transformation in Python

The raw timestamp format is the second record from '2015-05-11 14:00:00'.

For further use of timestamp data, the format needs to be transformed to YYYY/MM//DD rather than raw second format.

Join operation among diverse tables in JMP

Join the tables where the source or target is “suspicious” and select out the suspicious transactions for the further visualizations in Tableau and social network analytics in Gephi

Social network modeling in Gephi

Import the suspicious data file into Gephi and model the data with two methodologies:

1. Eigenvalue centrality for vertex importance calculation.

2. Modularity for clustering calculation.

Visualization in Tableau

1. Visualize the communication table by day and by month to interpret the growth from 2015 to 2017.

2. Visualize the suspicious staffs' activities.