ISSS608 2017-18 T3 Assign Aakanksha Kumari Data Preparation

From Visual Analytics and Applications
Jump to navigation Jump to search
Classified-stamp.png

MC Challenge 3

Overview

Data Preparation

Question 1

Question 2

Question 3

Question 4

Conclusion

Dropbox

 

Data Set

The Kasios Insider has provided data from across the company. There are call records, emails, purchases, and meetings. The data only includes the source of each transaction, the recipient (destination), and the time of the transaction. Contents of emails or phone calls are not available.

Dataset Description Size
calls.csv Information on 10.6 million calls 251 MB uncompressed
emails.csv Information on 14.6 million emails 345 MB uncompressed
purchases.csv Information on 762 thousand purchases 18.8 MB uncompressed
meetings.csv Information on 127 thousand meetings 3.26 MB uncompressed


There are four data files that contain information about individuals that the Insider has indicated as suspicious:
D2.PNG
All provided data files have the same format. The data are provided in comma-separated format with four columns:
D3.PNG

Tools

  • Python for Data Cleaning
  • Excel for Data Cleaning
  • Gephi for Network Visualization
  • Tableau Desktop for Visualization


Data Prep

Converting the Time in all the CSV’s from seconds to the standard format and baselining the time w.r.t May 11, 2015 at 14:00. Using Python datetime and panda’s library the modification was done.