ISSS608 2017-18 T3 Assign Pooja Manohar Sawant Data Preparation

From Visual Analytics and Applications
Jump to navigation Jump to search

About the data

Data provided by the insider has 10 CSV files as below.

Index File name Number of Rows Description
1 calls.csv 10,606,835 Call details for entire organization
2 email.csv 14,550,085 Email details for entire organization
3 meeting.csv 127,351 Meeting details for entire organization
4 purchaces.csv 762,200 Purchases details for entire organization
5 CompanyIndex.csv 642,631 Company employee ID and Name list
6 Suspicious_calls.csv 70 Calls details involving suspicious group of people within an organization
7 Suspicious_emails.csv 61 Email details involving suspicious group of people within an organization
8 Suspicious_meetings.csv 1 Meeting details involving suspicious group of people within an organization
9 Suspicious_purchases.csv 5 Purchases details involving suspicious group of people within an organization
10 Other_suspicious_purchases.csv 7 A list of 4 individuals who made 7 suspicious purchases
Table 1 - Details of input files


All above files except CompanyIndex.csv, have below four fields -

  1. Source (contains the company ID# for the person who called, sent an email, purchased something, or invited people to a meeting)
  2. Etype (contains a number designating what kind of connection is made)
    1. 0 is for calls
    2. 1 is for emails
    3. 2 is for purchases
    4. 3 is for meetings
  3. Destination (contains company ID# for the person who is receiving a call, receiving an email, selling something to a buyer, or being
  4. Time stamp – in seconds starting on May 11, 2015 at 14:00.