ISSS608 2017-18 T3 Assign Pooja Manohar Sawant Data Preparation

From Visual Analytics and Applications
Revision as of 02:58, 8 July 2018 by Poojams.2017 (talk | contribs) (Created page with "== About the data == Data provided by the insider has 10 CSV files as below. {| class="wikitable" |- ! Index !!File name !! Number of Rows !! Description |- |1|| calls.csv...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

About the data

Data provided by the insider has 10 CSV files as below.

Index File name Number of Rows Description
1 calls.csv 10,606,835 Call details for entire organization
2 email.csv 14,550,085 Email details for entire organization
3 meeting.csv 127,351 Meeting details for entire organization
4 purchaces.csv 762,200 Purchases details for entire organization
5 CompanyIndex.csv 642,631 Company employee ID and Name list
6 Suspicious_calls.csv 70 Calls details involving suspicious group of people within an organization
7 Suspicious_emails.csv 61 Email details involving suspicious group of people within an organization
8 Suspicious_meetings.csv 1 Meeting details involving suspicious group of people within an organization
9 Suspicious_purchases.csv 5 Purchases details involving suspicious group of people within an organization
10 Other_suspicious_purchases.csv 7 A list of 4 individuals who made 7 suspicious purchases
Table 1 - Details of input files


All above files except CompanyIndex.csv, have below four fields -

  1. Source (contains the company ID# for the person who called, sent an email, purchased something, or invited people to a meeting)
  2. Etype (contains a number designating what kind of connection is made)
    1. 0 is for calls
    2. 1 is for emails
    3. 2 is for purchases
    4. 3 is for meetings
  3. Destination (contains company ID# for the person who is receiving a call, receiving an email, selling something to a buyer, or being
  4. Time stamp – in seconds starting on May 11, 2015 at 14:00.