Difference between revisions of "ISSS608 2017-18 T3 Assign Pooja Manohar Sawant Data Preparation"
Jump to navigation
Jump to search
Poojams.2017 (talk | contribs) |
Poojams.2017 (talk | contribs) |
||
Line 1: | Line 1: | ||
+ | <div style=background:#2B3856 border:#A3BFB1> | ||
+ | [[Image:MC3_2018.jpg |300px]] | ||
+ | <b><font size = 5; color="#FFFFFF"> Detecting Suspicious Activities at Kasios International</font></b> | ||
+ | |||
+ | </div> | ||
+ | <!--MAIN HEADER --> | ||
+ | {|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0" | | ||
+ | | style="font-family:Century Gothic; font-size:100%; solid #000000; background:#2B3856; text-align:center;" width="20%" | | ||
+ | ; | ||
+ | [[ISSS608_2017-18_T3_Assign_Pooja_Manohar_Sawant| <font color="#FFFFFF">BACKGROUND</font>]] | ||
+ | |||
+ | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" | | ||
+ | ; | ||
+ | [[ISSS608_2017-18_T3_Assign_Pooja_Manohar_Sawant_Data_Preparation| <font color="#FFFFFF">DATA PREPARATION</font>]] | ||
+ | |||
+ | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" | | ||
+ | ; | ||
+ | [[ISSS608_2017-18_T3_Assign_Pooja_Manohar_Sawant_Methodology_Dashboard_Design| <font color="#FFFFFF">METHODOLOGY AND DASHBOARD DESIGN </font>]] | ||
+ | |||
+ | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" | | ||
+ | ; | ||
+ | [[ISSS608_2017-18_T3_Assign_Pooja_Manohar_Sawant_Observations & Insights| <font color="#FFFFFF">OBSERVATIONS AND INSIGHTS</font>]] | ||
+ | |||
+ | | style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" | | ||
+ | ; | ||
+ | [[Assignment_Dropbox_G1| <font color="#FFFFFF">Back to Dropbox</font>]] | ||
+ | |||
+ | | | ||
+ | |} | ||
+ | <br/> | ||
== About the data == | == About the data == | ||
Line 40: | Line 70: | ||
#Destination (contains company ID# for the person who is receiving a call, receiving an email, selling something to a buyer, or being | #Destination (contains company ID# for the person who is receiving a call, receiving an email, selling something to a buyer, or being | ||
#Time stamp – in seconds starting on May 11, 2015 at 14:00. | #Time stamp – in seconds starting on May 11, 2015 at 14:00. | ||
+ | |||
== Tools Used for data preparation and Visualization == | == Tools Used for data preparation and Visualization == |
Revision as of 03:09, 8 July 2018
|
|
|
|
|
About the data
Data provided by the insider has 10 CSV files as below.
Index | File name | Number of Rows | Description |
---|---|---|---|
1 | calls.csv | 10,606,835 | Call details for entire organization |
2 | email.csv | 14,550,085 | Email details for entire organization |
3 | meeting.csv | 127,351 | Meeting details for entire organization |
4 | purchaces.csv | 762,200 | Purchases details for entire organization |
5 | CompanyIndex.csv | 642,631 | Company employee ID and Name list |
6 | Suspicious_calls.csv | 70 | Calls details involving suspicious group of people within an organization |
7 | Suspicious_emails.csv | 61 | Email details involving suspicious group of people within an organization |
8 | Suspicious_meetings.csv | 1 | Meeting details involving suspicious group of people within an organization |
9 | Suspicious_purchases.csv | 5 | Purchases details involving suspicious group of people within an organization |
10 | Other_suspicious_purchases.csv | 7 | A list of 4 individuals who made 7 suspicious purchases |
Table 1 - Details of input files
All above files except CompanyIndex.csv, have below four fields -
- Source (contains the company ID# for the person who called, sent an email, purchased something, or invited people to a meeting)
- Etype (contains a number designating what kind of connection is made)
- 0 is for calls
- 1 is for emails
- 2 is for purchases
- 3 is for meetings
- Destination (contains company ID# for the person who is receiving a call, receiving an email, selling something to a buyer, or being
- Time stamp – in seconds starting on May 11, 2015 at 14:00.
Tools Used for data preparation and Visualization
- JMP Pro – For data cleaning and transformation
- Tableau – To visualize how communication and purchasing patterns are evolved over a period of 2 and half years in Kasios international
- Gephi – To visualize how suspected employees of Kasios International are connected with each other and with other employees in the organization