From Visual Analytics and Applications
		
		
		
		
		
		Jump to navigation
		Jump to search
		
 
Unraveling the Secrets of Kasios : VAST Mini Challenge 3 
 
Data Set
| 
The Kasios Insider has provided data from across the company. There are call records, emails, purchases, and meetings. The data only includes the source of each transaction, the recipient (destination), and the time of the transaction. Contents of emails or phone calls are not available.
 
| Dataset | Description | Size |  
| calls.csv | Information on 10.6 million calls | 251 MB uncompressed |  
| emails.csv | Information on 14.6 million emails | 345 MB uncompressed |  
| purchases.csv | Information on 762 thousand purchases | 18.8 MB uncompressed |  
| meetings.csv | Information on 127 thousand meetings | 3.26 MB uncompressed |  
  
There are four data files that contain information about individuals that the Insider has indicated as suspicious:
 
| Dataset | Description | Size |  
| Suspicious_calls.csv | Information on suspicious  calls | 1.76 KB uncompressed |  
| Suspicious_emails.csv | Information on suspicious  emails | 1.55 KB uncompressed |  
| Suspicious_purchases.csv | Information on suspicious  purchases | 27 B uncompressed |  
| Suspicious_meetings.csv | Information on suspicious meetings | 130 B uncompressed |  
| Other_suspicious_purchases.csv | list of 4 individuals who made 7 suspicious purchases (For Question 4) | 378 B uncompressed |  
  
All provided data files have the same format. The data are provided in comma-separated format with four columns: 
 
| Column Name | Description |  
| Source | Contains the company ID# for the person who called, sent an email, purchased something, or invited people to a meeting |  
| Etype | Contains a number designating what kind of connection is made a.	0 is for calls
b.	1 is for emails
c.	2 is for purchases
d.	3 is for meetings
 |  
| Destination | Information on suspicious  purchases |  
| Suspicious_meetings.csv | Contains company ID# for the person who is receiving a call, receiving an email, selling something to a buyer, or being invited to a meeting |  
| Time stamp | In seconds starting on May 11, 2015 at 14:00. |  | 
Tools
|  
Python for Data CleaningExcel for Data CleaningGephi for Network VisualizationTableau Desktop for Visualization  
 | 
Data Cleaning
|  
Converting the Time in all the CSV’s from seconds to the standard format and baselining the time w.r.t May 11, 2015 at 14:00.
Using Python date-time and panda’s library the relative date-time was converted to an absolute date-time.
 
 |