Fu Yi - Data Preparation
|
|
|
|
|
Data Preparation Question 1
a) Add titles Open 4 large tables (calls, emails, purchases, meetings) in Excel. Add title for each column (source, eType, target, time) for each of 4 tables.
b) Change date
Import tables to JMP, since the real time should start from 11/05/2015, 14:00. I created 2 new columns for 11/05/2015 and 14:00 respectively, and combine Old time, Date, Time of date together to get the correct date.
c) No duplication
Check summary of each table to eliminate the duplication.
d) Clear out incomplete month
The date starts from May,2015, however, the first 2 months have incomplete data. I delete the first 2 months data (May + June 2015) to make the dataset have a complete cycle. The description of final 4 tables:
- Calls table: 10,091,409 rows
- Emails table: 13,846,639 rows
- Purchase table: 723,586 rows
- Meetings table: 127,110 rows