ISSS608 2017-18 T3 Assign Jyoti Bukkapatil Methodology & Dashboard Design
|
|
|
|
|
|
Contents
Tools Used
I have used below four tools for data analysis and visualization.
- JMP Pro 13 - Used for Data Preparation
- Tableau 2018.1 - Used to create a Calendar View and Timer Series Graph for All transactions in company
- Gephi 0.9.2 - Used to create Network Graph
- Microsoft Excel - Used to map Employee ID with Name
Methodology
Three different Methods of visualisation was used to analyse and visualise provided data by insider .
- Time Series Plot
- Calender Plot
- Network Graph
Time Series Graph
Time series graph was created for four big data files to visualize monthly and hourly pattern of the company's communication and purchase habits. I have used Tableau to create time series patterns. To make the above plot, the fields added to Tableau rows , columns and the filters applied are shown below :
Below are fields used added to tableau Marks and colour legend used for graph . Same color legends are also used for hourly timeseries plot .
Hourly Time series was created to with single vertical axis. This was later added to the tooltip of the calendar view. Below is the combined hourly graph for all four transactions.
To make the above plot, the fields added to Tableau rows and columns and the filters applied are shown below:
Tooltip filter in Figer 8 is automatically generated tooltip because this graph is used in the tooltip of the calendar plot. Below are color legends used for both monthly and hourly time series graph.
Calendar Plot
The Calendar plot is a representation of monthly calendar to show the number of interactions on daily basis. Date from May 2015 till Oct 2015 was excluded for better visualization of the trend for next two year and two months. So calendar view is actually from the Year 2015 Q4 till 2017 Q4.
To make the above plot, the fields added to Tableau rows and columns and the filters applied are shown below:
As shown above, the DateTime field was modified to show the month, week and weekdays in the respective columns and rows. The Marks card and Tooltip configuration in Tableau for this chart is shown below:
Network Graph
The network graph is the representation of communications and Purchases took place among suspicious employees and other employees in the company. The tool used to create network graph is Gephi 0.9.2. Below are the steps to load Edge and node files to the tool. Two different workspaces created in Gephi.
- To create Workspace"Suspicious Only", Suspicious_All.csv file was loaded as Edge file. Node list for Suspicious employees was created by mapping IDs with Names from CompanyIndex.csv file provided in data sources. Name for node file is “Suspicious_Node list.csv” Total 20 nodes and 137 Edges (Suspicious Connections) were loaded. This network Graph is the representation of Suspicious interactions between 20 suspicious employees.
I have used colors for Suspicious nodes same as modularity class which represents the group which they belong to in the organization.
2. Workspace was created for Total interactions of suspicious employees with other employees from the company. Edge file Suspicious Association Total.csv was used for this graph. Node list for Suspicious employees was created by mapping IDs with Names from CompanyIndex.csv file provided in data sources. Name of node file is “Suspicious Association Node list.csv”. Size for all suspicious Nodes was fixed at max for range (500) to show them in bigger group interactions.
3. For finding our other employees which might have been closely associated with 20 suspicious employees, Network graph in figure 18 was filtered based on Betweenness centrality and data was copied to the new workspace for further analysis. Details of these findings are described in The "Observations and Insight" section.
Throughout network graph analysis fixed color legends are used. Node color legends are based on modularity class of Suspicious Employees organizational group in a company ( from network graph figure 18) and Edge color legends represents communication mode.
4. To analyze data over the time interval, I grouped interactions for every month in one group and created start and end date . for example, for all interactions happened in Nov 2015 end date will be 1st Dec 2015. 5. For analyzing data over time I used two filters, One Dynamic time interval filter and after filtering edges for particular time interval calculated degree again and added degree in subfilter to filter out all those nodes with 0 degrees / those who don't have any connections.
Dashboard Design
- Tableau Dashboard :
Tableau dashboard was designed to give one picture view of communications and purchase patterns for 2 & half years time period. The view will provide patterns based on monthly, daily and hourly transactions.
- Interactive Network graphs were extracted using SigmaExporter plugin of Gephi.