ISSS608 2017-18 T3 Assign Joel Choo Peng Yeow Methodology

From Visual Analytics and Applications
Jump to navigation Jump to search
Joel MagnifyingGlass.png

"You See, But You Do Not Observe" - Sherlock
Looking Deeper Into The Network Of Connected Individuals


Background

Methodology

Company Growth

Are You Guilty?

Conclusion

[Back To Assignments]

 

Methodology

We will be exploring 2 softwares, Gephi & Tableau to explore the dataset. Since the overall transactional dataset is large; 26 million rows, we will be filtering the dataset based on the suspicious list provided to reduce the employee count from 642k to 2k.

Understanding Centrality Index

Since our objective is to identify the suspects, we will be using 2 centrality metrics to help us in the investigative process.

Closeness

Closeness centrality (or closeness) of a node is a measure of centrality in a network, calculated as the sum of the length of the shortest paths between the node and all other nodes in the graph. It is defined by Bavelas (1950) as the reciprocal of the farness, where d(pi,pk) is the distance between vertices pi and pk.
Close.jpg

Thus, the more central a node is, the closer it is to all other nodes. We will use this metric to identify who are close to the group of suspects.

Betweeness

Betweeness on the other hand represents the degree of which nodes stand between each other and high betweenness means more information will pass through that node. Removing the node will lose a large part of the graph. This will help us identify the key players in the organisation since "most approvals would probably have to go through these people".