Difference between revisions of "Forensic Ninja"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Line 62: Line 62:
 
|-
 
|-
 
| style="font-family:Open Sans, Arial, sans-serif; text-align: center; padding:3px 10px; border-bottom:solid 1px #d8d8d8" | <strong>
 
| style="font-family:Open Sans, Arial, sans-serif; text-align: center; padding:3px 10px; border-bottom:solid 1px #d8d8d8" | <strong>
[[File:Forensic Ninja ParallelVizTianjin.png|300px]] </strong>
+
<b>Parallel Coordinates of Employee Characteristics</b>[[File:Forensic Ninja ParallelVizTianjin.png|300px]] </strong>
 
| style="font-family:Open Sans, Arial, sans-serif; text-align: left; padding:3px 10px; border-bottom:solid 1px #d8d8d8" |  
 
| style="font-family:Open Sans, Arial, sans-serif; text-align: left; padding:3px 10px; border-bottom:solid 1px #d8d8d8" |  
 
* Parallel Coordinates can display the Employee Characteristics in the  Y axis, and the lines are the employees themselves
 
* Parallel Coordinates can display the Employee Characteristics in the  Y axis, and the lines are the employees themselves

Revision as of 13:43, 7 October 2016

GroupLogo.png

PROPOSAL

POSTER

APPLICATION

REPORT

Problem and Motivation

Benford’s Law has been widely used by forensic data analysts to detect anomalies or possible fraudulent activities in an organisation. However, in the world of information, majority of the data are textual fields. For example, in an accounts payable, 70% of the data are textual data whereas only 10% of the data are numerical fields (Lanza, 2016).


Furthermore, fraudsters tend to work in groups rather than relying on their own. In 2015, 62 percent of fraudsters colluded with others(KPMG International, 2016). As 74 percent of the fraud is perpetrated by internal staff or a collusion between internal staff and external parties (KPMG International, 2016), this highlights the need for complex tools for fraud examiners to not only analyse available textual data of the firm but also visualise the interactivity among employees of an organisation.


As email is one of the preferred modes of business communication in an organisation, analysing emails can help to uncover any potential red flags in the organisation structure or culture. By using GAStech organisation email exchanges as a case study, we seek to analyse the connectivity and frequently discussed topics among employees of an organisation.

Objectives

In this project, we seek to build an interactive visualisation application that helps users to analyse connectivity and frequently discussed topics among employees of an organisation. This allows users to better visualise the organisation structure and interactivity among the employees that might suggest potential wrongdoings.


By using GAStech organisation email exchanges as a case study, the application aims to help users the following:

  • Understand GAStech organisational structure
  • Analyse frequently discussed topics among GAStech employees

Data Source

The dataset that will be used in this project can be retrieved from VAST Challenge 2014. Link to Dataset.
It mainly consists of GAStech employee records and email headers from two weeks of internal GAStech company email.

References to Related Work

Screenshots What can we learn

Parallel Coordinates of Employee CharacteristicsForensic Ninja ParallelVizTianjin.png

  • Parallel Coordinates can display the Employee Characteristics in the Y axis, and the lines are the employees themselves
  • From this method, common characteristics amongst employees will be visible
  • Common characteristics that can be seen are who went to military service together, wh
  • Which military branch they were in and how they obtained their citizenship.
Put Pic here
  • ABC
  • DEF
  • GHI
  • JKL
Put Pic here
  • ABC
  • DEF
  • GHI
  • JKL

Storyboard

Key Technical Challenges

Firstly, one of the key technical challenge is that we will be working on two datasets, namely Employee Records and Email Headers. This is because there will need to be a connection created between the two databases, so it can be used effectively and simultaneously. A possible solution to this would be to link the two databases by using the Email address information that is both available in the two databases.


Another key technical challenge would be to publish our Visualisations using D3.JS. To link the databases, there would be a form of javascript coding involved using the D3 library. Our group has started learning the programming language and library recently, and there is a steep learning curve. To bridge the gap between the expectations of the project and our programming ability, we will be looking into the code of published D3 Visualizations, and learn best practices from these visualisations. This is so to better understand the logic of the code and be able to use it to make our visualisations more interactive and powerful to the end user.

Project Schedule

Forensic Ninja Timeline.PNG

References

  1. KPMG International. (2016). Global profiles of the fraudster: Technology enables and weak controls fuel the fraud. Retrieved from here
  2. Lanza, R. B. (2016, March). Blazing a trail for the Benford' s Law of words, part 1. Retrieved from here
  3. 3
  4. 4
  5. 5

Our Team

Group 13
1. Lim Hui Ting
2. Jonathan Eduard Chua Lim

Comments