1718t1is428T12

From Visual Analytics for Business Intelligence
Revision as of 12:17, 24 November 2017 by Victoriakoh.2015 (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Los tres mascatero logo.png


Home

Proposal

Team

Poster

Application

Research Paper

Version 1 Version 2

Introduction & Motivation

Panama intro photo.jpg


The Panama Papers (2016) are a huge leak — 11.5 million (approximately 2.6 TB) — of financial documents that reveal the financial holdings of the rich and powerful. The global investigation into the secretive industry of offshore companies expose how politicians, celebrities, sportsmen and high-net-worth individuals set up front companies in remote jurisdictions to protect their cash from higher taxes, and facilitate bribery, arms deals, financial fraud and drug trafficking. Laying within the trove of leaked files are also the names of the rich and powerful in the Asia-Pacific (APAC) region, overshadowed by the media’s interest of more prominent names of the West.

Objectives

As news coverage, even in Singapore, was focused mainly on the West, attention is diverted from what may be more important, which are details of individuals and businesses in APAC that are also found in the leaked documents. This results in a lack of information and coverage on the APAC region.

Our goal is the shed light on the individuals and business involved in the APAC region in the following ways:

  1. To present the complexity and structure of relationships between entities and individuals in each country in the APAC region.
  2. Identify key parties in the offshore investments.

Background Survey of Related Works

Visualization Description
Is428 databg ucb viz.png


Data source: http://people.ischool.berkeley.edu/~yhfan/W209-Final-Project/

Geospatial map chart

This visualization shows the interconnectedness of countries connected and involved in the offshore industry over a forty-year period. The map view scalable and specific date and time is selectable on the timeline.

Pros:

  • Able to see an overview of countries who are connected and involved in the offshore industry.
  • Location circle markers are clickable with a pop-up displaying more details on the total number of officer, intermediary and entity connections in each country.
  • Location circle marker sizes are dynamically sized according to the number of connections each country has.


Cons:

  • No country name labels on the location pins, only appears upon hover.
  • Despite location circle marker sizes being dynamically sized according to the number of connections each country has, large pin sizes are limited to the same size after it is past the threshold.
  • Color choice of the location pins and network lines are slightly jarring and don't go nicely with the overall muted color choice for the visualization.
Is428 databg ucb viz2.png


Data source: http://people.ischool.berkeley.edu/~yhfan/W209-Final-Project/

Network graph

This visualization is part of a drill-in in the previous visualization when you click on a particular country. It shows the interconnectedness of companies in that particular country to companies outside the country, across the world. The nodes show more details when they are hovered over.

Pros:

  • Able to see an overview of entities, officers and intermediaries situated outside and inside a particular country.
  • More details of nodes upon hover.


Cons:

  • Difficult to understand at a glance.
  • The labels for "Entities", "Intermediaries" and "Officers" are positioned at an imaginary 'T' shape, but the ordered nodes are positioned with a 'Y' shape, which makes it hard for reference.
  • Network lines are faint against the grey background, and color choice of the nodes are slightly jarring and don't go nicely together with the visualization.
Is428 databg viz1.png


Data source: http://www.arcgis.com/apps/MapJournal/index.html?appid=1f611be658e74ad48f899d1d6152bdb4

Interactive map

Map showing companies in Mossack Fonseca database “connected” to a particular country by address. The data also shows clients, beneficiaries, and shareholders by country. The visualization uses scaled circle location markers to show the number of companies in each country mentioned in the database. Each country's circle location markers are clickable, which reveal the number of clients, beneficiaries, and shareholders mentioned in the papers from the selected country.

Pros:

  • Able to see at a glance which countries have the most number of companies, clients, beneficiaries, and shareholders based on their circle location marker size.
  • Map is scalable. User is able to zoom in to have a closer look at the smaller countries, and zoom out to have an overview of the concentration of companies.
  • Minimal yet effective color choices that are also pleasing on the eye.


Cons:

  • Unable to see the connections of each country to other countries (i.e. which individual from a particular country has offshore companies in Switzerland).
  • No timeline provided — whatever shown on the map is all the data from from 1974 to 2015. Some companies might have already been dissolved.

Proposed Storyboard

Page Description
Is428 t12 storyboard1.JPG

Homepage

When the user enters our application, he is introduced to the home screen with our project topic and prompted to scroll down to read more. He can click the quick links at the top right hand corner to view details about our team or the project.

Is428 t12 storyboard2.JPG
Is428 t12 storyboard3.JPG

Story

As the user scrolls down to read more, he is introduced to the story of an individual who intends to set up an offshore company for asset protection, but is unsure of which country he should set up his company in.

The user then clicks through the story (displayed as a carousel slider) to view the different offshore networks of each APAC country (i.e. Hong Kong, Malaysia, Singapore). He is able to see at a glance which country has the most complex offshore networks and which does not. He can also click "View it in action" to view the interactive network graph we have implemented to explore countries on his own.

Is428 t12 storyboard4.JPG

Try it out: Interactive network graph

Now, the user is in our interactive network graph. He can click through the filters to select a Country, Node type, Jurisdiction, Relationship, and click Apply to execute the filters to show the offshore networks of a particular country.

Each node has a label (of Officer/Entity/Intermediary names) and there is a legend on the top left for him to refer to the different node types on the screen. He is able to zoom in and out of the network graph to have a closer look at the relationships between Officers, Entities and Intermediaries. The nodes are also draggable around the screen for the user to shift and form a better understanding of the offshore network.

Data Source

The following are the data sources we have gathered the Panama Papers data from for this project:

Dataset Description

Offshore Leaks Database by The International Consortium of Investigative Journalists

Data source: https://offshoreleaks.icij.org/pages/database

Contains information on more than 520,000 offshore entities that are part of the Panama Papers, the Offshore Leaks, the Bahamas Leaks and Appleby data from the Paradise Papers as well as from some politicians featured in the Paradise Papers investigation. The data covers nearly 70 years up to early 2016 and links to people and companies in more than 200 countries and territories.

Paradise-Panama-Papers: Data Scientists United Against Corruption dataset

Data source: https://www.kaggle.com/zusmani/paradisepanamapapers/version/1/data

Compilation of data from the Paradise and Panama Papers leaks in .csv format of Addresses, Entities, Intermediaries and Officers, and also node edges, which we will be utilizing to assist the development of our network graph.

Tools

The following are the tools we will be using for the project:

Tools team12.png

Technical Challenges

Key Technical Challenges Description Solution
Unfamiliar with D3.js libraries

D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers.

  • Go for the d3 workshop
  • Self learning
  • Peer Learning
Data Cleaning and Transformation

The data set are in text format and many other different format. Integration are challenging as there are a lot of manual work to be done.

  • Delegate workload for cleaning datasets
Determining the Most Optimal Interactive Elements

In order to enable users to understand the data sets, interactive elements needs to be suitable for this project

  • Develop storyboard
  • Research on network graph visualization

Project Timeline & Task Assignments

Projecttimeline team12 v2.png

References

Comments

Please leave comments here.