Difference between revisions of "1718t1is428T12"
(9 intermediate revisions by the same user not shown) | |||
Line 43: | Line 43: | ||
==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Introduction & Motivation</font></div>== | ==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Introduction & Motivation</font></div>== | ||
<div style="font-size: 15px; padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | <div style="font-size: 15px; padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | ||
− | [[File:Panama_intro_photo.jpg|center| | + | [[File:Panama_intro_photo.jpg|center|550px|link=]] |
<br> | <br> | ||
The Panama Papers (2016) are a huge leak — 11.5 million (approximately 2.6 TB) — of financial documents that reveal the financial holdings of the rich and powerful. The global investigation into the secretive industry of offshore companies expose how politicians, celebrities, sportsmen and high-net-worth individuals set up front companies in remote jurisdictions to protect their cash from higher taxes, and facilitate bribery, arms deals, financial fraud and drug trafficking. Laying within the trove of leaked files are also the names of the rich and powerful in the Asia-Pacific (APAC) region, overshadowed by the media’s interest of more prominent names of the West. | The Panama Papers (2016) are a huge leak — 11.5 million (approximately 2.6 TB) — of financial documents that reveal the financial holdings of the rich and powerful. The global investigation into the secretive industry of offshore companies expose how politicians, celebrities, sportsmen and high-net-worth individuals set up front companies in remote jurisdictions to protect their cash from higher taxes, and facilitate bribery, arms deals, financial fraud and drug trafficking. Laying within the trove of leaked files are also the names of the rich and powerful in the Asia-Pacific (APAC) region, overshadowed by the media’s interest of more prominent names of the West. | ||
Line 58: | Line 58: | ||
==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Background Survey of Related Works</font></div>== | ==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Background Survey of Related Works</font></div>== | ||
− | <div style=" | + | <div style="padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> |
{| class="wikitable" style="background-color:#FFFFFF;" width="100%" | {| class="wikitable" style="background-color:#FFFFFF;" width="100%" | ||
|- | |- | ||
− | ! style="font-weight: bold;background: #5E2705;color:#fff;width: 50%;" | | + | ! style="font-weight: bold;background: #5E2705;color:#fff;width: 50%;padding:10px;" | Visualization |
− | ! style="font-weight: bold;background: #5E2705;color:#fff;" | | + | ! style="font-weight: bold;background: #5E2705;color:#fff;padding:10px; " | Description |
|- | |- | ||
| [[Image:is428_databg_ucb_viz.png|550px|center]] | | [[Image:is428_databg_ucb_viz.png|550px|center]] | ||
<br> | <br> | ||
− | <center>Data source: http://people.ischool.berkeley.edu/~yhfan/W209-Final-Project/</center> | + | <center>'''Data source:''' http://people.ischool.berkeley.edu/~yhfan/W209-Final-Project/</center> |
− | || | + | | style="padding:10px;" | |
− | + | ==== Geospatial map chart ==== | |
+ | This visualization shows the interconnectedness of countries connected and involved in the offshore industry over a forty-year period. The map view scalable and specific date and time is selectable on the timeline.<br> | ||
+ | <br> | ||
+ | '''Pros:'''<br> | ||
+ | * Able to see an overview of countries who are connected and involved in the offshore industry. | ||
+ | * Location circle markers are clickable with a pop-up displaying more details on the total number of officer, intermediary and entity connections in each country. | ||
+ | * Location circle marker sizes are dynamically sized according to the number of connections each country has. | ||
+ | <br> | ||
+ | '''Cons:'''<br> | ||
+ | * No country name labels on the location pins, only appears upon hover. | ||
+ | * Despite location circle marker sizes being dynamically sized according to the number of connections each country has, large pin sizes are limited to the same size after it is past the threshold. | ||
+ | * Color choice of the location pins and network lines are slightly jarring and don't go nicely with the overall muted color choice for the visualization. | ||
|- | |- | ||
− | | [[Image:|550px|center]] | + | | [[Image:is428_databg_ucb_viz2.png|550px|center]] |
+ | <br> | ||
+ | <center>'''Data source:''' http://people.ischool.berkeley.edu/~yhfan/W209-Final-Project/</center> | ||
+ | | style="padding:10px;" | | ||
+ | ==== Network graph ==== | ||
+ | This visualization is part of a drill-in in the previous visualization when you click on a particular country. It shows the interconnectedness of companies in that particular country to companies outside the country, across the world. The nodes show more details when they are hovered over.<br> | ||
+ | <br> | ||
+ | '''Pros:'''<br> | ||
+ | * Able to see an overview of entities, officers and intermediaries situated outside and inside a particular country. | ||
+ | * More details of nodes upon hover. | ||
<br> | <br> | ||
− | + | '''Cons:'''<br> | |
− | + | * Difficult to understand at a glance. | |
− | + | * The labels for "Entities", "Intermediaries" and "Officers" are positioned at an imaginary 'T' shape, but the ordered nodes are positioned with a 'Y' shape, which makes it hard for reference. | |
+ | * Network lines are faint against the grey background, and color choice of the nodes are slightly jarring and don't go nicely together with the visualization. | ||
|- | |- | ||
− | | [[Image:|550px|center]] | + | | [[Image:is428_databg_viz1.png|550px|center]] |
+ | <br> | ||
+ | <center>'''Data source:''' http://www.arcgis.com/apps/MapJournal/index.html?appid=1f611be658e74ad48f899d1d6152bdb4</center> | ||
+ | | style="padding:10px;" | | ||
+ | ==== Interactive map ==== | ||
+ | Map showing companies in Mossack Fonseca database “connected” to a particular country by address. The data also shows clients, beneficiaries, and shareholders by country. The visualization uses scaled circle location markers to show the number of companies in each country mentioned in the database. Each country's circle location markers are clickable, which reveal the number of clients, beneficiaries, and shareholders mentioned in the papers from the selected country.<br> | ||
+ | <br> | ||
+ | '''Pros:'''<br> | ||
+ | * Able to see at a glance which countries have the most number of companies, clients, beneficiaries, and shareholders based on their circle location marker size. | ||
+ | * Map is scalable. User is able to zoom in to have a closer look at the smaller countries, and zoom out to have an overview of the concentration of companies. | ||
+ | * Minimal yet effective color choices that are also pleasing on the eye. | ||
<br> | <br> | ||
− | < | + | '''Cons:'''<br> |
− | + | * Unable to see the connections of each country to other countries (i.e. which individual from a particular country has offshore companies in Switzerland). | |
+ | * No timeline provided — whatever shown on the map is all the data from from 1974 to 2015. Some companies might have already been dissolved. | ||
+ | |} | ||
+ | </div> | ||
+ | ==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Proposed Storyboard</font></div>== | ||
+ | <div style="padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | ||
+ | {| class="wikitable" style="background-color:#FFFFFF;" width="100%" | ||
+ | |- | ||
+ | ! style="font-weight: bold;background: #5E2705;color:#fff;width: 50%;padding:10px;" | Page | ||
+ | ! style="font-weight: bold;background: #5E2705;color:#fff;padding:10px; " | Description | ||
+ | |- | ||
+ | | [[Image:is428_t12_storyboard1.JPG|550px|center]] | ||
+ | | style="padding:10px;" | | ||
+ | ==== Homepage ==== | ||
+ | When the user enters our application, he is introduced to the home screen with our project topic and prompted to scroll down to read more. He can click the quick links at the top right hand corner to view details about our team or the project. | ||
+ | |- | ||
+ | | [[Image:is428_t12_storyboard2.JPG|550px|center]] [[Image:is428_t12_storyboard3.JPG|550px|center]] | ||
+ | | style="padding:10px;" | | ||
+ | ==== Story ==== | ||
+ | As the user scrolls down to read more, he is introduced to the story of an individual who intends to set up an offshore company for asset protection, but is unsure of which country he should set up his company in. | ||
+ | <br><br> | ||
+ | The user then clicks through the story (displayed as a carousel slider) to view the different offshore networks of each APAC country (i.e. Hong Kong, Malaysia, Singapore). He is able to see at a glance which country has the most complex offshore networks and which does not. He can also click "View it in action" to view the interactive network graph we have implemented to explore countries on his own. | ||
+ | |- | ||
+ | | [[Image:is428_t12_storyboard4.JPG|550px|center]] | ||
+ | | style="padding:10px;" | | ||
+ | ==== Try it out: Interactive network graph ==== | ||
+ | Now, the user is in our interactive network graph. He can click through the filters to select a ''Country'', ''Node type'', ''Jurisdiction'', ''Relationship'', and click ''Apply'' to execute the filters to show the offshore networks of a particular country. | ||
+ | <br><br> | ||
+ | Each node has a label (of ''Officer''/''Entity''/''Intermediary'' names) and there is a legend on the top left for him to refer to the different node types on the screen. He is able to zoom in and out of the network graph to have a closer look at the relationships between ''Officers'', ''Entities'' and ''Intermediaries''. The nodes are also draggable around the screen for the user to shift and form a better understanding of the offshore network. | ||
|} | |} | ||
</div> | </div> | ||
Line 86: | Line 145: | ||
==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Data Source</font></div>== | ==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Data Source</font></div>== | ||
<div style="font-size: 13px; padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | <div style="font-size: 13px; padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | ||
+ | The following are the data sources we have gathered the Panama Papers data from for this project:<br><br> | ||
{| class="wikitable" style="background-color:#FFFFFF;" width="100%" | {| class="wikitable" style="background-color:#FFFFFF;" width="100%" | ||
|- | |- | ||
− | ! style="font-weight: bold;background: #5E2705;color:#fff;width: 50%;" | Dataset | + | ! style="font-weight: bold;background: #5E2705;color:#fff;width: 50%;padding:10px;" | Dataset |
− | ! style="font-weight: bold;background: #5E2705;color:#fff;" | | + | ! style="font-weight: bold;background: #5E2705;color:#fff;padding:10px;" | Description |
|- | |- | ||
| | | | ||
Line 96: | Line 156: | ||
Data source: https://offshoreleaks.icij.org/pages/database | Data source: https://offshoreleaks.icij.org/pages/database | ||
</center> | </center> | ||
− | || | + | | style="padding:10px;" | |
+ | Contains information on more than 520,000 offshore entities that are part of the Panama Papers, the Offshore Leaks, the Bahamas Leaks and Appleby data from the Paradise Papers as well as from some politicians featured in the Paradise Papers investigation. The data covers nearly 70 years up to early 2016 and links to people and companies in more than 200 countries and territories. | ||
|- | |- | ||
| | | | ||
<center> | <center> | ||
====Paradise-Panama-Papers: Data Scientists United Against Corruption dataset==== | ====Paradise-Panama-Papers: Data Scientists United Against Corruption dataset==== | ||
− | Data source: https://www.kaggle.com/zusmani/paradisepanamapapers/data | + | Data source: https://www.kaggle.com/zusmani/paradisepanamapapers/version/1/data |
</center> | </center> | ||
− | || | + | | style="padding:10px;" | |
+ | Compilation of data from the Paradise and Panama Papers leaks in .csv format of ''Addresses'', ''Entities'', ''Intermediaries'' and ''Officers'', and also ''node edges'', which we will be utilizing to assist the development of our network graph. | ||
|} | |} | ||
</div> | </div> | ||
− | ==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Tools | + | ==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Tools</font></div>== |
<div style="font-size: 15px; padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | <div style="font-size: 15px; padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | ||
− | + | The following are the tools we will be using for the project:<br><br> | |
− | + | [[Image: tools_team12.png |1100px|center]] | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
</div> | </div> | ||
Line 123: | Line 179: | ||
{| class="wikitable" style="background-color:#FFFFFF;" width="100%" | {| class="wikitable" style="background-color:#FFFFFF;" width="100%" | ||
|- | |- | ||
− | ! style="font-weight: bold;background: #5E2705;color:#FFFFFF; width: 40%;" | Key Technical Challenges | + | ! style="font-weight: bold;background: #5E2705;color:#FFFFFF; width: 40%;padding:10px;" | Key Technical Challenges |
− | ! style="font-weight: bold;background: #5E2705;color:#FFFFFF; width: 30%;" | | + | ! style="font-weight: bold;background: #5E2705;color:#FFFFFF; width: 30%;padding:10px;" | Description |
− | ! style="font-weight: bold;background: #5E2705;color:#FFFFFF; width: 30%;" | | + | ! style="font-weight: bold;background: #5E2705;color:#FFFFFF; width: 30%;padding:10px;" | Solution |
|- | |- | ||
− | | <center> Unfamiliar with D3.js libraries </center> || | + | | <center> Unfamiliar with D3.js libraries </center> |
+ | | style="padding:10px;" | | ||
D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. | D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. | ||
− | || | + | |style="padding:10px;"| |
*Go for the d3 workshop | *Go for the d3 workshop | ||
*Self learning | *Self learning | ||
*Peer Learning | *Peer Learning | ||
|- | |- | ||
− | | <center> Data Cleaning and Transformation </center> || | + | | <center> Data Cleaning and Transformation </center> |
+ | | style="padding:10px;" | | ||
The data set are in text format and many other different format. Integration are challenging as there are a lot of manual work to be done. | The data set are in text format and many other different format. Integration are challenging as there are a lot of manual work to be done. | ||
− | || | + | |style="padding:10px;"| |
* Delegate workload for cleaning datasets | * Delegate workload for cleaning datasets | ||
|- | |- | ||
− | | <center> Determining the Most Optimal Interactive Elements </center> || | + | | <center> Determining the Most Optimal Interactive Elements </center> |
+ | | style="padding:10px;" | | ||
In order to enable users to understand the data sets, interactive elements needs to be suitable for this project | In order to enable users to understand the data sets, interactive elements needs to be suitable for this project | ||
− | || | + | |style="padding:10px;"| |
*Develop storyboard | *Develop storyboard | ||
*Research on network graph visualization | *Research on network graph visualization | ||
Line 149: | Line 208: | ||
==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Project Timeline & Task Assignments</font></div>== | ==<div style="background: #F6B419; padding-top: 20px; padding-bottom: 20px; line-height: 0.3em; text-indent: 15px; font-size:20px; font-family:Trebuchet MS; "><font color= #5E2705>Project Timeline & Task Assignments</font></div>== | ||
<div style="font-size: 15px; padding-top: 15px; padding-bottom: 30px;"> | <div style="font-size: 15px; padding-top: 15px; padding-bottom: 30px;"> | ||
− | [[Image: projecttimeline_team12_v2.png | | + | [[Image: projecttimeline_team12_v2.png |1180px|center]] |
</div> | </div> | ||
Line 155: | Line 214: | ||
<div style="font-size: 15px; padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | <div style="font-size: 15px; padding-top: 15px; padding-bottom: 30px; padding-left: 15px; padding-right: 15px;"> | ||
*Databases: https://offshoreleaks.icij.org/pages/database | *Databases: https://offshoreleaks.icij.org/pages/database | ||
− | *Kaggle Dataset: https://www.kaggle.com/zusmani/paradisepanamapapers/data | + | *Kaggle Dataset: https://www.kaggle.com/zusmani/paradisepanamapapers/version/1/data |
*D3.js: https://d3js.org/ | *D3.js: https://d3js.org/ | ||
*Neo4j: https://neo4j.com/ | *Neo4j: https://neo4j.com/ |
Latest revision as of 12:17, 24 November 2017
Version 1 | Version 2 |
Contents
Introduction & Motivation

The Panama Papers (2016) are a huge leak — 11.5 million (approximately 2.6 TB) — of financial documents that reveal the financial holdings of the rich and powerful. The global investigation into the secretive industry of offshore companies expose how politicians, celebrities, sportsmen and high-net-worth individuals set up front companies in remote jurisdictions to protect their cash from higher taxes, and facilitate bribery, arms deals, financial fraud and drug trafficking. Laying within the trove of leaked files are also the names of the rich and powerful in the Asia-Pacific (APAC) region, overshadowed by the media’s interest of more prominent names of the West.
Objectives
As news coverage, even in Singapore, was focused mainly on the West, attention is diverted from what may be more important, which are details of individuals and businesses in APAC that are also found in the leaked documents. This results in a lack of information and coverage on the APAC region.
Our goal is the shed light on the individuals and business involved in the APAC region in the following ways:
- To present the complexity and structure of relationships between entities and individuals in each country in the APAC region.
- Identify key parties in the offshore investments.
Background Survey of Related Works
Visualization | Description |
---|---|
|
Geospatial map chartThis visualization shows the interconnectedness of countries connected and involved in the offshore industry over a forty-year period. The map view scalable and specific date and time is selectable on the timeline.
|
|
Network graphThis visualization is part of a drill-in in the previous visualization when you click on a particular country. It shows the interconnectedness of companies in that particular country to companies outside the country, across the world. The nodes show more details when they are hovered over.
|
|
Interactive mapMap showing companies in Mossack Fonseca database “connected” to a particular country by address. The data also shows clients, beneficiaries, and shareholders by country. The visualization uses scaled circle location markers to show the number of companies in each country mentioned in the database. Each country's circle location markers are clickable, which reveal the number of clients, beneficiaries, and shareholders mentioned in the papers from the selected country.
|
Proposed Storyboard
Page | Description |
---|---|
HomepageWhen the user enters our application, he is introduced to the home screen with our project topic and prompted to scroll down to read more. He can click the quick links at the top right hand corner to view details about our team or the project. | |
StoryAs the user scrolls down to read more, he is introduced to the story of an individual who intends to set up an offshore company for asset protection, but is unsure of which country he should set up his company in.
| |
Try it out: Interactive network graphNow, the user is in our interactive network graph. He can click through the filters to select a Country, Node type, Jurisdiction, Relationship, and click Apply to execute the filters to show the offshore networks of a particular country.
|
Data Source
The following are the data sources we have gathered the Panama Papers data from for this project:
Dataset | Description |
---|---|
Offshore Leaks Database by The International Consortium of Investigative JournalistsData source: https://offshoreleaks.icij.org/pages/database |
Contains information on more than 520,000 offshore entities that are part of the Panama Papers, the Offshore Leaks, the Bahamas Leaks and Appleby data from the Paradise Papers as well as from some politicians featured in the Paradise Papers investigation. The data covers nearly 70 years up to early 2016 and links to people and companies in more than 200 countries and territories. |
Paradise-Panama-Papers: Data Scientists United Against Corruption datasetData source: https://www.kaggle.com/zusmani/paradisepanamapapers/version/1/data |
Compilation of data from the Paradise and Panama Papers leaks in .csv format of Addresses, Entities, Intermediaries and Officers, and also node edges, which we will be utilizing to assist the development of our network graph. |
Tools
Technical Challenges
Key Technical Challenges | Description | Solution |
---|---|---|
D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. |
| |
The data set are in text format and many other different format. Integration are challenging as there are a lot of manual work to be done. |
| |
In order to enable users to understand the data sets, interactive elements needs to be suitable for this project |
|
Project Timeline & Task Assignments
References
- Databases: https://offshoreleaks.icij.org/pages/database
- Kaggle Dataset: https://www.kaggle.com/zusmani/paradisepanamapapers/version/1/data
- D3.js: https://d3js.org/
- Neo4j: https://neo4j.com/
Comments
Please leave comments here.