Difference between revisions of "ISSS608 2016 17T3 Group11 Report"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
<div style="background:#2B3856; border:#2B3856; padding-left:15px; text-align:center;">
 +
<font size = 5; color="#FFFFFF"><span style="font-family:Century Gothic;">Performance Decomposition of China Listed Firms
 +
</span></font>
 +
</div>
 
<!--MAIN HEADER -->
 
<!--MAIN HEADER -->
 
{|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
Line 15: Line 19:
 
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="25%" |  
 
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="25%" |  
 
;
 
;
[[ISSS608_2016_17T3_Group11_Report| <font color="#FFFFFF">Report</font>]]
+
[[ISSS608_2016_17T3_Group11_Report| <font color="#C00000">Report</font>]]
 +
|&nbsp;
 +
|}
  
|  &nbsp;
 
|}
 
  
 
= Motivation of Our Work =
 
= Motivation of Our Work =
Line 27: Line 31:
 
<br/><br/>
 
<br/><br/>
  
= Data =
+
= Data Preparation=
 +
The dataset the project built on is a combination of mutiple date sets and each item in the final dataset is cautiously selected to achieve our project goal.
 +
 
 +
One of the most important dataset used is China listed private enterprise research data, which offers data of companies whose nature of equity are listed non-state-owned enterprises at the end of the year, or later transferred into non-state-owned enterprises since 2003. The data of the database covers the complete information of the above companies every year since 2003, including Stock Code, Listing Exchange, Area, Province, Privatization Sign, Listing Date , Total Share Number, Number of Employees, Operating Revenue, Net Profit, etc.
 +
 
 +
Besides, data preparation process also involves the data collection of the geographic information including China map and the information on  financial performance indicators across different industries and regions—with current dataset, explicitly as province and city.
 +
 
 +
*The data structure of the final dataset is as follow: <br/>
 +
 
 +
[[image:20170805154852.png|1000px]]<br/>
 +
 
 +
 
 +
[[image:20170805154928.png|1000px]]<br/>
  
  
Line 76: Line 92:
 
Sparkline is built in another tab of the dashboard, which is a great way to view the performance indicators for each company based on the selection through the tree map. In total there are 5 indicators that displayed in the spark tables and these are represented respectively by box plot and line graph.   
 
Sparkline is built in another tab of the dashboard, which is a great way to view the performance indicators for each company based on the selection through the tree map. In total there are 5 indicators that displayed in the spark tables and these are represented respectively by box plot and line graph.   
 
<br/>
 
<br/>
[[image:1501641949(1).png|400px]]<br/>
+
[[image:1501641949(1).png|600px]]<br/>
 
 
 
 
= Demonstration =
 
For example, the team would like to investigate the sales report in October. First, the month filter is selected to ‘10’ for October. <br/><br/>
 
  
Firstly, looking at the overview (1), lower revenue for the month could be observed. The month revenue was lower than the previous month and monthly averages. Moreover, the boxes were red showing the decrease in revenue and orders which were more than 20% dropped. Then, checking at the revenue breakdown (2), most channels faced performance drop except ‘Direct’ and ‘Line’. This might due to lower spending – more investigation is needed here. Moreover, from the daily revenue (3), the revenue from 15th to 26th was significantly lower than the same period in previous month. This raised the question whether there was something wrong with the website or lower ads spending during the period.<br/>
 
[[image:DashboardViz_Overview_12.png|600px]]<br/>
 
  
 
= Discussion =
 
= Discussion =
  
Using visualization software such as Tableau will help us interactively analyze the data with the visualization. Moreover, Tableau also have the data connection where it could connect to various data sources such as SQL databases. With Tableau user-friendly interface, a dashboard could be created effortlessly. <br/><br/>
+
For Visual Analytics and Applications, Tableau and R are two main softwares we apply during the course. Generally, tableau is a fantastic tool for pattern discovery using data visualization. It is an ideal tool of choice when we want to throw some data and keep playing with the data to see whether any patterns emerge.  
 
 
Seeking more customization than what Tableau can provide from data import, data manipulation, or data visualization, enthusiasts could also use R Markdown to create the interactive dashboard with the help of the following packages: rmarkdown, shiny, flexdashboard, ggplot2, plotly, treemap. There are several packages in R to choose from when creating this dashboard which allows high customization level. Although the platform itself requires coding, it is a good start for those who want to try out data visualization programming. The interactive part of graph which is done by ‘plotly’  transformed simple R codes into JavaScript which learning curve is much steeper.<br/><br/>
 
 
 
When choosing between both platforms, users should look at their requirements of their dashboard and use the one that is most suitable for the project. Both approaches have its advantages and disadvantages which are summarized below:<br/>
 
[[image:DashboardViz_Overview_13.png|500px]]<br/>
 
  
 +
Whereas, R has a relatively steep learning curve. However, any investment we make in R, will be returned to us with significant rewards. In fact, R is easily more than a programming language; it is almost a whole framework. There are countless libraries ready to give us a helping hand. For instance, quite a few R packages were used to build our project application. In a word, there are two important reasons why we use R for the project: reproducibility and repeatability. Provided with the application code, you can easily reproduce the whole application to play with it and explore the data!
 +
<br/><br/>
  
 
= Future Work =
 
= Future Work =
Line 136: Line 143:
 
<sup>[14]</sup> [https://leonawicz.github.io/HtmlWidgetExamples/ex_dt_sparkline.html Combining data tables and sparklines]<br/>
 
<sup>[14]</sup> [https://leonawicz.github.io/HtmlWidgetExamples/ex_dt_sparkline.html Combining data tables and sparklines]<br/>
 
<sup>[15]</sup> https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group6_Poster<br/>
 
<sup>[15]</sup> https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group6_Poster<br/>
 +
<sup>[16]</sup> https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group5_Proposal<br/>
  
  
 
= Special Thanks =
 
= Special Thanks =
 
We would like to give a special thanks to Prof. Kam for his dedicated support and guidance! Moreover, thanks whoever spend your time on the fruits of our labor. We sincerely hope you enjoy it! Your suggestions & feedbacks are most appreciated and welcome!
 
We would like to give a special thanks to Prof. Kam for his dedicated support and guidance! Moreover, thanks whoever spend your time on the fruits of our labor. We sincerely hope you enjoy it! Your suggestions & feedbacks are most appreciated and welcome!

Latest revision as of 16:05, 5 August 2017

Performance Decomposition of China Listed Firms

Proposal

Poster

Application

Report

 


Motivation of Our Work

Since initiating market reforms in 1978, China has shifted from a centrally-planned to a market-based economy and has experienced rapid economic and social development. With a population of 1.3 billion, China is the second largest economy and is increasingly playing an important and influential role in development and in the global economy.

An increasing number of foreign investors are looking for opportunities to invest in China. Therefore, understanding the performance of China listed firms becomes very essential. This project identifies the need of developing an interactive dashboard to display the performance of China listed firms, which will be very helpful in understanding how these companies of different industries have developed over the past 12 years.

Data Preparation

The dataset the project built on is a combination of mutiple date sets and each item in the final dataset is cautiously selected to achieve our project goal.

One of the most important dataset used is China listed private enterprise research data, which offers data of companies whose nature of equity are listed non-state-owned enterprises at the end of the year, or later transferred into non-state-owned enterprises since 2003. The data of the database covers the complete information of the above companies every year since 2003, including Stock Code, Listing Exchange, Area, Province, Privatization Sign, Listing Date , Total Share Number, Number of Employees, Operating Revenue, Net Profit, etc.

Besides, data preparation process also involves the data collection of the geographic information including China map and the information on financial performance indicators across different industries and regions—with current dataset, explicitly as province and city.

  • The data structure of the final dataset is as follow:

20170805154852.png


20170805154928.png


Dashboard Design framework

To develop a comprehensive display of the performance of China listed firms, four parts are covered in our application. The first part is an interactive tree map, which shows 4 layers from industry to province to city. The tree map provides an overview of listed firms’ distribution in terms of total assets and profits in a top-down method. The second part is a geo_facet line graph which is linked to the tree map by click action. The geo_facet map compares the yearly revenue performance from 2003-2015 of each province by industry. The third part is scatter plot, which is also linked to the tree map by click action. The scatter plot shows each stock performance in terms of earning and cash flow based on the selected province, which provides a more granular view of the stock’s performance. The forth part is a spark table, which is an information dashboard design. This spark table is a way that displays all the important indicators of each stock so that the investors have a pipeline to observe the data in a more detailed manner.


Graphic Visualization

Overview

The first layer of tree map displays the industry distribution of China listed firms. Users can select any industry to go further to view province distribution.

1501558280.png


After clicking the first layer of the tree map, the application will automatically generate the china map with province as facet according to the selected industry.

1501558280(1).png


Detailed Info

Performance Breakdown by Industry & Province

The second layer of tree map shows the province distribution of selected industry through first layer. Users can then select any interested city to navigate to a more granular level.
1501559751(1).png


After clicking the interested province, the application will then display a box plot of every city, which allows users to compare performance of cities and presents an view of outliers.
1501559042(1).png


Performance Breakdown by Industry & City

The third layer of tree map displays the city distribution of selected industry from first layer and selected province from second layer. The third layer is linked to a scatter plot.
1501559000(1).png


After clicking the interested city, the application will display the firm distribution in terms of net cashflow and earing per share. The scatter plot allows the users to view the firm in details.
1501559775(1).png


Performance of Individual Stock

Sparkline is built in another tab of the dashboard, which is a great way to view the performance indicators for each company based on the selection through the tree map. In total there are 5 indicators that displayed in the spark tables and these are represented respectively by box plot and line graph.
1501641949(1).png


Discussion

For Visual Analytics and Applications, Tableau and R are two main softwares we apply during the course. Generally, tableau is a fantastic tool for pattern discovery using data visualization. It is an ideal tool of choice when we want to throw some data and keep playing with the data to see whether any patterns emerge.

Whereas, R has a relatively steep learning curve. However, any investment we make in R, will be returned to us with significant rewards. In fact, R is easily more than a programming language; it is almost a whole framework. There are countless libraries ready to give us a helping hand. For instance, quite a few R packages were used to build our project application. In a word, there are two important reasons why we use R for the project: reproducibility and repeatability. Provided with the application code, you can easily reproduce the whole application to play with it and explore the data!

Future Work

1 Data Capture
The data is downloaded manually. We can build a pipeline to read the data from API directly in the future, which will automate the whole visualization process

2 More interactive
Some of the graphs can be developed from static to interactive so that the whole application is more user friendly

3 Faster
The interaction part of the visualization is a bit slow. In the future, the whole application can be deployed on Cloud, instead of local server to make it much faster to play with

4 More detailed
The micro parts of the graph can be further tuned

User Guide

The explanation of function is shown in the picture below. Actually the application is quite user-friendly. You just need to click wherever you are interested in and view it!

G11 userguide.png


Reference

[1] shiny
[2] DT
[3] geofacet
[4] tibble
[5] d3Tree
[6] d3treeR
[7] tidyverse
[8] sparkline
[9] dplyr
[10] reshape2
[11] shinydashboard
[12] ggthemes
[13] ggplot2
[14] Combining data tables and sparklines
[15] https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group6_Poster
[16] https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group5_Proposal


Special Thanks

We would like to give a special thanks to Prof. Kam for his dedicated support and guidance! Moreover, thanks whoever spend your time on the fruits of our labor. We sincerely hope you enjoy it! Your suggestions & feedbacks are most appreciated and welcome!