Difference between revisions of "ISSS608 2016 17T3 Group11 Report"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
<div style="background:#2B3856; border:#2B3856; padding-left:15px; text-align:center;">
 +
<font size = 5; color="#FFFFFF"><span style="font-family:Century Gothic;">Performance Decomposition of China Listed Firms
 +
</span></font>
 +
</div>
 
<!--MAIN HEADER -->
 
<!--MAIN HEADER -->
 
{|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
Line 15: Line 19:
 
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="25%" |  
 
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="25%" |  
 
;
 
;
[[ISSS608_2016_17T3_Group11_Report| <font color="#FFFFFF">Report</font>]]
+
[[ISSS608_2016_17T3_Group11_Report| <font color="#C00000">Report</font>]]
 +
|&nbsp;
 +
|}
  
|  &nbsp;
 
|}
 
  
= Motivation of the application =
+
= Motivation of Our Work =
  
 
Since initiating market reforms in 1978, China has shifted from a centrally-planned to a market-based economy and has experienced rapid economic and social development. With a population of 1.3 billion, China is the second largest economy and is increasingly playing an important and influential role in development and in the global economy.  
 
Since initiating market reforms in 1978, China has shifted from a centrally-planned to a market-based economy and has experienced rapid economic and social development. With a population of 1.3 billion, China is the second largest economy and is increasingly playing an important and influential role in development and in the global economy.  
Line 27: Line 31:
 
<br/><br/>
 
<br/><br/>
  
= Data =
+
= Data Preparation=
 +
The dataset the project built on is a combination of mutiple date sets and each item in the final dataset is cautiously selected to achieve our project goal.
  
 +
One of the most important dataset used is China listed private enterprise research data, which offers data of companies whose nature of equity are listed non-state-owned enterprises at the end of the year, or later transferred into non-state-owned enterprises since 2003. The data of the database covers the complete information of the above companies every year since 2003, including Stock Code, Listing Exchange, Area, Province, Privatization Sign, Listing Date , Total Share Number, Number of Employees, Operating Revenue, Net Profit, etc.
  
= Design framework =
+
Besides, data preparation process also involves the data collection of the geographic information including China map and the information on  financial performance indicators across different industries and regions—with current dataset, explicitly as province and city.
  
== Dashboard Design Framework ==
+
*The data structure of the final dataset is as follow: <br/>
To develop a comprehensive display of the performance of China listed firms, four parts are covered in our application. The first part is an interactive tree map, which shows 4 layers from industry to province to city. The tree map provides an overview of listed firms’ distribution in terms of total assets and profits in a top-down method. The second part is a geo_facet line graph which is linked to the tree map by click action. The geo_facet map compares the yearly revenue performance from 2003-2015 of each province by industry. The third part is scatter plot, which is also linked to the tree map by click action. The scatter plot shows each stock performance in terms of earning and cash flow based on the selected province, which provides a more granular view of the stock’s performance. The forth part is a spark table, which is an information dashboard design. This spark table is a way that displays all the important indicators of each stock so that the investors have a pipeline to observe the data in a more detailed manner. <br/>
+
 
 +
[[image:20170805154852.png|1000px]]<br/>
 +
 
 +
 
 +
[[image:20170805154928.png|1000px]]<br/>
  
[[image:DashboardViz_Data_2.png|200px]]<br/><br/>
 
  
Filtering was also introduced in the dashboard by filtering by month (the current month) or the month we would like investigate. Also, there is the option to filter the performance by channel such as SEM or FacebookPaid.<br/>
+
= Dashboard Design framework =
  
[[image:DashboardViz_Data_3.png|200px]]<br/><br/>
+
To develop a comprehensive display of the performance of China listed firms, four parts are covered in our application. The first part is an interactive tree map, which shows 4 layers from industry to province to city. The tree map provides an overview of listed firms’ distribution in terms of total assets and profits in a top-down method. The second part is a geo_facet line graph which is linked to the tree map by click action. The geo_facet map compares the yearly revenue performance from 2003-2015 of each province by industry. The third part is scatter plot, which is also linked to the tree map by click action. The scatter plot shows each stock performance in terms of earning and cash flow based on the selected province, which provides a more granular view of the stock’s performance. The forth part is a spark table, which is an information dashboard design. This spark table is a way that displays all the important indicators of each stock so that the investors have a pipeline to observe the data in a more detailed manner. <br/>
  
  
Line 45: Line 54:
  
 
== Overview ==
 
== Overview ==
[[image:DashboardViz_Overview_4.png|400px]]<br/>
+
The first layer of tree map displays the industry distribution of China listed firms. Users can select any industry to go further to view province distribution.
 +
<br/>
 +
 
 +
[[image:1501558280.png|400px]]<br/>
  
The overview chart of monthly revenue was drawn using bar charts with dark blue color. The selected month will be highlighted in light blue. Moreover, the reference line of monthly average in grey was added. The data was shown as ‘thousands’ instead of 1 baht unit to avoid unnecessary zeroes. <br/><br/>
 
Although line graph is also suitable for this data, as we only view the limited timeline of 12 months, bar chart could show the data clearer and the categories (months) are not too much for bar charts.<br/><br/>
 
Since the graph is overview chart, it would not be effected by ‘channel’ filter since we would like to fix this as reference when investigating other sources.<br/>
 
  
== Revenue and Order ==
+
After clicking the first layer of the tree map, the application will automatically  generate the china map with province as facet according to the selected industry.<br/>
[[image:DashboardViz_Overview_5.png|200px]][[image:DashboardViz_Overview_6.png|200px]]<br/>
 
  
The revenue and order were shown for selected month and channel in the box with large font size to emphasize the numbers. Also, % change from previous month is shown in the bracket and reflected in box colors. <br/>
+
[[image:1501558280(1).png|400px]]<br/>
  
== Revenue Breakdown by Channel / Revenue Breakdown by Day ==
 
With ‘All’ channel selected, a bullet chart of overall revenue breakdown is shown. The grey area shown average revenue for that channel in each month. The black dash show the performance of specific channel in previous month as a KPI to hit for current month. Lastly, the dark grey bar chart show the revenue of selected channel. For example, in the sample below, FacebookPaid performance was lower than the average but higher than previous month.<br/><br/>
 
[[image:DashboardViz_Overview_7.png|400px]]<br/>
 
  
The bullet chart format was chosen for the graph as it could help measure the month performance with last month and average performance.<br/><br/>
+
== Detailed Info ==
After viewing the overview, the drill down of channel is given by selecting specific channels. Viewing this bar chart in comparison with the overview on the left, the performance can be easily compared.<br/><br/>
 
[[image:DashboardViz_Overview_8.png|400px]]<br/>
 
  
== Revenue Breakdown by Day ==
+
=== Performance Breakdown by Industry & Province ===
The revenue breakdown by day is also shown in line graph. With last month’s performance as light grey for reference, comparison could be easier seen and dark blue highlight of this month’s revenue. Line graph was chosen as it could show movement during the month.<br/><br/>
+
The second layer of tree map shows the province distribution of selected industry through first layer. Users can then select any interested city to navigate to a more granular level.
[[image:DashboardViz_Overview_9.png|400px]]<br/>
+
<br/>
 +
[[image:1501559751(1).png|400px]]<br/>
  
== Category and Brand ==
 
[[image:DashboardViz_Overview_10.png|200px]] [[image:DashboardViz_Overview_11.png|200px]]<br/><br/>
 
  
Sales by category is shown in bubble chart where the size of bubble is total number of orders, x-axis is total revenue, and y-axis is how large average order is. For example, watches category has high revenue and large order as well as higher average order revenue. On the treemap, the brand breakdown by categories is shown. <br/><br/>
+
After clicking the interested province, the application will then display a box plot of every city, which allows users to compare performance of cities and presents an view of outliers.
 +
<br/>
 +
[[image:1501559042(1).png|400px]]<br/>
  
Investigating both charts, best-seller category brand and products could be seen. This is useful in planning the strategy and product mix whether the sales is dominated by one category or not as well as which category could generate high revenue.<br/><br/>
 
  
= Demonstration =
+
=== Performance Breakdown by Industry & City ===
For example, the team would like to investigate the sales report in October. First, the month filter is selected to ‘10’ for October. <br/><br/>
+
The third layer of  tree map displays the city distribution of selected industry from first layer and selected province from second layer. The third layer is linked to a scatter plot. 
 +
<br/>
 +
[[image:1501559000(1).png|400px]]<br/>
 +
 
 +
 
 +
After clicking the interested city, the application will display the firm distribution in terms of net cashflow and earing per share. The scatter plot allows the users to view the firm in details.
 +
<br/>
 +
[[image:1501559775(1).png|400px]]<br/>
 +
 
 +
 
 +
=== Performance of Individual Stock ===
 +
Sparkline is built in another tab of the dashboard, which is a great way to view the performance indicators for each company based on the selection through the tree map. In total there are 5 indicators that displayed in the spark tables and these are represented respectively by box plot and line graph.
 +
<br/>
 +
[[image:1501641949(1).png|600px]]<br/>
  
Firstly, looking at the overview (1), lower revenue for the month could be observed. The month revenue was lower than the previous month and monthly averages. Moreover, the boxes were red showing the decrease in revenue and orders which were more than 20% dropped. Then, checking at the revenue breakdown (2), most channels faced performance drop except ‘Direct’ and ‘Line’. This might due to lower spending – more investigation is needed here. Moreover, from the daily revenue (3), the revenue from 15th to 26th was significantly lower than the same period in previous month. This raised the question whether there was something wrong with the website or lower ads spending during the period.<br/>
 
[[image:DashboardViz_Overview_12.png|600px]]<br/>
 
  
 
= Discussion =
 
= Discussion =
  
Using visualization software such as Tableau will help us interactively analyze the data with the visualization. Moreover, Tableau also have the data connection where it could connect to various data sources such as SQL databases. With Tableau user-friendly interface, a dashboard could be created effortlessly. <br/><br/>
+
For Visual Analytics and Applications, Tableau and R are two main softwares we apply during the course. Generally, tableau is a fantastic tool for pattern discovery using data visualization. It is an ideal tool of choice when we want to throw some data and keep playing with the data to see whether any patterns emerge.  
  
Seeking more customization than what Tableau can provide from data import, data manipulation, or data visualization, enthusiasts could also use R Markdown to create the interactive dashboard with the help of the following packages: rmarkdown, shiny, flexdashboard, ggplot2, plotly, treemap. There are several packages in R to choose from when creating this dashboard which allows high customization level. Although the platform itself requires coding, it is a good start for those who want to try out data visualization programming. The interactive part of graph which is done by ‘plotly’  transformed simple R codes into JavaScript which learning curve is much steeper.<br/><br/>
+
Whereas, R has a relatively steep learning curve. However, any investment we make in R, will be returned to us with significant rewards. In fact, R is easily more than a programming language; it is almost a whole framework. There are countless libraries ready to give us a helping hand. For instance, quite a few R packages were used to build our project application. In a word, there are two important reasons why we use R for the project: reproducibility and repeatability. Provided with the application code, you can easily reproduce the whole application to play with it and explore the data!
 
+
<br/><br/>
When choosing between both platforms, users should look at their requirements of their dashboard and use the one that is most suitable for the project. Both approaches have its advantages and disadvantages which are summarized below:<br/>
 
[[image:DashboardViz_Overview_13.png|500px]]<br/>
 
  
 
= Future Work =
 
= Future Work =
Line 108: Line 120:
 
<br/><br/>
 
<br/><br/>
  
 +
= User Guide =
 +
 +
The explanation of function is shown in the picture below. Actually the application is quite user-friendly. You just need to click wherever you are interested in and view it!<br/>
  
= User Guide =
+
[[image:G11 userguide.png|650px]]<br/>
  
The explanation of function is shown in the picture below:<br/>
 
[[image:DashboardViz_Overview_14.png|500px]]<br/>
 
  
 
= Reference =
 
= Reference =
Line 130: Line 143:
 
<sup>[14]</sup> [https://leonawicz.github.io/HtmlWidgetExamples/ex_dt_sparkline.html Combining data tables and sparklines]<br/>
 
<sup>[14]</sup> [https://leonawicz.github.io/HtmlWidgetExamples/ex_dt_sparkline.html Combining data tables and sparklines]<br/>
 
<sup>[15]</sup> https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group6_Poster<br/>
 
<sup>[15]</sup> https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group6_Poster<br/>
 +
<sup>[16]</sup> https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group5_Proposal<br/>
  
  
 
= Special Thanks =
 
= Special Thanks =
 
We would like to give a special thanks to Prof. Kam for his dedicated support and guidance! Moreover, thanks whoever spend your time on the fruits of our labor. We sincerely hope you enjoy it! Your suggestions & feedbacks are most appreciated and welcome!
 
We would like to give a special thanks to Prof. Kam for his dedicated support and guidance! Moreover, thanks whoever spend your time on the fruits of our labor. We sincerely hope you enjoy it! Your suggestions & feedbacks are most appreciated and welcome!

Latest revision as of 16:05, 5 August 2017

Performance Decomposition of China Listed Firms

Proposal

Poster

Application

Report

 


Motivation of Our Work

Since initiating market reforms in 1978, China has shifted from a centrally-planned to a market-based economy and has experienced rapid economic and social development. With a population of 1.3 billion, China is the second largest economy and is increasingly playing an important and influential role in development and in the global economy.

An increasing number of foreign investors are looking for opportunities to invest in China. Therefore, understanding the performance of China listed firms becomes very essential. This project identifies the need of developing an interactive dashboard to display the performance of China listed firms, which will be very helpful in understanding how these companies of different industries have developed over the past 12 years.

Data Preparation

The dataset the project built on is a combination of mutiple date sets and each item in the final dataset is cautiously selected to achieve our project goal.

One of the most important dataset used is China listed private enterprise research data, which offers data of companies whose nature of equity are listed non-state-owned enterprises at the end of the year, or later transferred into non-state-owned enterprises since 2003. The data of the database covers the complete information of the above companies every year since 2003, including Stock Code, Listing Exchange, Area, Province, Privatization Sign, Listing Date , Total Share Number, Number of Employees, Operating Revenue, Net Profit, etc.

Besides, data preparation process also involves the data collection of the geographic information including China map and the information on financial performance indicators across different industries and regions—with current dataset, explicitly as province and city.

  • The data structure of the final dataset is as follow:

20170805154852.png


20170805154928.png


Dashboard Design framework

To develop a comprehensive display of the performance of China listed firms, four parts are covered in our application. The first part is an interactive tree map, which shows 4 layers from industry to province to city. The tree map provides an overview of listed firms’ distribution in terms of total assets and profits in a top-down method. The second part is a geo_facet line graph which is linked to the tree map by click action. The geo_facet map compares the yearly revenue performance from 2003-2015 of each province by industry. The third part is scatter plot, which is also linked to the tree map by click action. The scatter plot shows each stock performance in terms of earning and cash flow based on the selected province, which provides a more granular view of the stock’s performance. The forth part is a spark table, which is an information dashboard design. This spark table is a way that displays all the important indicators of each stock so that the investors have a pipeline to observe the data in a more detailed manner.


Graphic Visualization

Overview

The first layer of tree map displays the industry distribution of China listed firms. Users can select any industry to go further to view province distribution.

1501558280.png


After clicking the first layer of the tree map, the application will automatically generate the china map with province as facet according to the selected industry.

1501558280(1).png


Detailed Info

Performance Breakdown by Industry & Province

The second layer of tree map shows the province distribution of selected industry through first layer. Users can then select any interested city to navigate to a more granular level.
1501559751(1).png


After clicking the interested province, the application will then display a box plot of every city, which allows users to compare performance of cities and presents an view of outliers.
1501559042(1).png


Performance Breakdown by Industry & City

The third layer of tree map displays the city distribution of selected industry from first layer and selected province from second layer. The third layer is linked to a scatter plot.
1501559000(1).png


After clicking the interested city, the application will display the firm distribution in terms of net cashflow and earing per share. The scatter plot allows the users to view the firm in details.
1501559775(1).png


Performance of Individual Stock

Sparkline is built in another tab of the dashboard, which is a great way to view the performance indicators for each company based on the selection through the tree map. In total there are 5 indicators that displayed in the spark tables and these are represented respectively by box plot and line graph.
1501641949(1).png


Discussion

For Visual Analytics and Applications, Tableau and R are two main softwares we apply during the course. Generally, tableau is a fantastic tool for pattern discovery using data visualization. It is an ideal tool of choice when we want to throw some data and keep playing with the data to see whether any patterns emerge.

Whereas, R has a relatively steep learning curve. However, any investment we make in R, will be returned to us with significant rewards. In fact, R is easily more than a programming language; it is almost a whole framework. There are countless libraries ready to give us a helping hand. For instance, quite a few R packages were used to build our project application. In a word, there are two important reasons why we use R for the project: reproducibility and repeatability. Provided with the application code, you can easily reproduce the whole application to play with it and explore the data!

Future Work

1 Data Capture
The data is downloaded manually. We can build a pipeline to read the data from API directly in the future, which will automate the whole visualization process

2 More interactive
Some of the graphs can be developed from static to interactive so that the whole application is more user friendly

3 Faster
The interaction part of the visualization is a bit slow. In the future, the whole application can be deployed on Cloud, instead of local server to make it much faster to play with

4 More detailed
The micro parts of the graph can be further tuned

User Guide

The explanation of function is shown in the picture below. Actually the application is quite user-friendly. You just need to click wherever you are interested in and view it!

G11 userguide.png


Reference

[1] shiny
[2] DT
[3] geofacet
[4] tibble
[5] d3Tree
[6] d3treeR
[7] tidyverse
[8] sparkline
[9] dplyr
[10] reshape2
[11] shinydashboard
[12] ggthemes
[13] ggplot2
[14] Combining data tables and sparklines
[15] https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group6_Poster
[16] https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group5_Proposal


Special Thanks

We would like to give a special thanks to Prof. Kam for his dedicated support and guidance! Moreover, thanks whoever spend your time on the fruits of our labor. We sincerely hope you enjoy it! Your suggestions & feedbacks are most appreciated and welcome!