Difference between revisions of "ISSS608 2016 17T3 Group11 Report"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 22: Line 22:
 
= Motivation of the application =
 
= Motivation of the application =
  
Since initiating market reforms in 1978, China has shifted from a centrally-planned to a market-based economy and has experienced rapid economic and social development. With a population of 1.3 billion, China is the second largest economy and is increasingly playing an important and influential role in development and in the global economy. China has been the largest contributor to world growth since the global financial crisis of 2008. An increasing number of foreign investors are looking for opportunities to invest in China. Therefore, understanding the performance of China listed firms becomes very essential. This project identifies the need of developing an interactive dashboard to display the performance of China listed firms, which will be very helpful in understanding how these companies of different industries have developed over the past 12 years. <br/><br/>
+
Since initiating market reforms in 1978, China has shifted from a centrally-planned to a market-based economy and has experienced rapid economic and social development. With a population of 1.3 billion, China is the second largest economy and is increasingly playing an important and influential role in development and in the global economy.  
 +
 
 +
An increasing number of foreign investors are looking for opportunities to invest in China. Therefore, understanding the performance of China listed firms becomes very essential. This project identifies the need of developing an interactive dashboard to display the performance of China listed firms, which will be very helpful in understanding how these companies of different industries have developed over the past 12 years.
 +
<br/><br/>
  
 
= Data =
 
= Data =
  
The dataset used in this project is real sales dataset from e-commerce website SQL databases. However, due to confidentiality, we blinded the data and use the exported csv file instead of the original connection. The dataset consists of information about the source, medium, campaign, orders, revenue, and status. <br/><br/>
 
Source, medium, campaign is the UTM parameters which is collected by the databases on how the traffic source is from<sup>[1]</sup> .  Utilizing UTM parameters, channel of traffic could be obtained such as Facebook Ads, Google Ads, etc. Moreover, medium or campaign parameters often reflect which brand the product belongs to. Using this information, channel of traffic, product brand and product category could be obtained.<br/><br/>
 
[[image:DashboardViz_Data_1.png|600px]]<br/>
 
 
In order to better visualize the data, similar data was created for the missing months and data to be used is from April to December 2016 and did not take the status into consideration. This is only part of the data available to e-commerce marketers but this project is conducted as ‘proof-of-concept’ that R and Tableau could also be used to create dashboards effectively. Both platforms also allow SQL databases connection.  <br/><br/>
 
  
 
= Design framework =
 
= Design framework =
Line 43: Line 41:
 
[[image:DashboardViz_Data_3.png|200px]]<br/><br/>
 
[[image:DashboardViz_Data_3.png|200px]]<br/><br/>
  
== Color ==
 
From the color design principle, using common hues could help signify the sequence of data<sup>[3]</sup>  as well as being not too make unnecessary use of different colors. The sequential hues in blue was chosen as blue is soft for the eyes and has clear distinction with grey background as well as matched with the default Shiny / flexdashboard theme.<br/><br/>
 
However, for the indicator boxes – showing this month revenue and orders as well as % change from last month, red and green colors were used. Green color typically signifies increase which we would use green color for increase or equal performance. On the other hand, if the performances worsen, red color was used here to get more attention and signify warnings.<br/><br/>
 
  
 
= Graphic Visualization =
 
= Graphic Visualization =
Line 97: Line 92:
 
= Future Work =
 
= Future Work =
  
''1 Data integration with various sources'' <br/>
+
'''''1 Data Capture'''''  <br/>
With various libraries available in R, using API to integrate with current platforms such as Google Analytics, Google AdWords, Facebook Ads Management, Sales SQL Databases, etc. is possible. This means that marketers no longer have to fetch the data one by one but can fetch it and combined it within R using the libraries. <br/><br/>
+
The data is downloaded manually. We can build a pipeline to read the data from API directly in the future, which will automate the whole visualization process
 
+
<br/><br/>
''2 Real-time Data''  <br/>
 
Also, with the integration of above APIs, this dashboard could be utilized to fetch the latest data when going live.<br/><br/>
 
 
 
''3 Dashboard Function'' <br/>
 
In terms of functions, possible improvements include adding more type of graph to reflect more metrics with different data sources and more flexible filtering such as customizable date ranges, brand filtering which can be done through R.<br/><br/>
 
  
''4 Report Generation'' <br/>
+
'''''2 More interactive''''' <br/>
With the needs of reporting, the application would be more useful if the graphs and dataset are downloadable for management report. Moreover, with the ability to download filtered data, in-depth analysis which might not be present in the dashboard design could be discovered.<br/><br/>
+
Some of the graphs can be developed from static to interactive so that the whole application is more user friendly
 +
<br/><br/>
  
= Installation guide =
+
'''''3 Faster''''' <br/>
Both Tableau and R Markdown application could be accessed online via the link or QR codes below. <br/>
+
The interaction part of the visualization is a bit slow. In the future, the whole application can be deployed on Cloud, instead of local server to make it much faster to play with
[[image:QR_Code.png|200px]]
+
<br/><br/>
  
== R Markdown ==
+
'''''4 More detailed''''' <br/>
R Application: https://goo.gl/b42hOS
+
The micro parts of the graph can be further tuned
The application could be accessed online via mobile or laptops from the link above. The recommended browser is Google Chrome for both desktop and mobile version. Despite the convenience of viewing the data on mobile, its performances might be slower than desktop.
+
<br/><br/>
  
== Tableau ==
 
Tableau Application: https://goo.gl/fH4qiz
 
As mentioned earlier, Tableau is not mobile responsive if the design was created using desktop size. Therefore, the recommended device is laptop computer. Google Chrome is the recommended browser.<br/><br/>
 
  
 
= User Guide =
 
= User Guide =
Line 127: Line 115:
  
 
= Reference =
 
= Reference =
<sup>[1]</sup>  https://ga-dev-tools.appspot.com/campaign-url-builder/<br/>
+
<sup>[1]</sup>  [https://cran.r-project.org/web/packages/shiny/index.html shiny]<br/>
<sup>[2]</sup>  Visual Information-Seeking Mantra [Shneiderman,1996]<br/>
+
<sup>[2]</sup>  [https://cran.r-project.org/web/packages/DT/index.html DT]<br/>
<sup>[3]</sup> http://www.perceptualedge.com/articles/b-eye/choosing_colors.pdf<br/>
+
<sup>[3]</sup>  [https://cran.r-project.org/web/packages/geofacet/index.html geofacet]<br/>
 +
<sup>[4]</sup>  [https://cran.r-project.org/web/packages/tibble/index.html tibble]<br/>
 +
<sup>[5]</sup>  [https://cran.r-project.org/web/packages/d3Tree/index.html d3Tree]<br/>
 +
<sup>[6]</sup>  [https://github.com/timelyportfolio/d3treeR d3treeR]<br/>
 +
<sup>[7]</sup>  [https://cran.r-project.org/web/packages/tidyverse/index.html tidyverse]<br/>
 +
<sup>[8]</sup>  [https://cran.r-project.org/web/packages/sparkline/index.html sparkline]<br/>
 +
<sup>[9]</sup>  [https://cran.r-project.org/web/packages/dplyr/index.html dplyr]<br/>
 +
<sup>[10]</sup> [https://cran.r-project.org/web/packages/reshape2/index.html reshape2]<br/>
 +
<sup>[11]</sup> [https://cran.r-project.org/web/packages/shinydashboard/index.html shinydashboard]<br/>
 +
<sup>[12]</sup> [https://cran.r-project.org/web/packages/ggthemes/index.html ggthemes]<br/>
 +
<sup>[13]</sup> [https://cran.r-project.org/web/packages/ggplot2/index.html ggplot2]<br/>
 +
<sup>[14]</sup> [https://leonawicz.github.io/HtmlWidgetExamples/ex_dt_sparkline.html Combining data tables and sparklines]<br/>
 +
<sup>[15]</sup> https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group6_Poster<br/>
 +
 
 +
 
 +
= Special Thanks =
 +
We would like to give a special thanks to Prof. Kam for his dedicated support and guidance! Moreover, thanks whoever spend your time on the fruits of our labor. We sincerely hope you enjoy it! Your suggestions & feedbacks are most appreciated and welcome!

Revision as of 12:06, 4 August 2017

Proposal

Poster

Application

Report

 

Motivation of the application

Since initiating market reforms in 1978, China has shifted from a centrally-planned to a market-based economy and has experienced rapid economic and social development. With a population of 1.3 billion, China is the second largest economy and is increasingly playing an important and influential role in development and in the global economy.

An increasing number of foreign investors are looking for opportunities to invest in China. Therefore, understanding the performance of China listed firms becomes very essential. This project identifies the need of developing an interactive dashboard to display the performance of China listed firms, which will be very helpful in understanding how these companies of different industries have developed over the past 12 years.

Data

Design framework

Dashboard Design Framework

To develop a comprehensive display of the performance of China listed firms, four parts are covered in our application. The first part is an interactive tree map, which shows 4 layers from industry to province to city. The tree map provides an overview of listed firms’ distribution in terms of total assets and profits in a top-down method. The second part is a geo_facet line graph which is linked to the tree map by click action. The geo_facet map compares the yearly revenue performance from 2003-2015 of each province by industry. The third part is scatter plot, which is also linked to the tree map by click action. The scatter plot shows each stock performance in terms of earning and cash flow based on the selected province, which provides a more granular view of the stock’s performance. The forth part is a spark table, which is an information dashboard design. This spark table is a way that displays all the important indicators of each stock so that the investors have a pipeline to observe the data in a more detailed manner.

200px

Filtering was also introduced in the dashboard by filtering by month (the current month) or the month we would like investigate. Also, there is the option to filter the performance by channel such as SEM or FacebookPaid.

200px


Graphic Visualization

Overview

400px

The overview chart of monthly revenue was drawn using bar charts with dark blue color. The selected month will be highlighted in light blue. Moreover, the reference line of monthly average in grey was added. The data was shown as ‘thousands’ instead of 1 baht unit to avoid unnecessary zeroes.

Although line graph is also suitable for this data, as we only view the limited timeline of 12 months, bar chart could show the data clearer and the categories (months) are not too much for bar charts.

Since the graph is overview chart, it would not be effected by ‘channel’ filter since we would like to fix this as reference when investigating other sources.

Revenue and Order

200px200px

The revenue and order were shown for selected month and channel in the box with large font size to emphasize the numbers. Also, % change from previous month is shown in the bracket and reflected in box colors.

Revenue Breakdown by Channel / Revenue Breakdown by Day

With ‘All’ channel selected, a bullet chart of overall revenue breakdown is shown. The grey area shown average revenue for that channel in each month. The black dash show the performance of specific channel in previous month as a KPI to hit for current month. Lastly, the dark grey bar chart show the revenue of selected channel. For example, in the sample below, FacebookPaid performance was lower than the average but higher than previous month.

400px

The bullet chart format was chosen for the graph as it could help measure the month performance with last month and average performance.

After viewing the overview, the drill down of channel is given by selecting specific channels. Viewing this bar chart in comparison with the overview on the left, the performance can be easily compared.

400px

Revenue Breakdown by Day

The revenue breakdown by day is also shown in line graph. With last month’s performance as light grey for reference, comparison could be easier seen and dark blue highlight of this month’s revenue. Line graph was chosen as it could show movement during the month.

400px

Category and Brand

200px 200px

Sales by category is shown in bubble chart where the size of bubble is total number of orders, x-axis is total revenue, and y-axis is how large average order is. For example, watches category has high revenue and large order as well as higher average order revenue. On the treemap, the brand breakdown by categories is shown.

Investigating both charts, best-seller category brand and products could be seen. This is useful in planning the strategy and product mix whether the sales is dominated by one category or not as well as which category could generate high revenue.

Demonstration

For example, the team would like to investigate the sales report in October. First, the month filter is selected to ‘10’ for October.

Firstly, looking at the overview (1), lower revenue for the month could be observed. The month revenue was lower than the previous month and monthly averages. Moreover, the boxes were red showing the decrease in revenue and orders which were more than 20% dropped. Then, checking at the revenue breakdown (2), most channels faced performance drop except ‘Direct’ and ‘Line’. This might due to lower spending – more investigation is needed here. Moreover, from the daily revenue (3), the revenue from 15th to 26th was significantly lower than the same period in previous month. This raised the question whether there was something wrong with the website or lower ads spending during the period.
600px

Discussion

Using visualization software such as Tableau will help us interactively analyze the data with the visualization. Moreover, Tableau also have the data connection where it could connect to various data sources such as SQL databases. With Tableau user-friendly interface, a dashboard could be created effortlessly.

Seeking more customization than what Tableau can provide from data import, data manipulation, or data visualization, enthusiasts could also use R Markdown to create the interactive dashboard with the help of the following packages: rmarkdown, shiny, flexdashboard, ggplot2, plotly, treemap. There are several packages in R to choose from when creating this dashboard which allows high customization level. Although the platform itself requires coding, it is a good start for those who want to try out data visualization programming. The interactive part of graph which is done by ‘plotly’ transformed simple R codes into JavaScript which learning curve is much steeper.

When choosing between both platforms, users should look at their requirements of their dashboard and use the one that is most suitable for the project. Both approaches have its advantages and disadvantages which are summarized below:
500px

Future Work

1 Data Capture
The data is downloaded manually. We can build a pipeline to read the data from API directly in the future, which will automate the whole visualization process

2 More interactive
Some of the graphs can be developed from static to interactive so that the whole application is more user friendly

3 Faster
The interaction part of the visualization is a bit slow. In the future, the whole application can be deployed on Cloud, instead of local server to make it much faster to play with

4 More detailed
The micro parts of the graph can be further tuned


User Guide

The explanation of function is shown in the picture below:
500px

Reference

[1] shiny
[2] DT
[3] geofacet
[4] tibble
[5] d3Tree
[6] d3treeR
[7] tidyverse
[8] sparkline
[9] dplyr
[10] reshape2
[11] shinydashboard
[12] ggthemes
[13] ggplot2
[14] Combining data tables and sparklines
[15] https://wiki.smu.edu.sg/1617t1ISSS608g1/ISSS608_2016_17T1_Group6_Poster


Special Thanks

We would like to give a special thanks to Prof. Kam for his dedicated support and guidance! Moreover, thanks whoever spend your time on the fruits of our labor. We sincerely hope you enjoy it! Your suggestions & feedbacks are most appreciated and welcome!