Difference between revisions of "ANLY482 AY2016-17 T2 Group19 Documentation"

From Analytics Practicum
Jump to navigation Jump to search
 
(5 intermediate revisions by 2 users not shown)
Line 40: Line 40:
  
 
<div style="margin:0px; padding: 10px; background: #f2f4f4; font-family: Arial, sans-serif; border-radius: 7px; ">
 
<div style="margin:0px; padding: 10px; background: #f2f4f4; font-family: Arial, sans-serif; border-radius: 7px; ">
Reports and Minutes will not be available due to client confidentiality
+
Reports and Minutes '''will not be available''' due to client confidentiality
 
</div><br>
 
</div><br>
  
=<div style="background: #4d4d4d; padding: 20px;  line-height: 0.1em;  text-indent: 10px; font-size:20px; font-family: Trajan Pro;  border-radius: 7px; border-bottom:3px solid #ba3749"><font color= #ffffff>Interactive Visual Analytical Dashboard</font></div>=
+
=<div style="background: #4d4d4d; padding: 20px;  line-height: 0.1em;  text-indent: 10px; font-size:20px; font-family: Trajan Pro;  border-radius: 7px; border-bottom:3px solid #ba3749"><font color= #ffffff> Conclusion </font></div>=
  
 
<div style="margin:0px; padding: 10px; background: #f2f4f4; font-family: Arial, sans-serif; border-radius: 7px; ">
 
<div style="margin:0px; padding: 10px; background: #f2f4f4; font-family: Arial, sans-serif; border-radius: 7px; ">
'''1. User Interface Design'''
 
 
[[File:1st.png]]
 
 
'''''Figure 1'''''
 
 
The Interactive Visual Analytical Dashboard (IVAD) presents transactional data in 3 main perspectives to provide a holistic view of the data: Geospatial, Product, Customer Type. This approach was based off the 5 Ws of information gathering, with geospatial relating to Where, product relating to What, Customer Type, relating to Who. Alongside with these 3 perspectives, a consistent underlying philosophy was taken to the UI design, following the Schneiderman’s Mantra: overview, zoom and filter, details on demand. This was implemented by creating 3 distinct portions in every perspective view: Snapshot, Filter, Trend.
 
The Snapshot portion acts as an overview/navigation panel that takes the form of either a web-map or a tree-map, alongside with a simple bar chart/heatmap. The Snapshot portion utilises data from a selected time frame with two levels of data visualisations: a snapshot level relating to a certain time frame that is selected in the sidebar (overview) and displayed alongside trend information (detail) in the form as shown in Figure 1. This would be further elaborated below with regards to the various perspectives.
 
 
'''Geospatial Perspective'''
 
 
A geospatial overview is provided at the initialisation of data with a snapshot of the past quarter’s customers displayed as a proportional symbol map next to a synced control map that is zoomed out to give an overview of the entire map. This acts an overview of the data with the proportional symbol also acting as a navigation panel with clicks leading to a reactive filtering of data and the generation of trend charts on subzone performance alongside specific individual customer performance over time as seen in Figure 2.
 
In addition, users can probe into the composition of the performance of each subzone or individual customers with a tree-map and this provides a more effective visualisation of multivariate information on both sales and quantity information with respect to different categories or specific products. This also allows for a comparison of a specific customer against its peers within the same subzone. (Figure 3)
 
 
[[File:2nd.png]]
 
 
'''''Figure 2'''''
 
 
[[File:3rd.png]]
 
 
'''''Figure 3'''''
 
 
'''Product Perspective'''
 
 
In the product perspective, an interactive treemap based on data from the selected timeframe is used instead of a webmap.
 
 
[[File:4th.png]]
 
 
'''''Figure 4'''''
 
 
The tree-map serves as a navigation as well as an overview as it would allow the user to drill down from a broad product category into a single product and subsequently view the sales trends of each broad product category to item with the choice of different intervals. In addition, a simple bar chart is also available next to the tree-map for a simple overview of sales of different categories over a specified timeframe.
 
 
'''Customer Perspective'''
 
 
In this perspective, a tree-map of the customer type is used with the levels being
 
1. Broadest customer grouping
 
2. Customer grouping
 
3. Category of products
 
4. Subcategories of product
 
 
[[File:5.png]]
 
 
'''''Figure 5'''''
 
 
Using a tree-map as a navigation tool again allows the user to track the changes in the composition of buying patterns (in the form of reactive trend charts) for different customer types over time while at the same time providing a quick overview of buying patterns of a certain customer group over a time frame. (Fig 5). A heatmap also visualises the purchasing patterns of various customer segments and allows for quick identification of strong product categories across all customer segments.
 
 
[[File:6.png]]
 
 
'''''Figure 6'''''
 
 
'''''Data Visualisation Processes'''''
 
 
The shiny app runs on a single file format and carries out the bulk of data visualisation processes, namely web-mapping, interactive treemapping, and reactive charting.
 
 
'''I. Generation of webmap'''
 
 
Using the leaflet R plugin, the web-map generated consisted of 2 main sections, a control map and a zoom in map, as shown in Fig. 7. The two maps are synced through proxies with clicks on the control map leading to a zoom onto the corresponding area on the right map. This allows for convenient navigation of the map. On the right map, proportionate symbols relating to sales within a specified timeframe are generated with colours of the symbols corresponding to different customer types. Upon clicking the selected symbol with then be highlighted in red and key details would appear in a popup (redacted for confidentiality)
 
 
[[File:7.png]]
 
 
'''''Figure 7'''''
 
 
Alternatively, navigation could be done using dropdown menus that are populated with items from the corresponding time frame. This navigation is done through a series of observer events where the click input first updates the dropdown input which then navigates to a specific view and zoom on the right map. This reduces the probability of a trigger cascade.
 
 
'''II. Interactive treemapping'''
 
 
Modifications were made to the d3treeR package, an extension of the tree-map package. The tree-maps used in this app were preloaded along with the environment image due to the complexity. The d3treeR allows for interactive hover-over-to-reveal visualisations and for tracking of clicks and thus enabling use of the tree-map for reactive charting.
 
 
'''III. Reactive charting'''
 
 
By using the click inputs to both the leaflet webmap and d3treeR, dropdown inputs were populated and used as reactive inputs to filter datasets in addition to other input parameters prepared in the global script and charted using plotly in addition to ggplot. The key benefit in using plotly is that is allows for easy export of its charts as images as well as the ease of brush-zooming and provision of hover-info. These functions allow for ease in access and useability as users can brush-zoom into areas of interest and hover over interesting trends to gain greater insight.
 
 
=<div style="background: #4d4d4d; padding: 20px;  line-height: 0.1em;  text-indent: 10px; font-size:20px; font-family: Trajan Pro;  border-radius: 7px; border-bottom:3px solid #ba3749"><font color= #ffffff> Case Examples </font></div>=
 
 
'''A. GEOSPATIAL PERSPECTIVE'''
 
 
Suppose that a user wishes to run a query on the subzone of ‘Bedok North’, this can be done by either navigating through the smaller map navigator on the left which would have hover over labels or through the drop-down menu in the filter area. This zooms into the subzone on the display map on the right where points of interest are presented in a proportionate symbol map as seen in Fig. 8.
 
 
[[File:8.png|1000px]]
 
 
'''''Figure 8'''''
 
 
After which, by clicking on a particular point within the subzone, for example, S0297, or by selecting this customer through the dropdown, both subzone and individual data would be displayed in the form of a bar chart and line chart correspondingly, the bar chart would also display contribution of the individual within the subzone. This allows for a quick view of the relative importance of this customer to a specific subzone as well as assessing what product classes this customer has been purchasing as seen in Fig. 9.
 
 
[[File:9.png|1000px]]
 
 
'''''Figure 9'''''
 
 
Other than the trend tab, a product mix comparison is also available using treemaps which are 2 levels deep and allow for the user to compare the individual’s purchasing patterns against the subzone’s total market basket.
 
 
[[File:10.png|1000px]]
 
 
'''''Figure 10'''''
 
 
'''B. PRODUCT PERSPECTIVE'''
 
 
Moving to a product perspective, a user could analyse the sales performance of a specific product class, e.g. Respiratory System by selecting the item on the treemap.
 
 
[[File:111.png|1000px]]
 
 
'''''Figure 11'''''
 
 
This, in turn, populates the filter section which also plots a stacked bar chart of the trend over a specified time frame, which by default is set to ‘Quarterly’ due to contextual requirements.
 
 
[[File:122.png|1000px]]
 
 
'''''Figure 12'''''
 
 
The user would also be able to further drill down into the subclass group as well as the specific item through the treemap and this would allow for a clearer understanding of specific products as comparisons would be easily made across the product, subclass, and product class.
 
 
[[File:13.png|1000px]]
 
 
'''''Figure 13'''''
 
 
'''C. CUSTOMER TYPE PERSPECTIVE'''
 
 
The Customer perspective is very similar as compared to the product case, however, in the customer perspective, a heat map is also supplemented to show the percentage of sales of various product classes against the various customer groups. Navigation is done in a similar way to the product perspective with the treemap being 5 levels deep (Customer grouping level 1, customer grouping level 2, product class, product subclass, item). This complexity of treemap allows for the user to understand exactly which products the customer group or customer subgroup is purchasing as well as allow for a quick view of how the market basket is through the treemap. In this case, through navigating through UNI>PUN, we see that Gastrointestinal & Hepatobiliary System products are driving growth.
 
 
[[File:14.png|1000px]]
 
 
'''''Figure 14'''''
 
 
Deeper drilling through the relevant subclass (Other GI Drugs) would then allow for the narrowing down of growth drivers.
 
 
[[File:155.png|1000px]]
 
 
'''''Figure 15'''''
 
 
In addition, the interactivity of the treemap further enables the comparison of market baskets across customer segments and the identification of potential opportunities of upselling or cross-selling.
 
 
=<div style="background: #4d4d4d; padding: 20px;  line-height: 0.1em;  text-indent: 10px; font-size:20px; font-family: Trajan Pro;  border-radius: 7px; border-bottom:3px solid #ba3749"><font color= #ffffff> Conclusion </font></div>=
 
 
 
'''A. LIMITATIONS'''
 
'''A. LIMITATIONS'''
  
Line 193: Line 61:
 
The geospatial aspect of IVAD provides users with a new avenue to analyse their sales transactions, thus allowing them to improve upon logistical processes and the precision of marketing efforts. The combination of the geospatial, product and customer aspects along with its reactive charts and interactivity can be used to improve decision making through data-driven insights.   
 
The geospatial aspect of IVAD provides users with a new avenue to analyse their sales transactions, thus allowing them to improve upon logistical processes and the precision of marketing efforts. The combination of the geospatial, product and customer aspects along with its reactive charts and interactivity can be used to improve decision making through data-driven insights.   
 
However, a significant barrier that remains is nonchalance of many SMEs towards using newer technologies in driving productivity. Common concerns behind this nonchalance are related to ease of usage, which should be alleviated with the user-friendly interface and operation of IVAD.  
 
However, a significant barrier that remains is nonchalance of many SMEs towards using newer technologies in driving productivity. Common concerns behind this nonchalance are related to ease of usage, which should be alleviated with the user-friendly interface and operation of IVAD.  
 
+
</div><br>
 
=<div style="background: #4d4d4d; padding: 20px;  line-height: 0.1em;  text-indent: 10px; font-size:20px; font-family: Trajan Pro;  border-radius: 7px; border-bottom:3px solid #ba3749"><font color= #ffffff> Acknowledgements </font></div>=
 
=<div style="background: #4d4d4d; padding: 20px;  line-height: 0.1em;  text-indent: 10px; font-size:20px; font-family: Trajan Pro;  border-radius: 7px; border-bottom:3px solid #ba3749"><font color= #ffffff> Acknowledgements </font></div>=
 
+
<div style="margin:0px; padding: 10px; background: #f2f4f4; font-family: Arial, sans-serif; border-radius: 7px; ">
 
Additions to the data, namely the demographic data and the subzone data, were sourced from Data.gov.sg, a publicly accessible database run by the Singapore Government.  
 
Additions to the data, namely the demographic data and the subzone data, were sourced from Data.gov.sg, a publicly accessible database run by the Singapore Government.  
 
We would like to extend our gratitude to our data sponsor for entrusting us with the sales data as well as the efficient assistance rendered. We appreciate the timely correspondence with regards to our enquiries and clarifications.  
 
We would like to extend our gratitude to our data sponsor for entrusting us with the sales data as well as the efficient assistance rendered. We appreciate the timely correspondence with regards to our enquiries and clarifications.  
 
We would also like to extend our gratitude to Professor Kam Tin Seong, our project supervisor, for his valuable insights regarding dashboard construction and for guiding us throughout the entirety of the practicum.
 
We would also like to extend our gratitude to Professor Kam Tin Seong, our project supervisor, for his valuable insights regarding dashboard construction and for guiding us throughout the entirety of the practicum.
 +
</div><br>
 +
=<div style="background: #4d4d4d; padding: 20px;  line-height: 0.1em;  text-indent: 10px; font-size:20px; font-family: Trajan Pro;  border-radius: 7px; border-bottom:3px solid #ba3749"><font color= #ffffff> References </font></div>=
 +
<div style="margin:0px; padding: 10px; background: #f2f4f4; font-family: Arial, sans-serif; border-radius: 7px; ">
 +
[1] Teo, R. (2016) Here’s what SMEs need to know to compete more effectively. Accessed 3 April 2017 http://sbr.com.sg/hr-education/commentary/heres-what-smes-need-know-compete-more-effectively
 +
 +
[2] Shiao, V. (2016) Many SMEs bo chap about innovation: SCCCI head. Accessed 3 April 2017 http://www.businesstimes.com.sg/sme/many-smes-bo-chap-about-innovation-sccci-head
 +
 +
[3] Halper, F. (2013) Seven Use Cases for Geospatial Analytics. Available at https://www.victa.nl/alteryx/wp-content/uploads/TDWI-Checklist-Webinar-on-Data-Discovery.PDF
 +
 +
[4] Debortoli, S., Müller, O., Dr, & Vom Brocke, Jan, Prof Dr. (2014). Comparing business intelligence and big data skills. Business & Information Systems Engineering, 6(5), 289-300. doi:http://dx.doi.org.libproxy.smu.edu.sg/10.1007/s12599-014-0344-2
 +
 +
[5] North, & Shneiderman. (2000). Snap-together visualization: Can users construct and operate coordinated visualizations? International Journal of Human - Computer Studies, 53(5), 715-739.
 +
 +
[6] Rios-Berrios, M., Sharma, P., Lee, T., Schwartz, R., & Shneiderman, B. (2012). TreeCovery: Coordinated dual treemap visualization for exploring the Recovery Act. Government Information Quarterly, 29(2), 212-222. http://dx.doi.org/10.1016/j.giq.2011.07.004
 +
 +
[7] Lazada uses Qlik visual analytics for insights to drive competitive edge. (2016, November 22). Networks Asia, p. Networks Asia, Nov 22, 2016.
 +
 +
[8] Smith, C., Le Comber, S., Fry, H., Bull, M., Leach, S., & Hayward, A. (2015). Spatial methods for infectious disease outbreak investigations: systematic literature review. Eurosurveillance, 20(39). http://dx.doi.org/10.2807/1560-7917.es.2015.20.39.30026
 +
 +
[9] Lim, H., Park, S. (2014). Designing a GIS-based planning support system for a public library building project. Journal of Librarianship and Information Science Vol 47, Issue 3, pp. 254 – 264.
 +
 +
[10] Shneiderman, B. (2006). Discovering Business Intelligence Using Treemap Visualizations (p. 9). Retrieved from https://www.perceptualedge.com/articles/b-eye/treemaps.pdf
 +
 +
[11] Jern, M., Rogstadius, J., & Åström, T. (2009). Treemaps and Choropleth Maps Applied to Regional Hierarchical Statistical Data. In 17th International Conference on Information Visualization (pp. pp. 403-410). Barcelona, Spain: NCVA – National Center for Visual Analytics, Linkoping University, Sweden. Retrieved from https://www-computer-org.libproxy.smu.edu.sg/csdl/proceedings/iv/2009/3733/00/3733a403.pdf
 +
 +
[12] Babicki, S., Arndt, D., Marcu, A., Liang, Y., Grant, J. R., Maciejewski, A., & Wishart, D. S. (2016). Heatmapper: web-enabled heat mapping for all. Nucleic Acids Research, 44(Web Server issue), W147–W153. http://doi.org/10.1093/nar/gkw419
  
 
</div><br>
 
</div><br>

Latest revision as of 00:31, 24 April 2017



Protegelogo-01.svg

Protege overview.svg   OVERVIEW

Protege data.svg   DATA

Protege Methods.svg   METHODOLOGY & ANALYSIS

Protegemaster-03.svg   FINDINGS

Protege poster.svg   DOCUMENTATION

  BACK TO COURSE

Key Deliverables

Reports and Minutes will not be available due to client confidentiality


Conclusion

A. LIMITATIONS

This project is set to be limited in span of roughly 6 months, from the sourcing of a data sponsor to the EDA then finally to the dashboard construction. With severe time constraints, the full capabilities of R and Shiny platforms could not be represented fully and accurately. Furthermore, insights generated from the sales data in this project is not exhaustive as not all relevant techniques and tools are used.

B. RECOMMENDATIONS / IMPLICATIONS

We recommend that future research or work be done in this field to consider exploring the use of clustering on top of geospatial analysis to provide a deeper understanding of the business. Furthermore, should the data set be operational in nature, it is critical to analyse stock movements considering holding and transportation costs. Predictive analytics can be employed to forecast demand and as such better allocate time for restocking. In regards to transportation costs, delivery schedule or routes can be analysed to increase operational efficiency to lower cost for the business. This applied research into R and Shiny platform would have implications on how businesses apart from large corporations can employ big data analytics in a more affordable way. With more businesses making use of their untapped wealth of data, greater value can be generated to benefit the end-consumers and the country’s overall productivity.

C. ENDING REMARKS

IVAD’s initial development was done with the data sponsor’s interest in mind, however, applications for this dashboard has potential to benefit SMEs who have business models based on wholesaling. IVAD’s open-sourced nature and its intermediate IT requirements presents a viable alternative for SMEs to leverage on the benefits of data analytics. The geospatial aspect of IVAD provides users with a new avenue to analyse their sales transactions, thus allowing them to improve upon logistical processes and the precision of marketing efforts. The combination of the geospatial, product and customer aspects along with its reactive charts and interactivity can be used to improve decision making through data-driven insights. However, a significant barrier that remains is nonchalance of many SMEs towards using newer technologies in driving productivity. Common concerns behind this nonchalance are related to ease of usage, which should be alleviated with the user-friendly interface and operation of IVAD.


Acknowledgements

Additions to the data, namely the demographic data and the subzone data, were sourced from Data.gov.sg, a publicly accessible database run by the Singapore Government. We would like to extend our gratitude to our data sponsor for entrusting us with the sales data as well as the efficient assistance rendered. We appreciate the timely correspondence with regards to our enquiries and clarifications. We would also like to extend our gratitude to Professor Kam Tin Seong, our project supervisor, for his valuable insights regarding dashboard construction and for guiding us throughout the entirety of the practicum.


References

[1] Teo, R. (2016) Here’s what SMEs need to know to compete more effectively. Accessed 3 April 2017 http://sbr.com.sg/hr-education/commentary/heres-what-smes-need-know-compete-more-effectively

[2] Shiao, V. (2016) Many SMEs bo chap about innovation: SCCCI head. Accessed 3 April 2017 http://www.businesstimes.com.sg/sme/many-smes-bo-chap-about-innovation-sccci-head

[3] Halper, F. (2013) Seven Use Cases for Geospatial Analytics. Available at https://www.victa.nl/alteryx/wp-content/uploads/TDWI-Checklist-Webinar-on-Data-Discovery.PDF

[4] Debortoli, S., Müller, O., Dr, & Vom Brocke, Jan, Prof Dr. (2014). Comparing business intelligence and big data skills. Business & Information Systems Engineering, 6(5), 289-300. doi:http://dx.doi.org.libproxy.smu.edu.sg/10.1007/s12599-014-0344-2

[5] North, & Shneiderman. (2000). Snap-together visualization: Can users construct and operate coordinated visualizations? International Journal of Human - Computer Studies, 53(5), 715-739.

[6] Rios-Berrios, M., Sharma, P., Lee, T., Schwartz, R., & Shneiderman, B. (2012). TreeCovery: Coordinated dual treemap visualization for exploring the Recovery Act. Government Information Quarterly, 29(2), 212-222. http://dx.doi.org/10.1016/j.giq.2011.07.004

[7] Lazada uses Qlik visual analytics for insights to drive competitive edge. (2016, November 22). Networks Asia, p. Networks Asia, Nov 22, 2016.

[8] Smith, C., Le Comber, S., Fry, H., Bull, M., Leach, S., & Hayward, A. (2015). Spatial methods for infectious disease outbreak investigations: systematic literature review. Eurosurveillance, 20(39). http://dx.doi.org/10.2807/1560-7917.es.2015.20.39.30026

[9] Lim, H., Park, S. (2014). Designing a GIS-based planning support system for a public library building project. Journal of Librarianship and Information Science Vol 47, Issue 3, pp. 254 – 264.

[10] Shneiderman, B. (2006). Discovering Business Intelligence Using Treemap Visualizations (p. 9). Retrieved from https://www.perceptualedge.com/articles/b-eye/treemaps.pdf

[11] Jern, M., Rogstadius, J., & Åström, T. (2009). Treemaps and Choropleth Maps Applied to Regional Hierarchical Statistical Data. In 17th International Conference on Information Visualization (pp. pp. 403-410). Barcelona, Spain: NCVA – National Center for Visual Analytics, Linkoping University, Sweden. Retrieved from https://www-computer-org.libproxy.smu.edu.sg/csdl/proceedings/iv/2009/3733/00/3733a403.pdf

[12] Babicki, S., Arndt, D., Marcu, A., Liang, Y., Grant, J. R., Maciejewski, A., & Wishart, D. S. (2016). Heatmapper: web-enabled heat mapping for all. Nucleic Acids Research, 44(Web Server issue), W147–W153. http://doi.org/10.1093/nar/gkw419