Difference between revisions of "Project insights"

From Analytics Practicum
Jump to navigation Jump to search
Line 66: Line 66:
 
<div style="background: #FFFFFF;border-style:solid; border-top:0px; border-width: 2px; border-color: #000000; padding: 1em; text-align: justify;">
 
<div style="background: #FFFFFF;border-style:solid; border-top:0px; border-width: 2px; border-color: #000000; padding: 1em; text-align: justify;">
 
<!--Moving on past the interim phase, we hope to be able to implement the market basket analysis as our core analytical solution. Following which, our aim is then to build a database for the sponsor, allowing them to keep track of and store these information in a fixed location. The rationale for this is ideally to build an analytics dashboard that shows the market baskets, clusters and even other forms of analysis to be visualized, aggregating the data from the database that we aim to create.-->
 
<!--Moving on past the interim phase, we hope to be able to implement the market basket analysis as our core analytical solution. Following which, our aim is then to build a database for the sponsor, allowing them to keep track of and store these information in a fixed location. The rationale for this is ideally to build an analytics dashboard that shows the market baskets, clusters and even other forms of analysis to be visualized, aggregating the data from the database that we aim to create.-->
 +
Moving past interim, we decided that we should focus on building an end-to-end solution for our sponsor. This is because of the limited data that we managed to retrieve from the invoices. Additionally, only managing to do a market basket analysis as only core analysis proved insufficient to solve the problem that our sponsor faced, which was largely related to resource allocation and aggregation of data. Based on some recommendations from our supervisor (Prof. Kam), we began to look at the bigger picture and how we could go about creating this end-to-end solution. We finally settled with using R Shiny, as our backend solution for market basket analysis was based on R codes, and it would be easy to integrate visualization into the solution easily.
 +
 +
 
</div>
 
</div>
  
 
<!--color template http://www.color-hex.com/color-palette/52671 -->
 
<!--color template http://www.color-hex.com/color-palette/52671 -->

Revision as of 23:21, 14 April 2018

Bytesyzed-home.png

Bytesyzed-projectoverview.png
Project Overview

Bytesyzed-projectinsights.png
Project Insights

Bytesyzed-calendar.png
Project Management

Bytesyzed-about.png

Bytesyzed-back.png
ANLY482 Main Page


Byteszed-new-logo.png


Project Pain Points

We learnt that working with an external organisation has its difficulties and yet we need to be aware of stakeholder management. The realities of real world data being unorganized, unclean and difficult to make sense of are also part of the learning and experience process but it help to widen our perspectives.

The problems we faced when we took the data from the sponsor was the fact that it was messy. There was no proper storage of the data, with different files being in different folders. On top of that, the data given to us were in a raw format with varying consistent formats. Hence, it was a huge challenge for the team to understand or even use the data effectively.

Despite this, we have identified 2 key pain points that the sponsor is experiencing:

  1. There is no proper database storage for the data, which prevented the business user from having an aggregated high level view of the records for analysis purposes
  2. The data was too disorganized, despite having very detailed and deliberately formatted tables. We observed that recorded invoices always had a different format and some of the phrasing changes often even if it was meant to talk about the same thing.


Dealing with the Data

The data that is available to us comes from different file formats and also do not have the same number of pages. However, what we have identified that is consistent across is that the data set captures items or activities picked by the school to be a part of their programme package. With this in mind, we decided to pick up key activities and save it into a separate data file that ultimately aggregates all the information across every invoice that we have been given. We will then use this new dataset for market basket analysis as for our EDA conditions to continue and improve on the understanding of the data.


Building an End-to-End solution

Moving past interim, we decided that we should focus on building an end-to-end solution for our sponsor. This is because of the limited data that we managed to retrieve from the invoices. Additionally, only managing to do a market basket analysis as only core analysis proved insufficient to solve the problem that our sponsor faced, which was largely related to resource allocation and aggregation of data. Based on some recommendations from our supervisor (Prof. Kam), we began to look at the bigger picture and how we could go about creating this end-to-end solution. We finally settled with using R Shiny, as our backend solution for market basket analysis was based on R codes, and it would be easy to integrate visualization into the solution easily.