Difference between revisions of "Project insights"

From Analytics Practicum
Jump to navigation Jump to search
Line 42: Line 42:
 
<br/>
 
<br/>
 
<div align="left">
 
<div align="left">
<div style="background: #00313c; padding: 12px; font-family: Andale Mono; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #f6a228 solid 30px;"><font color="#f6a228">Interim - Pain Points</font>
+
<div style="background: #00313c; padding: 12px; font-family: Andale Mono; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #f6a228 solid 30px;"><font color="#f6a228">Project Pain Points</font>
 
</div>
 
</div>
 
<div style="background: #FFFFFF;border-style:solid; border-top:0px; border-width: 2px; border-color: #000000; padding: 1em; text-align: justify;">
 
<div style="background: #FFFFFF;border-style:solid; border-top:0px; border-width: 2px; border-color: #000000; padding: 1em; text-align: justify;">
Line 55: Line 55:
  
  
<div style="background: #00313c; padding: 12px; font-family: Andale Mono; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #f6a228 solid 30px;"><font color="#f6a228">Interim - Dealing with the Data</font>
+
<div style="background: #00313c; padding: 12px; font-family: Andale Mono; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #f6a228 solid 30px;"><font color="#f6a228">Dealing with the Data</font>
 
</div>
 
</div>
 
<div style="background: #FFFFFF;border-style:solid; border-top:0px; border-width: 2px; border-color: #000000; padding: 1em; text-align: justify;">
 
<div style="background: #FFFFFF;border-style:solid; border-top:0px; border-width: 2px; border-color: #000000; padding: 1em; text-align: justify;">
The data that we are now looking to capture comes from different file formats and also do not have the same number of pages. However, what we have identified that is consistent across is that the data set captures items or activities picked by the school to be a part of their programme package. With this in mind, we decided to pick up key activities and save it into a separate data file that ultimately aggregates all the information across every invoice that we have been given. We will then use this new dataset for market basket analysis as for our EDA conditions to continue and improve on the understanding of the data.
+
The data that is available to us comes from different file formats and also do not have the same number of pages. However, what we have identified that is consistent across is that the data set captures items or activities picked by the school to be a part of their programme package. With this in mind, we decided to pick up key activities and save it into a separate data file that ultimately aggregates all the information across every invoice that we have been given. We will then use this new dataset for market basket analysis as for our EDA conditions to continue and improve on the understanding of the data.
 
</div>
 
</div>
  
  
<div style="background: #00313c; padding: 12px; font-family: Andale Mono; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #f6a228 solid 30px;"><font color="#f6a228">Interim - Future Work</font>
+
<div style="background: #00313c; padding: 12px; font-family: Andale Mono; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #f6a228 solid 30px;"><font color="#f6a228">Building an End-to-End solution</font>
 
</div>
 
</div>
 
<div style="background: #FFFFFF;border-style:solid; border-top:0px; border-width: 2px; border-color: #000000; padding: 1em; text-align: justify;">
 
<div style="background: #FFFFFF;border-style:solid; border-top:0px; border-width: 2px; border-color: #000000; padding: 1em; text-align: justify;">
Moving on past the interim phase, we hope to be able to implement the market basket analysis as our core analytical solution. Following which, our aim is then to build a database for the sponsor, allowing them to keep track of and store these information in a fixed location. The rationale for this is ideally to build an analytics dashboard that shows the market baskets, clusters and even other forms of analysis to be visualized, aggregating the data from the database that we aim to create.
+
<!--Moving on past the interim phase, we hope to be able to implement the market basket analysis as our core analytical solution. Following which, our aim is then to build a database for the sponsor, allowing them to keep track of and store these information in a fixed location. The rationale for this is ideally to build an analytics dashboard that shows the market baskets, clusters and even other forms of analysis to be visualized, aggregating the data from the database that we aim to create.-->
 
</div>
 
</div>
  
 
<!--color template http://www.color-hex.com/color-palette/52671 -->
 
<!--color template http://www.color-hex.com/color-palette/52671 -->

Revision as of 23:03, 14 April 2018

Bytesyzed-home.png

Bytesyzed-projectoverview.png
Project Overview

Bytesyzed-projectinsights.png
Project Insights

Bytesyzed-calendar.png
Project Management

Bytesyzed-about.png

Bytesyzed-back.png
ANLY482 Main Page


Byteszed-new-logo.png


Project Pain Points

We learnt that working with an external organisation has its difficulties and yet we need to be aware of stakeholder management. The realities of real world data being unorganized, unclean and difficult to make sense of are also part of the learning and experience process but it help to widen our perspectives.

The problems we faced when we took the data from the sponsor was the fact that it was messy. There was no proper storage of the data, with different files being in different folders. On top of that, the data given to us were in a raw format with varying consistent formats. Hence, it was a huge challenge for the team to understand or even use the data effectively.

Despite this, we have identified 2 key pain points that the sponsor is experiencing:

  1. There is no proper database storage for the data, which prevented the business user from having an aggregated high level view of the records for analysis purposes
  2. The data was too disorganized, despite having very detailed and deliberately formatted tables. We observed that recorded invoices always had a different format and some of the phrasing changes often even if it was meant to talk about the same thing.


Dealing with the Data

The data that is available to us comes from different file formats and also do not have the same number of pages. However, what we have identified that is consistent across is that the data set captures items or activities picked by the school to be a part of their programme package. With this in mind, we decided to pick up key activities and save it into a separate data file that ultimately aggregates all the information across every invoice that we have been given. We will then use this new dataset for market basket analysis as for our EDA conditions to continue and improve on the understanding of the data.


Building an End-to-End solution