Project insights

From Analytics Practicum
Jump to navigation Jump to search

Bytesyzed-home.png

Bytesyzed-projectoverview.png
Project Overview

Bytesyzed-projectinsights.png
Project Insights

Bytesyzed-calendar.png
Project Management

Bytesyzed-about.png

Bytesyzed-back.png
ANLY482 Main Page


Byteszed-new-logo.png


Project Pain Points

We learnt that working with an external organisation has its difficulties and yet we need to be aware of stakeholder management. The realities of real world data being unorganized, unclean and difficult to make sense of are also part of the learning and experience process but it help to widen our perspectives.

The problems we faced when we took the data from the sponsor was the fact that it was messy. There was no proper storage of the data, with different files being in different folders. On top of that, the data given to us were in a raw format with varying consistent formats. Hence, it was a huge challenge for the team to understand or even use the data effectively.

Despite this, we have identified 2 key pain points that the sponsor is experiencing:

  1. There is no proper database storage for the data, which prevented the business user from having an aggregated high level view of the records for analysis purposes
  2. The data was too disorganized, despite having very detailed and deliberately formatted tables. We observed that recorded invoices always had a different format and some of the phrasing changes often even if it was meant to talk about the same thing.


Dealing with the Data

The data that is available to us comes from different file formats and also do not have the same number of pages. However, what we have identified that is consistent across is that the data set captures items or activities picked by the school to be a part of their programme package. With this in mind, we decided to pick up key activities and save it into a separate data file that ultimately aggregates all the information across every invoice that we have been given. We will then use this new dataset for market basket analysis as for our EDA conditions to continue and improve on the understanding of the data.


Building an End-to-End solution