Project insights

From Analytics Practicum
Jump to navigation Jump to search

Bytesyzed-home.png

Bytesyzed-projectoverview.png
Project Overview

Bytesyzed-projectinsights.png
Project Insights

Bytesyzed-calendar.png
Project Management

Bytesyzed-about.png

Bytesyzed-back.png
ANLY482 Main Page


Byteszed-new-logo.png


Project Pain Points

We learnt that working with an external organisation has its difficulties and yet we need to be aware of stakeholder management. The realities of real world data being unorganized, unclean and difficult to make sense of are also part of the learning and experience process but it help to widen our perspectives.

The problems we faced when we took the data from the sponsor was the fact that it was messy. There was no proper storage of the data, with different files being in different folders. On top of that, the data given to us were in a raw format with varying consistent formats. Hence, it was a huge challenge for the team to understand or even use the data effectively.

Despite this, we have identified 2 key pain points that the sponsor is experiencing:

  1. There is no proper database storage for the data, which prevented the business user from having an aggregated high level view of the records for analysis purposes
  2. The data was too disorganized, despite having very detailed and deliberately formatted tables. We observed that recorded invoices always had a different format and some of the phrasing changes often even if it was meant to talk about the same thing.


Dealing with the Data

The data that is available to us comes from different file formats and also do not have the same number of pages. However, what we have identified that is consistent across is that the data set captures items or activities picked by the school to be a part of their programme package. With this in mind, we decided to pick up key activities and save it into a separate data file that ultimately aggregates all the information across every invoice that we have been given. We will then use this new dataset for market basket analysis as for our EDA conditions to continue and improve on the understanding of the data.


Building an End-to-End solution

Moving past interim, we decided that we should focus on building an end-to-end solution for our sponsor. This is because of the limited data that we managed to retrieve from the invoices. Additionally, only managing to do a market basket analysis as only core analysis proved insufficient to solve the problem that our sponsor faced, which was largely related to resource allocation and aggregation of data. Based on some recommendations from our supervisor (Prof. Kam), we began to look at the bigger picture and how we could go about creating this end-to-end solution. We finally settled with using R Shiny, as our backend solution for market basket analysis was based on R codes, and it would be easy to integrate visualization into the solution easily.

Bytesyzed-rshiny.png


Methodology

1. Workflow Diagram

Bytesyzed-workflow.png

The first step in this project was to create the workflow diagram. The purpose of the workflow diagram is so that we can understand:

  1. Logical flow of business process
  2. How our end-user would interact with the application
  3. What data needs to be collected into our database


2. Entity-Relationship Diagram (ER Diagram)

Bytesyzed-erdiagram.png

Subsequently, we built an Entity-Relationship diagram for the purpose of building the database. The purpose of building an ER diagram is so that we can understand how the databases are linked to each other and what data will be stored for each table. This helps us in understanding the data types that we need to capture as well.


3. End-to-End Framework using Shiny

Bytesyzed-shinyframework.png

Lastly, we came up with how our R Shiny application would work. The database is built on the helper methods that we create for each database, the client, bookings and activity databases where each helper file (e.g. clienthelpers.R) include "Create", "Update", "Read" functions that help to ensure the database can be built. We also utilised dashboardhelpers and mbahelpers to do the data manipulation and visualization for the dashboard and market basket analysis respectively.


Application Prototype
Bytesyzed-dashboard.png