ANLY482 AY2016-17 T2 Group06

From Analytics Practicum
Jump to navigation Jump to search
Is482 team enigma.png


HOME

 

ABOUT US

 

PROJECT MANAGEMENT

 

METHODOLOGY

 

FINAL PROGRESS

 

DOCUMENTATION

 

ANLY 482 HOMEPAGE

Project Overview

Khoo Teck Puat Hospital (KTPH) is a 590-bed general and acute care hospital, managed by Alexandra Health System. Opened in June 2010, KTPH offers a comprehensive range of medical services and specialist care to the community in the north.

Every month, KTPH needs to manually prepare 25 reports based on 4 data files, each of which contains around 60k rows, and submit the reports to Ministry of Health (MOH). This brings in 3 major problems for the hospital.

Firstly, the process is very time-consuming. The staff needs to manually go through all the data records so as to identify any suspicious data records, and then contact relevant departments for verification. Only when all the suspicious data have been verified, the staff could then proceed to prepare the 25 reports based on different templates. It takes the hospital about 2-week time to verify the data and prepare the reports each month.

Secondly, the process is error-prone as it involves intensive manual works. Human errors are hardly avoidable.

Thirdly, the process is fairly tedious. The staff repeats the same procedures every month for data cleaning and report preparation. However, he could actually spends this amount of time for more intelligent works.

As such, our project will focus on automate the data cleaning and report generation process for KTPH, so as to improve the efficiency, accuracy and allow for better time-spending. Moreover, as currently there is a lack of data visualization for KTPH to view the changes in data records documented in reports; our project would also implement a dashboard for KTPH to view the important changes in data records.

Project Motivation

In recent years, digital disruption in every aspect of the commercial world has been prevailing. The ever-increasing trend of digital transformation can be attributed to the widespread adoption of business intelligence and data analytics approaches. In Singapore’s context, Smart Nation initiative effectively drives adoption of data analytics among industry partners by harnessing the power of data technologies to create substantial business benefits. In a sense, most organizations are actively looking out for more systematic and time-efficient approaches with technologies to replace manual activities and streamline ongoing business process. In the context of Khoo Teck Puat Hospital (KTPH), datasets showcasing essential metrics like attendances and patient days, are extracted from internal information systems on a monthly basis. Manual efforts are required to effectively clean these datasets and generate standard reports, which are required for submission to MOH every month. Excel spreadsheet functions are the only tool used to clean and process these datasets. The entire process is extremely tedious due to the immense data volume, which consumes lots of man-hours. Moreover, the approach is error-prone, which compromises accuracy and quality of the report generated. As a result, KTPH is actively looking out for a technological approach in automating the monthly data cleaning and report generation process.

R is capable of tackling existing issues faced by KTPH by bridging the gap between KTPH’s business functions and technical capabilities. R is chosen for our project, primarily due to its versatility in the field of statistics. It focuses on user-friendly data analysis, statistical and graphical models. Meanwhile, it has large number of associated tools like, RStudio and CRAN, which include a wide range of statistical tools and packages.

There are indeed other data processing tools like, Python and Visual Basics in the market, which share benefits like, open-source and huge online communities. However, as Python is a general purpose language, most data analysis functionality is only made available with packages like NumPy and Pandas. R on the other hand, builds in data analysis functionalities by default, which largely decreases its reliability on add-in statistical packages. As compared to Python, statistical models can be written within a few lines in R, disregarding the fact that there are a great variety of readily available statistical tests and models in R. On top of which, packages like ggplot2 and Shiny, makes R easier and more customizable for data visualization, which is an essential aspect for our project.

Project Objectives

1. To automate the data cleaning process according to the required procedures for Specialized Outpatient Clinic (SOC), Acute and Emergency (A&E), and Inpatient (Inpt) data files. To generate clean data and pick out all the dirty data into two separate data files.

2. Use the clean data to generate the monthly reports automatically for Ministry of Health (MOH) in the standard format.

3. To analyze the monthly performance data and use data visualization to present the meaningful insights.

Project Progress

Team06 midterm progress 40.JPG