Difference between revisions of "ANLY482 AY2016-17 T2 Group06"

From Analytics Practicum
Jump to navigation Jump to search
 
(14 intermediate revisions by 3 users not shown)
Line 15: Line 15:
 
| style="padding:0.3em; font-size:100%; background-color:#134E84;  border-bottom:0px solid #AEC6CF; text-align:center; color:#F5F5F5" width="12%" |  
 
| style="padding:0.3em; font-size:100%; background-color:#134E84;  border-bottom:0px solid #AEC6CF; text-align:center; color:#F5F5F5" width="12%" |  
 
[[ANLY482_Enigma_Project_Management|<font color="#F5F5F5" size=2><b>PROJECT MANAGEMENT</b></font>]]
 
[[ANLY482_Enigma_Project_Management|<font color="#F5F5F5" size=2><b>PROJECT MANAGEMENT</b></font>]]
 +
 +
| style="background:none;" width="1%" | &nbsp;
 +
| style="padding:0.3em; font-size:100%; background-color:#134E84;  border-bottom:0px solid #AEC6CF; text-align:center; color:#F5F5F5" width="10%" |
 +
[[ANLY482_ Enigma_Methodology|<font color="#F5F5F5" size=2><b>METHODOLOGY</b></font>]]
  
 
| style="background:none;" width="1%" | &nbsp;
 
| style="background:none;" width="1%" | &nbsp;
Line 23: Line 27:
 
| style="padding:0.3em; font-size:100%; background-color:#134E84;  border-bottom:0px solid #AEC6CF; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.3em; font-size:100%; background-color:#134E84;  border-bottom:0px solid #AEC6CF; text-align:center; color:#F5F5F5" width="10%" |  
 
[[ANLY482_ Enigma_Documentation|<font color="#F5F5F5" size=2><b>DOCUMENTATION</b></font>]]
 
[[ANLY482_ Enigma_Documentation|<font color="#F5F5F5" size=2><b>DOCUMENTATION</b></font>]]
 +
 +
| style="background:none;" width="1%" | &nbsp;
 +
| style="padding:0.3em; font-size:100%; background-color:#134E84;  border-bottom:0px solid #AEC6CF; text-align:center; color:#F5F5F5" width="10%" |
 +
[[Main_Page|<font color="#F5F5F5" size=2><b>ANLY 482 HOMEPAGE</b></font>]]
 
|}  
 
|}  
 
<!--/Header-->
 
<!--/Header-->
  
== Current Progress ==  
+
== Project Overview ==
Iteration 2 (updated wiki, proposal report uploaded)
+
Khoo Teck Puat Hospital (KTPH) is a 590-bed general and acute care hospital, managed by Alexandra Health System. Opened in June 2010, KTPH offers a comprehensive range of medical services and specialist care to the community in the north.
 +
 
 +
Every month, KTPH needs to manually prepare 25 reports based on 4 data files, each of which contains around 60k rows, and submit the reports to Ministry of Health (MOH). This brings in 3 major problems for the hospital.
  
== Sponsor Background ==
+
Firstly, the process is very time-consuming. The staff needs to manually go through all the data records so as to identify any suspicious data records, and then contact relevant departments for verification. Only when all the suspicious data have been verified, the staff could then proceed to prepare the 25 reports based on different templates. It takes the hospital about 2-week time to verify the data and prepare the reports each month.
Kosmebox is Vietnam’s top beauty e-tailer and is expanding its influence across Southeast Asia. Driven by its customer-centric mission statement and its commitment towards efficient customer-service, Kosmebox aims to make cosmetic products easily accessible to Southeast Asian consumers. Despite its late entry to the market in 2015, its customer-base has reached 10,000 and is still growing rapidly. Through Kosmebox’s online portal, its customers are exposed to products offering from over 100 cosmetic brands. Most of which are imported directly from Korea, US and Vietnam.
+
 +
Secondly, the process is error-prone as it involves intensive manual works. Human errors are hardly avoidable.  
  
Like many other start-ups, despite its initial success, Kosmebox’s current business model is no longer capable of fulfilling the needs of its expanding market. It starts to face issues in managing its ongoing business processes, especially in areas like inventory management and warehouse selection. As a result, Kosmebox is in need of an analytics solution that is capable of streamlining its inventory management business process and optimizing resource allocation in cases like, warehouse selection.
+
Thirdly, the process is fairly tedious. The staff repeats the same procedures every month for data cleaning and report preparation. However, he could actually spends this amount of time for more intelligent works.  
 +
 
 +
As such, our project will focus on automate the data cleaning and report generation process for KTPH, so as to improve the efficiency, accuracy and allow for better time-spending. Moreover, as currently there is a lack of data visualization for KTPH to view the changes in data records documented in reports; our project would also implement a dashboard for KTPH to view the important changes in data records.
  
 
== Project Motivation ==
 
== Project Motivation ==
Currently there is no systematic way for Kosmebox to decide on the replenishment quantity for different products. Current forecast on product type and quantities need to be ordered is mainly based on human interpretation of current sales and inventory level. In that case, forecasting results are inaccurate and unreliable due to the unforeseen external factors resulting in sales quantity fluctuations in different months. On top of which, factors leading to peak season sales are not taken into consideration when structuring the forecasting model. Furthermore, as Kosmebox’s decision on replenishment quantity is scheduled to be 20th of each month, forecasted sales quantity may potentially deviate from this month’s actual sales quantity.
+
In recent years, digital disruption in every aspect of the commercial world has been prevailing. The ever-increasing trend of digital transformation can be attributed to the widespread adoption of business intelligence and data analytics approaches. In Singapore’s context, Smart Nation initiative effectively drives adoption of data analytics among industry partners by harnessing the power of data technologies to create substantial business benefits. In a sense, most organizations are actively looking out for more systematic and time-efficient approaches with technologies to replace manual activities and streamline ongoing business process.
 +
In the context of Khoo Teck Puat Hospital (KTPH), datasets showcasing essential metrics like attendances and patient days, are extracted from internal information systems on a monthly basis. Manual efforts are required to effectively clean these datasets and generate standard reports, which are required for submission to MOH every month. Excel spreadsheet functions are the only tool used to clean and process these datasets. The entire process is extremely tedious due to the immense data volume, which consumes lots of man-hours. Moreover, the approach is error-prone, which compromises accuracy and quality of the report generated. As a result, KTPH is actively looking out for a technological approach in automating the monthly data cleaning and report generation process.
 +
 
 +
R is capable of tackling existing issues faced by KTPH by bridging the gap between KTPH’s business functions and technical capabilities. R is chosen for our project, primarily due to its versatility in the field of statistics. It focuses on user-friendly data analysis, statistical and graphical models. Meanwhile, it has large number of associated tools like, RStudio and CRAN, which include a wide range of statistical tools and packages.
  
Furthermore, as the Kosmebox’s sales growth and business expanded, it plans to set up one more warehouse to improve operational efficiency. This may be potentially achieved by saving delivery cost and processing time when deliveries are made from warehouses in the region. As a result, Kosmebox is looking out for an optimal warehouse location to balance the workload with operating warehouses. In order to facilitate our sponsor’s decision-making process, we would need to analyse the new revenue and cost (including labor and warehouse rental costs) if he decides to put a new warehouse into operation in the region. If the new warehouse location is deemed feasible, we would need to highlight the amount of stock that Kosmebox should keep inside the new warehouse to operate for the 1 to 2 months period.
+
There are indeed other data processing tools like, Python and Visual Basics in the market, which share benefits like, open-source and huge online communities. However, as Python is a general purpose language, most data analysis functionality is only made available with packages like NumPy and Pandas. R on the other hand, builds in data analysis functionalities by default, which largely decreases its reliability on add-in statistical packages. As compared to Python, statistical models can be written within a few lines in R, disregarding the fact that there are a great variety of readily available statistical tests and models in R. On top of which, packages like ggplot2 and Shiny, makes R easier and more customizable for data visualization, which is an essential aspect for our project.
  
 
== Project Objectives ==
 
== Project Objectives ==
1. To analyse past 2-year sales data to provide an inventory management and replenishment forecasting model<br>
+
1. To automate the data cleaning process according to the required procedures for Specialized Outpatient Clinic (SOC), Acute and Emergency (A&E), and Inpatient (Inpt) data files. To generate clean data and pick out all the dirty data into two separate data files.
2. According to past 2-year sales data and Vietnam's geographical information to help sponsor to make a decision on whether it is worthwhile to set up a new warehouse
+
 
 +
2. Use the clean data to generate the monthly reports automatically for Ministry of Health (MOH) in the standard format. 
 +
 
 +
3. To analyze the monthly performance data and use data visualization to present the meaningful insights.
 +
 
 +
== Project Progress ==
 +
[[File:Team06 midterm progress 40.JPG|center|600px]]

Latest revision as of 11:14, 23 April 2017

Is482 team enigma.png


HOME

 

ABOUT US

 

PROJECT MANAGEMENT

 

METHODOLOGY

 

FINAL PROGRESS

 

DOCUMENTATION

 

ANLY 482 HOMEPAGE

Project Overview

Khoo Teck Puat Hospital (KTPH) is a 590-bed general and acute care hospital, managed by Alexandra Health System. Opened in June 2010, KTPH offers a comprehensive range of medical services and specialist care to the community in the north.

Every month, KTPH needs to manually prepare 25 reports based on 4 data files, each of which contains around 60k rows, and submit the reports to Ministry of Health (MOH). This brings in 3 major problems for the hospital.

Firstly, the process is very time-consuming. The staff needs to manually go through all the data records so as to identify any suspicious data records, and then contact relevant departments for verification. Only when all the suspicious data have been verified, the staff could then proceed to prepare the 25 reports based on different templates. It takes the hospital about 2-week time to verify the data and prepare the reports each month.

Secondly, the process is error-prone as it involves intensive manual works. Human errors are hardly avoidable.

Thirdly, the process is fairly tedious. The staff repeats the same procedures every month for data cleaning and report preparation. However, he could actually spends this amount of time for more intelligent works.

As such, our project will focus on automate the data cleaning and report generation process for KTPH, so as to improve the efficiency, accuracy and allow for better time-spending. Moreover, as currently there is a lack of data visualization for KTPH to view the changes in data records documented in reports; our project would also implement a dashboard for KTPH to view the important changes in data records.

Project Motivation

In recent years, digital disruption in every aspect of the commercial world has been prevailing. The ever-increasing trend of digital transformation can be attributed to the widespread adoption of business intelligence and data analytics approaches. In Singapore’s context, Smart Nation initiative effectively drives adoption of data analytics among industry partners by harnessing the power of data technologies to create substantial business benefits. In a sense, most organizations are actively looking out for more systematic and time-efficient approaches with technologies to replace manual activities and streamline ongoing business process. In the context of Khoo Teck Puat Hospital (KTPH), datasets showcasing essential metrics like attendances and patient days, are extracted from internal information systems on a monthly basis. Manual efforts are required to effectively clean these datasets and generate standard reports, which are required for submission to MOH every month. Excel spreadsheet functions are the only tool used to clean and process these datasets. The entire process is extremely tedious due to the immense data volume, which consumes lots of man-hours. Moreover, the approach is error-prone, which compromises accuracy and quality of the report generated. As a result, KTPH is actively looking out for a technological approach in automating the monthly data cleaning and report generation process.

R is capable of tackling existing issues faced by KTPH by bridging the gap between KTPH’s business functions and technical capabilities. R is chosen for our project, primarily due to its versatility in the field of statistics. It focuses on user-friendly data analysis, statistical and graphical models. Meanwhile, it has large number of associated tools like, RStudio and CRAN, which include a wide range of statistical tools and packages.

There are indeed other data processing tools like, Python and Visual Basics in the market, which share benefits like, open-source and huge online communities. However, as Python is a general purpose language, most data analysis functionality is only made available with packages like NumPy and Pandas. R on the other hand, builds in data analysis functionalities by default, which largely decreases its reliability on add-in statistical packages. As compared to Python, statistical models can be written within a few lines in R, disregarding the fact that there are a great variety of readily available statistical tests and models in R. On top of which, packages like ggplot2 and Shiny, makes R easier and more customizable for data visualization, which is an essential aspect for our project.

Project Objectives

1. To automate the data cleaning process according to the required procedures for Specialized Outpatient Clinic (SOC), Acute and Emergency (A&E), and Inpatient (Inpt) data files. To generate clean data and pick out all the dirty data into two separate data files.

2. Use the clean data to generate the monthly reports automatically for Ministry of Health (MOH) in the standard format.

3. To analyze the monthly performance data and use data visualization to present the meaningful insights.

Project Progress

Team06 midterm progress 40.JPG