Difference between revisions of "ZAN Project Overview"

From Analytics Practicum
Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 25: Line 25:
 
| style="padding:0.3em; font-family:Arimo; font-size:110%; border-bottom:2px solid #228B22; border-top:2px solid #228B22; background:#228B22; text-align:center;" width="10%" |
 
| style="padding:0.3em; font-family:Arimo; font-size:110%; border-bottom:2px solid #228B22; border-top:2px solid #228B22; background:#228B22; text-align:center;" width="10%" |
 
[[Team_ZAN|<font  face ="Lucida Grande" color="#FFFFFF"><strong>ABOUT US </strong></font>]]
 
[[Team_ZAN|<font  face ="Lucida Grande" color="#FFFFFF"><strong>ABOUT US </strong></font>]]
 +
| style="border-bottom:2px solid #228B22; border-top:2px solid #228B22; background:#228B22;" width="1%" | &nbsp;
 +
 +
| style="padding:0.3em; font-family:Helvetica; font-size:110%; border-bottom:2px solid #228B22; border-top:2px solid #228B22; background:#228B22; text-align:center;" width="10%" |
 +
[[ANLY482_AY2016-17_Term_2|<font  face ="Lucida Grande" color="#FFFFFF"><strong>BACK TO MAIN ANLY82 </strong></font>]]
 
| style="border-bottom:2px solid #228B22; border-top:2px solid #228B22; background:#228B22;" width="1%" | &nbsp;
 
| style="border-bottom:2px solid #228B22; border-top:2px solid #228B22; background:#228B22;" width="1%" | &nbsp;
 
|}
 
|}
Line 36: Line 40:
 
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Motivation</font></div>
 
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Motivation</font></div>
 
<br/>
 
<br/>
Our project sponsor is a medical consultant working for Hospital X. He specialises in tending to younger patients from the age of 18 years old and below. He hopes to tap into the under-utilised administrative data that is collected by the hospital daily.
+
'''Zoey's Motivation'''
 +
<br/>
 +
Working with real world data provided to us by Hospital X is interesting. People’s behaviour, even when they follow expectations, are often varied and unique, and there is always a possibility of observing unexpected behaviours in the data. While we start off seeking improve the productivity of our sponsor’s colleagues, there is no telling what our analysis of the data might uncover about the situation and the problem, and that makes the project exciting.
  
Patients are usually referred to Hospital X by other medical institutions or they booked an appointment directly. Currently, Hospital X experiences high no-show  appointments rate of about 21% for first visits and 19% for review visits. Our project sponsor is keen on improving productivity for the doctors and psychologists as missed appointments lead to longer appointment lead times, idle time and overall lower quality of care.
+
It is very encouraging that our work has the potential to impact people’s lives. The sponsor’s concern for the community he works in is motivating as well. That has led him to do more for the hospital, even on top of his normal duties, inspires me to want to dig through the data and see if we can find anything that could him help the hospital improve their processes.  
  
Freeing up the time wasted by patients’ no-show would improve utilisation of slots, and even reduce appointment wait time for other patients.  
+
<br/>
 +
'''Aishwarya's Motivation'''
 +
<br/>
 +
Many hospitals and clinics often face the problem of cancelled appointments as well as no show by patients. The time allocated by the doctors and consultations for these patients goes to waste, time that could have been significantly used by another patient in need of it. Our motivation is to be able to come up with a solution to efficiently maximize the utilization of the doctor’s time to be able to serve as many patients as possible. In this process, the doctors do not have to spend their time allocated to patients idly.  
  
 +
Another possible problem that we wish to look into through this project is the link between a patient’s details and his/her probability of cancelling an appointment. There could be various factors that influence the chance of a patient not showing up, such as the time of the appointment, the location of the patient, the age of the patient, experience of previous visit and so on. We hope to study these factors and try to find a possible link between the two. In doing so, we would be able to predict the probability of a patient not turning up for the appointment, and subsequently allocate the time to another patient in need.
  
 
<br/>
 
<br/>
 +
'''Nas's Motivation'''
 +
<br/>
 +
This project offers a follow-up to a previous project that I have done with a subsidiary of Hospital X. During that project, I have learn a lot about the organization and the amount of effort put in by the various stakeholders in order to improve the mental wellbeing of the general population. As an operationa management student, I am keen in exploring ways in improving the productivity of the staff as well as maximizing the access of care to the patients.
 +
 +
This project interests me as it offers a unique opportunity to further explore an unfamiliar domain (the medical sector). I believe that we can learn much from our project sponsor as he is a champion for data analytics and has considerable experience.
  
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Secondary Research</font></div>
 
 
<br/>
 
<br/>
Hospital X is a pioneer tertiary hospital that provides a comprehensive range of medical and rehabilitative services for children, adolescents, adults and the elderly. This project plans to make use of the dataset provided by our project sponsor to analyse if there is any relationship between the variables and to create a predictive model for likelihood of a patient in defaulting appointments.
 
  
 
<div align="left">
 
<div align="left">
 
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Objective & Goals</font></div>
 
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Objective & Goals</font></div>
 
 
 
<br/>
 
<br/>
 
 
 
The objectives of the project would be the following:
 
The objectives of the project would be the following:
# Analysis of Hospital X's inpatients data
+
# Business objective: To identify factors that relates to no-show appointments and predicts patients’ attendance rate in order to improve Hospital X’s scheduling of appointments and utilisation of appointment slots.
 +
# Technical objective: To use data analytical tools and statistical methods to study the data and obtain insights that would facilitate the business objective.
 
#* To understand the data domains
 
#* To understand the data domains
 
#* To understand the workflow of scheduling a patient’s consultation process
 
#* To understand the workflow of scheduling a patient’s consultation process
Line 63: Line 73:
 
#* To conduct what-if analyses to understand changes in appointment rates if the patient is referred to a medical professional nearer to them
 
#* To conduct what-if analyses to understand changes in appointment rates if the patient is referred to a medical professional nearer to them
 
#* To evaluate the feasibility of creating a predictive model  
 
#* To evaluate the feasibility of creating a predictive model  
# Recommendations based on findings
 
#* To help stakeholders understand the analysis of the findings
 
#* To consider the feasibility of a visual aid such as dashboard to aid in the stakeholders' future reference
 
 
<br/>
 
<br/>
  
Line 71: Line 78:
 
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Provided Data</font></div>
 
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Provided Data</font></div>
 
<br/>
 
<br/>
The data is withheld until the NDA agreement is signed. The dataset that will be given to us is based on our project sponsor’s department inpatient records. The inpatient records are processed by the hospital staff working on the front desk.  
+
The dataset is based on Hospital X's child and adolescent department inpatient records. The inpatient records are processed by the hospital staff working on the front desk. The patient visits are mainly categorised into 1) ''first appointment with a doctor'', 2) ''review appointment with a doctor'', 3) ''first appointment with a psychologist'' and 4) ''reviewed appointment with a psychologist''.
 
 
 
<br/>
 
<br/>
  
Line 78: Line 84:
 
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Methodology</font></div>
 
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Methodology</font></div>
 
<br/>
 
<br/>
As we have not obtained the data until the NDA is signed, we will only share our initial thought process of how we will tackle the project. We shall adopt closely to the Data Analytics Lifecycle approach.
+
For this project, we will follow Data Analytics Lifecycle approach closely.  
 
 
Prior to obtaining the actual data, we have been researching on the project topic to familiarise ourselves with the field domain and to understand the perspectives of the various stakeholders. We identified doctors, hospital frontline staff, patients who defaulted their appointments, patients who are unable to book an appointment due to no slots and Hospital X as the relevant stakeholders in  this project.
 
  
At this phase of the project, we will focus on understanding the given dataset and clean the data. Concurrently, we will decide on the analytical model and prepare the data accordingly.
+
With reference from several research papers such as Michelle. K. (2011). and Molfenter. T. (2013), our secondary findings would be the following:
 +
* Younger patients are significantly less likely to keep their initial outpatient mental health appointments
 +
* No-show behavior is positively correlated with lower income and lower socioeconomic status
 +
* Previous appointment experience of the patient, such as number of previous appointments, their types and lead times, do play a part in a patient defaulting his or her appointment
 +
* The longer a patient has to wait for an appointment to be scheduled, the less likely is the patient to keep his or her first appointment
  
 +
These secondary findings will be useful as a starting platform for us to carry out the analysis. We will test the given data against the secondary findings to see if there is any conformity.
 
<br/>
 
<br/>
  
Line 93: Line 102:
 
#The dataset only pertains to our project sponsor’s department, which administers only younger patients of ages 18 years and below.
 
#The dataset only pertains to our project sponsor’s department, which administers only younger patients of ages 18 years and below.
 
<br/>
 
<br/>
 +
 +
'''Phase 0: Learning about the Case Context'''
 +
<br/>
 +
We will gather and map out all information on patient appointment scheduling. This includes:
 +
*Mapping out the workflow in Hospital X’s context, as well as the general process as described in existing literature
 +
*Consolidating significant factors of no-show from literature review
 +
*Review existing methods/recommendations for the problem of no-show
 +
<br/>
 +
 +
'''Phase 1: Data Cleaning'''
 +
<br/>
 +
In the first phase, we will closely study the dataset to understand each of its variables and values so as to prepare it for analysis. This involves the following steps:
 +
#Recording the description and range for each variable and its values
 +
#Identifying irrelevant or duplicate fields
 +
#Resolving missing and invalid values
 +
#Cross-check related variables to verify accuracy
 +
#Transform skewed variables, merge variables for ease of analysis
 +
#Record assumptions made
 +
#Documenting all of the above
 +
<br/>
 +
 +
Depending on our findings regarding the dataset and its errors, we will also consult with the sponsor to understand more specifically on questions of how some fields and its items are entered.
 +
 +
<br/>
 +
 +
'''Phase 2: Data Exploration'''
 +
<br/>
 +
In the second phase, we will conduct exploratory data analysis, as well as test some of the hypotheses gleaned from our literature review findings.
 +
 +
Exploratory data analysis steps include:
 +
*Studying the distributions of variables
 +
*Identifying and treating outliers/anomalies
 +
*Finding clusters or groupings (cluster analysis)
 +
*Compare two or more locations or time periods (any cycles/ seasonal trends)
 +
*Examine relationships between variables (regression analysis)
 +
*Develop hypotheses based on literature and conduct hypothesis testing
 +
<br/>
 +
 +
This analysis should go through a number of iterations, as we will continually compare our findings to existing literature as well as what we know of Hospital X’s processes.
 +
 +
<br/>
 +
 +
'''Phase 3: Data Modelling'''
 +
<br/>
 +
By the final phase, we would have a good understanding of the data and case, and develop models for predicting no-show in patients. From this, we will be able to develop solutions for Hospital X.
 +
 +
Steps include:
 +
#Develop various (multiple regression) models
 +
#Compare models and select best model based on testing
 +
#Interpret model to develop strategies that Hospital X can adopt
 +
 +
<br/>
 +
  
 
<!--Content End-->
 
<!--Content End-->

Latest revision as of 13:54, 23 April 2017


HOME

 

PROJECT OVERVIEW

 

PROJECT FINDINGS

 

PROJECT MANAGEMENT

 

DOCUMENTATION

 

ABOUT US

 

BACK TO MAIN ANLY82

 



Motivation


Zoey's Motivation
Working with real world data provided to us by Hospital X is interesting. People’s behaviour, even when they follow expectations, are often varied and unique, and there is always a possibility of observing unexpected behaviours in the data. While we start off seeking improve the productivity of our sponsor’s colleagues, there is no telling what our analysis of the data might uncover about the situation and the problem, and that makes the project exciting.

It is very encouraging that our work has the potential to impact people’s lives. The sponsor’s concern for the community he works in is motivating as well. That has led him to do more for the hospital, even on top of his normal duties, inspires me to want to dig through the data and see if we can find anything that could him help the hospital improve their processes.


Aishwarya's Motivation
Many hospitals and clinics often face the problem of cancelled appointments as well as no show by patients. The time allocated by the doctors and consultations for these patients goes to waste, time that could have been significantly used by another patient in need of it. Our motivation is to be able to come up with a solution to efficiently maximize the utilization of the doctor’s time to be able to serve as many patients as possible. In this process, the doctors do not have to spend their time allocated to patients idly.

Another possible problem that we wish to look into through this project is the link between a patient’s details and his/her probability of cancelling an appointment. There could be various factors that influence the chance of a patient not showing up, such as the time of the appointment, the location of the patient, the age of the patient, experience of previous visit and so on. We hope to study these factors and try to find a possible link between the two. In doing so, we would be able to predict the probability of a patient not turning up for the appointment, and subsequently allocate the time to another patient in need.


Nas's Motivation
This project offers a follow-up to a previous project that I have done with a subsidiary of Hospital X. During that project, I have learn a lot about the organization and the amount of effort put in by the various stakeholders in order to improve the mental wellbeing of the general population. As an operationa management student, I am keen in exploring ways in improving the productivity of the staff as well as maximizing the access of care to the patients.

This project interests me as it offers a unique opportunity to further explore an unfamiliar domain (the medical sector). I believe that we can learn much from our project sponsor as he is a champion for data analytics and has considerable experience.


Objective & Goals


The objectives of the project would be the following:

  1. Business objective: To identify factors that relates to no-show appointments and predicts patients’ attendance rate in order to improve Hospital X’s scheduling of appointments and utilisation of appointment slots.
  2. Technical objective: To use data analytical tools and statistical methods to study the data and obtain insights that would facilitate the business objective.
    • To understand the data domains
    • To understand the workflow of scheduling a patient’s consultation process
    • To identify the contributing factors that lead patients to defaulting appointments
    • To conduct what-if analyses to understand changes in appointment rates if the patient is referred to a medical professional nearer to them
    • To evaluate the feasibility of creating a predictive model


Provided Data


The dataset is based on Hospital X's child and adolescent department inpatient records. The inpatient records are processed by the hospital staff working on the front desk. The patient visits are mainly categorised into 1) first appointment with a doctor, 2) review appointment with a doctor, 3) first appointment with a psychologist and 4) reviewed appointment with a psychologist.

Methodology


For this project, we will follow Data Analytics Lifecycle approach closely.

With reference from several research papers such as Michelle. K. (2011). and Molfenter. T. (2013), our secondary findings would be the following:

  • Younger patients are significantly less likely to keep their initial outpatient mental health appointments
  • No-show behavior is positively correlated with lower income and lower socioeconomic status
  • Previous appointment experience of the patient, such as number of previous appointments, their types and lead times, do play a part in a patient defaulting his or her appointment
  • The longer a patient has to wait for an appointment to be scheduled, the less likely is the patient to keep his or her first appointment

These secondary findings will be useful as a starting platform for us to carry out the analysis. We will test the given data against the secondary findings to see if there is any conformity.

Project Scope


While the project will revolve around the above objectives, our project sponsor is flexible to allow us to explore other possible relevant analytical tools or techniques that would enhance the findings.

  1. The dataset is limited to records from 2015 to 2016, which prevent any seasonal or yearly analysis
  2. The dataset only pertains to our project sponsor’s department, which administers only younger patients of ages 18 years and below.


Phase 0: Learning about the Case Context
We will gather and map out all information on patient appointment scheduling. This includes:

  • Mapping out the workflow in Hospital X’s context, as well as the general process as described in existing literature
  • Consolidating significant factors of no-show from literature review
  • Review existing methods/recommendations for the problem of no-show


Phase 1: Data Cleaning
In the first phase, we will closely study the dataset to understand each of its variables and values so as to prepare it for analysis. This involves the following steps:

  1. Recording the description and range for each variable and its values
  2. Identifying irrelevant or duplicate fields
  3. Resolving missing and invalid values
  4. Cross-check related variables to verify accuracy
  5. Transform skewed variables, merge variables for ease of analysis
  6. Record assumptions made
  7. Documenting all of the above


Depending on our findings regarding the dataset and its errors, we will also consult with the sponsor to understand more specifically on questions of how some fields and its items are entered.


Phase 2: Data Exploration
In the second phase, we will conduct exploratory data analysis, as well as test some of the hypotheses gleaned from our literature review findings.

Exploratory data analysis steps include:

  • Studying the distributions of variables
  • Identifying and treating outliers/anomalies
  • Finding clusters or groupings (cluster analysis)
  • Compare two or more locations or time periods (any cycles/ seasonal trends)
  • Examine relationships between variables (regression analysis)
  • Develop hypotheses based on literature and conduct hypothesis testing


This analysis should go through a number of iterations, as we will continually compare our findings to existing literature as well as what we know of Hospital X’s processes.


Phase 3: Data Modelling
By the final phase, we would have a good understanding of the data and case, and develop models for predicting no-show in patients. From this, we will be able to develop solutions for Hospital X.

Steps include:

  1. Develop various (multiple regression) models
  2. Compare models and select best model based on testing
  3. Interpret model to develop strategies that Hospital X can adopt