ANLY482 AY2016-17 T2 Group23 Silver Daisies Project Overview

From Analytics Practicum
Revision as of 18:55, 8 April 2018 by Carol.chong.2014 (talk | contribs) (→‎Project Objectives)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

HOME

PROJECT OVERVIEW

ANALYSIS & FINDINGS

PROJECT MANAGEMENT

DOCUMENTATION

MAIN PAGE

Business Problem

Each Persons Needing Care (PNC) requires different levels of care, different treatments, and different appointment clinics. The current scheduling is heavily dependent on intuition, as the current procedure allocates any available driver to pick up the PNC from the hospitals after the PNC calls to inform that they have completed their appointments. This may result in long wait times for the PNCs. With an accurate estimation for appointment duration, there is an opportunity to 'batch' PNCs, where PNCs who end their appointment around the same time and location can be picked up by the same driver.

MET Process

MET Process.png

Problem Analysis

From our data exploration, where the team note that 6% of the time, the drivers could have waited for another 15 minutes or less to pick up another client from the same clinic before leaving. By waiting and ferrying an additional PNC, it will help reduce operational cost as well as free up capacity. Therefore, accurate prediction of appointment time helps our sponsor to schedule possible “Batching” and thus improves efficiency and reduce the number of trips.

Project Objectives

This motivates the team to analyze the factors affecting appointment duration and the team will research into the topic "Using Regression Analysis To Build Predictive Model For Appointment Duration"


The Multiple Linear Regression Analysis will potentially contribute to:

  • Identification of significant variables affecting Appointment Duration
  • Prediction of Appointment Duration
  • Allows “Batching” when picking up clients

And the desired impact of:

  • Reduction of operational cost due to increased efficiency
  • Improved satisfaction of Person Needing Care due to possible shorter waiting time
  • Freeing up capacity to cater for greater demand

Methodology

Variables Identification

The outcome measure is the appointment duration of the PNC to attend his/her appointment. There are 7 input variables that can be further classified to:

  • Demographics related variables (Gender, Walkability)
  • Appointment related variables (Day of the week, Time of the day, Escort accompaniment, Appointment clinic, Appointment purpose)

Data Preparation

Data preparation is an important aspect of data analysis, as it allows us to accurately and easily interpret the data in later stages. We received a set of data from our client, for the period of 3 months. PNC Data contains information of PNC's address, first appointment time, appointment clinic, appointment purpose, presence of medical escort, pick up vehicle (from home to clinic), and drop off vehicle.

To analyze each appointment purpose's median, we first filtered out purposes with at least 15 cases to ensure sufficient data points. Missing values and negative appointment durations were also removed. Exceptionally long appointment durations that occurred during lunch hours were excluded as they are identified as outliers.

Analysis Tools

We performed the following data analysis procedures:

  • Univariate analysis: We performed univariate analysis on our response variable, appointment duration, to find out the current distribution and summary statistics of the appointment duration.
  • Bivariate analysis: The analysis of X and Y variables for the purpose of determining the empirical relationship between them. We performed bivariate analysis on each of our independent variable to find out their individual effect on the dependent variable, Appointment duration. Bar charts, line charts, box plots were used for graphical representation of our analysis and significance tests were performed to validate findings.
  • Stepwise Regression: Finding the subset of independent variables involves two opposing objectives, we want the regression model to be complete and realistic and we want to include as few variables as possible as irrelevant regressor decreases the precision of the estimated coefficients and predicted values .Therefore, we performed stepwise regression for all candidate variables in the model for adding and removing variables.
  • Recursive Partitioning: We then performed recursive partitioning to create a decision tree that group similar response values together. This is to lower the number of potential predictor variables and generate more intuitive models.

Statistical Software

  • SAS
  • JMP
  • Tableau