ANLY482 AY2017-18T2 Group10 Analysis & Findings: Analysis

From Analytics Practicum
Jump to navigation Jump to search

Tennet logo.png


HOME

ABOUT US

PROJECT OVERVIEW

ANALYSIS & FINDINGS

PROJECT MANAGEMENT

BACK TO MAIN ANLY482

EDA

Recommendations

Model

Overview

There are various time-series forecasting methods available, and our research paper will perform a comparison of two more commonly used time-series forecasting techniques – (1) exponential smoothing and (2) autoregressive integrated moving average (ARIMA), to determine the appropriate model to use.

Box-Jenkins Autoregressive Integrated Moving Average (ARIMA)

The Box-Jenkins ARIMA model is a combination of Auto Regressive (AR), Integrated (I) and Moving Average (MA) models. This assumes that the time series data is stationary. If it is not, differencing must be performed to make it stationary. An effective fitting of Box-Jenkins models requires at least a moderately long series. This has been recommended to contain at least 50 to 100 observations

The standard notation for ARIMA is denoted by: ARIMA(p,d,q)

  • p: the number of autoregressive terms(AR)
  • d: the number of times the series is differenced before it becomes stationary (I)
  • q: the number of moving average terms

We used Box-Jenkins ARIMA to build our model, which consist of the following iterative steps:

  1. Identification. Using the customer data, we performed analyses such as autocorrelation plot, partial autocorrelations and the augmented Dickey-Fuller stationary test. We then used those analyses to estimate appropriate values for p, d and q.
  2. Estimation and testing. Numerically approximating the solutions of nonlinear equations, using techniques such as nonlinear least square and maximum likelihood estimation.
  3. Diagnostic Checking. The fitted model is checked for inadequacies by considering the autocorrelations of the residual series (the series of residuals, or error values).

In Model Identification, we determined that the data stationary and non-seasonal. This can be done using the Augmented Dickey-Fuller (ADF) test and

Exponential Smoothing

Exponential smoothing aims to isolate trends or seasonality from irregular variation and has been found to be most effective when the components describing the time series vary slowly as time passes [3]. In calculating the new estimate, the estimate for the current period and a portion of the current period’s generated random error are combined. Past data is weighted unequally with the effect of recent observations expected to decline exponentially as time passes

In the paper “A state space framework for automatic forecasting using exponential smoothing methods”, the authors adopt a well-established taxonomy as a framework to choose between various exponential smoothing methods [4]. This framework identifies the presence or absence of a trend component and seasonality component within the data being analysed.

To decide on the best model, we are implementing Rob J. Hyndman’s state space framework, [4] which was also covered in the literature review, that has the general notation of ETS (Error, Trend, Seasonal) Where:

  • Error: The type of error function
  • Trend: Function of trend
  • Seasonal: Function of seasonality

Each component in the framework can either be Not present, Additive, Additive Damped, Multiplicative or Multiplicative Damped. For example the notation of ETS(A,N,N) represents Simple Exponential Smoothing - additive errors, no trend, no seasonality.

Hyndman’s framework applies each of the 24 possible exponential smoothing methods in the state space framework to our data set and decides on the best model using the AIC, BIC and AICc. In our dataset, the optimal method was ETS(M,N,N) which represents multiplicative errors with no trend and no seasonality. Additionally, also included the ETS(A,N,N) which is a Simple Exponential Smoothing (SES) as our candidate models for the final comparison with ARIMA.


Comparison between ARIMA and Exponential Smoothing Forecasting Models