The Efficiency of Artificial Neural Networks for Forecasting in the Presence of Autocorrelated Disturbances

We compare three forecasting methods, Artificial Neural Networks (ANNs), Autoregressive Integrated Moving Average (ARIMA) and Regression models. Using computer simulations, the major finding reveals that in the presence of autocorrelated errors ANNs perform favorably compared to ARIMA and regression for nonlinear models. The model accuracy for ANN is evaluated by comparing the simulated forecast results with the real data for unemployment in Palestine which were found to be in excellent agreement


Introduction
A good forecasting model is a key component to proper planning.Many different approaches exist for developing the forecast model, each designed to address special situations which arise in the time series.In this paper, we compare two traditional methods: Linear Regression and Autoregressive Integrated Moving Average to Artificial Neural Networks (ANNs).Given the complex contexts in which time series arise, there is a need for robust forecasting model which is flexible enough to be of use in a variety of situations.Previous research indicates that ANNs may provide such approach, see for example (Potočnik, et al. 2015, Adhikari & Agrawal, 2014, Thielbar & Dickey, 2011, Khashei & Bijari, 2010, Aksoy & Dahamsheh, 2009, Yasdi, 1999, among many others).
Neural networks are the preferred tool for many predictive data mining applications because of their flexibility, power, accuracy and ease of use.The statistical methods assume that data are linearly related and therefore is not true in real life applications.The newly introduced method, the ANN which is inherently a nonlinear network and does not make such assumption, therefore is well suited for prediction purpose.(Safi, 2013).
We use a data set of unemployment rates from Palestinian Central Bureau of Statistics (PCBS).The dataset contains the quarterly unemployment rates in Palestine during the period of the first quarter of 2000 through the second quarter of 2015.R-statistical software is used for fitting ANN, ARIMA, and regression models for the unemployment rates time series data.
In this paper, ANN, ARIMA and regression models have been conducted for unemployment rates forecasting in Palestine.The main purpose of this paper is to find a more accurate and reliable forecasting model for the unemployment rates in Palestine.This paper is organized as follows: Section 2 presents review of ANN literature; in section 3, we present the comprehensive computer simulation results.Section 4 displays three forecasting cases fitting ARIMA, ANN, and Regression models for unemployment data in Palestine; and section 5 concludes some important results of this paper and offers future research.

Review of ANN Literature
Artificial neural networks (ANN) have received a great deal of attention over the last years.They are being used in the areas of prediction and classification, areas where regression and other related statistical techniques have traditionally been used.(Cheng & Titterington, 1994).Box, et al. (1995) have developed the integrated autoregressive moving average (ARIMA) methodology for fitting a class of linear time series models.However, the statistical methods assume that data are linearly related and which is typically not true in real life applications.The newly introduced method, ANN, has emerged to be popular as it does not make such assumptions.The ANN, which is inherently a nonlinear network and does not make such assumptions, is The existence of good model to forecast is very crucial for policy makers.Good policy requires that first identification of relationship for data (linear or non-linear).Artificial Neural Networks have been successfully used in a variety of areas.
Research evidence shows that for any system with non-linear instability patterns such as the market for housing, the utilization of the ANN methodology serve properly (Bahramianfar, 2015).Valipour, et al. (2013) showed that by comparing root mean square error (RMSE) and mean bias error (MBE), dynamic artificial neural network model was chosen as the best model for forecasting inflow of the Dez dam reservoir.KÖLMEK & Navruz (2015) constructed simulation studies about price modeling via artificial neural networks and proper artificial neural network configurations.They showed that the neural network model gave better results over a time-series model.Potočnik, et al. (2015) showed that neural network models exhibited the overall best forecasting performance, and suggested that neural network (NN) or the neural network models with a direct linear link (NNLL) structures should be considered as forecasting solutions for applied forecasting in district heating markets.

The Simulation Setup
In this section, we present a computer simulation comparing the robustness of three forecasting techniques: ANNs, ARIMA, and Regression.These simulations examine the sensitivity of forecasting approaches to model misspecification.The efficiency of the approaches are compared using the root mean squared forecast error (RMSFE) of the ANNs model relative to ARIMA and regression models.Time series were generated of the form: The errors follow a first order autoregressive AR(1) model with coefficient 0 <  < 1 and   independent standard normal.Eighteen cases are considered in the results shown below: three finite series lengths (T = 20, 50, and 100), two choices of the linking function f (S-curve and linear) and three autoregressive coefficients   0.1, 0.5, 0.9   are used.For each case, 1000 series were generated.
In this simulation, we used AR(1), because most of the time in the economic time series, data generating processes is explainable in terms of autoregressive of order one or two.The most commonly assumed process in both theoretical and empirical studies is the first-order autoregressive process or briefly, AR(1).At one time, the AR(1) process was the only autocorrelation process considered by economists.Most economic data series were annual, for which the AR(1) process is reasonable.This may explain the wide use of this process, as estimation of other more complicated processes is not manageable without the aid of a computer (Safi & White, 2006).
We introduce definitions of the simulation RMSFE, the relative efficiency, and the two selected models of dependent variable.
Definition 1: The simulation RMSFE is a measure of the size of the forecast error, that is, the magnitude of a typical mistake made using a forecasting model.The RMSFE is given by Y  based on information through period T , using a model estimated with data through period T (Stock & Watson, 2015).
Definition 2: The efficiency of the ANNs forecasts relative to that of ARIMA in terms of the simulation RMSFE,  is given by A ratio less than one indicates that the ANNs forecast is more efficient than ARIMA, and if  is close to one, then the ANNs forecast is nearly as efficient as ARIMA forecasts.Otherwise, ANNs performs poorly, (Safi, 2016).

Definition 3:
The models used in the simulation are defined below.
The S-curve model: The model coefficients 0 b , and 1 b were each chosen to be equal one, respectively.

Simulation Results
We discuss the simulation results based on the ratio of the estimated RMSFE of ANNs to that of ARIMA and regression.Table 1 shows the complete simulation results for the ratios of RMSFE of ANNs to that of ARIMA and regression for the two different models, all selected sample sizes and autocorrelation coefficients., respectively.This result indicates that RMSFEs for ANNs equal 88.22% and 60.74% of that of ARIMA and regression models, respectively.Therefore, ANNs superior on ARIMA and regression for small sample size and 0.1 o For 50 n  , the relative efficiencies of ANNs to ARIMA and regression equal ˆ0.9406, 0.8217   , respectively.This result indicates that RMSFEs for ANNs equal 94.06% and 82.17% of that of ARIMA and regression models, respectively.Hence, ANNs is more efficient than ARIMA and regression for moderate sample size and 0.1 o For 100 n  , the relative efficiencies of ANNs to ARIMA and regression equal ˆ0.9984, 0.9328   , respectively.This result indicates that RMSFEs for ANNs equal 99.84% and 93.28% of that of ARIMA and regression models, respectively.Hence, ANNs perform nearly as efficiently as ARIMA and regression models for large sample size and 0.1 -ANNs perform nearly as efficiently as ARIMA and more efficient than regression for the S-curve for all selected sample sizes and autocorrelation coefficients, 0.5   and 0.9 .For example, o For 20 n  , the relative efficiencies of ANNs to ARIMA and regression equal ˆ0.9633, 0.6890   , respectively.This result indicates that RMSFEs for ANNs equal 96.33% and 68.90% of that of ARIMA and regression models, respectively.Therefore, ANNs perform nearly as efficiently as ARIMA and are superior to regression for small sample size and 0.5 , the relative efficiencies of ANNs to ARIMA and regression equal ˆ0.9952, 0.7082

 
, respectively.This result indicates that RMSFEs for ANNs equal 99.52% and 70.82% of that of ARIMA and regression models, respectively.Hence, ANNs perform nearly as efficiently as ARIMA and more efficient than regression for moderate sample size and 0.9 , the relative efficiencies of ANNs to ARIMA and regression equal ˆ0.9760, 0.8392   , respectively.This result indicates that RMSFEs for ANNs equal 97.60% and 83.92% of that of ARIMA and regression models, respectively.Hence, ANNs perform nearly as efficiently as ARIMA and more efficient than regression for large sample size and 0.9   .-We notice, there is a situation where ANNs performs poorly compared to ARIMA.When 20 n  and 0.9

 
, the relative efficiency of ANNs to ARIMA equals ˆ1.2373   . This result indicates that RMSFE for ANNs is 23.73% more than that for ARIMA.

B)
For Linear Model ANNs perform poorly compared to ARIMA, but are superior to regression for the linear model for all selected sample sizes and autocorrelation coefficients.For example, -For 20 n  and 0.1

 
, the relative efficiencies of ANNs to ARIMA and regression equal ˆ2.2491, 0.1440   , respectively.This result indicates that RMSFEs for ANNs equal 224.91% and 14.40% of that of ARIMA and regression models, respectively.
-For 50 n  and 0.5

 
, the relative efficiencies of ANNs to ARIMA and regression equal ˆ3.2729, 0.0907   , respectively.This result indicates that RMSFEs for ANNs equal 327.29% and 9.07% of that of ARIMA and regression models, respectively.
-For 100 n  and 0.9   , the relative efficiencies of ANNs to ARIMA and regression equal ˆ14.7658, 0.3530

 
, respectively.This result indicates that RMSFEs for ANNs equal 1476.58% and 35.30% of that of ARIMA and regression models, respectively.

Summary
For non-linear model, ANNs were more efficient than ARIMA for all selected sample sizes and 0.1   .While, ANNs perform nearly as efficiently as ARIMA for all selected sample sizes and 0.5   , 0.9 .However, ANNs performs poorly as efficiently compared to ARIMA when 20 n  and 0.9

 
. In addition, ANNs were more efficient than regression for all selected sample sizes and autocorrelation coefficients.For linear model, ANNs perform poorly compared to ARIMA, but were superior to regression for all selected sample sizes and autocorrelation coefficients.

Fitting Models for unemployment Data
This section presents the fitting models for unemployment rates data by using three different approaches, ANN, ARIMA(p,d,q), and regression models.Consider the quarterly unemployment in Palestine, from the first quarter in 2000 through the second quarter 2015, Figure 1 displays the time series plot.The series displays considerable fluctuations over time, especially in 2000 and 2003, and a stationary model does not seem to be reasonable.The higher values display considerably more variation than the lower values.The forecasting results are presented in the following sub-sections.

Fitting ANN Model for Unemployment Data
Applying ANN with average of 20 networks, each of which is a 1-1-1 network.R-software is used for fitting ANN model for the time series.Some commands and functions with input and output variables have been used.The nnetar function is used to fit neural networks (Venables & Ripley, 2002).The estimated noise variance is 2 5.587 e   . RMSFE is used as stopping criteria in the network.Smaller values of RMSFE indicate higher accuracy in forecasting.The Neural network result shows that the minimum RMSFE equals 2.3636.

Fitting ARIMA Model for Unemployment Data
We use maximum likelihood estimation and show the results obtained from auto.arima command by using the R statistical software in Table 1.Here we see that ˆ0.6980

 
. We also see that the estimated noise variance is 2 ˆ=10.91 e  .Noting the P-value, the estimate of autoregressive is significantly different from zero statistically, as is the intercept term.
The AR(1) result shows that the RMSFE equals 3.3034.

Fitting Regression Model for Unemployment Data
The estimated linear regression model in (3) is obtained by using the Ordinary Least Square (OLS) estimation method;   2 show the forecasting results for unemployment through 2015: Q3-2017:Q4 based on ANN and ARIMA(1,0,0), and regression models.

Conclusion and Future Research
In this paper, we show that Artificial Neural Networks provide a good alternative to traditional approaches for fitting time series.In the simulation results for time series with autocorrelated errors, the ANN approach was much more efficient than regression in forecasting future observations of the time series.When compared to ARIMA, ANNs performed well when the linking function was nonlinear, outperforming ARIMA in 7 of the 9 cases and nearly the same in 1 of the two remaining cases.However, when the linking function was linear, ARIMA outperformed ANN.This is not surprising since ARIMA is designed to fit this particular situation.It is interesting to note, however, that the difference in performance was smaller when ϕ is close to 1.
All three approaches were used to forecast the unemployment in Palestine.The regression model predicts a nearly constant unemployment rate for the next 10 quarters.The fit of the regression line was poor.Comparing the ANN and the ARIMA model, we see considerable differences.The ANN model forecasts exhibit more short term variation than the ARIMA model.The RMSFE indicates that the ANN provides a better fit to the data.Taken, together this shows that ANN is the preferable approach for this data set.
The models studied here were restricted to univariate time series.In the context of economic forecasting there are typically many correlated time series available for analysis.One avenue for future investigation, is to use ANNs in the context of multivariate time series to forecast multiple time series simultaneously.Another, is investigate the effectiveness of ANNs to model data where seasonality is known to exist.

ForecastingFigure 2 .
Figure 2. Forecasting results of ANN, ARIMA and Regression Models for Unemployment Rates

Table 1 .
Ratios of RMSFE for ANN to ARIMA and Regression

Table 2 .
Maximum Likelihood Estimates from R Software: Unemployment Rates

Table 3 and
Figure

Table 3 .
Forecasting results of ANN, ARIMA and Regression Models for Unemployment