A Forecasting Model for Thailand ’ s Unemployment Rate

This study deals with two approaches—viz. via Box-Jenkins and artificial neuron network to forecast the unemployment rate in Thailand. The Box-Jenkins approach proves more efficient to estimate the unemployment rate in Thailand, with less MAPE compared to the second model. The forecast values are consistent with the actual values and tend to decrease.


Introduce the Problem
The unemployment rate (UR) is an important key to indicate economic status, and UR forecasting is a basic tool for planning and risk management in tax, finance, education, agricultural and industrial policies.Two approaches -viz.Box-Jenkins technique that combines moving average (MA) and autoregressive (AR) models, and data mining via an artificial neural network (ANN) model are very popular in prediction.Both approaches are flexible for complicated non-linear data, and their advantages include computational speed, low cost feasibility, and ease of design for operators with little technical experience.Box-Jenkins involves a very strict assumption for residuals during the diagnostic checking stage before proceeding to forecasting.The ANN approach has offers a very good approximation capability, and additional advantages such as fast processing times where the mathematical formulae and prior knowledge on the relationship between inputs and outputs are unknown (Kankal, Akpinar, Kömürcü, & Özşahin, 2011;Sözen, Arcaklioglu, & Ozkaymak, 2005;Sözen & Arcaklioglu, 2007).
The National Statistical Office (NSO) collects the national UR data, but have to take more time to present update reports.This is the motivation for our study.The objective of this study is to evaluate the model to forecast the UR in Thailand based on economic variable defined by the NSO, by using ANN compare to Box-Jenkins techniques.The results from this study employ the important informations in assessing UR patterns and selecting a more accurate approach to estimate the future UR.The remainder of this paper is organized as follows.Section 2 proposes the forecasting methodology of the ANN and Box-Jenkins approaches.Section 3 presents the modelling of Thailand's UR, and some conclusions are stated in the last Section.

Methodology and Data
The two different forecasting approaches, via ANNs and SARIMA from Box-Jenkins, are investigated to model the UR in Thailand.Six different models from these two approaches include twelve economic variables defined by the NSO-viz.the total number of workers (x1), the number of seasonal workers (x2), those compulsorally insured (x3), the number employed (x4), the use of electricity (x5), car sales (x6), the industrial production index (x7), the set index (x8), the private investment index (x9), Thailand's economic indicator (x10), the industrial labor productivity index (x11) and the industrial worker index (x12).The response variable is UR(y).
The monthly data used have been collected by the Labour Force in Thailand project of the NSO, from January 2003 to December 2011-cf.Figure 1.The UR is obvious decreasing during this period.

Artificial Neural Network Model Approach
The processes-viz.training and testing are the methodology of an ANN.The training of ANNs usually involves modifying the connection weights by mean of the learning rule.The total error, based on the squared difference between the predicted and actual output, is computed for the whole training set.Adjustment of the correction weights is carried out using the standard error back-propagation algorithm, which minimizes the total error using the gradient decent method.More details on the back-propagation algorithm shown in Figure 2 are given in Kankal, Akpinar, Kömürcü and Özşahin (2011).Then, testing data are used to check the generalization.

Box-Jenkins Approach
The purpose of Box-Jenkins is to find an appropriate model based on statistical concepts.There are both statistical tests to find validity of the model and statistical measures of forecast uncertainty.The iterative approach, with three steps of the model-building, presents in Figure 3.
where y t is the observation at time t; the φ's and θ's are parameters of the model and a t is the residual at time t with constant mean 0 and variance σ 2 , and uncorrelated with each other, which call "white noise" (Dobre & Adeiana, 2008).
Stationary-i.e. with constant mean, constant variance, is necessary in Box-Jenkins model.Differencing of non-stationary series one or more times is required to the achieve stationary series, and "I" stands for integrated.Thus the model becomes ARIMA.The Box-Jenkins approach can be extended to include a seasonal term (S) in the model as the SARIMA.At stage 1, the order for the seasonal autoregressive and seasonal moving average terms can be included in the model-i.e. it is not necessary remove seasonality before fitting the model.

Accuracy of Model
The accuracy of the forecast is evaluated based on the estimation of error or residual.Thus the smaller the values of the root mean square error (RMSE) and the mean absolute percent error (MAPE), the better the forecast.The MAPE criterion is the decisive factor, because it is expressed in easy generic percentage terms.The following equations are the respective formulas used in computing the RMSE (Mustafa et al., 2012) and MAPE: and where Raw i and Predict i are the actual and predicted observed at time i respectively, and n is the total number of the predictions.The criterion of MAPE for model evaluation is based on Lewis (1982).

Construction, Teaching and Testing of Artificial Neural Network (ANN)
From the historical data, the appropriate ANN from training data to forecast the UR is discovered, including the The determination of the number of nodes in the hidden layer is not "exact science"-cf.Kankal et al. (2011).
The network is therefore tested for different numbers of hidden layer nodes, in order to find the optimum and good convergence for the ANN structure.The problem in the training of an ANN is memorization, which the training is cut when the network starts to memorize.To prevent this, the error values of the training set may be greater than the testing set in the models.The accuracy in training is monitored by RMSE of the training and testing patterns separately In this study, initial weights for the learning rate are initialized into random values between -0.5 and 0.5, the learning rate equals 0.025, and the momentum is 0.8.After the learning set of data was presented to the ANN models, we stopped the learning process when the epochs reached 50,000 iterations.The best result from Table 1 for our ANN model forecast of the UR is ANN-1: 12-11-1 with the inaccuracy in forecasting where MAPE > 50%.

Model Building by Box-Jenkins
In the Box-Jenkins approach, data are split into two parts (70%) and (30%).The first step is to determine stationary and seasonality-and the Augmented Dickey-Fuller test (ADF) and autocorrelation (ACF) are used, respectively.The unit root test for stationary by the ADF shows that the series has a unit root and it is non-stationary for the original series-cf Table 2.With first order difference, it therefore becomes stationary (p-value < 0.00001).

Figure 1 .
Figure 1.The unemployment rate in Thailand from January 2003 to December 2011

Table 1 .
ANN models and their training and testing error

Table 2 .
The unit root test for stationary by ADF test Time plot for the actual value (line), predicted value from ANN1 (line with triangle) and predicted value from SARIMA (dash line)