The Advantages of Dynamic Factor Models as Techniques for Forecasting : Evidence from Taiwanese Macroeconomic Data

This study applies an approximate dynamic factor model to forecast three macroeconomic variables of Taiwan – inflation based on consumer price index, unemployment rate, and industrial production growth rate. Our data contain 95 macroeconomic variables of Taiwan and 89 international time series during 1981Q1-2006Q4. We perform out-of-sample forecasting from a rolling-window estimation scheme and compare our models with a univariate autoregressive model and a vector autoregressive model. We find that our dynamic factor model has superior performance in predicting inflation for all forecasting horizons. However, limited superior performance is found in the application to industrial production growth rate and unemployment rate. Moreover, we do not find that including international variables help to improve the performance of a dynamic factor model in our application.


Introduction
Macroeconomic forecasting is a challenging topic which has attracted significant attention among researchers.A common way to perform macroeconomic forecasting is utilizing a univariate autoregressive model or a vector autoregressive model.The latter has advantage over the former in that the system contains more information than the variable of interest itself.This paper focuses on the comparison of forecasting performance of Taiwan's data using three different methods: an approximate dynamic factor model (DFM), an autoregressive model (AR) and a vector autoregressive model (VAR).
To include more information in a model may help generate better prediction.A univariate time series model only uses a limited subset of the whole information set and thus a natural extension is a VAR.On the other hand, although we have abundant time series available, it is unwise to include all the variables into a system.To estimate more parameters may result in imprecise estimation because of low degrees-of-freedom, especially in a VAR system.In order to use as much information as possible without the problem of parameter inflation, we require some statistical techniques to achieve this goal.
One way to deal with abundant information is to use dimension reduction techniques.If all macroeconomic variable movements are driven by a few common sources, then a model that includes a few common factors should be able to explain most variations.Stock and Watson (2002) propose an approximate dynamic factor model to forecast macroeconomic variables.In contrast to a strict factor model, an approximate dynamic factor model allows for weak correlations in the idiosyncratic errors.They find that the most accurate forecasts of the US inflation use lags of inflation together with a single factor.For the methods to generate the factors, or the way to reduce dimension of variables, Principal Component Analysis (PCA) is perhaps the most popular method in dimension reduction and it is also used in Stock and Watson (2002).Some researchers have applied this method to several issues.For example, Camacho and Sancho (2003) use monthly macro economic data from 1975 to 2001 to construct a diffusion index model.They show that their model outperforms an AR and a VAR models in forecasting Spain's price and output variables although the outperformance of output variables is significant only for longer forecasting horizon.Marcellino et al. (2003) apply the dynamic factor model in forecasting European macro variables.Their results suggest that forecasts constructed by aggregating the country-specific models are more accurate than forecasts constructed using the aggregate data.Eickmeier (2005) investigates economic co-movements in the euro-area.Matheson (2006) use quarterly series from 1992 to 2004 in New Zealand to construct a dynamic factor model.He compares the forecasts of four variables, price, output, interest rate, and exchange rate, and finds that the dynamic factor with few factors outperform the method used in the Reserve Bank of New Zealand at longer forecasting horizon.Hsu et al. (2005) compare forecasts from their diffusion index model with those from several public and private economic institutes.Because of limited number of pseudo out-of-sample observations, they use a sign test to compare the forecasting results and conclude that their model outperforms others in economic growth rate.
Our study utilizes an approximate dynamic factor model to forecast three macroeconomic variables: industrial production growth rate, unemployment rate, and inflation rate based on consumer price index.Motivated by Marcellino, Stock et al. (2003), we would like to examine whether inclusion of international data improves forecasting variables of a specific country.We first estimate the model both using Taiwan's own data.Next, we incorporate the US and Japan's data since these two countries are the main trade partners of Taiwan during our data period.We compare the forecasting performances of three models by looking at their pseudo out-of-sample forecasts.

Methodology
The approximate dynamic factor model we used in this paper follows the one proposed in Stock and Watson (2002).Let x it be the observed data for the i-th economic variables at time t, for i=1,2,…,N and t=1,2,…,T.Consider the following model: where y t+1 denote the variable to be forecasted; is the lag polynomial; 1 2 ( , ,..., ) ' is an N r  factor loading matrix; i  is the factor loading vector of the i-th variable with dimension 1 r  ; t  and t e are disturbances.Equation (1) implies that y t+1 is formed by previous factors, its lags, and a disturbance.Equation (2) implies that the variation of N variables can be explained by r factors.When N is large and r is small, we can forecast y t+1 using r factors instead of N variables without loss of main information.
A popular way to construct factors from variables is the principal component analysis, which is also used in Stock and Watson (2002).The first principal component, or the first factor, is formed by: where . Thus, 1 B is the eigenvector associated with the largest eigenvalue of x  .The second principal component is constructed in the same way and is orthogonal to the first factor.Thus, 2 B is the eigenvector corresponding to the second largest eigenvalue of x  .The other factors can be formed in the similar way.
To select a proper number of factors in the approximate dynamic factor model, we rely on Bayesian Information Criterion (BIC).Bai and Ng (2002) propose more general panel criteria to select factor number and this method becomes popular in dynamic factor model applications.We apply both methods to select number of factors in our dynamic factor models while we find the forecasting performance of the models selected by Bai-Ng criteria is poor.
For concise reason, we report the forecasting result of the dynamic model selected by BIC only.
Once we construct the factors, we are able to form the one-step-ahead forecast from Equation (1).More generally, we can obtain the h-step-ahead forecasts by: The number of factors, r, and the number of autoregressive lags, p, are selected by Bayesian Information Criterion (BIC) during in-sample estimation.The maximum value of r is set to be 12 and the maximum value of p is 6.We use direct forecasting instead of indirect forecasting (iterative forecasting) since direct forecasting has been shown to be more accurate than indirect forecasting in Lin and Tsay (1996) and Ing (2003).
Two forecasting benchmark models are the AR and the VAR.We also choose the appropriate lag length of the AR based on BIC.Since there is no specific way to choose a good VAR model in forecasting, we follow the one in Stock and Watson (2002).Moreover, including more variables in a VAR usually leads to imprecise estimation because of the vast loss of degrees-of-freedom.Thus, we consider a three-variable VAR is adequate in our application.The variables in the VAR model are the industrial production growth rate, inflation rate based on consumer price index, and 90-day interest rate.The industrial production growth rate is replaced with unemployment rate when we perform forecasting for unemployment rate.The series are adjusted to be stationary if necessary.Stock and Watson (2002) find that the fixed-lag VAR performs better than the VAR selected by BIC, and we set the fixed lag length to be 3, which is mostly chosen by BIC in our application.

Data Description
The full data contain 95 quarterly time series for Taiwan, 53 quarterly time series for the United States, and 36 quarterly time series for Japan from 1981Q1 to 2006Q4.We use quarterly data instead of monthly data because the former leads to more series available.We choose the series of US and Japan because they are the mainly trade partner of Taiwan during this period.The 15-year-data, 1981Q1-1995Q4, are used for in-sample estimation and the remaining observations are reserved for our pseudo out-of-sample forecasting.The in-sample estimation is based on the 15-year rolling-window scheme.We do not report the in-sample estimation result because, for the AR models, the model selection from different estimation window leads to different model specification.Thus, it is less meaningful to report all the in-sample estimation results since our goal is to compare forecasting performances among models.

Forecasting results
Table 1 presents our forecasting comparison.We use the mean square error (MSE) from the AR as the reference MSE and report the relative MSE from the other models.DFM represents the dynamic factor model selected by BIC using domestic data only and DFM_INT indicates the dynamic factor model selected by BIC using international data.If the relative MSE is smaller than one, it implies that, on average, the forecasting squared error generated by the model is smaller than that generated by the AR.We use the Diebold-Mariano statistics to determine whether the superior forecasting performance is statistically significant.It is possible that the relative MSE from the model is slightly under one while the superior performance is statistically significant if a model generates smaller forecasting error than the AR in most periods.
We find that our DFM consistently beats the AR in the 1-step-ahead to 4-step-ahead forecasting on inflation.Moreover, the DFM also outperforms the VAR in the 1-step-ahead and 4-step-ahead forecasting on inflation.As regards unemployment rate prediction, our DFM outperforms the AR in the 1-step ahead forecasting but not in the longer forecasting horizons.The DFM's performance is even worse than the AR in the 4-step ahead forecasting for unemployment rate.In contrast, we find the DFM is superior to both the AR and the VAR in the 2-step-ahead forecasting on industrial production growth rate but not in the other horizons.
Does including more data in a dynamic factor model improve prediction performance?We answer this question by including international time series data in our dynamic factor model (DFM_INT).Similar to the result of the DFM, the DFM_INT consistently outperforms the AR model in inflation prediction for all forecasting horizons.The DFM_INT also performs better than the VAR in the 1-step-ahead and 4-step-ahead forecasting.In addition, the DFM_INT also outperforms the AR in 1-step-ahead forecasting on unemployment rate.However, we do not find any superior performance of the DFM_INT when we forecast industrial production growth rate.Although the DFM_INT has some better forecasting performance than the DFM (e.g., 1-step-ahead and 4-step-ahead foresting on inflation), it also generates worse predictions than those from the DFM in other cases (e.g., 2-step-ahead and 4-step-ahead forecasting on industrial production growth rate).Therefore, we conclude that including more data in a dynamic factor model does not help improve forecasting performance in our application.

Discussion and conclusion
This study utilizes a dynamic factor model proposed in Stock and Watson (2002) to forecast three macroeconomic variables of Taiwan -inflation rate based on consumer price index, unemployment rate, and industrial production growth rate.We compare the forecasting performance of a univariate autoregressive model (AR), a vector autoregressive model (VAR), and our dynamic factor model (DFM).We find that our dynamic factor model performs relatively well on inflation rate prediction -our model consistently outperforms the AR model for all forecasting horizons.It also outperforms the VAR model in the 1-step-ahead and 4-step-ahead forecasting on inflation.However, the strong performance of our DFM does not generally carry over to the forecasting on the other two variables.Moreover, we find that DFM make worse prediction than the AR model in the 4-step-ahead forecasting on unemployment rate.
We also include international data to see whether more data help prediction.Our results show that the evidence is mixed.The worse performance of our dynamic factor model using international data (DFM_INT) can be seen in the result of the 2-step-ahead forecasting on industrial production growth rate: the DFM outperforms the AR while the DFM_INT does not.This result supports the view that one should only include variables which exhibit high explanatory power with respect to the variable that one aims to forecast (Breitung & Eickmeier, 2005;Marcellino et al., 2003).
Some limitations exist in this study.First, we only examine the performance of one dynamic factor model that is proposed in Stock and Watson (2002).Although this specification is perhaps the most popular one (Eickmeier & Ziegler, 2006), other specification such as the one proposed in Forni, et al. (2000) may lead to different forecasting performance.Second, although we do not find the inclusion of international variable helpful in prediction, other international variables may bring useful information.For example, China has become the largest trade partner of Taiwan since 2005 and we expect the economic activities of China and Taiwan are closely linked.We do not include China's data in this study because the data are largely missing before 1994.To retain enough observations for in-sample estimation and out-of-sample forecasting, we decide to exclude China's variables.Utilization of monthly data may alleviate this problem while we still face the trade-off between the number of variables and the number of time observations.

b
DM indicates the Diebold-Mariano statistic proposed in Diebold and Mariano (1995) DFM : dynamic factor model selected by BIC with domestic data only DFM_INT: dynamic factor model selected by BIC with international data * The forecasting outperforms AR's forecasting at 5% significance level.+The forecasting outperforms VAR's forecasting at 5% significance level.
Stock and Watson (2002)ral important categories of macroeconomic variables: real output, tax, labor market, stock market and money, exchange rate and interest rate, price index and wage.The data of Taiwan obtained from AREMOS database and the Statistical Database of Directorate General of Budget, Accounting and Statistics, Executive Yuan, R.O.C. (Taiwan).In addition, the data of the United States and Japan are obtained from AREMOS and the International Financial Statistics of International Monetary Fund.All variables are listed in Appendix.We first remove the seasonality in the series by the Moving Average method used inStock and Watson (2002).Next, we transfer the series to be stationary and they are examined by Augmented Dickey Fuller unit root test.The transformation method for each series is described in Appendix.

Table 1 .
Forecasting performance Rel.MSE indicates the mean squared prediction error relative to that from AR. a