How Did the Financial Crisis Affect the Forecasting Performance of Time Series Exchange Rate Models ? Evidence from Euro Rates

This paper uses monthly data on euro exchange rates vis-à-vis major currencies, covering the period 1999-2012, to compare the forecasting ability of alternative stochastic exchange rate representations. In particular, we test the out-of-sample forecasting performance of a random walk, a non-linear Markov switching regimes process, and a vector autoregressive representation reflecting the dynamics of linear structural exchange rate models. These statistical models are evaluated in terms of the root mean square error of one-month to twelve-month out-of-sample forecasts. The empirical evidence points to the random walk puzzle, that is, the superiority of the naïve model in forecasting exchange rates before the crisis of 2008. However, this outcome is consistently reversed following the 2008 financial turmoil and the naïve model seems to regain some of its forecasting power only after 2011. These results suggest that different stochastic representations are appropriate for the exchange rate depending on the presence of financial calmness or turbulence.


Introduction
The superiority of a naïve random walk model in out-of-sample exchange rate prediction, relative to structural approaches based on fundamentals, has long been documented (Meese and Rogoff, 1983).Recent evidence has also reproduced the random walk property of exchange rates (see Chortareas et al., 2011), and the whole research project has converged to a rather widespread belief holding that exchange rate random walk forecasts are extremely difficult to beat.However, Engel and Hamilton (1990) presented evidence that refuted the result of Meese and Rogoff (1983) on the superiority of the random walk over other atheoretical time series representations.In particular, Engel and Hamilton (1990) reported that a Markov switching regimes model appears to beat in-sample and out-of-sample exchange rate forecasts of a random walk.Also, Kirikos (2000), based on an extended data set, verified the in-sample superiority of a random walk but he also reported evidence on the out-of-sample superiority of the Markov switching regimes process as the forecast window converged to the end of the full sample.More recently, Nikolsko-Rzhevskyy and Prodan (2012) found that the model of Engel and Hamilton (1990) outperforms the random walk in both short-run and long-run forecasting accuracy for US dollar exchange rates over the post-1973 floating period.
In this paper, we reconsider the forecasting ability of a random walk against that of a Markov stochastic segmented trends process and of a vector autoregression (VAR), using a data set on euro rates.Specifically, the forecasting performance of the three models is evaluated on the basis of the root mean square error (RMSE) of forecasts estimated on monthly data on the currencies of the USA, the UK, Japan, Norway, Sweden, Switzerland, Australia, and Canada relative to the euro (€) over the period 1999 -2012.The empirical evidence points to the random walk puzzle, that is, the superiority of the naïve model in forecasting exchange rates before the crisis of 2008.However, it turns out that structural approaches and representations that allow for policy shifts provide a better setting for predicting euro rates after the outbreak of the financial crisis.
The methodological approach is outlined in section 2 and the empirical results are reported in Section 3. The final section contains a discussion and conclusions.

Methodological Approach (Note 1)
Assume that e t (t = 1, 2, …, T) denotes the logarithm of the exchange rate and s t the first difference of e t (s t = e te t-1 ).If e t follows a random walk with a drift, the k-period-ahead forecast of e t+k , based on information at time t, is: where , and n is any sub-sample (nT) used for out-of-sample forecasting.
Alternatively, the drift parameter may be allowed to vary as follows: where p ij =Pr(h t =j|h t-1 =i), i, j = 1, 2. Under these circumstances, the forecast of s t+k , on the basis of information available at time t, is (Hamilton, 1993;Kirikos, 1996): where S t is the history of s up to time t,  t  = [Pr(h t =1|S t ) Pr(h t =2|S t )] is the vector of state probabilities at date t (Hamilton, 1990(Hamilton, , 1993)), and  h  = [ 1  2 ] is the vector of state means.The probabilities  t  are based on a nonlinear filter and, therefore, forecasts given by (4) are nonlinear.

Estimates of the parameters (
, p 11 , p 22 ) are based on the maximization of the sample likelihood function through the EM algorithm (see Hamilton, 1990).
Using (4), we obtain forecasts of the logarithm of the exchange rate by the following equation: Next, we look at a class of linear forecasts along the lines of structural asset market models of the exchange rate.
In particular, we consider vector autoregressive (VAR) representations for exchange rates and observed fundamentals as proposed by Engel andWest (2004, 2005), that is, VARs in the exchange rate and the variables y t -y t *, m t -m t *, p t -p t *, r t -r t *, where y t is the logarithm of domestic GDP, m t is the logarithm of the domestic money supply, p t is the logarithm of the domestic price level, r t is the domestic interest rate and starred variables are the foreign counterparts.More specifically, it is assumed that the vector series ,2,3,4,5) are polynomials in the lag or backshift operator L, all of order j.The VAR process (6) can be equivalently written in the companion form: where z t ,  t , and the matrix of coefficients A are defined implicitly in ( 7) and ( 8).The first-order companion form (8) is very convenient for taking conditional expectations since E(z t+k | I t ) = A k z t , k> 0, where I t is the information set which contains the history through time t of the information variables included in the vector z t .
For all models, out-of-sample forecasts are taken by rolling estimation.More precisely, we select a sub-sample of size n and obtain an initial prediction for the forecast horizon k.The next k-period-ahead forecast is computed by including in the sub-sample the next available observation, so that the sub-sample size becomes n+1, and this iteration continues until the maximum possible sub-sample size T-k, T being the full sample size.
The root mean square error (RMSE) of forecasts is computed by: Where n is the initial sub-sample size and k is the forecast horizon.

Emprirical Results
The empirical investigation is based on monthly data on the currencies of the USA (USD), the UK (GBP), Japan (JPY), Norway (NOK), Sweden (SEK), Switzerland (CHF), Australia (AUD), and Canada (CAD) vis-á-vis the euro (€) covering the period from January 1999 to August 2012 (164 observations).Data and sources are described in the appendix and all estimations are carried out by code written in GAUSS.
Variables included in the VAR representation are difference stationary and, thus, the first differences of the series are taken into account for the estimation of the linear model.VAR estimates were obtained for lag lengths between 3 and 6 but the results do not change considerably and, therefore, forecast errors are reported for a VAR with 3 lags only.However, VAR forecasts are not reported for the currencies of Switzerland and Australia due to lack of monthly data on the industrial production which is used to proxy output.
The RMSEs of out-of-sample forecasts at horizons of 1 to 12 months are presented in the following graphs for three different post-sample periods, namely 2006:9 -2012:8, 2008:9 -2012:8, and 2011:9 -2012:8.Each row of the graphs is related to a specific euro exchange rate, while each column refers to a given post-sample period in the order outlined previously.The only exception is the Euro/CAD rate for which the second post-sample period is 2009:9 -2012:12.The notation is RW(dr) for the random walk with a drift, (Note 2) 'Markov' for the Markov switching regimes model, and VAR(3) for the vector autoregressive model with 3 lags.
The graphs in the first column show that the random walk by far outperforms both the Markov and the VAR models in terms of RMSEs of forecasts for all currencies and forecast horizons, when the post-sample period is 2006:9 -2012:8.However, the middle column of graphs reveals a rather systematic reversal of this outcome for all currencies when the forecast window is limited to the period after the outburst of the financial crisis in September 2008.Indeed, RMSEs of naïve model forecasts deteriorate for all currencies and most forecast horizons, when the post-sample period is 2008:9 -2012:8.(Note 3) Especially for the Euro/CAD rate (last row of graphs) this reversal sets in rather later and, for this reason, the second post-sample period is 2009:9 -2012:8 in this case only.
When the forecast window is further limited to the period after September 2011, the naïve model seems to regain some of its predictive power as RW(dr) RMSEs improve considerably relative to those of the Markov and the VAR(3) models.This can be easily inferred from the graphs in the third column, for all currencies and forecast horizons.Nevertheless, the pre-crisis superiority of the random walk is not re-established as Markov and VAR forecasts seem to be equally good to, and in many cases (USD, JPY, CHF) better than naïve forecasts.

Conclusions
Based on a monthly data set for euro rates over the period 1999-2012, we obtained evidence that, before the 2008 financial crisis, the out-of-sample forecasting performance of a random walk with a drift is superior to that of a Markov switching regimes model and of a VAR representation.However, this outcome is consistently reversed following the 2008 financial turmoil.
These findings strongly suggest that in periods of financial calmness there is no better use of the information included in past values of the exchange rate than a simple observation of the values themselves.This result is most likely due to the absence of exceptional monetary or fiscal actions that could produce noticeable effects on exchange rates.However, in periods of financial turmoil this naïve approach does not work since there are active monetary and fiscal interventions which are not accounted for when a simple reproduction of past behavior is projected into the future.Indeed, after September 2008 the forecasting superiority of the random walk is drastically reversed for all euro rates considered, and models that allow for policy shifts or include targeted fundamentals do a much better job in predicting the exchange rate.Since this was a period of exceptionally active fiscal consolidation in many euro area countries, due to the debt crisis, and of very active monetary interventions in most major economies we can rather safely infer that the naïve random walk model does not capture exchange rate behavior under such conditions.Instead, both the Markov switching regimes model and the linear VAR model appear to be better representations for the stochastic behavior of the exchange rate in periods of turbulence which are characterized by very active policy actions.Also, it is worth noting that the non-linear Markov model and the linear VAR representation exhibit similar forecasting competence, suggesting that policy changes can be traced either by searching the data for structural shifts or by resorting to information variables that incorporate such changes.(Note 4) In any case, it seems that policy response explanations associated with non-linearities in euro rates (e.g.Kirikos, 2002Kirikos, , 2004) may be empirically relevant.
Finally, the reported evidence indicates that as we move away from the crisis outbreak and the financial turmoil recedes, the naïve model regains some of its predictive power.This behavior should be further gauged as time goes by and new observations become available, and in conjunction with the evolution of the ongoing crisis.Such additional information will provide a more solid basis as to whether or not different stochastic representations are appropriate for the exchange rate hinging on the presence of financial calmness or turbulence, as suggested by the evidence presented here.Probably, this is a non-trivial issue because, insofar as this emerging exchange rate behavior is replicated in the future, it opens up new directions both for theoretical exchange rate modeling and for more efficient statistical treatments.
Where h t is an unobserved state variable taking the values {1, 2}, and u is an error term.State 1 can be thought of as the exchange rate depreciation state while state 2 can be regarded as the revaluation state.Obviously, equation (2) allows for different means and variances across states and the variable h t will be assumed to follow a Markov chain with stationary transition probability matrix: