Spectral Bandwidth Selection for Long Memory

Long-memory parameter estimation using log-periodogram regression relies largely on the frequency bandwidth and the order of estimation. Literature shows that a data-dependent plug-in method for the bandwidth significantly increases the MSE’s. In a long memory time series with mild short range effect, a simple approach to determine the bandwidth size is suggested based on the spectral analysis. Monte Carlo simulation results and empirical applications show that the proposed bandwidth selection performs satisfactorily.


Introduction
There has been research on estimation of long memory parameter in time series using periodogram-based semi-parametric estimate, namely the local Whittle (Kunsch, 1987;Robinson, 1995a), average periodogram (Robinson, 1994) and log-periodogram (Geweke & Porter-Hudak, 1983;Robinson, 1995b).Semi-parametric estimation procedures are desired in the time series analysis of financial measurements sampled at high frequencies (Barros, Gil-Alana & Payne 2014;Bollerslev et al., 2013;Garvey & Gallagher, 2013) as they allow the estimation of the long-run characteristics (low frequency behaviour) of the time series without the knowledge of the short-run (high frequency) structure.Amongst these methods, log-periodogram (LP) regression proposed by Geweke and Porter-Hudak (1983) (GPH) has become a popular tool for statistical inference in empirical research due to its simple implementation, pivotal asymptotic normality and robustness as a result of the local condition (Arteche & Orbe, 2009).Nonetheless, it has been criticized due to its finite-sample bias (Agiakloglou,Newbold & Wohar 1993).To reduce the bias, Andrews and Guggenberger (2003) (AG) proposed a bias-reduced log-periodogram estimator (BRLP) , ∈ Ζ .Literature shows that the rate of convergence to zero of the mean-squared error (MSE) of is of order and it is of order with .Also, the rate of convergence of the latter exceeds that of the local Whittle estimator (Robinson, 1995a) and the average periodogram (Robinson, 1994), provided the spectral density is sufficiently smooth.Nonetheless, as a semi-parametric parameter estimate, the bandwidth plays an important role on the performance of .A large bandwidth reduces the variance at the cost of increased bias, and the estimates of the memory parameter vary significantly with the choice of .To balance the squared bias and variance, optimal bandwidth selection is usually proposed to minimize the approximation of the MSE or the root mean-squared error (RMSE).
There are basically three formal procedures to optimal bandwidth.Hurvich and Deo (1999) proposed a plug-in method that minimizes an asymptotic approximation of the MSE, Giraitis, Robinson and Samarov (2000) introduced an adaptive LP that chooses to adapt to an unknown local to zero spectral smoothness, and Arteche and Orbe (2009) suggested a bootstrap-based bandwidth choice that minimizes a local bootstrap MSE.The adaptive LP does not give unique choices of the bandwidth but only the bandwidths with an optimal growth rate.The bootstrap-based bandwidth is claimed to be robust, if the signals from the data are not misinterpreted.Nonetheless, it is noted that this technique depends largely on the choice of resampling width and the range of bandwidths considered for optimization.On the other hand, the plug-in method for selecting the number of frequencies to minimize asymptotic MSE of or is easy to implement but it is usually not adequate as it depends on the unknowns to be estimated (Andrews & Guggenberger, 2003;Arteche, 2004;Delgado & Robinson, 1996;Henry & Robinson, 1996;Henry, 2001).Besides, as the convergence rate is measuring the asymptotic properties that will not be reflected in finite samples, an improvement in the search for an optimum bandwidth is deemed crucial.It is believed that such effort is encouraging as much of the research work rely on the long-memory parameter estimation that is done by arbitrary choice of bandwidth (Charfeddine, 2014;Choi, Yu, & Zivot, 2010;Garvey & Gallagher, 2013).In a long memory time series with mild short range effect, the search for the optimum bandwidth can be done easily with spectral analysis.The following section reviews the log-periodogram regression estimator.Section 3 considers an alternative approach for the bandwidth selection using the spectral analysis.Section 4 provides a simulation study in finite samples, section 5 illustrates the proposed method with empirical examples and section 6 offers the concluding remarks.

Log-periodogram Regression Estimator
The spectral density of a semi-parametric model for a stationary Gaussian long-memory time series { : = 1, … , } is given by Where ∈ (−0.5, 0.5) determines the low frequencies property.Specifically when 0 < < 0.5, the series exhibits long memory.* (•) is an even, positive continuous function on [− , ] with 0 < * (0) < ∞.It determines the high frequencies properties of the series, relating to the short-term correlation structure.A model that takes a fractional difference of order , a -order autoregressive (AR) and -order moving average (MA), abbreviated as ARFIMA ( , , ) introduced by Granger and Joyeux (1980) and Hosking (1981) is a special case of long-range process satisfying Eq. ( 1).Based on this, Robinson (1995b) wrote the GPH estimator in the form of regression model below.To avoid the bias in (Hurvich et al., 1998) due to the error term { } that is not i.i.d.(Hurvich & Beltrao, 1993;Kunsch, 1986;Robinson, 1995b), AG proposed a bias-reduced log-periodogram regression (BRLP) that gives , which is basically adding regressor , ≥ 1 to the pseudo-regression model Eq.( 2).The asymptotic bias of is of order , compared to that of the GPH estimator which has the order of .As such, a larger bandwidth size is preferred for because with being the effective sample size to estimate , the standard deviation of the estimator declines at the approximate rate √ , which in effect, yields low RMSE.
Various effort have been made to determine the suitable bandwidth size .For the plug-in method, Hurvich et al. (1998) suggested an optimal bandwidth choice for , of which following the lines of Delgado & Robinson (1996), it can be estimated using Taylor expansion of * about = 0.As * is unknown, this method is not fully automatic and the estimation can be poor for some * ( ) functions.For estimator, AG proposed the plug-in MSE optimal choice of bandwidth.This method has the similar problem in GPH estimator, that is, * ( ) is unknown, and a non-parametric estimation is involved, which leads to an increase in the MSE of .Essentially, there is still no clue as to how large a bandwidth size should be taken.As the performance of the estimator varies according to the bandwidth size, a more efficient approach for the data-dependent choice of bandwidth is desired in finite samples.

An Alternative Bandwidth Selection Method for Long-memory Time Series
The spectral density is the variance per unit frequency (Chatfield, 2004), of which the auto-covariance function ( ) can be expressed as a cosine transform of the spectral density, shown in Eq.( 3).
In other words, ( ) is the contribution of the variance of a time series with frequencies in the range ( , + ).As persistence or positive autocorrelation is the characteristic of a long-memory, the variance in the time series due to the long range effect is explained at the low frequencies, and it diminishes on the high frequencies.On the other hand, a process with mild short range effect has quite a flat spectral density, particularly on the interval ∈ , (see Figure 1).

Figure 1. Spectral density of short-and long-range effects
The characteristic of a stationary long-memory process with mild short-range effect can be examined via the autocorrelation function (ACF) and its spectral density.Briefly, Figure 2 shows the characteristics of ARFIMA(0, , 0), ARFIMA(1, , 0) and ARFIMA(0, , 1) processes with the plots of the average modified periodogram, ACF and partial autocorrelation function (PACF).Basically, a long-memory process with mild short-range effect has dominant long-memory characteristics, of which the autocorrelations are not too big, and the ACF decay hyperbolically with its spectral density diverges at zero.However, a significant short-range effect increases the strength of the autocorrelations at low lags, and the ACF may tend to decay more rapidly (see Figure 2(b)).For a detailed discussion of the characteristics of long-memory process, see Baillie (1996).

Characteristics of various ARFIMA processes
To obtain an accurate long-memory parameter estimation, we should include as much the information pertaining the long range behaviour as possible.GPH suggested a bandwidth size = √ , but Qu (2011) showed that a larger bandwidth = [ .]produces better results as the spurious long-memory affects the periodogram only up to = √ .This is supported by the empirical analyses (Charfeddine, 2014;Choi et al., 2010;Garvey & Gallagher, 2013), of which a bandwidth size larger than √ is adopted.Concentrating on the scenario of long-memory process of which the short-range effect is mild, this paper proposes to include as much the frequencies as possible up to the point of where the spectral density is low enough to be defined as noise.This approach is practicable as the spectral density * of a mild short-range effect is rather flat on the interval ∈ √ , .From Eq.( 1), we can deduce that is not affected by * on the interval ∈ √ , .
Hence, the inclusion of as many frequencies with significant power spectrum as possible in the BRLP model is warranted.The spectral densities on the high frequencies are very low, and the signals are indistinguishable from the noise.These signals are regarded as insignificant, and they are to be removed like noise elimination.The challenge to this is to define a level of spectral density that is low enough to be treated as noise.As the effect of spurious long-memory is dominant on < √ , one may take ( ) = √ as the reference for the variance per unit frequency that explains the long-memory trait.The noise is then defined as those spectral density that has less than 5% of the amplitude at , computed as = 0.05 * − ( ) + ( ), where ( ) represents the spectral density of white noise in the time series.It can be shown that the bandwidth selected with this criterion satisfies = √ , thus meeting the statistical assumptions for a regression method.Following this procedure, we avoid the bias due to insufficient data, and it is believed that more accurate long-memory parameter estimation can be attained.
Since the spectral density of a time series is unknown, an unbiased and consistent estimate of the spectral density is needed.Although periodogram is a natural estimate of spectral density, it not a consistent estimator because its variance does not decrease as → ∞ (Jenkins & Watts, 1968).To improve the spectral estimation, this paper uses the average modified periodogram ( ) by the Welch method (Welch, 1967) with a frequency resolution of , where = max{256, the next power of 2 greater than the length of the segments}.The spectral density is then estimated by ( ), = 0,1, … , , where = .By overlapping the segments with one half their fixed length ( ), Welch (1967) reported that the variance of this estimator is a function of .That is, the variance of ( ) improves as increases.This implies that when the sample size is small, should be estimated from a broader neighbourhood in .To simplify the work, we propose that for < 1000, is estimated by the largest signal in the interval − , + .As the sample size increases ( ≥ 1000), the variance in the estimator decreases, and hence, the search interval for the estimate of can be drawn closer to the point , say − , + on the ( ) plot.Let's denote this estimate as .To include as many frequencies with significant spectrum, we suggest that moving from the last point , the cutoff frequency for the bandwidth is set at the first frequency that has the amplitude equals to .This cutoff frequency suggests a preliminary search for an optimum bandwidth for the long memory parameter estimation.To refine the bandwidth, the graphical method via the plot of against by Taqqu and Teverovsky (1996) is referred, whereby the optimum bandwidth is searched from the flat region in the plot of against around the Fourier frequency * * concluded from the spectral analysis.The flat region is defined as the region of frequencies of which are almost similar, say in a neighbourhood of three estimates with a standard deviation of less than 10 .The average frequency of such region gives the optimum bandwidth, and this point will determine the number of frequencies to be included in the log-periodogram regression for the long-memory parameter estimation.
The above procedure to search for the optimum bandwidth in the finite sample can be summarized in the steps below: 1. Examine the plots of modified periodogram, ACF and PACF to check if the process is with mild short range effect.
5. Search for the flat region in the plot of against around the Fourier frequency * * .
6.The optimum bandwidth is the average frequency of the region identified in step (5).
Whilst the above procedure suggests the number of frequencies needed in the BRLP model, the accuracy of depends on .It is noted in AG that except for extremely large sample sizes, a small value of is preferred because the variance of for fixed increases quickly as increases.The need for a large value of is evident if * (0) is close to unboundedness due to large nonzero derivatives of * on all positive even orders at zero.Hence, for an ARFIMA(1, , 0) process with AR parameter → 1, it is preferable to choose = 2 to = 1 for the estimator .On the other hand, an ARFIMA(0, , 1) process with MA parameter > 0 does not have the problem of unboundedness at * (0), indicating that is preferred.It is believed that with the optimal bandwidth and the proper selection of , the BRLP model can be improved and can be an efficient estimator for the long-memory parameter.

Monte Carlo Experiment
In this section, we compare the finite sample behaviour of the BRLP estimators and using the plug-in MSE optimal choice of bandwidth and the bandwidth size following the proposed method outlined in Section 0. We consider stationary Gaussian ARFIMA (1, , 1) processes with AR parameter and MA parameter .The time series generated takes the form in Eq.( 4).Without loss of generality, the series is normalized to zero-mean.
where is a backshift operator and is an iid standard normal random variable.
Focusing on the scenario of long-memory with mild short-range effect, we consider the processes with the combination of parameters = 0, .2,.4,= 0, .1,.2,.3,.4 and , = 0, .1,.2,.3,.4.We do not examine the cases with negative parameters as these cases are of low empirical relevance.To examine if the performance of the proposed bandwidth selection is consistent, we run the simulation in sample sizes = 512, 1000 and 2000.
We generate 1000 realizations for each run of combination, and the procedure is repeated for 100 replicates.To evaluate the performance of the bandwidth selection method in the BRLP estimators, the average of root mean squared error (RMSE) of these 100 replicates, each with 1000 realizations, is obtained for each of the processes considered in this study.These results are compared to the average of the minimum RMSE that tracks the BRLP estimates within the bandwidth size from = 10, 11, … , .We report in Figures 3 and 4 the results for ARFIMA(1, , 0), = 0, .1,.2,.3,.4 and ARFIMA(0, , 1), = 0, .1,.2,.3,.4respectively with = .2,.4 and = 512, 2000.It can be seen that in both ARFIMA(1, , 0) and ARFIMA(0, , 1) processes, the proposed bandwidth produces the average RMSE's that are closer to that of the minimum RMSE that examine over the bandwidth size from = 10, 11, … , .Figure 5 reports that the proposed method consistently suggests a bandwidth size that is closer to the one that gives the minimum RMSE in the Monte Carlo simulations, of which is preferred for the ARFIMA (1, , 0) processes, especially when ≥ .2.The plug-in MSE-optimum bandwidth (AG method) is rather insensitive towards ′ (and ′ ), and the bandwidths suggested are relatively small.Indeed, the average RMSE's due to AG method in Figure 3 are for , as this method reports over owing to the small bandwidth.In other words, the selection of estimator does not follow the choice of the one that minimizes the Monte Carlo RMSE.Interestingly, even in the case without long memory effect, the proposed method suggests the bandwidths that are closer to the one that gives the minimum RMSE in the Monte Carlo simulations, hence, giving the smaller average RMSE compared to the plug-in AG method (see Table 1).These results demonstrates that with the mild short-range component, the long-memory parameter should be estimated using a larger bandwidth size, which takes almost all signals except the noise. .061 .0361 .061 .0456 .

Empirical Examples
This section shows the application of the proposed bandwidth selection strategy to two real time series, namely the Nile river minimum water levels during years 622 through 1284 and the de-seasonalized volatility of the 5-minute returns of Kuala Lumpur Composite Index (FBMKLCI) from 30 th April 2013 to 31 st December 2013.
The first time series consists of 663 observations and it has been widely discussed in the long memory literature.
The second series is an example of high-frequency data on returns of financial assets which spurs a large amount of research relating to modeling and predicting the realized volatility.Both series are displayed in Figures 6 and   8. Based on the plots of the average modified periodogram, ACF and PACF in these figures, we can conclude that both the time series have long memory with a very weak short memory component.In each series, the long memory parameter is estimated using , and the performance of the proposed bandwidth is checked with that of the plug-in MSE-optimal bandwidth.The local bootstrap-based bandwidth is used as a means of substantiation.
For the Nile River data, the procedure in Section 0 suggests an optimal bandwidth = 301 giving = .3785.The bandwidth estimate of the MSE-optimal choice is = 111 and the corresponding long memory parameter estimate is = .4578.Taking the idea of approximating the MSE by a bootstrap MSE, a bandwidth estimate using local bootstrap by Arteche & Orbe (2009) is referred.To focus around the bandwidth estimates of the proposed method and the MSE-optimal choice, the range of bandwidths considered for optimization is set at 100 320, and the resampling width is = 5.The optimum bandwidth in terms of MSE is = 306, leading to = .3795with a bootstrap MSE of .0027.The local bootstrap-based bandwidth is very close to the proposed bandwidth, suggesting that a larger bandwidth size gives a better long memory parameter estimate in terms of MSE.To further examine the performance of these procedures, each pair of estimates , is used to compute the spectral density in Eq.( 1) following the approach of Delgado & Robinson (1996).The  ) , = 2, … , , and the series of log-squared returns is taken as a proxy for the volatility of FBMKLCI.Following the spirit of Deo et al. (2006), the seasonality of the log-squared returns is expected to be periodic with a period of 72.The series of volatility is thus de-seasonalized, and the graphical analysis of the series is depicted in Figure 8.The results of the long memory parameter estimation corresponding to the bandwidth estimates due to the proposed method and the MSE-optimal choice are shown in Table 2.
As the difference between the bandwidths suggested by the proposed method and the MSE-optimal choice is large, we may not be able to obtain a satisfactory local bootstrap-based bandwidth from the interval that spans between these numbers.The extended residuals in the local bootstrap procedure show a marked structure, and hence, a narrower search interval is deemed necessary.To examine the performance of the bandwidth choices, we execute the local bootstrap twice, one targeting the search interval ∈ [1300, 1800]and another one ∈ [4200, 4700], both using a resampling width = 5.The extended residuals of both cases showed stable behaviour (see Figure 9).Based on the bootstrap MSE in Table 2, it can be concluded that the proposed bandwidth suggests a more reasonable approximation, whilst the MSE-optimal choice bandwidth seems too low.The performance of these procedures are further gauged by their spectral density estimates, compared to the periodogram of the de-seasonalised volatility of FBMKLCI (see Figure 10).Similar to the Nile River data, the spectral density estimate with the proposed bandwidth is closer to the periodogram.These examples confirm that in a long memory time series with weak short memory component, the MSE-optimal choice of bandwidth may be too low.Alternatively, an adequate bandwidth for the BRLP estimator can be obtained by including all the Fourier frequencies except those of which the spectrums are low enough to be treated as noise.The proposed method of bandwidth selection clearly gives a better long memory parameter estimate in terms of MSE.Local bootstrap 4523 (a)  .177 (a)  1.97 × 10 -4 (a) 1672 (b)  .249 (b)  4.26 × 10 -4 (b)

Conclusion
BRLP estimator reduces the bias in , but its performance depends largely on the bandwidth size and the order .Working on the long-memory process with weak short-range component, this paper proposes to take as much the Fourier frequencies in the BRLP regression model, excluding the high frequencies that correspond to low spectrums that are associated with noise.The Monte Carlo simulation results confirmed the theoretical analyses, which suggest that for a mild short-range ARFIMA(1, , 0) with large sample size, using the proposed bandwidth gives a good estimate for the long-memory parameter, whereas for a mild short-range ARFIMA(0, , 1), with the proposed bandwidth is preferred.The results are quite consistent throughout the degrees of long-memory and the sample sizes.The advantage of the proposed method for the bandwidth selection in the long memory series with very weak short memory component is demonstrated in the empirical examples, of which the proposed bandwidth performs better in terms of MSE, and it gives a closer estimate to the spectral density.It is believed that the accuracy in the long memory parameter estimation can greatly improve the analysis of economic time series as pertains to modeling and forecasting.
577216 … is the Euler constant and is a positive integer smaller than or equal to , [ ] being the largest integer part of .

Figure 3 .
Figure 3. Average RMSE of the BRLP estimators with the plug-in MSE optimal bandwidth (AG) and the proposed bandwidth selection method compared to the minimum RMSE for ARFIMA(1, , 0) processes spectral density estimates are compared to the plot of periodogram shown in Figure7.As the spectral density estimate due to the proposed bandwidth is closer to the periodogram, this procedure of bandwidth search for the BRLP estimator seems to provide the more plausible approximation.

Figure 6 .Figure 7 .
Figure 6.Characteristics of Nile River minimum water levels during years 622 through 1284

Figure 8 .
Figure 8. Characteristics of the de-seasonalized volatility of the 5-minute returns of Kuala Lumpur Composite Index (FBMKLCI) from 30 th April 2013 to 31 st December 2013

Figure 9 .
Figure 9. Extended residuals of the local bootstrap for the bandwidth search around the proposed bandwidth and the MSE-optimal bandwidth choice for the de-seasonalized volatility of FBMKLCI