A Family of Stochastic Unit GARCH Models

A class of Asymmetric GARCH models is presented. It shares the same unconditional variance and volatility forecast formula as the standard GARCH(P,Q) model under the assumption of a symmetric conditional distribution for innovations. use three models of this class to assess their ability to forecast S&P 500 market volatility and to make better decisions for the purpose of risk management and investment. Subsequently, a comparison is made with respect to competing models (GARCH, EGARCH, GJR). It was found that for the in-sample evaluation, the best model is obtained from the Stochastic Unit GARCH (SUGARCH) model where leverage effects are introduced through the GARCH (i.e 1  ) parameter. For the out-of-sample evaluation (QLIKE loss function), it is better to use the SUGARCH class where the asymmetry appears on the ARCH (i.e 1  ) parameter.


Introduction
Understanding how the market volatility evolves is challenging for financial investors.This information can be used for many purposes among which risk management activities, trading strategies and option pricing.A common stylized fact of financial time series is that large absolute returns are more likely to be followed by large absolute returns.The same remark also applies for small returns.This stylized fact, known as volatility clustering, had motivated the work of Engle (1982) who proposes the ARCH(P) model where the conditional variance depends linearly on lagged squared innovations.In financial applications, to obtain a good fit, one needs a big integer for the parameter P and therefore more parameters to be estimated.In order to achieve a parsimonious parametrization, Bollerslev (1986) introduced the Generalized ARCH(P) model denoted by GARCH(P,Q).This model has the ability to sufficiently fit well asset prices even with small integers P and Q.During 1990s, another extension appears that integrates the fact that negative and positive return innovations impact differently future volatilities.Namely, the empirical evidence shows that impact of negative returns are more important than the positive ones (leverage effects).There are nowadays many GARCHs models that take into account this asymmetric effect, see Nelson (1991), Glosten, Jagannathan, and Runkle (1993), (GJR for short).Brownlees et al. (2011) provide also a good reference on this issue by make a forecasting comparative study.
A third extension is to consider many regimes of univariate GARCH models instead of one to generate enough skewness and kurtosis to match those of financial asset returns.The regimes may be independent or chosen through a Markov chain and the corresponding models are known as Markov regime switching GARCH models.Some papers related to this issue are Klaassen (2002), Marcucci (2005) and references therein.Switching GARCH models aim to capture the fact that volatility shocks are not persistent inside a regime (low or high volatility).Also, in these models, a small shock may be followed by a big shock and conversely a big shock may be followed by a small one.However, it is known that models with many parameters may have some problems of convergence in the estimation process or a lack of robustness in the out-of-sample evaluation because of the over-fitting phenomenon.For example, Marcucci (2005) demonstrates that Markov regime switching GARCH models do not dominate the single GARCH models with respect to VaR-based loss functions.He also finds the same results for volatility forecast accuracyfor which no model clearly outperforms the others if short and long run time period are considered.Christoffersen and Jacobs (2004), for option pricing purpose, consider several single GARCH models.They find that the leverage GARCH model is not dominated with respect to other asymmetric models having more parameters in their formulation.All these results motivate us to provide a mathematical model that is based on a single Asymmetric GARCH framework.The first contribution of the paper is to propose a stochastic unit GARCH (SUGARCH hereafter) model defined by a standard GARCH model where some coefficients are multiplied by a predictable stochastic factor having an expected value of one.In other words, the model may be seen as a GARCH model in "mean".In contrast to other asymmetric models, the SUGARCH(P,Q,O) class cannot include an integer which is greater than one for its third parameter O.If not, the GARCH model in "mean" property will be lost and the latter is the main difference with respect to other asymmetric models.Therefore, the class is always SUGARCH(P,Q,1) and may be denoted simply by SUGARCH(P,Q) where P, Q are strictly positive integers.Consequently, the model may be reliable for investors/market participants who want to use a model capturing both complexity (time-varying parameter) and simplicity (standard GARCH) for their investment or risk management purpose.The second contribution of the paper is empirical.It is found that asymmetric models perform well in the in-sample data if leverage effects are introduced on the GARCH parameter β.However, for the out-of-sample data, the relative performance is not too good and it is preferable to introduce leverage effects on the ARCH parameter α as usually done in the literature.
The remaining in this paper is organized around four sections.Section 2 presents the SUGARCH(P,Q) class and its competing asymmetric models.Section 3 describes the data and the methodology used in the empirical application.Section 4 presents the empirical results and the last section concludes.

The SUGARCH Class
The idea of SUGARCH class is to capture some properties of the standard GARCH model by taking into account the leverage effects.Specifically, I define it as a standard GARCH model with some coefficients multiplied by a predictable factor say v t .The latter is such as E(v t )=1.In the next step, to share some properties with GARCH models, some additional constraints are introduced on the conditional distribution of innovations which must belong to the class of symmetric distributions.This condition is not too restrictive since we may find some of them with more kurtosis than the normal distribution (i.e Student Law,).
In this study, I am interested on forecasting conditional volatility on short horizon using daily data.So the conditional mean is supposed to be constant as in Klaassen (2002) or Marcucci (2005).Let S t and t r be respectively the security prices and security logarithmic returns, μ the conditional mean, then the SUGARCH(P,Q) class is defined by ) ( , ) or stochastic.In the latter case, their expression is given by .asymmetric models (Note 1).Hereafter, I will work only with the general SUGARCH model where each parameter is stochastic since the other special cases may be obtained from the same methodology.In this case, the conditional variance (1b) may be rewritten as (3) So to obtain an unconditional variance which is equal to the standard GARCH model, we need the expected value of the second term of Eq. (3) to be zero.For this, I assume the distribution of t  to be symmetric with mean 0 which is a sufficient condition.In the appendix, it is then shown: (4) Since the conditional volatility must be always positive and to obtain a covariance stationary model, the following constraints are made: where N is the number of observation returns.
Eq. (5a) gives the same GARCH constraints.Eq. (5b) handles the positivity of the stochastic factor  to ensure the positivity of the conditional variance, see (2).A general bound may be taken for the innovations and then for the parameter  .Here, I let the bound to depend on the data.The idea is to allow a big range of  since the leverage effects are introduced by this parameter.In practice, since investors generally work with high frequency (intra or daily) observations, the conditional mean  in Eq (1a) is often small and so the innovations may be approximated by asset returns in Eq. (5b) (Note 2).It is also expected that  will be positive to integrate the fact that negative shocks impact more future volatilities than positive shocks.
On the other hand, when a big shock appears in the innovations, the volatility cannot persist for a long time.This is due to the multiplicative factor t v that alternatively allows large and small volatility movements in a symmetric way since ) ( t  behaves as a fair game.Namely, we may have a small shock in a period of high volatility or a big shock in a period of low volatility.This feature is also shared by switching GARCH models that allow the volatility process to be in different levels.To see formally the link, note that if we have two regimes, as it is often the case in financial applications, the standard GARCH coefficients take two different values.Here, for more flexibility, the SUGARCH class allows stochastic coefficients t v valued in R (Note 3) Even if the coefficients are random, the framework is still similar to the standard GARCH model; which is an an interesting result.We have seen that both models share the same unconditional variance.The difference only appears locally where the stochastic factor t v generates asymmetry effects and extreme movements for the conditional variance.This is the main feature that differentiates our model to other Asymmetric GARCH models which have also time-varying parameters but with an expected value different from 1. Consequently of these oscillations, the kurtosis of the distribution increases.Specifically, if the sixth moment of the asset return exists, it can be shown, see the appendix, that For the forecast purpose, I use the simplest SUGARCH (1,1) class corresponding to P = Q =1.In this case, a closed formula for any multi-step-ahead volatility forecasts exists.Its form is similar to the standard GARCH(1,1) model where the difference appears only on the initial condition.Namely, we have for any integer In this study, I only consider three of the seven asymmetric models of the SUGARCH(1,1) class given by the following conditional volatilities: The four remaining asymmetric formulations use at least two stochastic parameters and I find that they do not produce any significant difference with respect to formulations (8a), (8b), (8c).Another reason for choosing these three conditional volatilities is that they integrate leverage effects respectively on the constant 0  , ARCH 1  and GARCH 1  parameter of the standard GARCH model.Therefore, it allows seeing which is the best way to capture asymmetry shocks on financial time series.The intuition of these three formulations is to model the negative correlation between past shocks (or returns) and future volatility.To the best of my knowledge, this problem is not studied in the literature and authors generally use a formulation similar to (8b) i.e asymmetry introduced through the ARCH parameter.
In the next section, I present two competing models (EGARCH, GJR), belonging to the class of asymmetric GARCH models.Additionally, I include the symmetric GARCH model which may give good results in the out-of-sample evaluation of volatility forecasts.Since the true conditional variance is not observable, square returns are used as a conditionally unbiased volatility proxy.The advantage of this proxy is that it ensures the correct ranking of predictive models in terms of the QLIKE loss functions, see Eq. ( 12).The other loss functions such that Mean Absolute Error, the Mean Square Errors on standard deviations may give some biases, see A. Patton (2011) for more information.

Competing Models
The EGARCH model: The Exponential GARCH (EGARCH) model is proposed by Nelson (1991).As its name indicates, the variable of interest is the logarithm of the conditional variance.It is defined as follows: where t  was defined in Eq. ( 1c).The asymmetry between negative and positive shocks is modeled by the parameters . Note that, if we are interested in forecasting, the conditional variance of EGARCH must depend on the distribution of t  has a generic symmetric density f(0,1), I consider in this study two normalized distributions.The first is the Normal distribution: The Student's t is the second distribution which has more kurtosis than the normal.Its density is given by .
The multi-step-ahead volatility is derived recursively from where the initial condition is given by ( 10) and all parameters are estimated by the Maximum Likelihood method.
The GJR GARCH model: It is proposed by Glosten, Jagannathan and Runkle (1993), for short GJR model.The conditional variance of the asset return is defined by  and 0 otherwise.Some constraints need to be made to insure strict positivity of the volatility.For the simplest model GJR GARCH (1,1,1), we have Therefore, we have at time t The GARCH model: The model is proposed by Bollerslev (1986) as an extension of the ARCH model of Engle (1982).It is a special case of SUGARCH (set for any t ) and GJR-GARCH models.Its unconditional variance is given by Eq. ( 4) and the multi-step-ahead formula may be obtained from Eqs. ( 7) and (8a) by setting 0   .Note that the literature of asymmetric GARCH models allows more than one parameter for the leverage effects without significantly affecting the structure of the model.This is not the case of SUGARCH class where the parameter O is restricted to be always one.The reason is that if the stochastic factor t v contains more parameters , the explicit formula for the unconditional variance is lost for some formulations.In this case, the second term of Eq. (3) may involve expressions such that ) ( with j < i and those terms will be different from zero.Therefore, the acronym SUGARCH will be lost.On the other hand, if a symmetric distribution for t  is considered as a strong requirement for financial returns, we can avoid the distributional assumptions and estimate the model by using Quasi Maximum Likelihood method.

Data and Methodology
I consider the S&P 500 daily time series, adjusted for dividends, to evaluate the performance of the different models presented above.The sample period is from January 2, 2002 to December 31, 2010 corresponding to N=2267 daily observations.Table 1 gives the descriptive statistics of the index with respect to its asset returns r t =lnS t −lnS t−1 where S t represents the spot price at time t.The table shows that the mean return of the S&P 500 index is positive and small.The standard deviation is also small (1.38%).The maximum (minimum) return is given by 10.957 % (-9.469 %).Extreme movements appear more frequently since the null hypothesis of normal distribution for the unconditional return is highly rejected even at the 1% significance level.The same thing appears for the null hypotheses of no serial correlation as well as the null of no ARCH effects.Both hypotheses are rejected with p-values close to zero.These results from Table 1 suggest to use GARCH models to take into account the excess of kurtosis and the presence of heteroskedasticity.In the previous section, I reviewed some of them.To make forecast, I estimate the parameters of each model by using the Maximum Likelihood method where the conditional distribution for innovations is either normal or Student-t.Then, the future volatilities are forecast and some quantiles (Value-at-Risk) are also determined.The time horizon h belongs to {1,2,5,10}, corresponding to 1 day, 2 days, one week and two weeks, respectiveley.The literature usually compares the relative performance of volatility models around a statistical loss function or an economic loss function.Only the former is considered with the Quasi-Likelihood loss function, as a criterion, defined as follows where T and N are length of the in-sample data and total sample, respectively.QLIKE function shares robustness on ranking the models with respect to an unbiased estimator of the unkown conditional variance, see for example Patton (2011) .
The in-sample data spans the period January 2, 2002 to October 2, 2008 corresponding to T = 1699 (0.75*N) and the remaining ( 25% of the data) is used as the out-of-sample data.The parameter h gives the horizon forecast used to compare models.In this study, I focus only on the QLIKE loss metric for some reasons given by Brownlees et al. (2011).The authors note that QLIKE may be rewritten without loss of generality by and so is a combination of i.i.d terms ( t  ).Another reason is that QLIKE penalizes small volatility forecast (close to zero).
Even if metric criteria are important, it is useful to have statistical tests that assess if the difference between loss functions of two competing models is significant or not.For this, I consider the test of equal predictive ability (EPA) of Diebold and Mariano (1995) . If the latter is covariance stationary and short memory, the where  represents the population mean loss differential.The variance is estimated by Here, k ˆrepresents an estimate of the k-th order auto-covariance of the series ) ( t d , q the truncation lag and k  the lag windows.I follow Marcucci (2005) After assessing the loss functions of competing models and analyzing their statistical significance, I evaluate volatility forecasting performance in a financial risk management setting.For this, I calculate the Value-at-Risk which is the money-loss in a portfolio that is expected to occur over a pre-determined horizon ( h ) and with a pre-determined degree of confidence (  ).It may be seen also as a quantile of the portfolio (conditional) The last equality is explained by the fact that  ,  ( 1 1 The best model with respect to the loss function (14a) is the one that minimizes the function Christoffersen (1998) for some statistical tests based on the coverage probability.The second loss function was proposed by Koenker and Bassett (1978), hereafter KB.It penalizes more heavily the observations for which there is a violation of VaR constraints.I also evaluate the performance of the competing models with respect to investors having short positions (Note 4).

Empirical Results
The results are based on the S& P 500 daily data adjusted for dividends.The data is extracted from yahoo finance web site (Note 5).I recall that all parameters are estimated by the Maximum Likelihood method with a Gaussian and a Student's t distribution for t  and also the in-sample data ranges from January 2, 2002 to October 2, 2008 corresponding to 1699 daily observations.The labels ASUG, BSUG and CSUG in the Table 2 correspond to the conditional variance given by Eqs.(8a), (8b), (8c) respectively.
As expected, the estimated value of  (leverage parameter) is such that negative shocks impact more future volatilities than positive ones.Accordingly, it is positive for SUGARCH and GJR models and negative for the EGARCH model.Also, the persistence of shocks on volatility ( 1 1    ) is high (> 0.990) for all models as it is usual in the financial time series.It is noted that introducing leverage effects on the constant of GARCH model (ASUG) or on the ARCH parameter (BSUG) does not give significant difference on the estimated parameters with respect to a standard GARCH model.However, if the asymmetry between negative and positive shocks is modeled through the GARCH parameter (CSUG model), the difference becomes clear since 96 .0 ˆ1   , (see Figure 1 for illustration).Figure 1 shows also the necessity to have time varying parameters for the standard GARCH model since its parameters have big oscillations through the time evolution.The use of SUGARCH class solves this problem due to its stochastic parameters.The student's t distribution for t  gives better fitting than normal distributions.The three best models are given respectively by CSUG_T, EGARCH_T, GJR _T for LLF criterion.The ranking order is also the same for AIC and BIC criteria.This point highlights the finding that normalized financial returns  (ASUG, BSUG model) improves accuracy in the in-sample fitting.Since the model structure has similarities with other (A) GARCH models in the sense that current volatility depends on past volatilities (GARCH parameters) and past innovations (ARCH parameters), we may expect that the same technique to work also for those models to better approximate financial data.The next step is to see the out-of-sample performance of the different models, part that interests more investors and market participants.
For this purpose, I compare the relative performance of the different volatility models in three ways.The first uses a metric loss function, the second is based on directional accuracy tests whereas the third focuses on risk management purpose.The out-of-sample data ranges from October 3, 2008 to December 31, 2010 and represents twenty six months of data (567 observations).), the best model is T BSUG _ .Another point is that the assumption of normal distribution for innovations generates satisfactory results specifically for daily volatility forecasts.However, for longer horizon, minimal values for the QLIKE loss function are obtained with a Student's t distribution.Also, a good in-sample performance does not imply a good out-of-sample performance.We see previously that CSUG_T was the best model from in-sample performance while it gives here no satisfactory results on the out-of-sample evaluation when multi-step-ahead volatility forecasts are considered.Overall, the best model is now BSUG_T.The Diebold and Mariano test, see Eq. ( 13) is now adopted to further examine the statistical significance from two competing models i and j .The findings from the DM-test statistics across all models and forecast horizons are available.Table 5 presents the results obtained from BSUG _T and GJR _T taken as benchmarks where forecast horizons are given respectively by 10 , 5  h (Note 6) .As expected, the Diebold and Mariano (DM) test confirms results obtained from the previous table.For the benchmark model GJR_T, it is seen that all DM statistic values are negative showing that its loss function is the smallest for 1  h .On the other hand, the table shows that the null hypothesis of equal predictive ability is rejected for the following competing models ASUG, BSUG, GARCH.For the remaining, the difference between loss functions is not significant at the % 5 level.When BSUG_T is now the benchmark, similar results are .Finally, it is noticed that all the best models for a given horizon do not give significant difference with respect to EGARCH model.So even if the latter does not outperform the others, it gives satisfactory results.
Finally, I compare the performance of the different models with respect to the two loss functions defined in (14a), (14b).The coverage rate of the VaR is 01 .0   and the distribution of t  is assumed to be either Normal or a Student.A general finding is that all the models have problems to give good realized VaR forecasts when the horizon step is a week or two weeks.If the horizon is however 1 or 2 days, results are satisfactory.For the risk management purpose, I only analyze one side of the conditional return distribution since if an investor takes a long (short) position, only extreme negative (positive) returns would matter for him.Table 7 shows results for a short position.For daily VaR predictions, there are three best models for the PF criterion given by ASUG_N, BSUG_N and GARCH_N models.For the same horizon, the best accuracy for KB loss function is obtained from CSUG_N.For the horizon 2  h , CSUG_T is the model that minimizes also the loss function KB.For the PF loss criterion, the best models are a group given by ASUG_T, BSUG_T, CSUG_T and GARCH_T model.Overall, BSUG and CSUG models give satisfactory results with respect to VaR-based loss functions.

Conclusion
This paper has sought to re-examine the volatility forecasting literature by improving the standard GARCH model.The latter is extended by introducing asymmetry between negative and positive shocks.This extension, in contrast to other AGARCH models, does not change significantly the structure of the standard GARCH models.Also, I analyze what is the good way to capture leverage effects in financial time series.Our findings are summarized as follows.For in-sample fitting, the best model comes from SUGARCH class and it is the one obtained by modifying 1  instead of 1  parameter for asymmetric effects.Consequently, the GARCH parameter is more flexible than the ARCH parameter and is more suitable for financial asset prices.CSUG model is also the one inside the SUGARCH class that gives estimates significantly different from GARCH model.
For the out-of-sample evaluation, good results usually come from the SUGARCH class.For example, for the loss function QLIKE as a criterion, it is found that GJR is the best model for daily horizon but for 2  h These findings are also confirmed by the second criterion which is the statistical test defined by Diebold and Mariano.The latter test additionally shows that, all best models, in terms of loss ranking, do not give significant difference with respect to the EGARCH model. .However, for short position it is preferable to work with the SUGARCH class (ASUG, BSUG, CSUG).The second measure (KB loss function) integrates both the number and size of VaR violations.So it is more relevant than the coverage probability.For this criterion, the best models belong to the SUGARCH class (CSUG or BSUG model) independently on the investor position (long or short).These good results from SUGARCH class may be explained by inheritance of the standard GARCH model since they almost share the same formula for forecasting volatility where the difference only appears on the initial condition that integrates asymmetry.Another explanation is to see that SUGARCH class has some similarities with Mixture or Markov regime switching GARCH models and the literature has demonstrated that those models may give interesting results especially when economic changes appear on the interval of study as in our case (subprime crisis).

Notes
Note 1.Since each parameter may take two forms, the cardinal is 1 2  Q P . The minus 1 corresponds to the standard GARCH model where there is no stochastic parameter and so no asymmetry.Note 2. In the implementation, I use

  
for any integers j i, .   is an increasing function of 2  .


still have a heavy tail distribution, however with less kurtosis than the unconditional return ( If QLIKE is considered as a measure, the normal distribution for t  is better than the Student distribution.The three best models are given respectively by GJR_N, CSUG_N, GJR_T.Overall, I note that introducing asymmetry through 1 better to work with the BSUG model.

Finally
, I investigate performance of the different models with respect to loss functions based on Value at Risk predictions.The first criterion is based on the coverage probability or the number of VaR violations.The obtained results depend on the investor's position.If long positions are considered, CSUG_T and EGARCH_T models gives respectively the

Table 1 .
Descriptive statistics of S & P 500 returns (DM hereafter).Let t d be the loss differential between the two competing

Table 3 .
In-sample diagnostic for the different models This table presents the Log likelihood function(LLF), the Akaike information criterion(AIC) and the Schwarz criterion (BIC).QLIKE is the loss function defined in Eq. (12).Numbers in boldface indicate the best values.

Table 4 .
Out-of-sample evaluation of volatility forecasts for h=1, 2, 5, 10 This table presents the Quasi likelihood (QLIKE) loss function, see, (12).Numbers in boldface give the minimal (best) value for each group.

Table 5 .
Diebold-Mariano Test with BSUG_T and GJR_T as benchmarks obtained.It performs well in terms of volatility forecasts with respect to other models for the horizons with respect to GJR , GARCH _N, ASUG_N, BSUG _N, CSUG models.I have also the same conclusion for the other horizons

Table 6
by GJR_T and EGARCH_T models.The latter becomes the best model for the two day-horizons followed by the BSUG_T model.Since, the PF loss function does not take into account the magnitude of VaR violations (same weight), I add Koenker and Basset (KB) loss function to remedy this disadvantage.In this case, VaR violations (no VaR violations) are weighted by 

Table 7 .
Out-of-sample Evaluation: 99 % VaR, Short Position Note: This table presents the percentage proportion of failures (PF) and Koenker and Bassett (KB) loss function for the 99 % VaR failure processes at one and two-step-ahead.Numbers in boldface give the best value.
Note 3. I also tried other formulations of SUGARCH models that are close to regime switching models.Namely, I define t , so giving two values for t v .The corresponding models give also the same unconditional variance than the standard GARCH model and have closed formula for the kurtosis.Note 5.For robustness, the same treatment is also made for CAC 40 index with similar conclusions.Note 6. Due to space constraints, all results have not been included.The other ones can be downloaded from http://sites.google.com/site/makonte/ It remains to develop the last equality and then to use the following points coming fromTo obtain (6).The kurtosis is then deduced by the following formula