An Overview of the Determinants of Financial Volatility : An Explanation of Measuring Techniques

The majority of asset pricing theories relate expected returns on assets to their conditional variances and covariance’s. Since conditional variances and covariance’s are not observable, researchers have to estimate conditional second moments relying on models. An important concern is the accuracy of these models and how researchers may estimate them more accurately. In this paper, various measures of volatility have been examined ranging from time invariant to time variant measures. In the former case one of the simplest measures examined was the standard deviation. A weakness of this measure is the assumption that volatility is constant, this being due to the standard deviation of returns increasing with the square root of the length of the period. Empirical evidence, however, shows us that the behavior of asset returns in the real world changes randomly over time. This led us to an examination of time variant models for measuring volatility


Introduction
Volatility is a fundamentally important concept to the discipline of finance.Several reasons have been advanced as to why volatility is an important issue in itself.Firstly, when asset prices fluctuate sharply over time differentials as short as one day or less, investors may find it difficult to accept that the explanation for these changes lies in information about fundamental economic factors.This may lead to an erosion of confidence in capital markets and a reduced flow of capital into equity markets.Secondly, for individual firms the volatility of the firm is an important factor in determining the probability of bankruptcy.The higher the volatility for a given capital structure, the higher the probability of default.Thirdly, volatility is an important factor in determining the bid-ask spread.The higher the volatility of the stock, the wider is the spread between the bid and ask prices of the market maker.The volatility of the stock thus affects the market's liquidity.Fourthly, hedging techniques such as portfolio insurance are affected by the volatility level, with the prices of insurance increasing with volatility.Fifthly, economic and financial theory suggests that consumers are risk averse.Increased risk associated with a given economic activity should, therefore, see a reduced level of participation in that activity, which will have adverse consequences for investment.Finally, increased volatility over time may induce regulatory agencies and providers of capital to force firms to allocate a larger percentage of available capital to cash equivalent investments, to the potential detriment of allocational efficiency.In light of the above, it appears justifiable to discuss volatility in some depth.
The objective of this paper is to provide an overview of volatility.To date, there exist several reviews on specific models or issues regarding volatility, such as those of Bollerslev, Chou and Kroner (1992), Kupiec (1993) and Bera and Higgins (1993) but there appears to be no coverage of this controversial topic from a broader perspective, providing an insight on volatility to a more general audience of financial economists.
In this paper, the section 'What is Volatility?' begins by providing a background discussion on the topic of Volatility? while the next section 'What Influences Volatility?' examines the factors that influence volatility.The following section 'Determinants of Volatility Changes' provides an overview of the models used to measure volatility, and points to some of their inherent weaknesses.The final section summarizes the paper and draws together the conclusions.

What is Volatility?
Volatility can be described broadly as anything that is changeable or variable.Volatility can be defined as the changeableness of the variable under consideration; the more the variable fluctuates over a period of time, the more volatile the variable is said to be.Volatility is associated with unpredictability, uncertainty and risk.To the general public, the term is synonymous with risk, hence high volatility is thought of as a symptom of market disruption whereby securities are not being priced fairly and the capital market is not functioning as well as it should.Volatility is of enormous importance to everyone involved in the financial markets, where it is thought of more in terms of unpredictability.In this context, volatility is often used to describe dispersion from an expected value, price or model.The deviation of prices from theoretical asset pricing model values, and the variability of traded prices from their sample mean are two examples.As discussed earlier, substantial changes in the volatility of financial market returns are capable of having significant negative effects on risk averse investors.In addition, such changes can also impact on consumption patterns, corporate capital investment decisions, leverage decisions and other business cycle and macroeconomic variables.Traditionally, volatility is viewed as synonymous with variance risk.The trade-off between risk and expected return is the foundation upon which much of modern finance theory such as Capital Asset Pricing Models, Arbitrage Pricing Models and portfolio theory is based.Modern option pricing theory, beginning with Black and Scholes (1973), places volatility in a central role in determining the fair value of an option.In the Black-Scholes option pricing formula, returns volatility of the underlying asset is an important parameter, and its importance is magnified by the fact that it is the only one variable that is not directly observable.Although realised volatility can be computed from historical data, an option's theoretical value depends on the volatility that will be experienced in the future over its entire lifetime.Despite recent scrutiny, relatively little is yet known about the causes of volatility in various financial and commodity markets.Schwert (1990) shows that an increase in stock market volatility (as measured by percentage change in prices or rates of return)(Note 1) brings an increased chance of large stock price changes of either sign.Most of the markets' highest returns, for example, occurred during the Great Depression from 1929 to 1939.The most commonly used measure of stock return volatility is the standard deviation.Taking a plot of the volatility of monthly returns (measured by the standard deviation) of the listed stocks on the New York Stock Exchange shows the highest periods of volatility this century were recorded from 1929 to 1939 and for October 1987.Shiller (1989) concluded that stock market volatility is excessively high relative to fundamental values.This issue will be explored further in Chapter three.Shiller's work is supported by Spiro (1990), while Schwert (1991) was strongly critical, claiming Shiller's status has little to do with the scientific merits of his work.In terms of percentage stock price movements, over the past few years volatility has increased dramatically for short intervals, with periods of high volatility such as 1987 for equities, 1992 for foreign exchange and 1994 for bonds.Kupiec (1991) shows that over this period stock return volatility increased in many OECD countries.He argues that this higher average volatility recorded in the 1980s was caused by short periods of abnormally high return volatility which raised measures of average volatility for the decade.

What Influences Volatility?
One way of examining the influences on volatility is to calculate volatility over several different frequencies.Historical data shows that some volatility clusters are short-lived, lasting only a few hours, while others last a decade.The primary source of changes in market prices is the arrival of news about an asset's fundamental value.If the news arrives in rapid succession, and if the data is of sufficiently high frequency to pick up the arrival of news, then returns will exhibit a volatility cluster.At higher frequencies, the most likely sources of volatility are the pressures and turbulence induced through trading, often called noise.At lower frequencies, macroeconomic and institutional changes are the most likely influences.The high volatility of the 1930s, for example, is attributed to macroeconomic events.In general, the frequency of data dictates which types of volatility clusters can be seen and therefore measured.Low frequency data allows only low frequency or macroeconomic fluctuations to be seen, while higher frequency data reveals more about the volatility properties.Nelson (1996) lists several factors associated with market volatility changes.The more important of these are: (1) positive serial correlation in volatility.This means that 'large changes tend to be followed by large changes of either sign and small changes tend to be followed by small changes' (see Mandelbrot, 1963); (2) Fama (1965) and French and Roll (1986) show that trading and nontrading days contribute to market volatility.In particular, stock market volatility tends to be higher on Mondays than on other days of the week, reflecting movements of stock prices on a Monday based on information arriving over a 72 hour period, while on other trading days, price movements reflect information arriving over a 24-hour period; (3) Leverage effects provide a partial explanation for market volatility changes.When firms' stock prices fall, they become more leveraged and the volatility of its return typically rises.Black (1976) argued, however, that the measured effect of stock price changes on volatility are too large to be explained solely by leverage changes; (4) During recessions and financial crises stock market volatility tends to be high.For instance, stock market volatility hit a historic high during the 1930s Great Depression (see Schwert, 1988 andOfficer, 1973); and (5) High nominal interest rates have been shown to be associated with high market volatility (see Fama and Schwert, 1977;Christie, 1982 andGlosten et al. 1989).

Long-term and Short-term Factors Influencing Volatility
Bearing in mind the above discussion, it appears intuitive to separate the factors that influence volatility into long-term and short-term factors.Amongst the long-term influences on volatility is that of corporate leverage (i.e.debt/ equity ratios).Christie (1982) and Black (1976) identified volatility peaks with stock market price declines, their explanation for this being based on the effects that corporate leverage has on volatility.The argument is essentially circular, because stock price declines will increase financial risk, since the debt/equity ratio has increased.This increase in financial risk will, in turn, increase the expected return to equity which decreases the current stock prices.Schwert (1989) presents evidence of a positive correlation between corporate leverage and volatility.Similar results have been obtained by Black (1976), Christie (1982) and French, Schwert and Stambaugh (1987).Amongst them, French, Schwert and Stambaugh (1987), Officer (1973) and Schwert (1989) have identified an association between volatility peaks and recessions.Major volatility peaks have been associated with the 1930s Great Depression, the OPEC oil crisis of 1974, and the stock market crash of October 1987.Schwert (1989) also showed that the volatility of industrial production is highest during financial recessions.
Factors which influence volatility over the short-term include trading volume, contrarian trade(Note 2), and the introduction of futures and options.(Note3) Perhaps the most commented upon of these factors is the association between trading volume and volatility.Market folklore suggests that trading volume is positively associated with volatility.Karpoff (1987) reviewed the theory and evidence on this relationship and concluded that there is strong support for a positive relationship.French and Roll (1986) found that the volatility on the NYSE during trading hours is far greater than during weekend non-trading hours and this is due to the arrival of private rather than public information.By definition, private information can only affect prices through trading, whereas public information can affect prices at any time.

Statistical Measures
A common measure of stock market volatility is the standard deviation of returns.Estimates of sample standard deviation from daily returns serve as a useful measure for characterising the evolution of volatility.This statistic measures the dispersion of returns.
The standard deviation  of returns R t from a sample of T observations is the square root of the average deviation of returns from the average return in the sample, where R is the sample average return, R = R t /T.The standard deviation is a simple but useful measure of volatility because it summarises the probability of seeing extreme values of return.When the sample standard deviation is large, the chance of a large positive or negative return is large.Several studies have used a modification of the standard deviation to estimate volatility.Hooper andKohlhagen (1978), Cushman (1983), Aschheim et al (1993) and Daly (1997a) all use a four quarter moving standard deviation of exchange rates as a proxy for exchange rate volatility.A measure of volatility which focuses on the uncertainty aspect of volatility is the Root Mean Square Percentage Error (RMSPE).This is a simple and well known measure of prediction errors, and can be represented as follows: where E t represent the actual variable in period We can also use the standard deviation as a predictor of E t


, by denoting the mean as E and substituting it for E t  in equation (2.1) thus: The standard deviation method is, however, scale dependent.A closely related method which is not scale dependent is the percentage coefficient of variation: The above models of uncertainty can be regarded as assuming  t to be small for all time t.In the literature, these measures of volatility are referred to as time invariant measures.Since the actual variance of stock returns are widely acknowledged to be time varying, the usefulness of time invariant measures in measuring risk has been questioned.It is therefore, more appropriate that  t should be seen as varying over time and dependent on past values of   , etc.This has led to increasing attempts to develop more acceptable measures such as time varying measures of volatility.

Modeling Volatility
According to Engle (1993), financial market volatility is predictable.In principle this claim may only be justified when ARCH effects are present.The implication of this observation for risk averse investors is that they can adjust their portfolios by reducing their commitments to assets whose volatilities are predicted to increase, thereby reducing their exposure to risk.Predicting volatility is really just a prediction of variance, a prediction that the potential size of a price move is small or great.Volatility forecasting is an imprecise activity, just like predicting rain.You can be correct in predicting the probability of rain, but still have no rain.
In modeling volatility, time series statistics are used to find the best forecast of volatility.By using time series statistics it is possible to determine whether recent information is more important than old information and how fast information decays.We can determine whether volatility is equally sensitive to market up moves as it is to down moves, and whether the size of past returns is proportional to the magnitude of volatility experienced today.

ARCH Models
The most important development in modelling volatility changes was the autoregressive conditional heteroskedasticity or ARCH model, introduced by Engle (1982).The growth rate of the ARCH literature has been truly spectacular over the last decade, so much as to evoke the following comment from Bera and Higgins (1993), 'The numerous applications of ARCH models defies observed trends in scientific advancement.Usually applications lag theoretical developments, but Engle's ARCH model  has been applied to numerous economic and financial data series of many countries, while it has seen relatively fewer theoretical advancements.'This growth has stemmed primarily from the versatility of ARCH models in capturing some important stylised facts of many economic and financial data.These include unconditional distributions have thick tails, variances change over time and large and (small) changes tend to be followed by large (small) changes of either sign.The ARCH model has been applied to test several asset pricing models such as the Capital Asset Pricing Model (CAPM) and the Arbitrage Pricing Model (APT) in order to capture the time varying systematic risk process of these models.In the CAPM, there is a direct association between variance and risk as well as a fundamental tradeoff relationship between risk and return.The ARCH-M developed by Engle, Lilien and Robins (1987) [discussed below in (2.4.7)] provides a useful tool for estimating this linear relationship.
ARCH models have been used to examine how information flows across countries, markets and assets, to develop optimal hedging strategies.In macroeconomics, ARCH techniques have been used to model the relationship between the time-varying conditional variance and the risk premia in the term structure of interest rates.In modeling exchange rate dynamics, international portfolio management depends on expected exchange rate movement through time.Additionally the impact of exchange rate movements on different macroeconomic variables requires an understanding of exchange rate dynamics.In the absence of any structural model that captures these dynamics, the linear GARCH (p, q) model (see 2.4.3) has been widely used for modeling exchange rate dynamics.ARCH models have also been used to measure inflation uncertainty, to study the effects of central bank intervention, and to characterise the relationship between the macro economy and the stock market.Before we describe the original ARCH model of Engle (1982) and subsequent extensions to the basic ARCH model, we first outline the steps involved in measuring the volatility of asset returns.Engle (1982Engle ( , 1983) ) found in analysing results from models of inflation that large and small forecast errors appeared to occur in clusters.This suggested a form of heteroscedasticity in which the variance of the forecast error depends on the size of the preceding disturbance.The ARCH models discussed below all exploit a common statistical characteristic referred to as conditional variance.The conditional mean uses information from the previous period and is in general a random variable, depending on the information set F t-1 and is given by: where y t is the rate of return of a particular stock or market portfolio from time t-1 to t, F t-1 is the past information set containing the realised values of all relevant variables up to time t-1, and E is the mathematical expectations operator.Since investors know the information in F t-1 when making their investment decision at time t, the relevant expected return and volatility to the investors are in turn given by the conditional expected value of y t , represented in (1.5) above and the conditional variance of y t , given F t-1 , represented by equation (1.6) below.
Since volatility measures the variability of returns, investors will forecast more accurately by using the conditional variance,  t 2 , since it depends on the information set F t-1 .
To analyse the returns y t on an asset received in period t, we need to follow three basic steps: (1) specify m t ; (2) specify  t 2 ; and (3) specify the density function of  2 .In financial markets, m t is generally designated as the risk premium, or the expected return which is frequently set at zero, at least for high-frequency data.Regarding the specification of the density function, the characteristics of stock returns which tend to exhibit nonnormal unconditional sampling distributions will be examined at various points particularly when we investigate the application of ARCH to stock return data.Interestingly, according to Engle and Gonzalez-Rivera (1991) the assumption that the conditional density is normally distributed usually does not appreciably affect the estimates even if it is false.In presenting the various models examined below we will focus mostly on Step 2. In particular, we examine how the conditional variances depend on past information.Equation (2.6) is also used to determine time varying risk premiums and to forecast volatility.There will be little discussion on the estimation procedures, since these have been dealt with elsewhere (see Engle 1982;Bollerslev 1986 andHamilton 1994).Suffice to say that maximum-likelihood estimation is generally recommended and used and that the likelihood function typically assumes that the conditional density is Gaussian.

The Linear ARCH(p) Model
ARCH can be defined in terms of the distribution of the errors of a dynamic linear regression model.
where Z t ~ i.i.d. with E(Z t ) = 0, E(Z t 2 ) = 1.By definition,  t is serially uncorrelated with mean zero, but the conditional variance of  t equals  t 2 , which may be changing through time.Engle (1982) then chose a functional form for  t 2 (): where  and { i }, i = 1, p are nonnegative constants.This is necessary to keep  t 2 nonnegative.The distinguishing feature of this model is not that the conditional variance is a function of the conditioning set  2 (   ), but rather it is the particular functional form that is specified.Episodes of volatility are generally characterised as the clustering of large shocks to the dependent variable.The conditional variance function is formulated to mimic this phenomenon.In the regression model, a large shock is represented by a large deviation of y t from its conditional mean m t, or equivalently, a large positive or negative value of  t .In the ARCH regression model, the variance of the current error  t , conditional on the realised errors  t-i , is an increasing function of the magnitude of the lagged errors irrespective of their sign.Hence large errors of either sign tend to be followed by a large error of either sign, and similarly, small errors of either sign tend to be followed by small errors of either sign.The order for the lag p determines the length of time for which a shock persists in conditioning the variance of subsequent errors.The effect of a return shock i periods ago (i  p) on current volatility is governed by the paramaters  i in equation (1.10).Normally, the older the news, the less effect it has on current volatility.Investors choose different values of p, depending on how fast they think volatility is changing.Engle (1982) parameterised the conditional variance as (1.10)where the weights  i decline linearly.The model in (1.10) is referred to as ARCH(p), in which the conditional variance is simply a weighted average of past squared forecast errors.This is the volatility estimate used by the vast majority of market participants.
The appeal of (1.10) lies in the way it captures the positive serial correlation in  t 2 : a high value of  t 2 increases  t 1 2 , which in turn increases the expectation of  t  1 2 , and so on.In other words, a large (small) value of  t 2 tends to be followed by a large (small) value of  t  1 2 .The advantage of the ARCH formulation is that these parameters can be estimated from historical data and used to forecast future patterns in volatility.The above equation (2.10) can be written as: where the term in brackets is unforecastable and is considered to be the innovation in the autoregression for  2 .
It is a simple procedure to test whether the residuals  t from a regression model exhibit time-varying heteroscedasticity.Engle (1982) derived a test based on the Lagrange Multiplier (LM) principle.An LM test for  1 = =  p = 0 can be calculated as TR 2 from the regression of  2 t on  2 t-1  2 t-p , where T denotes the sample size.(Note4) Bollerslev (1986) extended the ARCH model into GARCH i.e., Generalised ARCH.The innovation here is that GARCH allows past conditional variances to enter equations (1.9) and (1.10).The intention of GARCH is that it can parsimoniously represent a higher order ARCH process.The GARCH (p, q) can be represented as follows:

GARCH
, and , , 1 , ,     are nonnegative constants.As indicated in (1.12), GARCH models explain variance by two distributed lags, one on past squared residuals to capture high frequency effects, and the second on lagged values of the variance itself, to capture longer term influences.An appealing feature of the GARCH (p, q) model concerns the time series dependence in  t 2 .
Equation (1.12) can be written as: Equation (2.13) shows that  2 follows an autoregressive moving average (ARMA) process.A systematic approach to estimation is maximum likelihood.This involves postulating a well defined objective function and then maximising it with respect to the unknown parameters.As the objective function is not quadratic, iterative algorithms are required.Various algorithms have been used but the GARCH (1, 1), estimation is rather well behaved.This is simplest of the GARCH models and can be expressed as: The GARCH (1, 1) in equation (1.14) embodies a very intuitive forecasting strategy: the variance expected at any given date is a combination of a long run variance and the variance expected for the last period, adjusted to take into account the size of last period's observed shock.In the GARCH (1, 1) model, the effect of a return shock on current volatility declines geometrically over time.Despite the success of both ARCH and GARCH models (see the survey by Bollerslev et al (1992)) these models cannot capture some important features of financial and economic data.The most interesting feature not addressed by these models is the leverage or asymmetric effect (discussed below) discovered by Black (1976), and confirmed by Nelson (1990) and Schwert (1990), among others.Mandelbrot (1963) and Fama (1965) were amongst the first to observe that unconditional price or stock returns tend to have fatter tails than a normal distribution exhibits, in the form of skewness(Note 5) but more pronounced in the form of excess kurtosis.For example in the GARCH (p, q) model examined in (2.4.3) above, the unconditional distribution of the  t 's have fatter tails than the normal distribution.Baillie and DeGennaro (1990) and de Jong et al ( 1990) assumed conditionally t-distributed errors (the conditional t-distribution allows for heavier tails than the normal distribution) in a GARCH (1, 1) model for the conditional variance, and found that failure to model the fat-tailed properties leads to spurious results in terms of the estimated risk-return tradeoff.One solution to the kurtosis problem is the adoption of conditional distributions with fatter tails than the normal distribution.Nelson's (1991) EGARCH (discussed below) represents one of the more successful attempts to model excess conditional kurtosis in stock return indices based on a generalised exponential distribution.

EGARCH
In the case of the GARCH models discussed above, we squared the residuals before estimating them.However, it is possible that up and down moves do not have the same predictability for future volatility.Nelson (1991) was the first investigator to model leverage effects, (i.e.where the down moves are more influential for predicting volatility than up moves), by introducing the exponential ARCH or EGARCH model which can be represented as follows: The EGARCH model was largely motivated by Black's (1976) empirical observation that stock volatility tends to rise following negative returns and to drop following positive returns.The EGARCH model exploits this empirical regularity by making the conditional variance estimate a function of both the size and the sign of lagged residuals.Unlike the linear GARCH (p, q) model, the EGARCH model places no restrictions on the parameters  i and  i to ensure nonnegativity of the conditional variances.Equation (1.15) allows positive and negative values of  t to have different impacts on volatility.The EGARCH model is asymmetric because the level   2 1 2 is included with coefficient  i .Since this coefficient is typically negative, positive return shocks generate less volatility than negative return shocks, all else being equal.In summary the EGARCH model differs from the standard GARCH model in two main respects: firstly, the EGARCH model allows good news and bad news to have a different impact on volatility, while the standard GARCH does not, and secondly, the EGARCH model allows big news to have a greater impact on volatility than the standard GARCH model.

The Taylor -Schwert Model
Davidian and Carroll (1987) argue that scale estimates based on absolute residuals are more robust to the presence of thick-tailed residuals than scale estimates based on squared residuals.Schwert (1989) applied the Davidian and Carroll intuition to ARCH models, conjecturing that estimating  t 2 with the square of a distributed lag of absolute residuals (as opposed to estimating it with a distributed lag of squared residuals, as in GARCH) would be more robust to innovations in x, i.e.  x 's with thick tailed distributions.Taylor (1986)   proposed a similar method.Generally, since thick-tailed standarised residuals are the norm in empirical applications of ARCH (see Bollerslev et al., 1992), the use of absolute residuals as opposed to squared residuals in estimating time varying volatilities in asset returns is more appropriate.
Schwert's (1989) model produces monthly volatility estimates from monthly return data.The measure is more robust than the standard deviation measure because of its insensitivity to extreme values, the measure being based on absolute deviations of returns from its conditional mean.Schwert's method for estimating volatility is calculated by first regressing monthly returns on 12 monthly dummy variables and 12 lagged return values, where  1 ( ) H is a 12th order polynomial in the lag operator H, the SD m 's are monthly dummy variables to capture seasonal variations in the means and standard deviations of the variables, and the  t x are innovations which are obtained as the absolute values of the residuals from the following equation, i.e.,   where, (1.17) and  1 ( ) H is another 12th order polynomial in the lag operator H.The measure of conditional volatility in equation (1.16) represents a generalisation of the 12-month rolling standard estimator used by Officer (1973), Fama (1976) and Merton (1980) to measure stock market volatility, because it allows the conditional mean to vary over time in equation (1.17) while also allowing different weights to apply to the lagged absolute unpredicted changes in stock market returns in equation (1.16).This measure has been used by Schwert (1989) to examine the relationship between stock market volatility and underlying economic volatility, and more recently by Koutoulas and Kryzanowski (1996) to examine the role of conditional macroeconomic factors in an arbitrage pricing model and in Chapters four and five of this thesis, where the Schwert (1989) measure is used to examine the relationship between stock market volatility and several financial and macroeconomic variables for Australia.Higgins and Bera (1992)  power.The chief appeal of NARCH is that it is more robust to conditionally thick-tailed  x s NARCH limits the influence of large residuals essentially the same way as the estimators employed in the robust statistics literature e.g.Davadian and Carroll, (1987).

ARCH-M
Assets with high expected risk must offer a high rate of return to induce investors to hold them.In this case, increases in conditional variance should be associated with increases in the conditional mean.Merton (1980) derives an equation that relates the expected return on the market linearly to the conditional variance of the market: where  can be interpreted as the coefficient of the relative risk aversion of an agent, and m is interpreted as a time-varying risk premium that is, the increase in the expected rate of return due to an increase in the variance of the return.Engle, Lilien and Robins (1987) propose the ARCH-M or GARCH-M which incorporates an equation like (1.18) in which the conditional mean is an explicit function of the conditional variance: In (1.19), an increase in the conditional variance will be associated with an increase or a decrease in the conditional mean of y t depending on the sign of the partial derivative of   , ; ARCH-M models have been frequently used in finance where many theories involve an explicit tradeoff between the risk and the expected return, ARCH-M is suited to modelling this relationship due to the observation that the variance may frequently be time-varying.

MARCH
Friedman and Laibson (1989) argue that stock movements have "ordinary" and "extraordinary" components.This motivated their modified ARCH (MARCH) model, which bounds g() in (1.20) to keep the "extraordinary" component from being too influential in determining  t 2 .The MARCH model can be represented as: This model can be understood intuitively if we assume that  x has occasional large outliers.Least-squares based procedures such as GARCH will not estimate  t 2 efficiently.In accord with Friedman and Laibson's (1989) intuition, the thicker tailed the conditional distribution of  x , the less weight should be given to 'large' observations.The point of the MARCH model is to enable the ARCH mechanism to focus on the ordinary component of stock returns by de-emphasising the extraordinary events.In essence, the estimated MARCH model distinguishes the extraordinary movements and removes most of their impact, and then analyses the persistence in volatility of the remaining ordinary component.Results from the estimated MARCH model show that extremely high volatility levels such as the October 1987 crash decay quickly, while only marginally high volatility levels decay much more slowly.

Multivariate ARCH Models
Since economic variables are interrelated, many issues in asset pricing and portfolio decisions can only be meaningfully examined in a multivariate context.Estimating some financial coefficients such as systematic risk (beta coefficients) and hedge ratios, require the sample values of covariance between relevant variables.A general definition of multivariate ARCH can be represented as follows: where z t is i.i.d., E(z t ) = 0, var(z t ) = 1, { t } denotes an N 1 vector stochastic process, and  is the time-varying N  N covariance matrix which is positive definite and measurable with respect to time t -1 information set.The simplest multivariate ARCH model is that presented by Engle et al. (1990), in which certain linear combinations of the observable X t 's drive the conditional covariance matrix.Inference in the multivariate ARCH model is similar to the univariate model.A complete description of the properties and parameterisation of multivariate ARCH is given in Higgins and Bera (1993).
Several studies (e.g.Domiwitz and Hakkio (1985); Diebold and Pauly (1988a) and Baillie and Bollerslev (1990)) have found the weak results in the foreign exchange market using univariate ARCH-M models to estimate time-varying risk premia.Some have speculated that these results are due to the poor proxies for risk as given by the conditional variances.In particular, the above authors suggest that the premium might be better approximated by a function of the time varying cross-currency conditional covariance's and not just the own conditional variance.Lee (1988) found support for this hypothesis, in that the conditional covariance between the German Mark and the Yen/US dollar spot rates (modelled by the bivariate ARCH (12) model) helps explain the weekly movements in the Yen/U.S. dollar rate.
Multivariate ARCH models have also addressed various policy issues related to the foreign exchange market.
Recent studies by Diebold and Pauly (1988b) and Bollerslev (1986) examine the effect on short-run exchange rate volatility following the creation of the European Monetary System (EMS).In all these cases it was found that an increase in the conditional variances and covariance's among the different European rates occurred after those currencies joined the EMS.More recently several papers have investigated the effect of central bank intervention on the foreign exchange rate dynamics in the context of a multivariate GARCH model.For example Connolly and Taylor (1990), Mundaca (1990) and Humpage and Osterbert (1990) come to the conclusion that a positive correlation between current intervention and exchange rate volatility exists in all cases.

Sources of ARCH
An important characteristic of the ARCH process which makes it suitable for modelling financial and economic series is the assumption of the lack of serial correlation in the distribution of the unconditional variance of the errors, where  2 is a constant.Since the efficient market hypothesis asserts that past rates of return can not be used to improve the prediction of future rates of return, it is important to confirm that the presence of ARCH does not represent a violation of market efficiency.This lack of serial correlation does not, however, imply that the  t 's are independent.A major contribution of the ARCH literature is the finding that apparent changes in the volatility of economic time series may be predictable and result from a specific type of nonlinear dependence rather than exogenous structural change in the variance.Using U.S. daily stock return data, Lamoureux and Lastrapes (1990) provide empirical support of the hypothesis that ARCH is a manifestation of the time dependence in the rate of information arrival to the market.They assume that I t the number of times new information comes to the market in period t, is serially correlated.Since I t is not observable, Lamoureux and Lastrapes used daily trading volume, V t , as a proxy for the daily information that flows into the market.When V t is included as an extra variable in the GARCH (1, 1) model (1.14), its coefficient was highly significant for all of the 20 stocks they considered.Furthermore when V t is included in (1.14) this made the ARCH effects (coefficients  1 and  1 ) become negligible for most of the stocks.In summary, an important property of speculative prices is serial correlation in the second moments, a search for the causes of this serial correlation has only recently begun.One possible explanation for the prominence of ARCH effects is the presence of a serially correlated news arrival process as evidenced by Lamoureux and Lastrapes (1990), Diebold and Nerlove (1989) and Gallant, Hsieh and Tauchen (1991).This empirical work supports the view that ARCH in daily stock returns is an outcome of the time dependence in the news that flows into the market.
Closely related to searching for ARCH effects is interest in how persistent are shocks to the volatility process, that is once volatility increases, for how long does it remain high?Persistence of volatility has implications for the value of the parameters of the various models presented above.In the linear GARCH (p, q) in (1.12), persistence is manifested when  1 and  1 = 1.Engle and Bollerslev (1986) refer to this class of models as integrated in variance, or IGARCH.In an IGARCH process, a current shock persists indefinitely in conditioning the future variances.The regular occurrence of vary large persistence in financial time series data is not theoretically justified.However, Kearns and Pagan (1993) feel safe in concluding that there is persistence of shocks in volatility, and this persistence is as true of small shocks as it is of large ones.Moreover, they claim that there is no evidence that persistence is due to structural change, since over long periods of time, persistence has remained constant.

Comparing Alternative Volatility Models
In comparing alternative ARCH models, a number of approaches can be taken.Nelson and Foster's (1994) research on ARCH models concentrates on refining techniques which approximate the measurement accuracy of ARCH conditional variance estimates and on comparing the efficiency achieved by different ARCH models.By deriving the asymptotic distribution of the measurement error, Nelson and Foster (1994) are able to characterise the relative importance of different kinds of misspecification.In particular, they show that misspecifying conditional means adds only trivially to measurement error, while other factors (for example capturing the 'leverage effect', accommodating thick-tailed residuals, and correctly modelling the variability of the conditional variance process) are potentially much more important.
Another approach involves an examination of how well different models of heteroscedasticity measure the value of  t 2 . Pagan and Schwert (1990) fitted several models to monthly U.S. stock returns from 1834 to 1925.They found that the EGARCH model of Nelson performed best overall in both in-sample and out-of-sample cases.A related approach involves calculating various specification tests of the fitted model.Higgins and Bera (1992) used the LM test whilst Nelson (1991) used moments tests and analysis of outliers.Yet another approach by Engle and Mustafa (1992) uses an approach based on the usefulness of a given specification of the conditional variance of the observed prices of stock options.Options give investors the right to buy or sell the security at some date in the future at a price agreed upon today.The value of the option increases with the perceived variability of the security.By assuming that stock prices can be approximated by a normal distribution with constant variance, the Black and Scholes (1973) formula which relates the price of the option to investors perception of the variance of the stock price can be used to value the option.These option prices can then be used to construct the market's perception of conditional variance which can be compared to the series implied by a given time series model.The results of such a comparison suggests that GARCH (1, 1) and EGARCH (1, 1) models can improve on the market's assessment of the conditional variance of stock prices.
ARCH models have been criticised for being ad hoc i.e., while they have been successful in empirical applications, they are statistical models, not economic models [Campbell and Hentschel (1993) and Andersen (1992)].However, these comments are based partly, perhaps, wholly on the observation that given the variety of ARCH models from which to choose.How do researchers choose between them?Nelson and Foster (1994) suggest that the robustness results of Davidian and Carroll (1987) hold in the ARCH context.In particular, EGARCH and Taylor-Schwert are more robust than GARCH to conditionally thick-tailed  t s, (i.e.scale estimates based on absolute residuals are more robust to the presence of thick-tailed residuals than scale estimates based on squared residuals).This indicates a preference to design ARCH models to be robust to thick-tailed  t s , since conditional leptokurtosis seems to be the rule in financial applications of ARCH.

Forecasting Volatility
The measurement of ex ante financial asset return volatility is extremely important for all participants in the markets.A commonly used approach to forecasting volatility involves using the market price of stock options.This method uses the market prices of contingent claims to work out the implied volatility.Here, a theoretical pricing function is used to relate the price of a contingent claim to the volatility of the underlying asset and solve it for the implied volatility given by the market price of the asset.Latane and Rendleman (1976) suggested computing volatility from the Black-Scholes option pricing formula.Implied volatility can be computed from this formula by calculating the implied standard deviation (ISD), assuming that the volatility remains constant over the life of the option.However, the problem with this approach is that the true volatility must be constant, which is more likely if the term over which the option applies is sufficiently short such that stock prices can be considered to approximate a normal distribution with constant variance.If this is not the case then the interpretation of volatility calculated by the ISD is unclear.

The Black-Scholes Model
Theoretical and empirical research on security prices since the 1950s has largely supported the 'efficient markets' of the "random walk" model.In an efficient market, asset price movements can be described by an equation like The return at time t, r t , is the percentage change in the asset price S over the period from t-1 to t.This is equal to  t , a random mean return for period t, plus a zero mean disturbance term  t , that is independent of all past and future  ' s .It is the lack of serial correlation in the random  ' s that is the defining characteristic of efficient market pricing; past price movements give no information about the sign of the random component of return in period t.If S follows a random walk, the expected value of the return is zero and the variance of the random component is constant over-time.Thus,  t would have to be zero and the variance of the  ' s would be the same for all dates.
In deriving the option pricing formula, Black and Scholes (1972) needed to model stock price movements over very short intervals of time.The formulation they adopted is an extension of the random walk model to continuous time.The result is the lognormal diffusion model shown in (1.23).
where dS is the asset price change over an infinitesimal time interval dt,  is the mean return at an annual rate, dz is a time independent random disturbance with mean 0, variance is 1.dt (a stochastic process known as Brownian motion), and  is the volatility, i.e. the standard deviation of the annual return.This model produces (continuously compounded) returns that follow a normal distribution and asset prices that have a lognormal distribution (i.e. the logarithm of S has a normal distribution).This implies that the cumulative return over a finite holding period of length T has expected value = T , variance =  2 T and standard deviation =  T .
An interesting feature of this asset price process is that with a constant volatility, the standard deviation, , of total returns over a holding period increases with the square root of the length of the period.This model, and subsequent extensions of it, have become the standard way to model asset price behavior.Empirical evidence shows that the behavior of asset returns in the real world differs fundamentally from (1.22) and (1.23).The following are the main differences.Volatility changes randomly over time.Prices in actual security markets are not perfectly uncorrelated over time.Positive correlation between consecutive price changes lowers (raises) measured volatility relative to the true value that should be used for .Observed price changes deviate consistently from lognormality.There are more very large changes and (consequently) more very small ones than a lognormal distribution calls for.The commonly used term for this is 'fat tails'.There is more weight in the tails of the actual returns distribution than in a lognormal distribution with the same variance.
The above discussion brings out one of the internal contradictions that are pervasive in applying theoretical derivative pricing models in practice.Volatility is known to be time-varying and stochastic, so a variety of methods to forecast it and to manage volatility risk are in use.Nevertheless, options are almost always priced simply by computing a point forecast for the unknown volatility and putting it into a constant volatility pricing model like Black-Scholes.

Computing Historical Volatility
Black and Scholes (1972) derived their option valuation equation under the assumption that stock returns followed a logarithmic diffusion process in continuous time with constant drift and volatility parameters as shown in equation (1.23).
Starting from an initial value S 0, the return, R, over the non-infinitesimal period from 0 to T is given by R = ln(S T /S 0 ) and R has a normal distribution, with Mean = ( / )    2 2 T (1.24) Standard deviation = .T  When an asset's price follows the constant volatility lognormal diffusion model of equation (1.23),  can be estimated from historical data.The difficulty is that actual prices do not follow (1.23) exactly, so that price behavior may change over time and differ over intervals of different lengths.However, the ways in which (1.23) fails in practice are not established and regular enough for an alternative model to have become widely accepted.It is common, therefore, to compute volatility using historical price data as if (1.23) were correct, but to adjust the estimation methodology, or the volatility number it produces, in order to offset known or suspected problems.
The resulting point estimate for  then becomes the volatility input to the Black-Scholes model or another fixed volatility equation.Black-Scholes is familiar and easier to manipulate than any valuation model that adjusts for random volatility formally.
Estimating historical volatility and projecting it forward is a very common approach to volatility forecasting in practice.Consider a set of historical prices {S 0, S 1, . . .S T }, for some underlying asset that follows the process defined in equation (1.23).The first step is to compute the log price relatives, i.e., the percentage price changes expressed as continuously compounded rates Rt = ln(S t /S t-1 ), for t from 1 to T. The estimate of the (constant) mean  of R t is the simple average The variance of the R t is given by .
Annualizing the variance by multiplying by N, the number of price observations in a year and taking the square root yields the volatility, .Given that the constant parameter diffusion model of (1.23) is correct, the above procedure gives the best estimate of volatility that can be obtained from the available price data.

Estimating Volatility in Practice
In reality, several problems exist in attempting to measure volatility.Volatility clearly changes with time.The value of using all available data is severely limited by the fact that prices and returns for many securities appear to have some serial correlation and other distortions at both short and long intervals.Positive autocorrelation in returns will reduce estimated volatility.To limit the effect of serial correlation at high frequencies, researchers can estimate using fewer data points, but this may increase sampling error.Another way in which actual security returns differ from equation (1.24) is the well-documented problem of 'fat-tails'.Equities and many other securities exhibit more large price changes than is consistent with the lognormal diffusion model.Allowing for the fact that securities prices do not come from a constant volatility lognormal diffusion process, computing historical volatility as shown in equations (1.25 -1.27) is no longer theoretically optimal.But many academic researchers and practitioners typically ignore them and calculate historical volatility estimates by the most basic method.The most common method of producing volatility forecasts from historical data is simply to select a sampling interval and the number of past prices to include in the calculation and then to apply equations (1.25 -1.27).
Above we examined several extensions to the basic ARCH model of Engle.The complexity of these models is in part due to the training of financial economists and practitioners in classical statistics.While the estimation procedures greatly influence the kinds of models and procedures used to tackle quantitative problems in their specialist fields, the complexities involved with measuring volatility warrant equally sophisticated models to estimate volatility.However, on this point Figlewski (1996), has argued that the classical statistics' view of the world does not accurately represent the nature of the underlying structure of a financial market.In particular those trained in classical statistics tend to build models that are too complex and expect too much from them.Figlewski (1996) illustrates this with an example.If it is assumed that estimation and forecasting are very similar to each other, goodness of fit statistics tell us how closely a model fits the data that was used in estimating it.However it is here that the classical statistics framework fundamentally misrepresents the nature of a financial market and leads those who adopt it to expect much better forecasting performance than can be achieved in practice.Consider the following estimation and prediction problem.Suppose we wanted to predict the movement of a whale, based on observing it over a period of time.The whale being a large animal, should move with a fairly predictable pattern over the short run simply from extrapolation.However, we do not think of a whale as following a fixed and immutable pattern, or one that we could ever hope to understand completely.The whale's behavior as a complex living organism must remain partially unpredictable no matter how much past data we may have.In this case, we are not looking at a fixed structure with constant but unknown parameters, but rather at a system that evolves over time, and perhaps alters its behavior rapidly on occasion.Because its evolution is partly stochastic, no amount of past data will allow us to know the exact structure of the system now or in the future.Prediction is possible only because the system evolves slowly and therefore our accumulated information from observing it decays only slowly.In this case, there may be an enormous difference between how well a model fits in-sample and how well it can forecast out-of-sample, and classical goodness of fit statistics may give little guidance about the latter.Also, having a large data sample for estimation does not guarantee that accurate parameter values can be computed.In any case, expanding the estimation data set by adding observations from the distant past can easily make the estimates of the current state of the system worse rather than better.In summary therefore, the more detailed and elaborate a model is, the better the fit one is generally able to obtain in-sample, but the faster the model tends to go off track when it is taken out-of-sample.Forecasting is a very different operation from in-sample estimation.In particular, financial markets behave very much like a whale.Nonetheless, it may be said that historical volatility computed over the recent periods provides the most accurate forecast for both long and short-run horizons.

Summary and Conclusions
In conclusion, the task of any model is to describe the typical historical pattern of volatility and use this to forecast future episodes.In doing so, we noted that the researcher is drawn toward constructing very complex models which typically provide a better in-sample fit than when taken out-of-sample.Despite their sophisticated composition, the predictive power of most volatility forecasting models is continually failing to convince investors of their designer's claims.Thousands of academics have devoted their entire careers to publishing models that supposedly are able to forecast volatility.Some authors have published well over 40 papers on this very topic, Andersen, Bollerslev, Christoffersen, and Diebold (2005) and yet none seems to deliver any improvement over the simple standard deviation.Andersen and Bollerslev (1997) attack the work of leading researchers such as Cumby, Figlewski, Hasbrouck, Jorion among many others, arguing that they do not know how to correctly implement their models.Besides this controversy between believers in volatility forecasting models and the large majority of skeptics, there is a contentious battle among those same believers, one claiming that his model is superior to the rest.In pre global financial crisis era 2007, Andersen and Bondarenko once again surprised the academic community by claiming not only that their volatility forecasting model was superior, but that they have mathematically demonstrated that future research was futile, since no future volatility forecasting model can beat theirs.

Notes
Note 1.The rate of return is the change in price plus the dividend received by stockholders during the period, all divided by the price of the investment at the beginning of the period.Note 2. Contrarian traders go against the market by buying in bear markets and selling in bull markets.The relevant issue is whether contrarian activities lead to an increase in volatility.One way to examine the issue is to ask the question what would happen to stock prices if they did not enter the market.It is conceivable that market rises and declines would be larger then they are, as contrarian traders actions prevent prices from spiraling in one direction.Schwert(1990a) shows that the number of extreme price reversals has not been very high over recent years.
Note 3. Evidence from the U.S. on the relationship between 'options' introduction and volatility indicate a significant fall in stock return volatility after the introduction of 'options '. See Conrad (1989), Damodaran andLim (1991) andSkinner (1989).Precise reasons for the decline are unknown, but the arrival of private information, an increase in information processing and reduced transaction costs are likely candidates.Available evidence on the effects of Index Futures introduction on market volatility, suggests that stock market volatility does not increase.See Gerety and Mulherin (1991) and Hodgson and Nicholls (1991).A distribution whose kurtosis exceeds 3 has more mass in the tails than a Gaussian distribution with the same variance.
nested GARCH and the Taylor-Schwert model in a class of 'NARCH' distributed lag of past absolute residuals each raised to the 2

Note 4 .
SeeHamilton (1994), p.664.Note 5.The skewness of a variable Y t with mean  is represented by   A variable with a negative skew is more likely to be far below the mean than it is to be far above the mean.The kurtosis is   Assume we are required to estimate y t , the rate of return, on a particular stock in the following linear regression model Engle's (1982)insight was to set the conditional variance of a series of errors,  t 's, as a function of lagged errors, time, parameters, and predetermined variables: