The Statistical Difference of Chinese Stock Market Risk before and after the Stock Index Futures Based on VAR Method

This paper examines the VaRs of daily stock market returns before and after the introduction of stock index futures contract trading in China from a statistical perspective. VaRs, in this paper, are estimated with peaks over threshold (POT) method fitting the tails of data distributions well. The key empirical results show that the VaRs of daily returns before stock index futures are greater than those after the stock index futures at the same significance levels. The market risk of Chinese stock market decreased after the introduction of stock index futures.


Introduction
The link between stock index futures and stock market risk is an issue that has interested researchers for a long time.When stock index futures are first introduced in a market, there will be a common concern for both investors and regulators on the risk of underlying stock index.
The stock index futures were launched at the China Financial Futures Exchange (CFFEX) and have been traded since April 16, 2010.The stock index futures contracts, agreements to buy or sell the CSI 300 being the underlying stock index of first mainland stock index futures at a present value on an agreed date, are designed to allow investors to bet on and profit from both gains and declines in the market.
There are two stock exchanges in China, the Shanghai Stock Exchange and Shenzhen Stock Exchange.The CSI 300 is jointly issued by Shanghai and Shenzhen stock exchanges in April 8, 2005, which is compiled from a sample of 300 selected index constituent stocks with large scale, better liquidity and best representation in China A-share stock market.Accounting for approximately 70% of Shanghai and Shenzhen A-share market capitalization, the CSI 300 may reflect the overall trend of China stock market and are better diversified.Depending on the former theoretical analysis, this research can make the assumption that the risk of the CSI 300 will be better offset after the introduction of the index futures.
There are many studies on the risk of underlying stock index after the introduction of stock index futures from the volatility perspective, but no consensus has been reached.Some studies support that the volatility of the underlying index increases after the introduction of stock index futures.However, many other studies suggested that the volatility of the spot stock market had decreased after the introduction of stock index futures.This paper is organized as follows: Section 2 includes a review of the literature.Section 3discusses econometric methodology.Section 4 briefly provides information on the data.Empirical results are presented in Section 5. Section 6 provides some conclusion and discussion of the study.Damodaran (1990) suggested marginal had increased in the variances of S&P 500 stocks after trading in S&P 500 index futures began.Lee and Ohk (1992) in their studies, found increased volatility after the Nikkei 225 futures contract was introduced by the SIMEX.For Nikkei stocks, Eric C. Chang et al. (1999) found that stock index futures trading had increased spot portfolio volatility given some absent trading restrictions.However, many other studies suggested that the volatility of the spot stock market had decreased after the introduction of stock index futures.By examining the time series of properties of returns in 25 markets around the world before and after the introduction of stock index futures, H. Gulen and S. Mayhew (2000) found, except United States and Japan, volatility decreased or stayed roughly the same in most of the other countries in sample, with statistically significant decreases in many cases.Thenmozhi (2002) showed that the introduction of stock index futures had reduced the volatility of spot index returns due to increased information flow.Raju and Karande (2003) also reported a decline in volatility of S&P CNX Nifty after the introduction of index futures.Drimbetas et al. (2007) suggested that the conditional volatility of the FTSE/ASE 20 index in Greek stock market had reduced after the introduction of the futures and options into the FTSE/ASE 20 index.S. Bandivadekar and S. Ghosh (2003) found that the volatility in both BSE Sensex and S&P CNX Nifty had declined in the period after index future was introduced.Besides all the studies above, some other studies support there is no change in the volatility of underlying stock index after the introduction of stock index futures.For instance, Freris (1990) studied the performance of the Hang Seng index in Hong Kong and found that the introduction of Hang Seng index Futures had no significant influence on the volatility of the underlying stock index.

Literature Review
The previous literatures on the effects of stock index futures trading have focused primarily on developed markets.Less work has been done for emerging markets, such as Turkish (A.Kasman et al, 2008), Malaysia (W. C. Poket al, 2004), India (Raju M Tet al, 2003).As a newly born market, the Chinese stock markets developing rapidly in recent years, have attracted many researchers.The Chinese stock market can be the representation of the emerging markets.Conclusions from this study may provide some useful information on the interpretation of other similar emerging stock markets.Besides, the previous researches mainly focused on the impact of stock index futures on the spot market volatility.Despite the variance as well as other simple statistics of stock market index volatility can roughly describe its market risk situation; however, these simple statistics imply the unrealistic assumption that different volatilities of stock index have the same probability distribution.Furthermore, the variance or conditional variance (the GARCH model) widely used to measure market risk is easy to grasp, they don't put sufficient emphasis on the extreme losses.VaR typically deals with the low-probability events in the tails of asset return distribution (Z.R. Wang et al, 2010).In this paper, we employ the VaR to measure the stock market risk before and after the introduction of stock index futures from a statistical perspective.As extreme value theory (EVT) is a powerful and robust framework to study the tail behavior of a distribution (R. Gencay et al, 2001) and could give us better estimates and forecasts of risk (Z.R. Wang et al, 2010), the VaRs in this study are estimated based extreme value theory (EVT).

Value at Risk
The attraction of VaR is its simplicity.In a single statistic VaR can estimate the potential loss to which a financial institution is exposed within a given level of confidence over a specified period.VaR is becoming one of the most important methods to measure the risk of an asset or portfolio since its inception in the early 1990s.More and more financial regulations and institutions use it to manage risk, such as the group of 10 banks, the group of 30 banks, the Bank for International Settlements, European Union and Basel Committee.
The VaR() of return can be defined by the equation p(r VaR())=. the significance level, is usually a small percentage.When other conditions keep constant, if  becomes smaller, then the absolute value of VaR() becomes larger.According to the equation, VaR() depends mainly on the significance level, the time interval and the distribution of portfolio return.Different significance levels and models lead to different VaR estimations.
The historical method, variance-covariance methodand Monte-Carlo simulation, applied to estimated VaR, give no insight into the possible losses that can occur in the tail.The value of VaR is closely related to the tail of the distribution.An effective approach focusing on the tail of the distribution is extreme value theory.

Extreme Value Theory
Extreme value theory has been developing rapidly (more details can be found in Reiss and Thomas(2007), Embrechts et al (1997) and lots of notable applications can be found in finance researches (see Lan-Chih Ho et al,1997;V. Fernandez,2007;A. Cifter,2011;R. Gencay et al,2001 and other references).R. Gencay et al(2003) suggested extreme value theory is a powerful and yet fairly robust framework to study the tail behavior of a distribution.In this study, VaR will be estimated based on extreme value theory.
According to Lan-Chih Ho et al. (1997), if the returns are drawn independently from the same distribution, then the distribution function of the sample maximum, Where F R is the population distribution function of return.From Eq. ( 1) we can see that for any r such that F R (r)<1, the limit distribution will have ( ) , X F r   0 so Eq. ( 1) must be normalized to be interesting for large n.
The required normalization transforms both location n and scale n : Where both parameters are positive.If n is large, or if F R is not known exactly, it is preferable to work with asymptotic distributions rather than Eq. ( 1).
Where GEV has generalized the three types of limiting distribution proposed by Gnedenko (1943).Where K is the shape parameter talked above.When K=0, GEV denotes the Gumbel distribution; K>0, Fretchet distribution; K<0, Winbull distribution.GEV has been widely used to estimate the extreme value (see Ruey S. Tsay, 2009 for more details about GEV).
Parameters in F (x) can be estimated by using Maximum Likelihood method with MATLAB.If the function is known, then it is easy to estimate VaR given a significance level.
According to V. Marimoutou et al. (2009), the modeling of extremes may be done in two different ways: modeling the maximum of a collection of random variables, and modeling the largest values over some high threshold, known as the 'peaks over threshold (POT)' model, which is a more modern approach to extreme events.The POT models are generally considered to be more appropriate for practical applications, due to their more efficient use of the limited data as all observations above the threshold are utilized to estimate parameters of the tail.The POT method based on the GEV distribution will be used to estimate VaR in this paper.
POT method emphasizes the conditional distribution of maxima.Given a threshold , with the condition r>, the conditional probability of ry+=x is as follows.Where y is the value that r exceeds .
Where  is the scale parameter and  is the location parameter of GEV distribution.function.k is the shape parameter.Balkema et al. (1974) and Pickands (1975) shows that the distribution function of the excess can be well approximated by the generalized Pareto distribution (GPD) for a sufficiently high threshold .The GPD is as follows.
, ( ) Where k is the shape parameter and ( )   is the scale parameter of GPD.According to Ruey S. Tsay ( 2009), when k 0, the GPD becomes an exponential distribution.An empirical estimate of ( ) , Where N is the sample size and N  is the number of exceedances.Then (6) Where  k and  are the estimates of shape parameter and threshold respectively.And is the estimate of scale parameter.Given a sufficiently large probability p, the estimate of y fulfills the following equation.
(7) For ( ) VaR can be obtained from Eq. ( 7) by solving for (8) One of the crucial steps in the estimation of the VaR by using POT is to choose the appropriate  .Hill plot (1975,2005) and mean excess function plot ( 2001) are two tools to determine  and both tools will be used in this paper.

Data Analyzed
We collect the daily closing prices of CSI 300 from www.finance.cn.yahoo.com.Where r t denotes the daily return and p t is the closing price of CSI 300 at time t.Fig. 2 demonstrates the corresponding returns of CSI 300 in each period.Table 1 reports the summary statistics of returns during the whole period, returns in period 1 and returns in period 2. The summary statistics for daily returns include mean, median, maximum, minimum, standard deviation, Variance, Kurtosis, Skewness, Jarque-Bera statistic, ADF (1 lag), Ljung-Box Q-statistic.
The standard deviations of returns in both period 1 and period 2 are more than 1.6%.The return distributions in period 1 and period 2 are negatively skewed, suggesting the distributions are not normal.The values of kurtosis statistics are smaller than those of normal distributions, so the distributions of returns are not sharply peaked.The P-value of Jarque-Bera statistic is so small, which indicates the distributions of daily returns are evidently different from normal distribution.The P-value of Ljung-Box Q-statistic is large enough, even at 20 lags, so it is not convinced to reject the null hypothesis that there exists no serial autocorrelation in returns.Table 1 indicates that returns in each period are independent but not follow normal distribution.Note: The figure in the parenthesis is p-value.
To examine the homogeneity of variances in two periods, statistical tools should be used.If two populations are from normal distribution, Bartlett's test is for equal variances.This is a test of the null hypothesis that the two samples come from normal distributions with the same variance, against the alternative that they come from normal distributions with different variances.However, when the sample distributions are not normal, Levene's test is an effective alternative.The null hypothesis of Levene's test is that two samples come from non-normal distributions with the same variance and the alternative is samples come from non-normal distributions with different variances.As the samples are not from normal distributions, Levene's test is for homogeneity of variance for time periods before and after futures.The variances of returns in period 1 and period 2 are 0.000333 and 0.000261(see Table 1) respectively, but the P-value of Levene's test statistic (see Table 2) is large enough that the null hypothesis that variances of return or loss distributions in two periods are the same cannot be rejected.Variance describing how far the returns lie from the mean cannot conclude any change in the market risk before and after the stock index futures.The approach of VaR focusing on the potential loss given a time interval and a significance level will be used to measure the market risk in the study.

Empirical Findings
VaR based on extreme value theory merely concerns the extreme value of losses.For a more simple calculation, we define the negative returns (returns multiplied with -1).Extreme value theory works with the right tail of the distribution.Hence, we work with negative return distribution where the right tail corresponds to losses.An investigation of losses distribution through Q-Q plot is needed.Q-Q plot can imply some information about the underlying distribution of a sample.The sample quantities can be plotted against the quantities of the assumed distribution.If the Q-Q plot is approximately linear, it means that the sample comes from the assumed distribution.In the extreme value theory and applications, the QQ-plot is typically plotted against the exponential distribution (i.e, a distribution with a medium-sized tail) to measure the fat-tailness of a distribution.If the data follows an exponential distribution, the points on the graph would lie along a straight line.If there is a concave presence, this would indicate a fat-tailed distribution relative to exponential distribution, whereas a convex departure is an indication of short-tailed distribution.
To determine the sign of shape parameter, QQ plots of losses against exponential distribution (displayed in Fig. 3) are needed.The exponential distribution has a medium sized tail.It can also be referred to Eq. ( 5).When k  0 , the GPD has the same distribution with exponential distribution.The sample quantiles are on the vertical axis.The quantiles of the exponential distribution obtained through maximum likelihood method are on the horizontal axis.
The key point for the parameters estimation of GPD is to determine an appropriate threshold.A low threshold may increase the number of observations (exceedances) but can also introduces some observations from the center of the distribution and the estimation becomes biased.On the other side, a high threshold will reduce the number of observations.A combination of tools should be used to determine a threshold.As presented in 2.2, hill plot and mean excess function are two most common tools to do it.Hill plot was proposed by Hill (1975).
The estimator for the shape parameter k is constructed as follows.
^, , ln ln Where r in the equation, as we have mentioned above, denotes the negative return and proceeds in a descending order.In a hill plot, the order statistic q on x axis is plotted against the k estimator on y axis.The threshold can be selected where the plot becomes stable.More details about hill plot can be found in Embrechts, et al. (1997).
In period 1, the estimator of shape parameter keeps fairly stable in the interval [1.09%, 1.47%].1.09% is a choice for threshold in period 1.Whether 1.09% is an appropriate threshold can be validated in the Q-Q plot of GPD.In period 2, a good choice for the threshold seems to locate in the interval [0.99%, 1.30%].Q-Q plot can also be used to evaluate these optional values for threshold.
Another tool is the sample mean excess function (MEF) which is defined as follows.
, ( ) ( ) Where N  is the number of exceedances, r i , N the return which exceeds the threshold and i the order statistic.r i , N used in the hill plot proceeds in a descending order.
The MEF is the average of the excess over a threshold  and can be used to determine an appropriate threshold The examination of hill plot and mean excess functionmethods indicates that the approximate threshold of period 1 value corresponds to 1.09% which is the intersection of [1.09%, 1.47%] and [0.83%, 1.1%].For the threshold, 1.09%, the number of exceedances is 51 which constitutes the 20.9% of the sample size 244.Fig. 5 is the Q-Q plot of losses in period 1 with respect to GPD.It indicates that the GPD with the threshold 1.09% can well describe the distribution of losses in period 1.The estimate of shape parameter is -0.0949.Its sign, as we have forecasted, is negative.And the estimate of scale parameter is 0.0167.It is possible to estimate a percentile value at the tail just by putting the estimated parameters of the generalized Pareto distribution into Eq.( 8).The threshold is 1.09%.Shape parameter estimation of GPD is -0.0949, scale parameter estimation is 0.0167.All the parameters are estimated with MATLAB.
The threshold of losses in period 2 can not be easily determined.The reference of Fig. 4 suggests an appropriate threshold should lie in [0.99%, 1.25%].It is imposssible to compare all the potential thresholds just by Q-Q plot, because the number of choices is infinite.We can just get an approximate value of the appropriate threshold.
With an interval of about 0.1%, some choices of threshold in this paper are 0.99%, 1.10%, 1.20% and 1.25%.Fitness of these choices can be evaluated by the following Q-Q plots of sample against GPD with different thresholds.Fig. 6indicates that losses in period 2 converage to GPD with the threshold 1.25%.The number of exceedances of threshold is 41 which constitutes the 17.1% of the sample size 241.The corresponding shape parameter estimation is 0.0603.With the expected positive sign, the estimate of shape parameter validates that losses in period 2 are fat-tailed again.The estimate of scale parameter is 0.0118.VaRs at different significance levels can also be obtained by putting the estimates of parameters of GPD into Eq.( 8).
(a) Q-Q plot of losses in period 2 against GPD with the threshold 0.99% Corresponding shape parameter estimation is 0.1327 and scale parameter estimation is 0.0101.
(b) Q-Q plot of losses in period 2 against GPD with the threshold 1.10% The estimate of shape parameter is 0.0902 and estimate of scale parameter is 0.0111.
(c) Q-Q plot of losses in period 2 against GPD with the threshold 1.20% The shape parameter estimation is -0.0112 and scale parameter estimation is 0.0133.
(d) Q-Q plot of losses in period 2 against GPD with the threshold 1.25% The estimate of shape parameter is 0.0603 and the estimate of scale parameter is 0.0118.From Eq. ( 8), VaRs of different significance levels in two periods obtained with estimated parameters are displayed in Table 3.At various significance levels, VaRs in period 1 are greater than those of period 2. Fig. 7 displays the excess of VaRs at different significance levels.Since quantiles higher than 0.95th are more of a concern in risk management applications, VaRs at the significance levels 7.5% and 10% have not been inserted into the figure.
The expected value of the VaR violation ratio is the corresponding tail size (R.Gencay, 2003).For example, the expected VaR violation ratio at 5% tail is 5%.And the calculated value is the ratio that the number of exceedances constitutes the sample size.A calculated value greater than the expected value indicates an excessive underestimation of the risk while a value less than the expected value indicates an excessive overestimation.In period 1(see Fig. 7(a)), the sample size is 244 observations.At 1%, 2.5% and 5% significance level, the calculated value of a violation ratio is 1.23%, 2.05% and 4.92% respectively.In period 1, the calculated values of violation ratios are approximately equal to the expected values, which indicates the right tail of loss distribution in period 1 is well fitted by the POT model.In period 2 (see Fig. 7(b)), at the 1% , 2.5% and 5% significance level, the calculated value of a violation ratio is 1.24%, 2.07% and 4.56% respectively.Similar to the situation in period 1, the calculated values of violation ratios are very close to the expected values.The POT model also fits the right tail of loss distribution in period 2 well.
From the above discussions, the POT model can well depict the right tail of loss distribution in each period, so the VaRs estimated at different significance levels can relatively accurately measure the market risk before and after the introduction of stock index futures.And the VaRs estimated in period 1 are greater than those of period 2. We can conclude the market risk of CSI 300 has decreased after the introduction of stock index futures.

Conclusion and Discussion
The effect of index futures on the underlying stock market risk has been a common concern in the field of finance in recent years.In this paper, no evidence of autocorrelation in daily returns of CSI 300 have been found.Daily losses in each period are independent.The statistical tests conducted in this paper indicate that the return distributions are not characterized by normality and sharply peaked.And in a statistical view, the null hypothesis that the variances of return populations in two periods are the same cannot be rejected.Different with most studies of other stock markets, returns of CSI 300 exhibit right thin-tailed before the introduction of stock index futures and right fat-tailed after the introduction of stock index futures relative to exponential distributions.
Using daily returns data from 16 April, 2009 to 15 April 2011, we estimate the VaRs in two periods with the POT method based on the generalized Pareto distribution.Both the Q-Q plots and tests support that the POT model fits well the tail of the CSI 300 loss distribution.An empirical finding is that VaRs of period 1 are greater than those of period 2, which indicates the market risk of CSI 300 has decreased after the stock index futures.This may be explained by the following possible reasons.Stock index futures with shorting, hedging and arbitraging mechanisms provide an alternative way to profit for investors.Shift of investors from stock market to future market can ease the stock index fluctuation.The findings of A. Antoniouetal (2005) support that futures markets help stabilize the underlying spot markets by reducing the impact of feedback traders and thus attracting more rational investors who make the markets more informationally efficient and thus providing investors with superior ways of managing risk.
The stock market risk is complicated to study.It may be influenced by many factors, including recent or anticipated market conditions.The introduction of stock index futures is one of the exogenous events affecting the market risk.By employing VaR, this paper is intended to study the difference of stock market risk before and after the stock index futures from a statistical perspective.However, more work should be done to confirm the conclusions in this paper.

Figure 5 .
Figure 5. Q-Q plot of loss in period 1 against generalized Pareto distribution (GPD)

Figure 6 .
Figure 6.Q-Q plot of losses in period 2 against generalized Pareto distribution (GPD) All parameters of GPD are estimated with MATLAB.

Figure 7 .
Figure 7. Excess of VaRs generated by generalized Pareto distribution: CSI 300 The CSI 300 in two periods is displayed inFig.1.From April 27, 2009 to August 3, 2009, about3 months, the index rose dramatically from 2513.29 and reached a new peak at 3787.03, then dropped to a local through at 2830.27 in August 31, 2009.The index rose to 3668.83 in December 7, 2009.Another depression lasting almost 8 months came and the index dropped down to 2512.65 in July 5, 2010.Then the index reverted and reached the peak in period 2 at 3548.57.Daily logarithm return on the CSI 300 is calculated from the daily closing price.
The period covers the time from April 16, 2009 to April 15, 2011.The stock index futures have been launched in China on April 16, 2010, which divided the whole time series into two phases.Period 1 is from April 16, 2009 to April 15, 2010 before the introduction of stock index futures and period 2 is from April 16 2010 to April 15, 2011 after the introduction of stock index futures.

Table 1 .
Summary statistics of returnsReturns during the whole period Returns in period 1 Returns in period 2

Table 2 .
Inference about variances of return distributions in two periods

Table 3 .
VaRsin each period at different significance levels