Empirical Value at Risk for Weak Dependent Random Variables

In this work, we study the empirical estimator of the Value at Risk (VaR for short) for weak dependent observations. Our approach uses the oscillation of the empirical process under hypothesis of moment’s inequality. We provide general conditions which ensure the convergence of the empirical estimator of the VaR. We also prove a central limit theorem (CLT) for the difference. We perform some simulations for different sequences to illustrate our results. Finally, we apply the results for different sequences under assumptions of mixing or covariance.


Introduction
The Value at Risk VaR is a method to evaluate financial risks.It summarizes the risks of loss in a unique number and aggregating the risks of market through several classes of financial assets (stocks, bonds, etc.).
The VaR is a probabilistic measure of the possible loss for a given horizon.It represents a level of loss, for a financial position or a portfolio, which will be exceeded during a given period only with a chosen typically small probability.
The VaR is obviously neither the loss which one can expect nor the maximum loss which one may suffer, but a level of loss which will be exceeded only with a level of a fixed probability q.
Definition 1 (P&L and loss function) Let P t be the value of a portfolio of assets at time t.Then the variation of the value of this portfolio over the interval [t, t + T ], is called the profit-and-loss (P&L) function: P t ≡ P t+T − P t , and the function X t :≡ − P t is called the loss function.
In practice, we decide to fix T (e.g. one day or one week), yet P t ≡ P t+1 − P t .
Definition 2 (Value at Risk) The Value at Risk VaR (q) of a portfolio of assets for a period [t, t + 1] at the confidence level q ∈ (0, 1) is given by the smallest number x such that the probability that the loss X t exceeds x is no larger than (1 − q).Formally VaR(q) ≡ inf {x : P (X t > x) ≤ 1 − q} or VaR (q) ≡ F −1 t (q) = inf {x : F (x) ≥ q} := ξ.
(1.1) where F t (x) = P (X t ≤ x) , x ∈ R is the distribution function of X t and F −1 t its quantile function.Definition (1.1) clearly shows that the knowledge of the distribution function (in short d f ) of the r.v X can determine the VaR (q).Often the function F is assumed to be normal.However a lot of financial practitioners use historical distributions which are far from being normally distributed (see e.g.Cont, 2001).Moreover, in general, the historical data have an intertemporally dependent structures.Indeed the assumption that the variables (X i ) 1≤i≤n (denote the variations (− P i ) 1≤i≤n in the value of a portfolio over the n periods) are i.i.d, is not easily satisfied in practice.Hence the feeling of the need of taking into account a possible dependence structure or an effect of memory in the observations.In order to model and measure this memory aspect in the data, we consider two cases: correlations or mixing coefficients.
So the main objective of this paper is to provide ways which allow to tackle the issue of estimation of the VaR in the cases where there is either a lack of parameterizations of F or some weak dependency among the data.To do so, we use the empirical distribution function F n (x) = 1 n n i=1 I (X i ≤x) , where x ∈ R and I is the indicator function, for a stationary sequence of dependent real-valued random variables (X i ) 1≤i≤n to estimate the VaR.
We note that if we order the independent random variables X n,1 ≤ X n,2 ≤ ... ≤ X n,n then VaR e (q) can be written as where [a] is the integer part of a.
Next let us recall the definitions of some mixing coefficients which are criteria needed to introduce dependency measures between variables.
Let (Ω, K, P) be a probability space and let A, B be two sub σ−algebras of K. We define: 1) The α−mixing coefficient by: 2) The ρ−mixing coefficient by: where corr ( f, g) = .
3) The ϕ−mixing coefficient by: Finally, we say that a stationary sequence (X i ) i∈Z is strong mixing or α−mixing, if The paper is organized as follows.The section 2 is related to the notion of oscillation of an empirical process which is defined for each f x ∈ F by: where F is the set of characteristic functions of intervals of the form (−∞, x) for any x ∈ R. We study the mean of the modulus of continuity of the empirical process defined by where Our method is inspired by the work by Ben Hariz (2005) who studied the stochastic equicontinuity of empirical processes indexed by a family of functions.
In the section 3, which is the main part of this work, we prove the consistency as well as a central limit theorem for the VaR, i.e.
√ n where is assumed to satisfy 0 < σ 2 ∞ (ξ) < ∞.In the section 4, several applications are discussed.Finally, the section 5 is devoted to simulations which illustrate the results.

Oscillation of the Empirical Process
First let us introduce the following assumptions: H(X) : (X i ) 1≤i≤n is a stationary sequence of real-valued random variables with a common distribution function F.
H(p, X) : For all positive real numbers 2 ≤ v < p < r ≤ ∞ and for any ε > 0, there exists a positive constant where 0 < a n → n→∞ 0, and F has a density function f which is continuous and 0 In the proofs C denote constant where values may change from one line to another.We will now focus on the modulus of continuity of an empirical process (X i ) 1≤i≤n .
Theorem 1 Under conditions H(X) and H(p, X), there exists (2.1) Let q 0 , k and q 1 ∈ N such that q 0 ≤ k ≤ q 1 , we define for 1 ≤ i ≤ 2 vq 0 , then the sets E i form a partition of F .For δ ∼ 2 −q 0 ⇔ q 0 ∼ − ln δ ln 2 , we define: Let now Λ = (i, j) : F i, j ∅ .For every pair (i, j) ∈ Λ, we fix an element of F i, j and denote this pair φ i, j , ψ i, j .
Let f x , f y be a pair satisfying f x − f y v ≤ δ, then f x , f y ∈ F i, j for some (i, j) ∈ Λ.We write That gives by applying the expectation: In order to control the terms E 1 and E 2 , we put Z n ( f ) F = sup f ∈F |Z n ( f )| , and we use the following inequality due to Pisier: For all random variables Z 1 , Z 2 , ..., Control of E 1 : For f ∈ F , we write: Therefore, where ) and π k ( f ) take values on a finite set N (k) ≤ 2.2 vk .Then using Pisier's inequality, we can write: Apply H(p, X) to h = g − π k−1 (g) to get: Hence, Similarly for E 1,q 1 +1 : , we obtain: Then, and Δ x q 0 (i) v ≤ 2 −q 0 ) and φ i, j − ψ i, j v ≤ δ, using the inequality of Pisier, we get Again by H(p, X), Then, (2.4) Thus, from (2.3) and (2.4) we conclude that: . We have δ ∼ 2 −q 0 then 2 Take q 1 such that As q 1 and q 0 have to satisfy q 0 < q 1 then δ > n Proof of Remark 1.The proof of the first point of the Remark 1 has the same steps of the proof of Theorem 1 up to the inequality (2.2).This relation becomes in the case where r = p, Since Therefore,

Limit Theorems for the Empirical VaR
In this part we will apply the results of the previous section on the fluctuations of the empirical process to deduce asymptotic results on the VaR (q).
The proof of the previous theorem is based on the two following lemmas: Lemma 1 Under conditions H(X), H(F) and H(p, X) where ε ≤ p 2 − 1, we have for a n > 0, Proof of Lemma 1.Let s = nq + 1.Then, we note that Sen, 1972) then, using H(F) and the first-order Taylor expansion of F (ξ − a n ), one obtains And by Markov's inequality, this is bounded by Then, Consequently for 0 < a n and ε ≤ p 2 − 1 For the second term, we note that: But, using H(F) and the first-order Taylor expansion of F (ξ + a n ), one obtains and by Markov's inequality, this is bounded by In the same way for the first term, we have By H(p, X), Then, Consequently for 0 < a n and ε ≤ p 2 − 1 Thus, from (3.1) and (3.2) we conclude for 0 < a n and ε ≤ p 2 − 1 The following lemma studies the proximity between Lemma 2 Under conditions H(X), H(F) and H(p, X) where ε < Proof of Lemma 2. Let 0 < a n and 0 < b n , we note that 2 − 1 and 0 < a n , then by Lemma 1: .
If H(F) is verified, then F is locally Lipschitz, then for |y − ξ| ≤ a n , we have In addition, by Markov's inequality and Theorem 1 for Consequently, for Finally, by the definition of Z n ( f x ), we obtain Proof of Theorem 2. By Lemmas 1 and 2 for a n n − 1 2 and b n max n And by Lemma 1 for a n n − 1 2 , Then by (3.3), (3.4), (3.5), (3.6) and Slutsky's Theorem (Cramér, 1946, p. 254), we have: .

Applications
In this section we apply the previous results for different sequences.Using the findings of Hu (2003Hu ( , p. 1124) and Peligrad (1985, Theorem 2.1, p. 1305), we apply our result to ϕ−mixing case.Making use of the result of Utev and Peligrad (2003, Theorem 2.1 and 2.2), we apply our result to the ρ−mixing case and to α−mixing by mean of the results in Shao and Yu (1996, Theorem 4.1) and Rio (1997, Theorem 7.2).We also consider the nonlinear functional of Gaussian sequences to which we apply the result of Ben Hariz (2011) and Breuer and Major (1983).
Finally we compare the results with those in the existing literature.

ϕ−mixing Process
Corollary 1 Under condition H(X), if the ϕ−mixing coefficient satisfies If H(F) is verified, then for a n n − 1 2 we have .
Proof of Corollary 1.When (X i ) i≥1 are identically distributed, using a Lemma by Hu (2003Hu ( , p. 1124)), if then, there exists a positive constant K = K(p, ϕ (.)) such that for all n ≥ 1 and for any f Then H(p, X) is satisfied with ε = 0, v = 2 and p = r.Apply now Theorem 1 for δ If H(F) is verified and a n n − 1 2 , then by Lemma 1 for p > 2 we obtain we will apply a result by Peligrad (1985, Theorem 2.1, p. 1305) The condition (L) therein can be written for > 0, The conditions: .

ρ−mixing Process
For a stationary sequence (X i ) i∈Z ,we define . We apply a result by Utev and Peligrad (2003, Theorems 2.1 and 2.2) to prove the following Theorems: Corollary 2 Under condition H(X), we assume: H (ρ) : There exists a real number 0 ≤ η < 1 and integer number If in addition the sequence (X i ) i≥1 is stongly mixing and .
Proof of Corollary 2. Assuming that the condition H (ρ) is satisfied and the random variables are identically distributed, then by Utev and Peligrad (2003, Theorem 2.1), for any p > 2, there exists a positive constant Apply now Theorem 1 with the condition H(p, X) where ε = 0, v = 2 and p = r, we obtain If H(F) is verified and a n n − 1 2 , then by Lemma 1 for p > 2 we obtain we will apply a result by Utev and Peligrad (2003, Theorem 2.2, p. 105) The condition (2.5) of Utev and Peligrad (2003): .

α−mixing Process
Corollary 3 Under conditions H(X) and H(F), if the α−mixing coefficient satisfies Then, for a n n − 1 2 we have .
Proof of Corollary 3. When (X i ) i≥1 are identically distributed, then by Shao and Yu (1996, Theorem 4.1), if Then, for some real numbers 2 , then by Lemma 1 we have For determining θ which allows to apply Theorem 1 we need v < p < r and p 2 1 Then we have If in addition 0 < σ 2 ∞ < ∞, then by Rio (1997, Theorem 7.2) for Finally, by applying Theorem 2 for a n n − 1 2 , we obtain that .

Nonlinear Functional of Gaussian Sequences
Corollary 4 Let X i = G (Z i ) where G is a measurable function and (Z i ) is a stationary Gaussian sequence with zero mean and covariance function .
Proof of Corollary 4. The proof of this corollary is a consequence of the following results: Lemma 3 (Ben Hariz, 2011) Let p be an even integer and assume that ∞ i=0 | (i)| < ∞, then there exists a constant K = K (p, ) such that for all n > 0, We apply Lemma 3 for And by Theorem 1 For the central limit theorem we need to apply the following results due to Breuer and Major (1983), (see also Csörgo, Sándor & Mielniczuk, 1996, for a functional extension) .
Lemma 4 Let (Z i ) be a stationary Gaussian sequence with a covariance function satisfying where .

Comparison with the Existing Results of the Literature
• In Sen (1972), Sen has proved that for a ϕ-mixing sequence of random variables, if we have .
which is stronger than our condition: Indeed, ∞ i=0 ϕ 1 2 (i) < ∞ needs an algebraic decay of the the mixing coefficient ϕ (i) , and ∞ i=0 ϕ 1 p 2 i < ∞ needs only a logarithmic decay.
• In 2005, Chen and Tang studied the nonparametric estimation of the Value at Risk (VaR) for a geometric α-mixing sequence of random variables, that means α (k) ≤ cρ k where k ≥ 1, c > 0 and ρ ∈ (0, 1) .
Using the kernel estimation of the VaR: ) is a distribution function of a kernel density K, they showed that: .
• Lahiri and Sun (2009) showed that for a α-mixing sequence of random variables such that α (n) ≤ dn −θ where θ > 12, the empirical VaR (q) satisfy for a constant C > 0 and n ≥ 1 where Φ is the standard normal distribution.In particular they obtained as n → ∞, .
Observe that for the CLT to hold for strong mixing sequences, we only need that α (k) ≤ Cn −θ with θ > 1 + √ 2.
Remark 2 Our results also apply for stochastic differential equations and stochastic volatility models discretely observed.Indeed, Genon-Catalot et al. (2000) showed that, under some conditions, these models as well as theirs discrete versions, satisfies geometric α or ρ−mixing.Therefore the main hypothesis H(p, X) is then fulfilled for any p ≥ 2. Regarding GARCH models which are also widely used in financial modeling, we mention that Davis et al. (1999) showed that under conditions on the moment of the innovations and on the Lyapunov exponent associated to the sequence, the squared of the GARCH sequence is geometric α−mixing.Hence, our results apply also for GARCH models.

Simulation Studies
In this section we present some numerical studies which illustrate the conditions under which VaR (q) converges to VaR (q).In these simulations, we choose a correlated Gaussian and Pareto sequences.In both cases, we compare the VaR(q) where q = 0.95 to the empirical estimate of VaR(q).For each set of parameters, we run (M = 10000) Monte Carlo simulations and compute the mean absolute error (MAE(n)) between VaR (q) and VaR (q) We also give a confidence interval with level 95% to the VaR(q).We consider three different models.First, a correlated Gaussian sequence, then a correlated sequences with Pareto marginal distributions and finally a stochastic volatility model.

Case 1: Dependent Gaussian Process
Let (X i ) 0≤i≤n be a Gaussian sequence with zero mean, unit variance and a correlation function given by: where α > 0. The parameter α tunes the strength of dependence.In particular α = ∞ corresponds to the i.i.d.sequence, whereas α = 0, ( n (i) = 1) gives perfectly correlated sequence.
We study the process: where f 2 (VaR(q)) .Here we recall that VaR (0.95) = 1.6449.In Figure 1, we plot the mean absolute error with a 95% confidence interval as a function of n for different values of α when q = 0.95.Clearly the MAE(n) goes to zero when n large, for any α > 0. The simulations shows that the VaR (q) is consistent when the correlation parameter α > 0. When α > 1, in Figure 2, we plot √ n MAE against n to see that it converges to a constant.In Figure 3, we see that the MAE (n) as a function of α for different values of n with q = 0.95, tends to zero for large values of n.In Figure 4, we compare the histogram of T n for α = 3 and n = 800 with the density function of Gaussian distribution N 0, τ 2 ∞ .Clearly, for α > 1 the histogram of T n is close to the normal distribution, confirming our result (5.1).

Case 2: Dependent Pareto Process
We now consider the VaR (q) for a correlated Pareto sequence (X i ) 0≤i≤n .Recall that the distribution function of Pareto is defined for β > 0 by: To construct a correlated Pareto sequence, we let where Φ is the Gaussian distribution N (0, 1) and {Y i } 0≤i≤n is a correlated Gaussian sequence defined as in the previous example.As in the first case, we study the process T n to illustrate the central limit theorem (see (5.1)).Here VaR (0.95) = 2.7144 when β = 3.
In Figure 5, we plot MAE (n) with a 95% confidence interval as a function of n for different values of α when q = 0.95.Clearly, the MAE goes to zero when n large, for any α > 0. The simulations shows that the VaR (q) is consistent when the correlation parameter α > 0. When α > 1, in Figure 6, we plot √ n MAE (n) against n to see that it converges to a constant.In Figure 7, we see that the MAE (n) as a function of α for different values of n with q = 0.95, tends to zero for large values of n.In Figure 8, we compare the histogram of T n for α = 3 and n = 800, with the density function of Pareto distribution.Here again, when α > 1, the CLT is satisfied.

Case 3: Stochastic Volatility Models
We assume that VaR (q) of the correlated sequence (X i ) 0≤i≤n with stochastic volatility: where (ε i ) 0≤i≤n is an iid Gaussian sequence N (0, 1) and (σ i ) 0≤i≤n correlated Gaussian or Pareto sequences.
As in the first case, we study the process T n to prove (5.1) where VaR (0.95) ≈ 1.5949 for the Gaussian sequence and VaR (0.95) ≈ 2.4615 for the Pareto sequence with β = 3.In Figure 9, we compare the histogram of T n for α = 3 and n = 800, with the density function of Gaussian distribution N 0, τ 2 ∞ using two cases (Gaussian and Pareto for the distribution function of σ i ).Here again, when α > 1, the CLT is satisfied.

Figure 1 .Figure 3 .
Figure 1.The Mean Absolute Error (MAE(n)) with 95% confidence intervals for correlated Gaussian sequence with correlation function n (i) = (1 + |i|) −α is plotted against the sequence length n for different values of dependence parameter α

Figue 4 .Figure 5 .Figure 7 .
Figue 4. Comparing the histogram of T n for a Gaussian sequence where α = 3 and n = 800, with the density function of Gaussian distribution N 0, τ 2 ∞ the bracketing number) be the minimal number of brackets which are of a norm .v less than or equal 2 −k needed to cover F .As N (k) ≤ 2.2 vk is finite (see e.g.