Estimation of Generalized Pareto Distribution from Censored Flood Samples using Partial L-moments

The research is financed by Ministry of Science, Technology and Innovation (MOSTI) Malaysia, under Vot 79346. Abstract The use of partial L-moments (PL-moments) for estimating hydrological extremes censored data is compared to that of simple L-moments. Expressions of parameter estimation are derived to fit the generalized Pareto (GP) distribution based on PL-moments approach. A Monte Carlo analysis examined the sampling properties of PL-moments and the results showed that with censoring flood samples up to 0.2 are similar to those of simple L-moments. Finally, both PL-moments and L-moments are used to fit the GP distribution to 32 annual maximum flow series of River Golok in Kelantan, and it is found that PL-moments produce a better fit to the larger flow values than simple L-moments.


Introduction
In the field of hydrology, the analysis of annual maximum series of environmental event such as flood is aiming to predict the magnitude of flood of relatively large return period such as 100 years.Wang (1990) pointed out that as small floods are of little relevance to the larger ones, inclusion of data on small floods in estimating high return period floods can sometimes be only nuisance value.Cunnane (1987) stated that these smaller values have only a nuisance value in the context of upper quantile estimation and also in model form testing and verification.He also suggested that it might be advantageous to intentionally censor (or eliminate) low-value observations and in such cases, a censored sample should be used.
Censored sample involves the process of censoring low-value observation in a complete sample that lies below a certain measurement threshold level.In water quality analysis, censored values refer to the values that are less than detection or measurement limits.These censored values may have actually been zero or they may have been between zero and the measurement threshold, yet are noted as being zero (Kroll & Stedinger, 1996).Censored data involves type I and type II censoring.Type I censoring refers to the situation where all data below a fixed threshold value are censored while the number of values censored is a random variable.For type II censoring it is inversed, that is, a fixed number of data points are always censored and the censoring threshold is a random variable (Schneider, 1986).The present study focuses on type II censoring.
Since probability weighted moments (PWMs) were first introduced by Greenwood et al. (1990), it has been widely applied in many fields.PWMs have the theoretical advantages over conventional moments of being able to characterize a wider range of distributions and being robust to the presence of outliers in the sample.Experience also shows that PWMs are less subjected to bias in estimation (Hosking and Wallis, 1997).PWMs have been extended to partial PWMs (PPWMs) and used in distribution parameter estimation and quantile function derivation (Deng & Pandey, 2009).As linear combination of PWMs, partial L-moments were proposed by Wang (1990) to deal with censored samples.
In analyzing censored data of hydrological extremes, Wang (1990Wang ( , 1996) ) used PL-moments in estimation of GEV distribution.Censored sample yield high quantile estimates which are almost as efficient as those obtained from uncensored sample.Moisello (2007) used PPWMs and compared to that PWMs in estimating of quantile functions.The results showed that PPWMs could constitute a valid tool.Bhattarai (2004) followed the same approach as Wang (1990) in investigating the sampling properties of GEV distribution.It is found that, in some situation, PL-moments can produce a better fit to the larger flow values than simple L-moments.
The purposes of this study are to derive the expressions of PL-moments from uncensored floods sample and to fit the parameters of the GP distribution using PL-moments approach.This study expands the work of Bhattarai (2004) to GP distribution and performs simulated data to evaluate the sampling properties of simple L-moments and PL-moments.

Parameters estimation of GP distribution using PL-moments
The PWMs of a random variable x with a cumulative distribution function F(x) was formally defined by Greenwood et al. (1979) as where p, r and s are real numbers.When p = 1 and s = 0, the moments become For an ordered sample x (1) ≤ x (2) ≤ ... ≤ x (n) , the following statistic is defined by Wang (1990) as an unbiased estimator of PWMs so defined can only be used for a complete sample.L-moments are linear functions of PWMs.The concept of PWMs, however, can be easily extended so as to be applied to a censored sample.The partial PWMs can be defined as where F 0 = F(x 0 ), x 0 being the censoring threshold.Partial L-moments are variants of L-moments and also analogous to the PPWMs.When p = 1 and s = 0, the PPWMs become ,the unbiased estimator of β r is given by Wang (1990) as where The relationship between L-moments and PWMs is given by Hosking (1990) as where β 0 , β 1 , β 2 and β 3 are the PWMs and λ 1 , λ 2 , λ 3 and λ 4 are the first four L-moments.Similar linear relations can be established between PPWMs and the PL-moments.The level of censoring, F 0 , which is selected a priori, determines the number of the sample data points to be censored as where n is the length of the uncensored sample and n 0 is the number of occurrences of values which do not exceed x 0 in the sample (censored data points).

Derivation of parameters estimation of GP distribution using PL-moments
The generalized Pareto distribution is a special case of the Wakeby distribution.The reparameterized form for the Wakeby distribution is If γ = 0 and β = k in equation ( 13), the Pareto distribution results as The distribution function F = F(x) is explicitly written The PPWMs of the GP distribution are given by Moisello (2007) as 16) where F 0 = F(x 0 ), x 0 being the censoring threshold.From equations ( 16) and ( 17), one can write When F 0 is known, one can replace β r by b r and estimate parameters ζ, α and k as the solutions of equations ( 16) to (20).The exact solution of equation ( 20) requires iterative methods which are cumbersome.The following simple method proposed by Wang (1990) can be used instead.
Let z equal the right-hand side of equation ( 20), that is When z are plotted vs. k for fixed F 0 , the curve is very smooth.The exact location of the curve changes with F 0 value.The curve can be accurately approximated by a quadratic function of form For fixed F 0 three z values can be calculated by equation ( 21) corresponding to three chosen k values, e.g.k = -0.4,-0.1 and +0.4 as the limiting form of equation ( 21).Substituting these three values into equation ( 22) will produce a set of linear equations.One can then find the solutions for a 0 , a 1 and a 2 corresponding to that F 0 .
When z is replaced by its sample estimate, and substituted in equation ( 22), one can find the estimate for k.The other two parameters can then be estimated successively using the relationship in equations ( 16), ( 17) and ( 19) as

Simulation study
Monte Carlo simulation techniques have been performed to investigate the sampling properties of L-moments and PLmoments in estimating the parameters of GP distribution from censored flood samples.For this purpose, Monte Carlo simulations were performed for sample sizes, n of 20, 30, 40 and 50 and parent distribution with ζ = 0.0, α = 1.0 and k varying from -0.4 to +0.4.Different levels of censoring threshold are considered, namely, F 0 = 0.0, 0.1, 0.2, 0.3, 0.4, 0.5 and 0.6.When F 0 = 0.0, the PL-moments become the ordinary L-moments.The number of replications, M used in the simulations for each case is 10 000.
Two of the more commonly error functions used in such cases are bias and efficiency evaluated for four quantiles of 50, 100, 200 and 500 years return period, i.e. x(F = 0.980), x(F = 0.990), x(F = 0.995) and x(F = 0.998).Efficiency has been obtained from relationship of PL-moments and L-moments in terms of mean square error using following formula φ = mean square error of estimator using L-moments mean square error of estimator using PL-moments (26)

Monte Carlo simulation study
To evaluate the performances of PL-moments (with censored data at different levels of censoring threshold, F 0 ) and simple L-moments (with uncensored data), two sampling properties were employed, namely bias and efficiency on different quantile estimators, x(F).
The results indicated that bias on quantile estimates obtained from PL-moments with the level of censoring threshold of F 0 ≤ 0.2, for sample sizes n = 20 − 50 is almost very similar to those simple L-moments.For the censoring threshold of F 0 > 0.2, the value of bias negatively increase with the increase in F 0 .Figure 1(a) and 1(b) show the results of bias obtained on quantile estimator, x(F = 0.998) at different censoring threshold, F 0 .
The value of bias for simple L-moments (i.e.F 0 = 0.0) and PL-moments with F 0 ≤ 0.2 lies between -1 and +1, but for F 0 > 0.2, the negative value of bias increase substantially. Figure 1(c) and 1(d) indicate that the above results are also true for other estimators x(F = 0.980), x(F = 0.990) and x(F = 0.995).These can be concluded that the method of PL-moments with level of censoring up to 0.2 will be almost unbiased over that the method of simple L-moments.
Table 1 provides the results of bias on quantile estimators of GP shape parameters, k = −0.1 and k = +0.1 with various values of censoring threshold F 0 and different sample sizes (n = 20 − 50).For both GP shape parameters, k = −0.1 and k = +0.1, as highlighted, the bias on quantile estimators of x(F = 0.995) and x(F = 0.998) from the method of PL-moments are smaller and closer to zero than the simple L-moments.This is true for the PL-moments with the level of censoring threshold F 0 ≤ 0.2.It signifies that at certain particular conditions such that when predicting in higher return period, the method of PL-moments can sometimes produce a better performance than that of simple L-moments.
Bias and efficiency from the method of simple L-moments and PL-moments with quantile estimators, x(F = 0.995) at different levels of censoring, F 0 and different sample sizes, n were plotted against the value of GP shape parameters, k.
It is observed that, for the value of k > −0.1, the value of bias is almost unaffected by the increases value in GP shape parameters, k.Clearly shown in Figure 2(a) and 2(b), bias increases slightly for k > −0.1.Results also show that the bias from the method of PL-moments with censoring threshold F 0 = 0.1 is closest to the simple L-moments followed by PL-moments at F 0 = 0.2, 0.3 and 0.4.Also, the values of absolute bias for k > −0.1 and F 0 ≤ 0.2 are closest to zero and even lower than that from simple L-moments.
The efficiency on quantile estimators, x(F) are plotted against the value of GP shape parameter, k, with different levels of censoring threshold, F 0 at different sample sizes, n.It is revealed that the PL-moments at F 0 = 0.2 always lead to higher efficiency for k > −0.1 but has lower efficiency for k ≤ −0.1 compared to F 0 = 0.1 as shown in Figure 2(c) and 2(d).However, the efficiency of PL-moments at both F 0 = 0.1 and F 0 = 0.2 always perform better than L-moments for k > −0.1 and result in lower efficiency for k ≤ −0.1.

Data analysis
To illustrate the application of GP distribution using PL-moments approach for analysis of censored flood samples, a set of annual maximum flow series for station 6019411 Golok River which is located in Kelantan, Malaysia is presented in this study.The data consists of 32 annual maximum series from year 1977 until 2008, are listed in Table 2.The flood data was obtained from Department of Irrigation and Drainage, Ministry of Natural Resources and Environment, Malaysia.
The parameters estimates of the data set, using simple L-moments and PL-moments at level of censoring threshold, F 0 = 0.2 are given in Table 3. Observed and computed frequency curves for the data set is plotted in Figure 3.The observe data values are plotted against the corresponding EV1 reduced variate.
From this fitted plot, it is generally observed that the frequency curve obtained by PL-moments with level of censoring threshold, F 0 = 0.2 lies much closer to the observed data than the simple L-moments.

Summary and conclusions
Partial L-moments (PL-moments) are extended from L-moments and analogous to the partial probability weighted moments to be used for censored samples.A Monte Carlo simulation study was performed to investigate sampling properties of PL-moments involving various sample size n, different values of GP shape parameters k, different censoring threshold F 0 and for different quantile estimators x(F).Results reveal that the bias from the method of PL-moments with censoring up to F 0 ≤ 0.2 is almost similar to and even lower than those of simple L-moments (F 0 = 0.0).Values of absolute bias for GP shape parameter, k > −0.1 are closest to zero and produce even less bias than simple L-moments for certain F 0 .
Similarly, the efficiency of PL-moments for censoring level of F 0 ≤ 0.2 always lead to better efficiency than L-moments for the values of GP shape parameter, k > −0.1.
An application to annual maximum flow series data at River Golok in Kelantan, Malaysia involving 32 sample sizes were performed for the method of simple L-moments and PL-moments.Results show that PL-moments is quite effective in fitting GP distribution to these floods data, and in some cases, produce even better performance than the simple L- where ζ, α and k are the location, scale and shape parameters of the GP distribution respectively.The variable x takes values in the range ζ ≤ x < ∞ for k < 0 and ζ ≤ x < ζ + α/k for k > 0. The special case of k being 0 yields the exponential distribution, whereas the special case of k = 1 yields the uniform distribution on [ζ, ζ + α].Pareto distributions are obtained when k < 0.

Table 1 .
Bias on quantile estimators of GP shape parameters, k = −0.1 and k = +0.1 for different levels of censoring F 0 and different sample sizes n

Table 2 .
Annual maximum flow series in m 3 /s for station 6019411 Golok River in Kelantan, Malaysia years 1977 until 2008