Scale Parameter Estimation of the Laplace Model Using Di ff erent Asymmetric Loss Functions

In the last few decades, there has been an emergent interest in the construction of flexible parametric classes of probability distributions in Bayesian as compared to Classical approach. In present study Bayesian Analysis of Laplace model using Inverted Gamma, Inverted Chi-Squared informative, Levy and Gumbel Type-II priors is discussed. The properties of posterior distribution, credible interval, highest posterior density region (HPDR) and Bayes Factor are discussed in current study. Bayes estimators are derived under squared error loss function (SELF), precautionary loss function, weighted squared error loss function and modified (quadratic) squared error loss function. Hyperparameters are determined through Empirical Bayes method. The estimates are also compared using the posterior risks (PRs) under the said loss functions. The priors and loss functions are compared using a real life data set.


Introduction
Very few real world events that we need to statistically study are symmetrical.Thus the popular normal model would not be a useful model for studying every phenomenon.The normal model at times is a poor description of observed phenomena.Skewed models, which exhibit varying degrees of asymmetry, are a necessary component of the modeler's tool kit.A variety of different forms of the Laplace distribution have been introduced and applied in several areas of real world problems.So Laplace distribution is getting popularity due to simplicity of its characteristic function as an alternative to Normal model.In particular, various forms of the Laplace distribution have been introduced and applied in several areas.In this article Laplace model is considered as a lifetime model using complete and censored data because in many experiment we cannot continue experiment up to last item failure due to cost or time constraints.So in these cases censoring is unavoidable.
It needs to be mentioned here Laplace distribution has been considered before in literature.Kappenman (1975) obtain the conditional confidence interval for the parameters of a double exponential distribution by finding the conditional distributions of the pivotal quantities for location and scale parameters.Kappenman (1977) propose a procedure for obtaining and lower probability tolerance interval for a proportion of a population for the unknown two parameters of Laplace (double exponential) distribution.Balakrishnan and Chandramouleeswaran (1996) present an estimators for the reliability function based on the best linear unbiased estimators (BLUEs) for the location and scale parameters of Laplace distribution based on Type-II censored samples, and they show that obtained estimator is almost unbiased at varying level of reliability, Childs and Balakrishnan (2000) consider the progressively type-II right censored sample for analysis of Laplace distribution.Childs and Balakrishnan (1997) constrain the maximum likelihood estimators for the parameters of Laplace distribution under the general Type-II censored samples, which are simple linear functions of the order statistics.They also examine the asymptotic variance through the Fisher information matrix.Nadarajah (2009) utilize the Laplace distribution random variables with application to price indices.Kozubowski and Nadarajah (2010) motivated by the recent popularity of Laplace distribution, provide a comprehensive review of the known Laplace distributions along with their properties and applications Nadarajah (2010) obtain two posterior distributions for the mean of the Laplace distribution by deriving the distributions of the product XY and the ratio X/Y when X and Y are Student's t and Laplace random variables distributed independently of each other.
The rest of the article is organized as follows.The Laplace model likelihood, posterior distribution using different informative priors and their properties are defined in Section 2 for complete and censored data.Empirical Bayes method, predictive distributions (prior and posterior) are provided in Section 3. Simulation study of different properties defined in section 2 is conducted using simulated data and real data in Section 4. Section 5 contains discussion and derivation of Bayesian Interval Estimation using real data.Bayesian hypotheses testing discussed in Section 6, while Bayes Estimates and their respective Posterior Risks are evaluated using different loss functions in Section 7 for real and simulated data.A model comparison framework is presented in Section 8. Some concluding remarks and further research proposal are given in last Section 8.

Posterior Distribution and Likelihood Function
The posterior distribution summarizes available probabilistic information on the parameters in the form of prior distribution and the sample information contained in the likelihood function.The likelihood principle suggests that the information on the parameter should depend only on its posterior distribution.Bayesian scientist's job is to assist the investigator to extract features of interest from the posterior distribution.In this section we will use the Laplace distribution as sampling distribution mingles with the informative and noninformative priors for the derivation of posterior distribution.A random variable x is said to possess a Laplace distribution if it has the following form: here λ is scale parameter and location parameter is zero.The likelihood function for a random sample x 1 , x 2 ,.... x n which is taken from Laplace distribution is: The likelihood function for censored data as described Mendenhall and Hader (1958) is: Where T is time, r is for censored observations and n-r are uncensored observations.

Posterior Distribution Using Inverted Gamma Prior
Informative priors are those that deliberately insert information that researchers have at hand.This seems like a reasonable and reasoned approach since previous scientific knowledge should play a role in statistical inference.The author is deliberately manipulating prior information to obtain a desired posterior result.An informative prior provides more information than the non-informative priors, therefore the analysis using these prior more accurate and informative than classical approach.
The Inverted Gamma prior of λ is defined as: The posterior distribution of parameter λ for the given data (x = x 1 , x 2 , ..., x n ) using equation ( 1) and ( 2) is: which is the density similar to Inverted-Gamma distribution.So λ|x∼G −1 (α, β) where G −1 for Inverted Gamma distribution and which is the density similar to Inverted-Gamma distribution.So λ|x∼G −1

Posterior Distribution Using the Inverted Chi-squared (ICS) Prior
Using ICS an informative prior with hyperparameters 'a' and 'b', this is defined by the following density: We obtain the posterior distribution of λ for the given data (x = x 1 , x 2 , ..., x n ) using equation ( 1) and (3) as: Which is similar to inverted-gamma distribution, so λ|x∼G −1 (α, β), where α For Censored, the posterior distribution as follows

Posterior Distribution Using Levy Prior
The Levy prior of λ is defined as: The posterior distribution of parameter λ for the given data (x = x 1 , x 2 , ..., x n ) using equation ( 1) and ( 4) is: which is the density similar to Inverted-Gamma distribution and we can write it as λ|x∼G −1 (α, β), where α = 0.5 + n, β = 0.5b + ∑ n i=1 |x i | and here G −1 for Inverted Gamma distribution.For Censored data the posterior distribution of parameter λ , λ > 0 Vol. 1, No. 1; May 2012 Here also λ|x∼G −1 ) .

Posterior Distribution Using Gumbel Type-II (GTII) Prior
Using GTII an informative prior with hyperparameter 'b', this is defined by the following density: We obtain the posterior distribution ofλ for the given data (x = x 1 , x 2 , ..., x n ) using equation ( 1) and ( 5) as: For Censored, the posterior distribution as follows

Properties of Posterior Distribution Assuming Different Priors
Since our Posterior distribution is Inverted-Gamma distribution, here we give general forms of properties for Inverted-Gamma distribution.
, where α and β are respective posterior distribution parameters.

Elicitation of Hyperparameters through Empirical Bayes
Empirical Bayes procedures utilize past data as a means for bypassing the necessity of identifying a completely unknown and unspecified prior distribution having frequency interpretation.Grabski and Sarhan (1996), Sarhan (2003) used empirical Bayes estimation in the case of exponential reliability while Ahn et al. (2006) consider this procedure for the hazard rate estimation of a mixture model with censored lifetimes.The empirical Bayes approach may be considered as a two-stage estimation procedure in which the hyperparameter is estimated from the marginal distribution and then the parameter is estimated using 'pseudo-prior' where the hyperparameters are replaced by the estimation of first stage.Robbins (1955) sought a representation of the desired Bayes rule in terms of the marginal distribution of data (called prior predictive distribution) and then uses the data to estimate it rather than the prior distribution.For excellent discussion about empirical Bayes see Kass and Steffey (1989), Bansal (2007).We use the Inverted Gamma, levy, Gumbel type-II and Inverted Chi-Squared informative priors for the cradle of prior predictive distribution which will be used for empirical Bayes procedure.Followings are the prior predictive equations used for Empirical Bayes procedure.Let Y be the random variable from Laplace distribution with unknown parameter λ.
The prior predictive distribution is obtained by using the following equation:

Prior Predictive Distribution Using Inverted Gamma Prior
The prior predictive distribution using Inverted Gamma prior is: We use above equation of prior predictive distribution for the elicitation of hyperparameters 'a' and 'b'.

Prior Predictive Distribution Using Inverted-chi Prior
The prior predictive distribution using Inverted Chi-Squared prior is: We use above equation of prior predictive distribution for the elicitation of hyperparameters 'a' and 'b'.

Prior Predictive Distribution When Prior is Levy Distribution
The prior predictive distribution using Levy prior is:

Prior Predictive Distribution When Prior is Gumbel Type-II Distribution
The prior predictive distribution using Gumbel Type-II prior is:

Predictive Distribution
Often it is necessary to make predictions of future observations, based on our best inferences on parameters determined through observations already made.Posterior predictive distribution is the product of the posterior distribution and (conditional) independence (given the parameters) of the new observation from the "learning sample".
Where y = x n+1 be the future observation given the sample information x = x 1 , x 2 , ......x n , from of the model with unknown parameter λ.
Predictive Interval can be obtained as:

Predictive Distribution and Predictive Intervals Using Inverted-Gamma Prior
The posterior predictive distribution for a future random variable 'y' given that data x = x 1 , x 2 , ......x n is: And Predictive interval is: For Censored Data And Predictive interval is: where a and b are hyperparameters determined by empirical Bayes method.

Predictive Distribution and Predictive Intervals Using Inverted-Chi Prior
The posterior predictive distribution for a random variable 'y' given that data x = x 1 , x 2 , ......x n is: Equations of Predictive Intervals are: For Censored Data Equations of Predictive Intervals are: Where 'a' and 'b' are hyperparameters to be elicited by empirical Bayes procedure.

Predictive Distribution and Predictive Intervals Using Levy Prior
The posterior predictive distribution for a future random variable 'y' given that data x = x 1 , x 2 , ......x n is: And Predictive interval is: For Censored Data And Predictive interval is: <Table 1>

Predictive Distribution and Predictive Intervals Using Gumbel Type-II Prior
The posterior predictive distribution for a random variable 'y' given that data x = x 1 , x 2 , ......x n is: Equations of Predictive Intervals are: For Censored Data Equations of Predictive Intervals are:

Simulation Study
For conducting simulation study, data from Laplace model is generated using inverse transformation.Simulation study of different properties discussed in above section are given in appendix.From Tables 1-8 in appendix, one can easily observe that mode is more close to the true supposed parameter values than the mean and variance value increase as we increase the parameter value and especially for censored data it is clear that variance is more than complete data because in censored data we are using less information as compared to complete data set.Similar conclusion for skewness and kurtosis tables can be drawn.One thing is common in Table 1-11 that with the increase of sample size our parameter values approach to true values of parameters and variances, skewness and kurtosis decreases.Also comparing the prior, one can easily observe that GTII prior has better results in terms of posterior risk, skewness and kurtosis than the other informative priors.

Real Data Findings
Childs and Balakrishnan (1996) consider the data of mean difference of 33 year in flood stage for two stations on the Fox River in Wisconsin.Following table shows the different posterior distribution properties using real data set assuming IG and ICS informative priors.
GT-II prior performance is better than IG, ICHI and LP prior based on variance, mode, skewness and kurtosis.Graphical presentation as follows: <Figure 1-4> Above figures confirm the same behaviors in terms of their shape but Gumbel Type-II prior is different from others and also from above figure one can easily observe that using censored data information lost but due to application this is unavoidable.Figure 1(i) is for complete data set using informative priors and Figure 1(ii) is for complete and censored data using IG prior.Here in figure 1(ii) IG (c) denotes the censored data.One can easily observe that using censored data we lost information.

Bayesian Interval Estimation
In Bayesian analysis, credible interval becomes the counterpart of the classical confidence interval when we put hyperparameter (s) values equal to zero.Also credible interval may be unique for all models.The difference between credible interval and highest density region (HPDR) is that credible interval is comparable with classical interval but HPDR is unique.

Credible Interval Assuming Informative Priors
, where A, B are the parameters of respective posterior distribution.Thus a (1 Abu-Taleb et al. (2007) and Saleem and Aslam (2009)).

Highest Posterior Density Region (HPDR)
The HPD interval is defined on the posterior density such that the posterior density at every point inside the HPD interval is greater than the posterior density at every point outside the HPD interval.An interval (λ 1 , λ 2 ) would be a (1-α) 100% Vol. 1, No. 1; May 2012 HPD interval for λ if it satisfy the following two conditions simultaneously.

HPDR for Inverted Chi-squared Prior
HPDR equations for Inverted Chi-Squared prior as follows (0.5a

.3 HPDR for Levy Prior
Above two conditions given in equation ( 6), solving for Levy prior we get HPDR equations for Levy prior as follows: For censored data λ and Γ(r + 0.5, (0.5b

HPDR for Gumbel Type-II Prior
HPDR equations for Gumbel Type-II prior as follows (n

For Censored data
For real data set the HPD and Credible intervals for different level of significance evaluated in the following table considering different informative priors.

<Table 2>
The HPD has smaller band than credible interval.ICSP and GTII has wider CI and HPD than IGP and LP.

Bayesian Hypotheses Testing and Bayes Factor
Bayesian hypothesis testing is less formal than non-Bayesian varieties.In fact, Bayesian researchers typically summarize the posterior distribution without applying a rigid decision process.If one wanted to apply a formal process, Bayesian decision theory is the way to go because it is possible to get a probability distribution over the parameter space and one can make expected utility calculations based on the costs and benefits of different outcomes.Considerable energy has been given, however, in trying to map Bayesian statistical models into the null hypothesis testing framework, with mixed results at best.This section contains the Bayesian hypotheses for parameterλ under the posterior distribution, considering using noninformative priors.Posterior probabilities are defined for the H 1 : λ ≥ λ 1 vs. H 2 : λ < λ 1 hypotheses: where p(λ |x) is the posterior distribution of λ given x and P(H 2 ) = 1 − P (H 1 ).
Bayes Factors are the dominant method of Bayesian model testing.These are Bayesian analogues of likelihood ratio tests.The basic intuition is that prior and posterior information are combined in a ratio that provides evidence in favor of one model specification verses another.By dividing posterior odds under null and alternative hypotheses we get Bayes Factor.Bayes Factors are very flexible, allowing multiple hypotheses to be compared simultaneously and nested models are not required in order to make comparisons -it goes without saying that compared models should obviously have the same dependent variable.
The Bayes factor can be interpreted as the 'odds for H 1 to H 2 that is given by the data'.While the Bayesian approach typically avoids arbitrary decision thresholds, Jeffreys (1961) gives the following typology for comparing H 1 vs H 2 .
Here B is used for Bayes factor.

<Table 3>
For the smaller parameter value than Bayes Estimator value we does not get strong support for H 1 but when we go further from it we get strong evidence support for H 1 .For testing λ ≤ 6.0 we observe that decisive evidence againstH 1 for all priors, λ ≤ 7.5 we have strong evidence againstH 1 for informative priors.Minimal evidence against H 1 occur in case of λ ≤ 9.0 and in case of λ ≤ 11.0, H 1 is strongly supported especially for Inverted Gamma prior.Similarly for censored data testing λ ≤ 15.0 we observe that minimal evidence against H 1 for all priors, λ ≤ 16.5 we have minimal evidence against H 1 using informative priors.We have strong evidence against H 1 occur in case of λ ≤ 18.0 and λ ≤ 19.5, H 1 is strongly supported especially for Gumbel prior.The values taken in complete data cannot be taken here because using that values we have decisive evidences due to involvement of time, we are using less information about data.

Bayes Estimators Under Different Loss Functions
This section spotlight on the derivation of the Bayes Estimator (BE) under different loss functions and their respective Posterior Risk (PR).The results are also compared their results for informative prior.Bayes decision is a decision d which minimizes risk function then d is the best decision.If the decision is choice of an estimator then the Bayes decision is a Bayes estimator.The Bayes estimator for different loss function is given below: we use four loss functions name as squared error loss function (SELF), weighted squared error loss function (WSELF), precautionary loss function and modified (quadratic) squared error loss function (M/QSELF).
The squared error loss function (SELF) was proposed by Legendre (1805) and Gauss (1810) to develop least square theory.Later, it was used in estimation problems when unbiased estimators of parameter were evaluated in terms of the risk function which becomes nothing but the variance of the estimators.It was also observed that SELF is a convex loss function and, therefore, restricts the class of estimators by excluding randomized estimators.The difficulty with unbounded loss function, like SELF is that Bayes estimates may change enormously when the observation of the random variable changes infinitesimally.Therefore, the investigator has to be absolutely precise about his probability statements.Furthermore, in real life situations, it is usually being impossible to lose an infinite amount of money.The extensive form of analysis provides Bayes estimate under SELF, as E(λ|x).Also note that squared error loss function is not the only loss function for which posterior mean is the Bayes estimate.The natural exponential family f (λ|x) = a(λ)b(x) exp(λx), the Bayes estimates under entropy loss function is the posterior mean.The Bayes estimate under the weighted SELF may not exist if the weight function w(λ) i ncreases too fast to infinity.Norstrom (1996) introduced an alternative asymmetric precautionary loss function, and also presented a general class of precautionary loss functions as a special case.These loss functions approach infinitely near the origin to prevent underestimation, thus giving conservative estimators, especially when underestimation may lead to serious consequence.Since in risk analysis, both the potentiality of an undesired event and its consequences are investigated.This potentiality is usually measured by either a probability or a failure rate.The Bayes approach is widely applied to estimate this failure rate.When dealing with disastrous consequences, it can be worse to underestimate the potentiality of an event than to overestimate it.This is important when risk-level is the basis of a risk-reducing initiative, either by reducing the potentiality or the consequences.An erroneously low estimated risk-level can lead to the absence of necessary initiative to reduce the risk level.It is unreasonable to use a loss function that allows one to estimate a failure probability of zero.A positive loss function at the origin allows estimating zero, and in a risk analysis, estimating zero failure probability simply means that no risk is anticipated.Hence, a precautionary loss function is used.Also optimal policy selection has traditionally been discussed in relation to symmetric and often quadratic loss functions.So by using non-symmetric loss functions one is able to deal with cases where it is more damaging to miss the target on one side than the other.

Bayes Estimate and Posterior Risk Under Different Priors
Following tables gives the comparison of Bayes estimates and Posterior risk under different priors.

Simulation of Bayes Estimates (BE) and Posterior Risk (PR)
Simulation is a flexible methodology, we can use it to analyze the behavior of a pattern of proposed business activity, new product, manufacturing line or plant expansion, and so on (analysts call this the 'system' under study).In simulation one generates a sample of random data in such a way that mimics a real problem and summarizes that sample in the same way.It is one of the most widely used quantitative methods because it is so flexible and can yield so many useful results.There are different method such as Monte Carlo simulation and Bootstrap to simulate the data.Here we did simulation in Minitab for non-informative prior.Following tables contains the Bayes estimates and their posterior risks.
It is immediate from appendix Table 17-24, as we increase sample size posterior risk comes down and also with the increase of parameters values posterior risk also increase.For censored data posterior risk is greater than the complete data because of time contribution.The choice of loss function as concerned, one can easily observe that modified squared error loss function has smaller posterior risk than other three loss functions.

Real Data Bayes Estimates (BE) and Posterior Risk (PR)
Following tables shows the Bayes Estimates (BE) and their Posterior Risk (PR) in brackets for real data <Table 5> Modified squared error loss function has smaller posterior risk than other three loss functions using Gumbel Type-II prior.

Model Comparison
The comparison of model performances is proposed to be based on the generated posterior predictive distributions.The criterion used to compare them is based on the use of the logarithmic score as a utility function in a statistical decision framework.This was proposed by Bernardo (1979) and used, for example, by Walker and Gutiérrez-Peña (1999) and Martín and Pérez (2009) in a similar context.In situations where the uncertainty is contained in the value of a future observation y = x n+1 , the logarithmic score log (p k (y|x)) is used, where p k (y|x) denotes the posterior predictive density under model M k .Then, the posterior predictive expected utility is given by: Ūk = ∫ log (p k (y|x)) p k (y|x)dy.The optimal solution to the decision problem of choosing among the competing models M 0 ; M 1 , ....; M l is given by the model M k * , such that: Ūk * = max k∈{0,1,...,l} Ūk .From a practical viewpoint, Ūk can be estimated , where y 1 ; y 2 , ... , y m are an independent and identically distributed random sample from p k (y|x).
In order to illustrate this and the applicability of the proposed approaches, a performance comparison between posterior distribution using Inverted Gamma and Inverted Chi-Squared priors, a random sample of size 10 generated from laplace distribution with mean 0 and scale parameter equal to 4 using Minitab v 12 (0.97289, 4.02645, 5.07323, 0.75942, 2.09396, 0.98549, 2.22772, 5.06832, 0.40607, 3.74112).We get U IG =-2.734129,U ICS =-2.732629U Levy =-2.721256 and U GT II =-2.714222 so Gumbel Type-II prior is best.

Conclusion and Suggestions
We consider the Bayesian analysis of the Laplace Model using complete and censored data assuming informative prior.Based on different properties of posterior distribution, hypotheses testing, HPD and credible intervals, we conclude that Gumbel Type II prior results are more precised than other prior.The choice of loss function as concerned, one can easily observed based on evidence (different properties as discussed above) that modified squared error loss function has smaller posterior risk.Model comparison method also suggest that Gumbel Type II prior is best.One thing is common as we increase sample size posterior risk comes down.Also note that we cannot compare the result of complete data with censored data because in censored data we are using less information than the complete data set.In future this work can be extended using truncated Laplace model and considering location parameter.

Bayes Estimator
Posterior Risk

Bayes Estimator
Posterior Risk )

Figure 1 .
Figure 1.(i) For complete data set (ii) Using IG prior where IG(c) denote censored data

Table 1 .
Properties of posterior distribution using real data set

Table 2 .
HDR and CI using complete real data set; Parentheses contain credible intervals, Curly braces HPDR and brackets classical intervals

Table 3 .
Posterior probabilities under null and alternative hypotheses, Bayes factors using IG and ICS for real data

Table 4 .
Bayes estimators and respective posterior risk under different loss function for complete data set

Table 5 .
BEs and PRs using IC and ICS priors under different LFs

Table 7 .
Posterior distribution properties via ICSP for complete data

Table 9 .
Posterior distribution properties via ICSP for censored data

Table 12 .
Posterior distribution properties via LP for censored data

Table 14 .
Skewness and excess kurtosis of posterior distribution assuming different prior for complete data

Table 15 .
Skewness and excess kurtosis of posterior distribution assuming IG for censored data

Table 16 .
Skewness and excess kurtosis of posterior distribution assuming LP for censored data

Table 17 .
BEs and PRs using IGP under L 1

Table 18 .
BEs and PRs using IGP under L 2

Table 19 .
BEs and PRs using IGP under L 3

Table 20 .
BEs and PRs using IGP under L 4

Table 21 .
BEs and PRs using LP under L 1

Table 22 .
BEs and PRs using LP under L 2

Table 23 .
BEs and PRs using LP under L 3

Table 24 .
BEs and PRs using LP under L 4