The Bayes Premium in an Aggregate Loss Poisson-Lindley Model with Structure Function STSP

Many premium calculating problems in actuarial science consider the number of claims, denoted as K, as the variable risk. Traditionally, this random variable is modelled by the Poisson distribution. However, it is well known that automobile insurance portfolios are characterized by zero-inflation (high percentage of zero values in the sample) and overdispersion (the variance is greater than the mean), and the Poisson distribution does not properly reflect the last phenomenon. In this paper we determine the Bayes premium considering that K follows a PoissonLindley distribution, with parameter θ1 in [0, 1], which is a potential alternative to describe these situations. As the structure function for θ1 we elicit the standardized two-sided power distribution, which is a reasonable alternative to the usual beta distribution. In addition, an aggregate loss model is considered with primary distribution given by the Poisson-Lindley distribution. A Bayesian analysis is developed to obtain the Bayes premium. The conclusion is that the STSP is not an adequate alternative in the problem in question because it is more informative and less dispersed than the Beta distribution.


Introduction
An important issue in insurance theory is that of premium calculation.In particular, determining the Bayes premium is the natural aim when prior information and claim experience are considered (see Klugman, 1992).
In many practical problems of actuarial science, the claim experience is described by a frequency distribution for the number of claims, K, considered as the magnitude risk.Traditionally, this random variable is modelled by the Poisson distribution.However, it is well known that automobile insurance portfolios are characterized by zeroinflation (high percentage of zero values in the sample) and overdispersion (the variance is greater than the mean), and the Poisson distribution does not properly reflect the last phenomenon.For this reason, as Nikoloulopoulos and Karlis (2008) point out, overdispersed models (relative to simple Poisson) are potential alternative to describe these situations.In particular, mixed Poisson distributions are widely considered for the random variable K.For a detailed review of models based on the mixed Poisson distributions see Cohen (1966), Willmot (1986), Grandell (1997), Nadarajah andKotz (2006a, 2006b) or Antzoulakos and Chadjiconstantinidis (2004), among others.Some of the advantages of these distributions obtained by mixing are that they are overdispersed, and that they also assign high probabilities at k = 0, which is very adequate because the mode of the variable number of claims is often at this value.Sometimes, this situation is managed with zero-inflated distributions (see Angers & Biswas, 2003;Yip & Yau, 2005;Boucher et al., 2007) which are not considered here to avoid introducing another parameter in the problem.
One of the mixed distribution is the Poisson-Lindley distribution, with parameter θ 1 ∈ (0, 1), proposed by Sankaran (1971).It has been widely studied by Ghitany et al. (2008) or Ghitany and Al-Mutairi (2009).An extension of this distribution, which includes an additional parameter is suggested in Mahmoudi and Zakerzadeh (2010).Some recent works have pointed out the usefulness of this one-parameter distribution in the Bayes premium calculating problem (see Hernández-Bastida et al., 2011or Martel-Escobar et al., 2012).
In other practical situations it is considered the aggregate loss model.In actuarial risk theory, the collective risk model, hereafter crm, is described by a frequency distribution for the number of claims K and a sequence of independent and identically distributed random variables representing the size of the single claims X i .Frequency K and severities X i are assumed independent.Note that the independence assumed here is conditional on distribution parameters.There is an extensive body of literature on modelling the risk process, see e.g.Klugman et al. (2004) or McNeil et al. (2005).Estimation of the annual loss distribution by modelling the frequency and severity of losses is a well known actuarial technique.It is also used for modelling solvency requirements in the insurance industry; see e.g.Sandström (2006) or Wüthrich (2006).Then the aggregate loss S is the sum of the individual claim sizes, i.e. S = K i=1 X i , for K > 0, and S = 0, for K = 0.The present paper aims to develop a Bayesian analysis of the PL model for the number of claims.The structure function for the parameter θ 1 is given by the standardized two-sided power (ST SP) distribution, a reasonable alternative for the beta distribution.
The analysis include two steps.The first step, which is developed in Section 2, considers that the risk variable is the number of claims.In this section the model is presented.The premiums are determined and an application to real data is provided.In the second step, which is developed in Section 3, it is considered an aggregate loss model and the Bayes premiums are also obtained.Both sections show the results obtained from the comparison with the beta distribution, since it is a usual distribution in this problem.A final section draws the main conclusions.

The Claims Number Model
In this section the variable number of claims, denoted as K, is considered as the variable risk.We determine the Bayes premium, when the ST SP is considered as structure function, and it is compared with the Bayes premium when the structure function is the usual beta distribution.An application to real data is also provided.

The Model
Let K be the random variable number of claims taking values {0, 1, . . .} which is assumed to follow a Poisson-Lindley distribution with a probability mass function (pmf) given by for k = 0, 1, . . ., and 0 < θ 1 < 1, (hereafter the PL model).
It is well known that its moment generating function is given by M PL (t; θ 1 ) = θ 2 1 (2−θ1−e t +θ 1 e t ) (1−e t +e t θ 1 ) 2 .The first two moments are It is obtained as a mixed Poisson distribution with mixing function given by the Lindley distribution (see Lindley, 1958), whose density function is f (x|θ 1 ) = θ 2 1 θ 1 +1 (x + 1) e −xθ 1 , for x = 0, 1, . . ., and θ 1 > 0. Compared with the Poisson distribution, the PL distribution presents three qualities that usually appear in claim data sets.Since it is a mixed Poisson distribution then it is overdispersed.The probability of observing a zero value is higher than under a Poisson distribution with the same mean, termed zero inflated phenomenon, (see Karlis & Xekalaki, 2005 and references therein).Furthermore, for mean values under 2.4, which includes practically all the real data cases, it is straightforward to prove the termed one-deflated phenomenon, i.e., f PL (1; θ 1 ) ), where f P is the probability mass function of the Poisson distribution.
Under a bayesian point of view, the parameters of interest of the problem can be estimated using our knowledge about them.
The beta distribution B(θ 1 ; α, β) ∝ θ α−1 1 (1 − θ 1 ) β−1 ,0 < θ 1 < 1; α > 0 and β > 0 is a common prior distribution for the parameter θ 1 , where For the application of this model to the premium calculating and operational risk see Hernández-Bastida et al. (2011).It is known that in the PL model the marginal distribution of K, if the B(α, β) prior is considered, is given by where . Observe that this marginal distribution is a mixed Poisson distribution with mixing function given by Γ(α+2)  B(α,β) (λ + 1) U (α + 2, −β + 2, λ), where U is the Hypergeometric U.For the calculation of the moments of the marginal distribution it is useful to observe that they can be written as It also occurs with the rest of the marginal distributions.
A meaningful alternative to the beta distribution is proposed in Van Dorp and Kotz (2002a).Hence, the ST SP distribution is proposed as a choice of prior distribution for θ 1 .Its pdf is given by: (3) The prior mean and the variance for θ 1 are given by When b = 2 the Triangular distribution, denoted as T (θ 1 , a) (see Johnson, 1997), is obtained.If b = 1 the Uniform distribution in [0, 1] is derived (hereafter the U distribution) and if a = 1 it follows the potential distribution (hereafter, the P distribution).
This paper focuses on the ST SP distribution as an alternative prior distribution to the beta distribution for θ 1 .
To make this comparison operative, we establish certain aspects considered essential to the distribution modelling the prior information.The first aspect to consider is that of unimodality with the value of the mode at a, thus completely determining the T distribution.To rank the comparison we are interested in, the second aspect is set so as to consider unimodal distributions with the same mean.(4) In the sequel we assume that the parameters for the B distribution are determined from the expressions in (4).
In accordance with our aim of comparing the three models derived from the three prior distributions for the parameter of the distribution considered, T , ST SP and B, we compare the amount of information contained in each of these distributions, considering such quantity to be the discrepancy of each of them with respect to the uniform distribution, defined as (see Shannon, 1948;Kullback, 1959or Kullback & Leibler, 1951) If the uniform distribution is considered to be the less informative probabilistic model, then the greater the discrepancy D KL f : U , the greater the information contained in the f distribution.
The divergence between the B and the U distributions is obtained with the following expression, (see Soofi & Retzer, 2002)  From the comparison between the ST SP and the B with the same mode and mean, it follows that whatever the value of the mode a, and whatever the value b > 1, the B has less information and more dispersion than the ST SP distribution.
The following notation will be useful in the sequel.For the sake of simplicity, the acronym of the distributions are used to indicate the corresponding density functions.
Let ε 1 and ε 2 be integers and let k be a positive integer.We note ).The calculation of this expected value is tedious but after a bit of algebra it is obtained (see the Appendix for details).
In the PL model the marginal distribution of K, if the ST SP(a, b) prior is considered, is (5) For k = 0 the expression is notably simplified and it is obtained that m(0|ST SP) = For the beta and ST SP distributions we determine the hyper-parameters with the expected value of the marginal distribution and the value of this distribution at where k represents the sample mean of the data, and m(0|• • •) = f 0 where f 0 is the relative frequency of the value k = 0, we get two systems of two equations.The hyper-parameters can be obtained by solving the systems.In the particular case of the T distribution, we can use one of both possibilities since we only need one parameter.
In the real data sets for the problem considered, it is common to find that the frequency of the zero value is extremely high, clearly more than the 50%.Hence, it is worth paying special attention to how the marginal distribution at k = 0 behaves.
It is no so difficult to prove that, for instance with the Mathematica package, the m(0|B) and m(0|ST SP) distributions, as functions of the hyper-parameters a and b have similar behavior.
For a ≤ 0.54 fixed, these functions decrease uniformly in b, achieving the maximum value at b = 1.This maximum is always under 0.45.
For a > 0.54 fixed, these functions increase uniformly in b, whose superior is obtained when b increases indefinitely.It is straightforward to show that lim b→∞ m(0|B) = lim b→∞ m(0|ST SP) = a 2 (2 − a), and the limit function only takes values over 0.5 when a is greater than 0.6.In summary, for the problems in question the range of relevant values for a is given by the interval (0.6, 1).
It is straightforward to prove that m(0|T ) varies in the interval (0.234; 0.6), and furthermore, E m(k|T ) [K] ranges in the interval (1, 67; 34.8).Accordingly, we will not consider the T distribution as structure function in data sets which present observed frequency at k = 0 greater than 0.6 or sample mean smaller than 1.67.
For the comparison between the m(k|B) and m(k|ST SP) distributions we have considered the difference function defined as Dmarg(k, a, b) = m(k|B) − m(k|ST SP) and a complete numeric analysis for fixed values for k has been developed.In each case there have been determined the superior and the inferior in (a, b) for the difference function.It is obtained that, -When k ≤ 3, the difference function takes positive and negative values.
-When k ≥ 3, the difference function always takes positive values.
-In any case the function given by the absolute value of the difference, denoted as |Dmarg(k, a, b)| takes small values.The highest one is 0.033 when k = 1 and when k ≥ 3 it is always under 0.0058.
As a conclusion it is shown that, considering the marginal distribution, it is not direct the difference between the B and ST SP models.

The Premiums
In this section, for the several specified prior distributions, we examine the collective and Bayes premiums.The collective premium is defined as the expected value of the True Individual premium with respect to the corresponding prior distribution and it is equal to the expected value of the marginal distribution for K. Observe that the collective premium is the appropriate premium when claim experience is not available.It is straightforward to prove that the Bayes premium, which is the expected value of the True Individual premium with respect to the corresponding posterior distribution, is equal to the expected value of predictive function of K. Furthermore, it is the appropriate premium when experience claim is present being the best estimation of the True Individual premium.
In the PL model, the True Individual premium is given by E Hence, if the B(α, β) or the ST SP(a, b) are considered as prior distributions then the corresponding collective premiums are, by direct calculus, and When b = 1 the CP(U) does not exist and for a = 1 it follows that CP(P) = b+3 b 2 −1 .In the comparison between the collective premiums for the structure functions B and ST SP with the same mean and mode it is obtained that the first distribution produces greater values for the collective premium, i.e.,

CP(B) ≥ CP(ST SP).
( This affirmation is obtained just observing that it is equivalent to It can be proved that the term in the left hand achieves a minimum at 0.2527.
The Bayes premium is determined and compared for the different prior distributions.By direct calculus, where Using (17) for the ST SP, For the comparison of the Bayes premium we made a complete numerical analysis for the difference function Dbp(k, a, b) = BP(k|B) − BP(k|ST SP) for integers of k, a ≥ 0.6 and b > 1.The conclusions are clear, when there is experience of "non sinistrality", i.e., k = 0, the difference function takes values positives and negatives although the last ones predomine.
However, when any claims are declared, i.e., k ≥ 1, the difference function is always positive.That means that, compared with the ST SP distribution, the B distribution penalizes more severely the additional claim declaration.
Table 1 shows the highest values which have been observed in the function Dbp(k,a,b)  BP(k|ST SP) × 100, and it indicates the level of additional penalization given by the B as structure function.
Table 1.The highest observed values in b for the function Dbp(k,a,b)  BP(k|ST SP) × 100 for the indicated values of k and a a k 0.6 0.7 0.8 0.9 1 17.32 23.20 27.27 27.93 2 36.67 42.54 45.10 42.78 3 51.66 56.47 56.99 52.28 4 63.58 66.95 65.62 58.99 5 73.41 75.21 72.23 64.12It is clear that the differences between the premiums are remarkable, and the conclusion in the model for the number of claims is clear: If the essential aspects of the prior information are the unimodality, with the value of the mode, and the mean value, then the B distribution is a better choice than the ST SP distribution.The B distribution is less informative than the other distribution and also produces higher values for the premiums i.e., so it results more conservative from the point of view of the insurance firm.

An Application to Real Data
To address the issues described previously, we develop the analysis applied to a real data set which is well known in studies of automobile insurance claim calculation.These data, taken from Klugman et al. (2004) concern the number of automobile liability policies in Germany during the years 1960-61.The sample mean is 0.1442 and the sample variance is 0.1639.Accordingly, the sample dispersion index is 1.136.The Binomial Negative distribution (hereafter, NB), which is another mixed Poisson distribution allowing for overdispersion, and the Poisson distribution are compared with the Poisson-Lindley model to data set.Table 2 shows fits from these distributions.For comparative and illustrative purposes, all the usual measures, such as p-value, -Loglikelihood, the Akaike Information criterio (AIC) and the Bayesian Information Criteio (BIC) are used to compared the models.As it is known, a model with a minimum BIC value is preferred.Table 2 shows that the PL model performs very well in fitting the distribution with respect to the Poisson distribution and provides a fit as good as that of the biparametric Negative Binomial model.Based on the AIC and the BIC the PL is the preferred model.Furthermore, taking into account the Ockham's razor principle, it is simpler than the Negative Binomial and therefore it might appear to be more preferable than a less complex model.
The next step is to choose the prior distributions.As the sample mean is under 1.67, the Triangular distributions is not considered here as possible structure function.
For the B and ST SP distributions we solve the systems of equations proposed in the previous section, and we obtain a pair of values which let us to the B(8, 1) or the ST SP(1, 8), which are the same distribution.The following figure shows the prior, marginal and posterior distributions for a B(8, 1).In this case, the collective premium, which is given by the expected value of the marginal distribution is equal to 0.1746.The Bayes premium and the variation rates, denoted as TV and calculated as BP(k|... )−BP(k−1|... ) × 100, are in Table 3.

The Collective Risk Model
This section develops an aggregate loss model and determines the Bayes premium.A ST SP distribution is considered as an alternative to the usual B distribution as estructure function for the parameter of the primary distribution whereas a gamma distribution is considered for the secondary one.

The Model
Let X i be the random variable size of a single claim assumed to follow an Exponential distribution with parameter and the expected value and the variance are If the primary distribution is a Poisson-Lindley and the secondary is an Exponential, (the crmPLE), it is verified that: (i) The probability density function of the random variable aggregate claim, when s > 0, is given by, while with the usual discontinuity of the crm appearing at s = 0.
(ii) The moment generating function of S is given by M PLE and the mean and the variance are , respectively.
The natural choice of a prior pdf for θ 2 is the Gamma density G (θ 2 ; c, d) We consider values of c such that c > 1, a necessary and sufficient condition for the existence of the inverse moment.The prior mean and variance for θ 2 are given by E The corresponding prior mode for the parameter θ 2 is Mo G [θ 2 ] = c−1 d .The following notation and formulas are used throughout this paper.
Let ε 1 , ε 2 , ε 3 be integers and let k be a positive integer.Let s be a real positive number.If there is no possibility of misunderstanding and for the sake of simplicity of the notation, the hyper-parameters will be omitted.We note E ST SP•G θ 1 1 (1 − θ 1 ) 2 θ 3 2 e −sθ 1 θ 2 as J ( 1 , 2 , 3 |s).This expected value is calculated in a detailed way in the Appendix.
We note This expected value is obtained as a linear combination of hypergeometric functions (see the Appendix for details).
In addition, in the crmPLE the marginal distribution of S , if independent ST SP(a, b) and G(c, d) priors are considered, is given by and, when s = 0, it follows that, m(0|ST SP • G) = m(0|ST SP).
As it was indicated in section 2, the real data sets in the problems in question present a high percentage of values at s = 0. Accordingly, it is interesting to analyze the values for the marginal distributions at s = 0.The analysis is reduced to that made in the previous section.
From the comparison between the marginal distributions m(s|B • G) and m(s|ST SP • G) when s > 0, the numeric analysis show that there exist inequalities in both directions.
By assuming b = 1 in expressions ( 14) the marginal distribution for the prior U • G distribution is obtained and it is irrespective of the hyper-parameters a and b.
In the question of the specification of the hyper-parameters different sources of information can be considered in order to help us to choose them on a reasonable basis: the unimodality and the value of the mode are usually a strong intuitive aspect; the consideration, separately, of the PL and E models whose composition leads to the crm model; the value of m(s| • • • ) at s = 0 and, finally, it can be useful to calculate some moments of the marginal distribution m(s| • • • ).For the calculus of the moments for the marginal distribution is useful to know that which is also true for the other marginal distributions.

The Premiums
In this section we examine the Risk Net premium, which is the expected value of the likelihood, the collective premium and the Bayes premium.
In the crm.PLE model, the Risk Net premium is Then, the collective premiums if independence between θ 1 and θ 2 is assumed, are given by CP(B  Assuming s = 0 on the right side of this expression we obtain the Bayes Premium in the point of discontinuity, which is given by BP The Bayes Premium for the ST SP • G prior distribution is given by the following expressions: For s > 0, Making b = 1, the Bayes premium for the prior U • G distribution is obtained.
In the comparison for the Bayes premium we consider, separately, the cases s = 0 and s > 0.
When s = 0, the comparison between the Bayes premium is reduced to the comparison between BP(0|B) and BP(0|ST SP) made in the previous section.
For s > 0, we have made a wide numerical analysis for the values a = 0.6, 0.7, 0.8, 0.9 and b = 2, 7, 10, 25.Several gamma distributions have been considered.The conclusions are similar in the following way: it is obtained positive and negative values for the variation rates, defined as the quotient BP(s|B•G)−BP(s|ST SP•G)

BP(s|ST SP•G)
× 100 with the hyper-parameter for the beta distribution given by (4).The variation rates indicate that none of the premiums is systematically greater or slower than the other.Furthermore, it is shown that sometimes it is possible to find dramatic differences between the premiums.As an illustration, Table 4 shows the variation rates for the Bayes premium for the G(4, 6).

Discussion
The aim of this paper is to determine the Bayes and collective premiums, in an aggregate loss Poisson-Lindley model.A natural choice for the structure function is the Beta distribution.In this paper, taking into account the unimodality with mode value and mean value as essentials aspects of the prior information, it is studied the possibility of considering the STSP distribution as alternative to the structure function.The conclusion is that the STSP is not an adequate alternative in the problem in question because it is more informative and less dispersed than the Beta distribution.Comparing the marginal distribution it is not easy to distinguish it from the Beta distribution.However, we obtain premium values totally different.
Given a unimodal ST SP(a, b) distribution, b > 1, a unimodal B(α, β) distribution with the same mode and mean as the ST SP(a, b) is obtained by considering α = ab − a + 1; β = a + b − ab.
where ψ(z) is the PolyGamma function (see http://functions.wolfram.com).The discrepancy of the ST SP with respect to the U is given by D KL [ST SP : U] = log b − 1 + b −1 , with a minimum at b = 1, whose value is 0, and which strictly increases if b > 1 (see Van Dorp & Kotz, 2002b, for details).Hence, if 1 < b < 2 it is verified that D KL [ST SP : U] < D KL [T : U] ≡ 0.19314718 and when b > 2 the contrary occurs.

Figure 1 .
Figure 1.Prior, marginal and posterior distributions the collective premium does not exist.The comparison of the collective premiums is deduce directly from the inequality in the model for the number of claims, specifically CP(B • G) ≥ CP(ST SP • G).

Table 2 .
Fitting of automobile claim data a Expected frequencies have been combined for the calculation of χ 2 .

Table 4 .
Variation rates for Bayes premiums in the crmPLE model for the indicated values of a, b and s, when the hyper-parameter (c, d) = (4, 6)