The McDonald Generalized Beta-Binomial Distribution : A New Binomial Mixture Distribution and Simulation Based Comparison with Its Nested Distributions in Handling Overdispersion

The binomial outcome data are widely encountered in many real world applications. The Binomial distribution often fails to model the binomial outcomes since the variance of the observed binomial outcome data exceeds the nominal Binomial distribution variance, a phenomenon known as overdispersion. One way of handling overdispersion is modeling the success probability of the Binomial distribution using a continuous distribution defined on the standard unit interval. The resultant general class of univariate discrete distributions is known as the class of Binomial mixture distributions. The Beta-Binomial (BB) distribution is a prominent member of this class of distributions. The Kumaraswamy-Binomial (KB) distribution is another recent member of this class. In this paper we focus the emphasis on the McDonald’s Generalized Beta distribution of the first kind as the mixing distribution and introduce a new Binomial mixture distribution called the McDonald Generalized Beta-Binomial distribution(McGBB). Some theoretical properties of McGBB are discussed. The parameters of the McGBB distribution are estimated via maximum likelihood estimation technique. A real world dataset is modeled by using the new McGBB mixture distribution, and it is shown that this model gives better fit than its nested models. Finally, an extended simulation study is presented to compare the McGBB distribution with its nested distributions in handling overdispersed binomial outcome data.


Introduction
It is well known that the discrete random variable Y, the number of successes in n binary trials is generally modeled using the traditional Binomial distribution with the parameters n and p, if the binary trials are identical and independent.The probability of success parameter (p|0 ≤ p ≤ 1) is usually assumed to be a constant from trial to trial.The mean and the variance of a Binomially distributed random variable is then given by E (Y) = np and Var (Y) = np (1 − p).However, in most empirical situations, it has been observed that the actual variance of Y is greater than the assumed Binomial variance.This phenomenon is generally known as Overdispersion."Extra Binomial Variation" and "Binomial Heterogeneity" are some other commonly used terms to label overdispersion.One of the possible reasons for the overdispersion is that the success probability, p, does not remain as a constant from trial to trail but varies itself as a random variable.This leads to treat the success probability as a continuous random variable P, which is bounded between 0 and 1.The resultant class of distributions is known as the class of Binomial Mixture Distributions.A Binomial mixture distribution can be symbolically denoted as Bin (n, P) ∧ F P (p) where Bin (n, P) represents the Binomial distribution and F P (p) symbolizes the distribution function of the mixing distribution of the random variable P and the mixing density of P is denoted by f P (p).If the mean and the variance of the mixing distribution of the random variable P are denoted by E (P) = nπ and Var (P) = σ 2 respectively, then it can be shown that the mean and the variance of the Binomial mixture random variable Y are E (Y) = nπ and Var (Y) = nπ (1 − π) + n (n − 1) σ 2 respectively.The additional component in the variance of Y models the overdispersion.

univariate continuous
The aim of this paper is to propose an alternative Generalized Beta distribution, which has three additional parameters, as the mixing distribution to model the Binomial success probability.The McDonald's Generalized Beta distribution of the first kind (McDonald, 1984(McDonald, , 1995) ) is considered in our work.It has been shown that this distribution has more flexibility than the Beta and Kumaraswamy distributions in literature (see for example : Alexander, Cordeiro, Ortega, & Sarabia, 2012).We call the distribution which shall be obtained by mixing the McDonald's Generalized Beta distribution of the first kind to the success probability of Binomial distribution in our work as the McDonald Generalized Beta-Binomial distribution (McGBB).It can also be shown that our new mixture distribution includes both Beta-Binomial distribution and Kumaraswamy-Binomial distribution as its nested distributions.
The paper is organized as follows: In section 2, we present a brief review of Beta, Kumaraswamy, McDonald's generalized beta of the first kind, Beta-Binomial and Kumaraswamy-Binomial distributions.The McDonald Generalized Beta-Binomial distribution is developed in section 3 by deriving the probability mass function and moments.The section 4 demonstrates that both BB distribution and KB distribution are nested distributions of McGBB distribution.In section 5, we discuss the parameter estimation of McGBB distribution by means of Maximum Likelihood Estimation technique.An empirical overdispersed binomial dataset is analyzed with our McGBB distribution and its nested distributions, and a comparison study is done in section 6.Then, a simulation study is presented in section 7 to compare the performance of McGBB model with BB model and KB model in handling overdispersed binomial data.Finally, section 8 provides some concluding remarks.

Review on Key Ingredients
In this section we briefly outline the three mixing distributions of the random variable P, and the two Binomial mixture distributions, Beta-Binomial distribution and Kumaraswamy-Binomial distribution.

Beta Distribution
Let P be a random variable following a Beta distribution with two shape parameters a and b, denoted by Beta (a, b).The probability density function of P is then given by where B (a, b) denotes a beta function.

Kumaraswamy Distribution
Let P be a random variable following a Kumaraswamy distribution (Kumaraswamy, 1980), with two shape parameters ζand ϑ, denoted by Kumaraswamy (ζ, θ).The probability density function of P is then given by

McDonald's Generalized Beta Distribution of the First Kind
Let P be a random variable following a McDonald's Generalized Beta distribution of the first kind (McDonald, 1984(McDonald, , 1995) ) with three shape parameters α, β and γ, denoted GB1 (α, β, γ).The probability density function of P is then given by The r th moment of the McDonald's Generalized Beta Distribution of the first kind is given by, It is easy to show that the GB1 distribution is reduced to Beta distribution when γ = 1, and reduced to Kumaraswamy distribution when α = 1.

Beta-Binomial Distribution and Kumaraswamy-Binomial Distribution
The Beta-Binomial(BB) distribution is obtained from mixing the Binomial probability of success P over a Beta distribution defined in section (2.1).Suppose Y|p ∼ Bin (n, P) and P ∼ Beta (a, b).The probability mass function (PMF) of BB distribution is then given by Likewise, the Kumaraswamy-Binomial distribution (Li et al., 2011) is obtained by mixing the Binomial probability of success P over a Kumaraswamy distribution defined in section (2.2).Suppose Y|p ∼ Bin(n, P) and P ∼ Kumaraswamy (ζ, θ).The PMF of KB distribution is then given by, (−1) i B (y + ζ + ζi, n − y + 1); y = 0, 1, . . ., n and ζ, θ > 0. (6)

The McDonald Generalized Beta-Binomial Distribution (McGBB)
In this section we define the McDonald Generalized Beta-Binomial distribution and derive some basic properties of the same distribution.We begin with the definition of developing Binomial mixture distributions.
Generally, a Binomial mixture distribution is obtained through an integration approach.Conditional on p, suppose Y follows a Binomial distribution given by Bin (n, P), which is denoted by Y|p ∼ Bin (n, P).Unconditional probability mass function of the Y can be obtained by evaluating the well-known integral, For y = 0, 1, . . ., n and Θ is the parameter space of the mixing distribution.
Proof.(i) LetY|p ∼ Bin (n, P) and P ∼ GB1 (α, β, γ).Then the unconditional PMF of Y can be obtained by using Equation (7) as below, By adding the Binomial series representation of (1 − p γ ) β−1 for the above, we get Since the Binomial series is a Power series, it could be integrated term by term and hence we have Since Re (y + αγ + γi) > 0 and Re (n − y) > 0, [where Re (.) represents the real part of the number] the inner integral can be represented using a Beta function as, which is the PMF given in Equation (8).Although the above PMF is a valid PMF as it is obtained by means of well-known stochastic compound formula, an infinite series occurs inside the PMF.Therefore it is of interest to know whether the above infinite series can be represented as a finite series.The second part of this theorem rearranges the above PMF as given below.
(ii) Now to obtain the rearranged probability mass function of McGBB (n, α, β, γ), let us begin with, Now consider the binomial series expansion of (1 − p) n−y in the above integral.Since n − y ≥ 0 and a positive integer, this series terminates at n − y, and can be written in the form n−y j=0 (−1) j n − y j p j .Thus, Again, since Re y γ + α + j γ > 0 and Re (β) > 0, the inner integral can be represented using a Beta function and hence, Now by inserting Equation (11) in Equation ( 8) we have, (iii) To obtain the r th moment about zero, mean and variance of McGBB (n, α, β, γ), apply the following wellknown identities of probability theory, Since Y|p ∼ Bin (n, P) it follows that, and Now, by substituting the moments of P ∼ GB1 (α, β, γ) given in Equation (4) in the above two identities, the moments of McGBB (n, α, β, γ) stated in Equation ( 10) can be derived.

Nested Distributions of McGBB
In this section we show that the McGBB distribution can be nested to the BB distribution and KB distribution under specific parameter settings.Proof.For γ = 1, the PMF of McGBB in Equation ( 8) becomes, Since Re (y + α) > 0 and Re (n − y + β) > 0, we have which is the PMF of Beta-Binomial distribution.
, then by setting α = 1, we obtain the Kumarasswamy-Binomial distribution with parameters n, β and γ.
Proof.For α = 1, the PMF of McGBB (n, α, β, γ) in Equation ( 8) becomes,   8) or Equation ( 9).However, the log-likelihood function in terms of the PMF given in (8) proceeds too many iterations as there is an infinite series that results unnecessary computer time.This is one of the reasons which motivated us to rearrange the PMF of the McGBB distribution from Equation (8) to Equation ( 9).Thus the log-likelihood function for Θ can be defined as follows, The Maximum Likelihood Estimates Θ = α, β, γ T can be obtained either by directly maximizing the above log-likelihood function with respect to Θ or by solving the three simultaneous equations obtained by equating U (Θ) = 0.The score function U (Θ) is defined as the gradient of (Θ) , derived by taking the partial derivatives of (Θ) with respect to α, β and γ.The components of the score function where In particular, the optimization method "Simplex algorithm for minimization" (Nelder, 1965) is used to minimize the user defined negative log-likelihood function with respect to Θ in our study.

Applications of McGBB in Handling Overdispersion
This section demonstrates the superiority of McGBB distribution over its nested binomial mixture distributions BB and KB in handling overdispersed binomial outcome data.We compare the goodness of fit and the Analysis of Deviance(ANODEV) results of the three Binomial mixture models in modeling a real world data that exhibits overdispersion relative to the Binomial distribution.The data for this section is taken from Alanko and Lemmens (1996), which have also been previously used by Rodríguez- Avi et al. (2007) and Li et al. (2011) for similar purposes.

Data Description
The numbers of alcohol consumption days in two reference weeks are separately self-reported by a randomly selected sample of 399 respondents in the Netherlands in 1983.The number of days an individual consumes alcohol Y, out of n=7 days in a reference week can be treated as a Binomial variable.However, the Binomial success probability p, the probability to consume alcohol on a randomly chosen day in a reference week for an individual, cannot be treated as a constant in this setup since there is a person-to-person variation in the inclination to drink and the drinking behavior.This leads to analyze this data using a Binomial mixture distribution by modeling the random variable P using a continuous distribution bounded in the standard unit interval.Alanko and Lemmens (1996) modeled this data using the Beta-Binomial distribution, Li et al. (2011) approached this data with the Kumaraswamy-Binomial distribution.

Modeling Results and Discussions
We model the alcohol consumption data by means of McGBB distribution by estimating the Maximum Likelihood Estimates Θ = α, β, γ T as described in section 5.The MLEs â and b of BB model are taken as starting values for α and β and the initial value of γ is taken as 1 in the numerical iterative procedures.

Simulation Study
In this section, we present a Monte Carlo Simulation study conducted to investigate the performance of McGBB distribution with its nested BB and KB distributions in modeling simulated overdispersed binomial outcome data under varying degrees of overdispersion.

Generation of Overdispersed Binomial Variates
In general, random generation from the Beta-Binomial distribution is used as a standard method to simulate overdispersed Binomial variables (See for example: Ennis & Bi, 1998).However, since our present study focuses on comparing three Binomial mixture distributions including the Beta-Binomial distribution in handling overdispersed Binomial data, there is a suspicion that perhaps the results may be influenced towards Beta-Binomial distribution.Therefore an alternative algorithm proposed by Ahn and Chen (1995) is used to simulate overdispersed Binomial variables.The algorithm developed by Ahn and Chen (1995) to generate overdispersed Binomial variables for specified mean and variance from an underlying multivariate normal distribution is simplified using equal correlation structure and briefly outlined in subsection 7.1.1.

Representation of Overdispersed Binomial Variables
Let the overdispersed binomial random variable Y i be the sum of the correlated binary variables X i1 , X i2 , . . ., X in where n is the number of trials and i = 1, 2, . . ., N. Suppose, E X i j = π, Var X i j = π (1 − π) and corr X i j , X ik = ρ, for j k.Then the mean and the variance of Y i is given by The parameter ρ is the overdispersion (also, the intracluster correlation coefficient) parameter of the overdispersed Binomial random variable.This explains that when ρ → 0 this distribution reduces to Binomial distribution and ρ → 1 results severe overdispersion.The means and the variances of the three Binomial mixture distributions under consideration can be represented similar to Equation ( 12), nevertheless, the expressions of π and ρ depend on the mixing distribution.Such a representation of the mean and the variance of the McGBB distribution is given in Equation (10).

Algorithm to Generate Overdispersed Binomial Random Variable
Step 1: Solve the following equation for a given n, π and ρ, For δ , where Φ[z (π) , z (π) , δ] is the cumulative distribution function of the standard bivariate normal random variable with correlation coefficient δ , and z (π) denotes the π th quantile of the standard normal distribution.
Step 2: Generate n−dimensional multivariate normal random variables, Z i = (Z i1 , Z i2 , . . ., Z in ) T with mean 0 and constant correlation matrix i for i = 1, 2, . . ., N, where the elements of i lm are δ for l m.
Step 3: Now, for each j = 1, 2, . . ., n define Then, it can be showed that the random variable Y i = n j=1 X i j is overdispersed relative to the Binomial distribution.
The reason for the above follows from the fact that Thus, it is apparent that the correlated binary variables generated by this algorithm encompass an overdispersed Binomial random variable which is characterized in Equation ( 12).

Simulation Design
In a simulation study like this, determining the values of the parameters to be used to generate required data is indeed a challenging task.In the alcohol consumption data presented in the previous section the McGBB estimate of π for both week 1 and week 2 is around 0.5 while the similar estimate of overdispersion parameter ρ for both week 1 and week 2 is around 0.4.Also, the number of trails in this data is 7 and the total number of observations is 399.Even though we can generate overdispersed data for different ρ values by keeping the other parameters π, n and N fixed as those in this data, in the present study we have conducted an extended simulation to clearly understand the complete scope of the problem.Consequently, in this simulation study, three π values are chosen as 0.1 (extreme), 0.5 (middle) and 0.75 (moderate), [here the range of success probabilities from 0-0.5 is adequate as the other half of the possible π values can be included by modeling the failure probability, however without loss of generality, we include π = 0.75 as a moderate value instead of π = 0.25]; two n values are chosen as 5(lower) and 10(greater); two N values are selected as 20 (small number of observations) and 500 (large number of observations).Further, 10 overdispersion parameter ρ values are picked from 0.05 to 0.9 in the increasing order since the main objective of this simulation study is understanding the behavior and comparing the performance of the three Binomial mixture distributions discussed above for different degrees of overdispersion.Thus, we run a total of 3 × 2 × 2 × 10 = 120 factorial combinations of the four factors π = {0.1,0.5, 0.75}, n = {5, 10}, N = {20, 500} and ρ = {0.05,0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}.The value of ρ = 0.05 is chosen just to observe the performance of the mixture models when there is almost no overdispersion.
For each of the 120 parameter combinations, we then simulate 1000 overdispersed Binomial datasets (total of 120,000 datasets) using the algorithm described above.Starting from the data generation, the entire simulation study (including the model fitting, evaluation and comparison presented below) is programmed and performed using the open source statistical software R and RStudio: an integrated development environment for R.
The types of simulated data herein may arise in many real world applications.For instance, the surveys of consumption of a product or service for a small time frame, like the one described in section 6; or any other types of behavior reporting in a short retrospective time period, such as the consumer purchasing behavior investigated by Chatfield and Goodhardt (1970).Moreover, other types of data such as plant decease incidence data modeled using Beta-Binomial distribution by Hughes and Madden (1993).

Model Evaluation Procedures
For each of 120,000 datasets the maximum likelihood estimates of the three Binomial mixture distributions (McGBB, BB and KB ) are obtained along with the Log-Likelihood values and the AIC values of the model fit.Then, by comparing the observed frequencies with the expected frequencies obtained by means of each of the three estimated Binomial mixture models, the Chi-Square goodness of fit test statistics, the number of degrees of freedom and the associated p-values to test that the data is consistent with the distribution specified under the null hypothesis are also obtained.The model evaluations are done in two procedures as stated below.
Procedure 1: The three Binomial mixture models are evaluated individually for each of 120 parameter combinations.This is done graphically by comparing the boxplots of the calculated AIC values and also for an inferential discussion, the percentage of p-values that significantly reject the Binomial mixture model under consideration at 5% significance level are reported and evaluated.These measures are also used to compare across the models for each set of parameter combinations.
Procedure 2: Pairwise comparisons are done for McGBB model with its nested BB and KB models for each of the 120,000 simulated datasets.Here, we perform Analysis of Deviance (ANODEV) by calculating the deviance difference using the Log-Likelihood values to determine whether the complex McGBB model with an additional parameter provides a significantly better fit compared to its nested models.Again, the percentage of p-values that significantly rejects the simple model over the complex McGBB model at 5% significance level are reported for each of the 120 parameters combinations.

Results of the Simulation Study
First we compare the individual evaluation of the three Binomial mixture models by means of the methods stated above in Procedure 1.The percentage of the significant p-values which leads to reject the model under consideration at 5% significance level, out of the 1000 simulated datasets for each of the 120 parameters combinations are presented in Table 3.
It can be seen from these numbers that, regardless of the π and n values, the rejection percentage of all three Binomial mixture models increases with increasing number of observations for two extremes of degree of overdispersion [ρ = 0.05, 0.1 and ρ = 0.9].This large scale increase in the rejection percentage of all three Binomial mixture models with N, also continues for extreme π values from ρ = 0.2 to ρ = 0.5.The same continues in the KB models even for ρ = 0.6, 0.7 and 0.8.Moreover, the KB model fails to fit any of the dataset out of the 1000 simulated datasets for high overdispersion, extreme probability and large number of observations irrespective of the number of trials, which is a major drawback of KB model.The significant p-value rejection percentage does not increase with N in a considerable magnitude for the BB model and the new McGBB model for middle and moderate π values, from ρ = 0.2 to ρ = 0.7 despite the changes in the n values.
The across comparison of all three Binomial mixture models for each of the parameter combinations suggests that there are no any large differences in the rejection percentages between the three models for most of the parameter combinations except in the case of high overdispersion, in which KB model fails to fit very high number of simulated datasets compared to BB and McGBB models.Further, for low overdispersion values, at some particular parameter combinations both the BB and KB models result a very serious rejection percentage( for example, more than 20% the simulated datasets are rejected by BB models at {π = 0.1, n = 10, N = 500, ρ ≤ 0.1} whereas the newly proposed McGBB models have a less rejection percentage).Even though there are few configurations in which the McGBB model results a slightly high rejection percentage than BB and KB models, this is not a serious issue as BB and KB are nested distributions of McGBB distribution.Besides, it should be noted there may be few simulated datasets which cannot be modeled using all three Binomial mixture models considered herein.By its nature the low probability of success can result binomial outcome datasets that contain excessive amount of zeros, known as zero-inflated data.Relatively high rejection percentage(>10%) of all three Binomial mixture models at {π = 0.1, n = 10, N = 500, ρ ≤ 0.2} is due to this zero-inflation feature.
which is the PMF of Kumaraswamy-Binomial distribution.Probability Mass Function plots of McGBB for some arbitrary parameter values are shown in Figure 1.Here, a graphical comparison of McGBB is illustrated with its nested mixture distributions, BB and KB, by fixing the common parameters and allowing the additional parameter to vary.These comparison plots indicate that the additional parameter in our new Binomial mixture distribution has more impact on the shape of the PMF compared to that of its nested distributions.
and middle π values irrespective of n values, noticeably high percentage of simulated datasets are favoring the complex McGBB model over the simpler BB model for low overdispersion values (ρ ≤ 0.1).Again, for the high overdispersion values (ρ ≥ 0.6) a few percentage of simulated datasets are favoring the complex McGBB model over the simpler BB model when N is large.Finally, it can be seen from Table5, that ANODEV comparisons of McGBB vs KB also result many parameter combinations with zero percentage of significant p-values indicating that KB is a minimum adequate model in favor of McGBB model at those parameter settings.Nevertheless, for high overdispersion values (ρ ≥ 0.6), the simpler KB models are extensively rejected in favor of the complex McGBB models for large N values.For example, at the parameter combination {π = 0.5 , n = 10, N = 500 , ρ = 0.9}, 87.6% of the simulated datasets are rejected favoring the complex McGBB model over the simpler KB model which is exceptionally high.Our overall findings of the simulation study are that even though the BB model and the KB model are minimum adequate models compared to complex McGBB model when N is small regardless of the other parameter combinations, which is not true in general.In particular, when N is large there are considerable number of simulated datasets which cannot be modeled either by means of BB or KB models but the proposed McGBB possibly models those datasets well.Note that for many simulated datasets, we observe that McGBB model outperforms BB model for very low degree of overdispersion (ρ ≤ 0.1) when N = 500 and π = 0.1 and 0.5, as well as McGBB model outperforms KB model for relatively high degree of overdispersion(ρ ≥ 0.6) when N = 500 and all π values considered and, also when ρ ≤ 0.1, N = 500, π = 0.1 and n = 10.8.Concluding RemarksIn the present study, we propose a new three parameter Binomial Mixture distribution, namely the McDonald Generalized Beta-Binomial (McGBB) distribution, by mixing the McDonald's Generalized Beta distribution of the first kind to the success probability of the Binomial distribution.We present two different parameter arrangements of the probability mass function of the McGBB distribution one containing an infinite series and the other with a finite series.The central moments of the McGBB distribution are also obtained along with the mean and variance of the McGBB distribution.The additional parameter in the McGBB distribution allows accommodating wide range of shapes in addition to the shapes that are accommodated by its nested Binomial mixture distributions.The parameters of the McGBB distribution are estimated by maximum likelihood estimation technique.The main objective of our study is comparing the new McGBB mixture distribution with its nested distributions in handling Binomial overdispersion.This objective is achieved by means of a real data and an extended simulation study.The results of the real data shows that the new McGBB mixture model provides a better fit and good improvement in the goodness of fit tests than BB and KB models.From the results of the simulation study, it is also evident that the McGBB model is superior to its nested models for some parameter combinations.Hence the proposed McGBB distribution has a great potential for handling Binomial overdispersion.

Figure A2 .
Figure A2.Boxplots of the distribution of AIC values when ρ = 0.9

Table 1 .
Avi et al. (2007)modeling results of alcohol consumption dataAs can be seen in Table1, the p-values in the Chi-Square goodness of fit tests for both Beta-Binomial (0.086 for week 1 and 0.082 for week 2) and Kumaraswamy-Binomial (0.0757 for week 1 and 0.0792 for week 2) models are noticeably small.Further, there are considerably large discrepancies between the expected frequencies obtained by means of both Beta-Binomial models and Kumaraswamy-Binomial models and the actual observed frequencies.Also,Li et al. (2011)noted that both Beta-Binomial distribution and Kumaraswamy-Binomial distribution, which have only two additional parameters, have the same flexibility in modeling this consumption data.Thus we conclude that both Beta-Binomial and Kumaraswamy-Binomial modeling approaches are not very satisfactory in analyzing this data.On the other hand, a Generalized Beta-Binomial(GBB) distribution proposed and developed by Rodríguez-Avi et al. (2007)possesses great flexibility in modeling this data.Since Rodríguez-Avi's GBB distribution was developed in a different context, we do not pay much attention on that distribution at present.However, the newly proposed McGBB distribution provides an admirable fit to the alcohol consumption data compared to its nested distributions.The discrepancy between the observed frequencies and the expected frequencies is much reduced in McGBB model over BB and KB models.For example, the observed number of respondents who consume alcohol in all seven days of a week is 95 over the week 1 and 84 over the week 2; the BB model provides the expected frequencies 87.8 and 76.7 for this; the KB model provides a similar 87.1 and 76.45 for this; While, as anticipated, the McGBB model results 94.90 and 83.26 for expected number of respondents who consume alcohol in all seven days of a week over the week 1 and week 2 respectively.The lesser discrepancies between the observed and expected frequencies in the McGBB model also results a substantial decrease in the χ 2 goodness of fit test statistic and accordingly larger p-values(0.7060forweek 1 and 0.4055 for week 2).This indicates that for any standard statistical significance level McGBB distribution models this data well whereas BB and KB distributions fail to fit.Moreover, ANODEV results which compare McGBB model with its nested models in fitting this alcohol consumption data are presented in Table2.The p-values in Table2.indicate that both BB and KB models are significantly rejected in favor of the McGBB model to fit this alcohol consumption data.Therefore, based on these results, we conclude that the proposed McGBB distribution provides better fit to model this data than its nested BB and KB distributions.

Table 3 .
The percentage of the significant p-values at 5% significance level which leads to reject the model under consideration out of the 1000 simulated datasets Next, we compare the Analysis of Deviance comparison results that are performed to determine the superiority of complex McGBB models in favor of its nested simpler models in fitting overdispersed binomial outcome data.Table 4. presents the results of Procedure 2. described in section 7.3 for ANODEV comparison of McGBB over KB for the first set of hypotheses stated in Table 2. Correspondingly, Table 5. contains the similar ANODEV comparison of McGBB over KB for the second set of hypotheses stated in the Table 2.

Table 4 .
Percentage of significant p-values in the ANODEV of McGBB over BB

Table 5 .
Percentage of significant p-values in the ANODEV of McGBB over KB