Discriminating Among Several Semiparametric Models

To distinguish between two or more than two models one can use the T-optimality criterion. Another criterion using for discrimination between two or more than two models is KL-criterion, which depend on the KullbackLeibler distance. KL-criterion can be used to discriminate between two non-normal models and a generalized of the KL-criterion was studied to discriminate more than two non-normal models. In this paper, more than two semiparametric models can be distinguished using generalized KL-criterion. An application was applied to illustrate the proposed technique by using three proportional hazard models via real data.


Introduction
Optimal designs are experimental designs that are generated, based on a optimality criterion and are generally optimal only for a specified statistical model.An optimality criterion showed how good a design is, based on some mathematical properties.One of these optimality criteria is T-optimality, which was proposed by Atkinson andFedorov (1975a, 1975b).This criterion is used to distinguish between two or more than two models with normal errors.Ponce de Leon and Atkinson (1992) proposed a generalized T-optimality between two generalized linear models, which called generalized T-optimality criterion.Uciński and Bogacka (2005) introduced a generalization of this criterion for multi response models.A generalized T-optimality composed of maximizing the deviance from the model 2 when data are generated by model 1.
Recently, López-Fidalgo et al. (2005, 2007) extended the conventional T-optimality criterion, to handle any distribution for the random errors and introduced a new criterion depend on the Kullback-Leibler divergence, called KL-optimality criterion.A design which maximizes this criterion is called KL-optimal design.One of the most applicable distance for statistical distributions is Kullback-Liebler distance is proposed see, Burnham and Anderson (1998).The KL-criterion function includes the T-optimality criterion as a special case and is applicable to any parametric regression models.López-Fidalgo et al. (2007) applied KL-optimality criterion under non-normal distributions, as the lognormal and gamma distributions.When the discrimination between two binary response models then the KL-criterion and generalized T-criterion are identical, see López-Fidalgo et al. (2007).Otsu (2008) proposed the KL-optimal criterion by using López-Fidalgo et al. (2007) to a semiparametric setup to discriminate two regression models.Tomasi (2007) used a generalized KL-criterion to discriminate more than two non-normal models.
In this paper, more than two semiparametric models can be discriminated using generalized KL-criterion.In Section 2, Cox's proportional hazard model is introduced.In Section 3, a generalized KL-criterion for discriminating among several Cox models is considered.In Section 4, a real data is illustrated where three Cox-proportional hazards models are given.A conclusion is proposed in Section 5.

Cox's Proportional Hazards Model
The proportional hazard model was firstly introduced by Cox (1972), and this is the most common model in biostatistics.The advantages using this model are: • The hazard ratio is an essay constant.
• The Cox model avoids making assumptions about the hazard.
Proportional hazards models are considered by Becker et al. (1989) who find D-optimal designs for models with one or two parameters and completely specified baseline hazard.They use geometric arguments and empirical values for the hazard rate to investigate how censoring affects the D-optimal designs for different shapes of the design region.
Survival analysis is a collection of statistical techniques used to examine and model the time it takes for events to occur.In survival analysis, when the event occur we use the term failure and survival time is the time taken for event failure to occur.
The Cox proportional hazards model is a semiparametric model which is given by: where β i 's are the parameters; t is the time; h 0 (t) is the baseline hazard function; If all of the x's are zero the exponential part of the previous equation equals 1, h i (t) = h o (t), so h o (t) is called the baseline hazard function (when predictor variables all have a value of zero).Even though the baseline hazard function is unspecified, it is still possible to estimate the parameter estimates in the exponential part of the model.Cox (1972) showed how to derive a valid parameter estimate that does not require the estimate of the baseline hazard function.
The hazard function is the probability that an individual will experience an event (for example, death) within a small time interval, given that the individual has survived up to the beginning of that interval.It can therefore be interpreted as the risk of dying at time t.If the hazard function does not depend on time and its value is completely determined by the covariate and the unknown parameters, it means that the risk of failure is the same no matter how long the subject has been followed.The hazard function, denoted by h(t), can be proposed as follows: number o f individuals experiencing an event in interval beginning at t (number o f individuals surviving at time t) × (interval width) Assumptions of the Cox model are as follows: • The ratio of the hazard function does not depend on time.
• Time is measured on a continuous scale.
There are three different tests to assess the significance of the coefficients: the partial likelihood ratio test, the score test, and Wald test.

Generalized KL-Criterion for Discriminating Among Several Cox Models
A statistical model is a collection of probability distribution functions or probability density functions.Let the statistical model can be written as f i (y, x, θ i ), i = 1, . . ., k where y is the dependent variable, x is a vector of experimental conditions and θ i ∈ Ω i ⊂ R m i is the unknown parameter vector.
In order to discriminate these k rival models; an prolonged model which includes them is considered by Atkinson and Cox (1974).The k models are entrenched in a more general model, f k+1 (y, x, θ k+1 ).KL-criterion is used to discriminate between the i-th model and f k+1 (y, x, θ k+1 ).In this paper, the parameters of the extended model are supposed to be known.
The i-th KL-optimality criterion function is where is the Kullback-Leibler distance between the true model f k+1 (y, x, θ k+1 ) and the alternative model f i (y, x, θ i ).
If ξ is any design, the efficiency of ξ is the ratio of the criterion function (1) at ξ to its maximum value, i.e.
is the KL-optimum design for discriminating model i from the general model.

Suppose that
be the generalized KL-criterion function which used to compare more than two models, and α is the k × 1 vector of the coefficients α i , which are such that 0 ≤ α i ≤ 1 for i = 1, . . ., k and is called a regular design, otherwise it is called singular design.
In this section, we will apply the KL-optimality criterion to discriminate more than two semiparametric models.One of the popular semiparametric models is Cox proportional hazards model given by: This model is based on two parts: h 0 (t) is called the baseline hazard function and depend on time only and the second part includes the covariates and does not conclude a time variable.So, the ratio of the hazards of two individuals does not depend on time, i.e. h 0 (t).
To find KL-optimum design for discriminating model i from the general model we first need to determine the i-th KL-criterion function given by (1). Where In our case, consider the following rival models: and the combined model In this paper, the parameters of the prolonged model are supposed to be identified.thus the optimal designs can be determined, so we let The criterion function (2) becomes 3 ) where the numerator is KL-optimality criterion function and given by: A design ξ * i which maximizing I i,4 (ξ), i = 1, 2, 3 is a KL-optimum design.According to evaluate a KL-optimum design numerically the Kullback-Leibler used in the expression of the directional derivative.Atkinson (1970) investigated a method for discriminating between models.It is desired to verify which of several alternative models adequately describe the data, the properties of a combined distribution containing the component models as special cases.Using this distribution, statistics are developed for testing for departures from one model in the direction of another and for testing the hypothesis that all models fit the data equally well.

An Application
In this section, a real data taken from Lee and Wang (2003) is applied in order to illustrate the proposed theoretical results.A sample of 200 cardiac patients was collected, and they were asked about some demographic variables then some clinical examinations were recorded.These patients were followed for ten years and the following variables were collected: age, SBP, LACR and LTG.The proportional hazards model used to identify which risk factors is the most important.
The event time of interest is CVD-free time, which is defined as the time in years.The covariates which used in this application are given by: systolic blood pressure (SBP), logarithm of ratio of urinary albumin and creatinine (LACR) and logarithm of triglycerides (LTG).

Conclusion
In this paper, a generalization of the KL-optimality criterion was introduced to discriminate among several semiparametric models.The main core of the generalized KL-optimality criterion was applied to one of the most important semiparametric models, namely the Cox's proportional hazards model.A real data set was used to illustrate the new theoretical results.The generalized KL-optimality criterion enabled us to discriminate among four different Cox models and select the model with high efficiency.