Minimization of Negative Log Partial Likelihood Function Using Reproducing Kernel Hilbert Space

Reproducing kernel Hilbert space (RKHS) can be used to estimate values of functions, derivatives and integrals of models. The RKHS kernels are useful in finding the optimizer (


Introduction
The theory of reproducing kernel is a powerful instrument in many areas of mathematical research.See, for example, Aronszajn (1950), Hille (1972), Burbea (1976), Wahba (1998), Berlinet et al. (2003), Li et al. (2003).Many researchers had shown that there exist strong connection between the problems of applied sciences, mathematical analysis, and many areas of engineering.Many statistical problems can also be solved and data can be analyzed using models related to reproducing kernel Hilbert space (RKHS).RKHS enables us to estimate a variety of mathematical models.A simple linear kernel was used by Li et al. (2003) in Cox regression models to relate expression profiles of censored cancer data sets.N. Abdul Manaf et al. (2011) generated a new kernel to determine the function ( ) ( , ) of the general Cox model and utilized partial derivatives of the negative log partial likelihood to find the optimal values for the HIV patients survival data.

Reproducing Kernel Hilbert Space (RKHS)
On a domain Ts, let Hs be a Hilbert space such that there exists an element H s t   for every s t T  and the inner product in H s is ( ) , , for every for every 1 2 , ,..., gives the originality of "reproducing kernel".

Moore-Aronszajn Theorem
The Moore-Aronszajn theorem mentioned by Berlinet and Thomas-Aqnan (2003) states that every kernel K which is symmetric and positive definite on a set T s defines an incomparable RKHS (Berlinet & Thomas-Aqnan, 2003).
The following process shows the construction of space H s (Berlinet & Thomas-Aqnan, 2003).

Let ( , )
s i K K s s  for all s in T s .Let H 0 be the linear span of   : . Suppose the dot product on H 0 is defined as The symmetry of K generates the symmetry of the dot product.
If we let H s be the complete set of H 0 with respect to the above dot product, then H s is the compilation of functions 1 ( ) ( ) We observed from the Cauchy-Schwarz inequality that this series converges for every x.We obtain the inner product .
from the reproducing kernel properties.

Kernel Method and Its Application
where ( , , ( ) is the loss function which depends on ( ) is the set of HIV patients who were at risk at i t .The solution of this problem was given by Kimeldorf and Wahba and is known as the representer theorem in which the optimizer function ( ) f s has the form (Kimeldorf & Wahba, 1971) 1 ( ) ( , ) where K is the reproducing kernel of k H . Constant c can be omitted in the solution procedure because it can be absorbed into baseline hazard function.

Formulas for Partial Derivatives of Loss Function
Our task is to find the function 1 ( ) ( , ) when the optimal values of vector where is the set of individuals who were at risk at time i t .
We can state the negative log-likelihood function as follows: R f is minimized by using the Newton-Raphson method.We give an illustrative example for i R .

Result and Discussion
Let ( , ) , , x i = (gender, age, race), and 0 ( ) ( ) exp( ( )) which is the hazard function for Cox model where i t is lifetime of the i th patient.Then we have to include the following equations to use the Newton-Raphson method: , ) .
We used the kernel ( , ) , K x y Cx Dy   , where C and D are diagonal matrices (Manaf et al., 2011).We had verified the positive definiteness and symmetrical properties of kernel K. Our research and observations show that the greater the values of function 1 ( ) ( , ), the less chance of survival among the HIV patients chosen in random.This is shown in the following Figure 1.Using the Newton-Raphson method the optimal values of i a is obtained by setting the derivatives with respect to a in R to zero.We compared the result with the Gaussian Radial Basis function kernel, 1 ( , ) exp , 2 Our result is shown in Table 1.
Table 1.Results of Two Different Kernels

Conclusion
Several other kernels can be generated and then applied to different sets of data.It is important that we are able to verify that the kernels fulfill all the rules and theories of RKHS.The derivatives used in this RKHS method are applicable to all kernels used to determine ( ) f x in the Cox hazard function models.It should be noted that RKHS kernels can be used in models of several research areas such as in business, engineering and medical sciences because of the connection between data distributions and kernels.In fact, more researches can be performed to show that RKHS method can solve many other problems that involve mathematics and statistics.( , ) ,

Figure 1 .
Figure 1.Survival of HIV patients