Superlinear Convergence of a Modified Newton’s Method for Convex Optimization Problems With Constraints

We consider the constrained optimization problem defined by: f (x∗) = min x∈X f (x) (1) where the function f : Rn −→ R is convex on a closed bounded convex set X. To solve problem (1), most methods transform this problem into a problem without constraints, either by introducing Lagrange multipliers or a projection method. The purpose of this paper is to give a new method to solve some constrained optimization problems, based on the definition of a descent direction and a step while remaining in the X convex domain. A convergence theorem is proven. The paper ends with some numerical examples.


Introduction
In applied mathematics such as in many scientific fields, we are often led to solve nonlinear optimization problems with constraints. Several authors have studied the solution of nonlinear optimization problems with constraints, such as (Dennis & Schnabel, 1983;Ortega & Rheinboldt, 1970;Laurent, 1972;Culioli, 1994;Rhanizar, 2002;Rhanizar, 2020). Among the methods used to solve the problem (1) by transforming it to an unconstrained problem, we can cite the projection methods defined by : This method is only applicable if one can easily compute the projection P X , for example if X = m i=1 [a i , b i ] is a block of R R R n . But if X is defined by constrained inequalities, it is not easy in general to use this method. We also find the external penalization method which introduces a function: ψ : R R R n −→ R R R having the following properties: The method considers ∀ > 0 a function f : R R R n −→ R R R defined by: The method consists in minimizing f (x) on R R R n with tending to 0.
This method is applicable if it is easy to build a function ψ with its properties.
We also find the method SQP (Sequential Quadratic Programming). By introducing the Lagrangian, this method consists in solving series of quadratic problems given by: By adding the Lagrange multipliers, the number of variables increases.
In the present paper we present an optimization method with constraints without reducing the problem to the case without constraints. In (Rhanizar, 2020), we have developed an optimization method with constraints defined by: This method has a geometric convergence and the sequence (x k ) checks: In general, gradient methods have slow convergence. This is due to the fact that the admissible directions used are obtained from first-order approximations of the function to be optimized. Hence the idea of obtaining admissible directions from second order approximations. This idea has been widely developed for the unconstrained problems. In this paper, we propose to develop a second order method applied to constrained optimization problems. Let us now give some notations to be used throughout this article: ., . denotes the usual scalar product x, y = n i=1 x i y i for x and y two vectors of R R R n .
: the gradient of the function f .
: the Hessian matrix.
The remainder of the paper is organized as follows: The introduction is presented in section 1. In section 2, we describe how to choose the direction of descent and a new algorithm that solves problem (1). Section 3 is devoted to results of convergence of the new method. We then study the speed of convergence in section 4. In section 5, some numerical examples are elaborated. The conclusions are given in section 6.

Searching for a Direction of Descent
Instead of using a first order approximation of the function to be optimized, we will determine the admissible directions from second order approximations. Hence the idea of defining the directions d k as follows: For each approximation x k we define: d k = y k − x k with y k the solution of the following problem : and we define x k+1 = x k + α k d k with 0 < α k ≤ 1 For each k, we consider the function g k defined by: Using the fact that y is a minimum, we get g k (y k ) ≤ g k (x k ) = 0, so: For the direction d k = y k − x k , we have two cases to consider: First case: suppose d k = 0, so y k = x k and g k (y k ) = 0, thus from (4), we have: On the other hand: and thereafter: Second case: If d k 0 Using relation (6) and the hypothesis But this condition is not sufficient for convergence (Rondepierre, 2017). This is why we are going to impose : With α k ∈]0, 1].
So we have the following algorithm: 2. Set k := k + 1 and go to 2 end If 6. α k = 1 2 α k and go to 4.

Convergence Study
The following theorem shows the possible choice of α k verifying (7) and the convergence of the sequence (g k (y k ) k to the solution 0.
Theorem 1 Let f be of class C 2 on X convex bounded, and suppose that there exists m > 0 and M > 0, such that: Then: 2) By applying Taylor formula to f , we have: Using the relation (7) , the convexity of g k , and g k (x k ) = 0, we have: By the relation (9) and the assumption (8) we get: For the (7) condition to be verified, it is sufficient that: so α k verifies: We can choose α k = α 0 2 −i with i as the first clue that verifies: It's always possible indeed: −g k (y k ) (M − m)||y k − x k )|| 2 > 0 and 2 −n α 0 −→ 0 when n −→ +∞.

3)
We have: Then: ( f (x k )) is a declining sequence, so it converges.

Convergence Speed Assessment
The demonstration of the theorem that gives the speed of convergence of the sequence (x k ) k requires the following lemma1 which is a result on Banach's fixed point theorem for contractions.
Lemma 1 Let (x k ) k be a sequence verifying the following hypothesis: Lemma 2 Under the same assumptions as in Theorem 1, we have: Proof. g k (y k ) = min y∈X g k (y) implies (∇g k (y k ), y k − x k ) ≤ 0 Which gives: As a result: Using the relation (8) we have: Thereafter: Using 4) of Theorem 1, we obtain: Theorem 2 Under the same assumptions as in Theorem 1, we have: 1. ∃K > 0, ∀k ≥ K, we have x k+1 = y k 2. x k converges to a x * limit a super-linear way.

Proof.
1. By the relation (9) and equality x k+1 − x k = α k (y k − x k ),we have: To have the inegality (7) it is enough that the following conditio is verified so α k verifies: On the other hand by the relation (12) we have: By the Lemma 2, we also have: Which implies: And: And using (14) we also have: So ∃K > 0 such as ∀k ≥ K we have : Using the relation 13 and condition 0 < α k ≤ 1 we get: For k ≥ K the inequality (7) is checked for α k = 1 which gives: x k+1 = y k ∀k ≥ K 2. By the convexity of g k , the generalized Lagrange formula (Kolmogorov & Fomine, 1979), and the relation g k (x k ) = 0, we have: with t k = x k−1 + s(x k − x k−1 ) and s ∈ [0, 1], which gives Using the relation (16) and the fact that we obtain: And from the relation (17), it follows that: Moreover, by (12) and (16), we have: By having: || m and using (15), we have: Research Vol. 13, No. 2;2021 With β k − −−−− → k→+∞ 0 Using Lemma 1, we finally obtain that: x k is a convergent sequence And: 3. Let us show that x * is the solution of problem (1).
Using relation (4) we have: Which gives: Using the Lemma (2) and the fact that: Which proves that x * is the solution to problem (1).

Numerical Examples
In this section, we present some numerical experiments. We compare the new method N.M. with a quadratic programming method Q.M.. This comparison is summarized in the tables which give the number of iterations, the associated residual norms for each method and the convergence time.
Example 1: We consider the following problem: subject to : x 1 + x 2 ≤ 1 x 1 ≥ 0, and x 2 ≥ 0. In this example, we have chosen the Resonbrock function defined by: We consider the following problem:  6. Conclusions 1. The method described in this paper minimizes a sequence of quadratic problems under constraints without using Lagrange multipliers, which does not increase the number of variables.
2. With this method, we can solve an important class of problems encountered in numerical analysis which are formulated as follows: where A is the matrix that defines the constraints, x is the vector of variables and b is the vector of bounds of the variables . At each iteration x k , we determine the direction d k = y k − x k by solving the following problem Min 1 2 (H k y, y) + (c k , y) Ay ≤ b y ≥ 0 where H k = ∇ 2 f (x k ) and c k = ∇ f (x k ).
3. What is also important is that in the sequence of quadratic problems solved by the method, what changes is the objective function. This gives a gain in memory space and execution time.