Convergence of Symmetric Rank-one Method Based on Modified Quasi-Newton Equation

In this paper we investigate on convergence rate of a modified symmetric rank-one (SR1) method for unconstrained optimization problems. In general, the modified SR1 method incorporates a modified secant equation into the standard SR1 method. Also a restart procedure is applied to avoid the loss of positive definiteness and zero denominator. A remarkable feature of the modified SR1 method is that it possesses at most n + 1-step q-superlinearly convergent and 2n-step quadratic convergent without uniformly independent assumptions of steps.


Introduction
We consider the unconstrained optimization problem min f (x) where f : R n → R is a smooth function whose gradient at point x k is ∇ f (x k ) or g k , for the sake of simplicity.We assume that f is continuous and at least twice differentiable.Among various iterative methods for solving eq.( 1), quasi-Newton methods constitute an important class.These methods have been developed based on Newton's method, in which the Hessian matrix of f at x k , ∇ 2 f (x k ) is substituted by some matrix B k to avoid the calculation of a Hessian matrix.The QN method for solving eq.( 1) takes the following iterative process.
Given the k th iterate x k and the gradient of the function at x k , ∇ f (x k ), we determine the QN direction d k by B k d k +∇ f (x k ) = 0 where B k is an secant approximation to ∇ 2 f (x).Once d k is obtained, the next iterate x k+1 is generated by x k+1 = x k +α k d k .Update B k to B k+1 such that B k+1 satisfies in the following secant equation where s k = x k+1 − x k and y k = g k+1 − g k .However the general secant equation employ only the gradient information and ignores function information.Therefore Wei et al. (Wei et al., (2006)) proposed the modified secant equation where and A k is a simple symmetric and positive definite matrix.
In order to use both function and gradient information in the secant equation, they proposed a modified BFGS-type method for the solution of eq.( 1) and the superlinear convergence of the BFGS-type algorithm was proved.On the other hand Conn et al. (Conn et al., (1991)) and Khalfan et al. (Khalfan et al., (1993)) have analyzed the computational and numerical results of the SR1 methods and the numerical results showed that SR1 is a competitive formula among the QN method.Motivated by this Modarres et al. (Farzin et al., (2009)) presented a modification to the secant equation of Wei et al. and employed this modification to the SR1 update.They used a restart procedure to preserve positive definiteness and to avoid the unbounded updates in the modified SR1 update.The global convergence for this method has been proved.Convergence rate of SR1 method has been studied by Conn et al. and proved that the rate of convergence is q-superlinear by the assumption of the uniformly linearly independent steps.This condition may be too strong in practice.Therefore Khalfan et al. made the weaker assumptions, in which the matrices are positive definite and uniformly bounded.They showed that the convergence rate for the standard SR1 update is at most n + 1-step q-superlinear and 2n-step q-quadratic.
Hence, it seems possible to extend the similar results of Khalfan et al. to the modified SR1 method.In the next section, we present the algorithm of the modified SR1 method.Finally we concentrate our attention on the proof of the superlinear convergence of the modified SR1 method.

Description of algorithm
Modified SR1 Algorithm (+MSR1) Step 0. Given an initial point x 0 , an initial positive matrix H 0 = I, set k=0.
Step 2. Compute a quasi-Newton direction by Step 3. Find an acceptable steplength such that the Wolfe conditions where 0 where r ∈ (0, 1), (denominator in H k is sufficiently close to zero) or set H k+1 = λk I, where λk is given by and subsequently Step 6. Calculate ỹk by using the following equation where Step 7. Compute the next inverse Hessian approximation H k+1 as follows Step 8. Set k = k + 1, and go to step 1.
Remark 1 Note that the +MSR1 method preserves positive definiteness, however in the case B k is non-positive definite, by replacing λk I we can preserve positive definiteness of Hessian matrix.
Remark 2 The scaling factor λk−1 is derived in such a way to improve condition of modified SR1 method while preserving positive definiteness.

Convergence Rate of +MSR1 Algorithm
In this section we show that the modified SR1 update generated by +MSR1 Algorithm is at most n + 1-step q-superlinear convergence, and 2n-quadratic convergence.
To give the convergence results, the following assumptions are given: (i) The sequence of iterates {x k } remains in a closed bounded convex set D.
(ii) The function f has an unique minimizer at a point x * such that its Hessian ∇ 2 f (x * ) is positive definite, and ∇ 2 f (x) is Lipschitz continuous near x * , that is, there exists a constant τ > 0 such that for all x, y in some neighborhood of (iii) The sequence {x k } converges to x * .
Since modified SR1 method always generates positive definite updates, then for a strongly convex objective function, a line search implementation with Wolfe conditions will ensure that Assumption (iii) holds.
We first extend Lemma 1 from (Conn et al., (1991)) to the modified SR1 update, which does not assume linear independence of the step directions.
Lemma 1 Let {x k } be a sequence of iterates generated by the +MSR1 Algorithm.Suppose that Assumptions (i)-(iii) holds, Assume, furthermore, for each iteration eq.( 7) holds.Then, for each j, for all i ≥ j + 2, where and τ is the Lipschitz constant from Assumption (ii).
Proof 1 It is obvious that eq.( 11) and eq.( 12) with i = j + 1 are immediate result of B k+1 s k = ỹk .Now we proof eq.( 12) by induction.First of all we choose k ≥ j + 1 and suppose that eq.( 12) holds for every i = j + 1, ..., k.Now consider In the last inequality, we used inductive assumption and Cauchy-Schwarz inequality.
Using the mean value theorem, we obtain that, for all l, where H l = 1 0 H(x l + ts l )dt.Then from eq.( 14) we have, Furthermore we know from the triangle inequality and eq.( 7) that Hence if we use induction assumption and eq.( 16) in eq.( 17) we have If we put i = k + 1 in eq.( 18), when one takes into account the fact that r ∈ (0, 1) and use the inequality η k, j ≤ η k+1, j we can deduce eq.( 12).
We then have the following useful two Lemmas by using the Lemma 1, which state that by using +MSR1 Algorithm, there is at least p − n superlinear steps for each p > n iterations.
Lemma 2 Let {x k } be the sequences of iterates generated by +MSR1 Algorithm and B k be the corresponding Hessian approximation, also suppose the assumptions of Lemma 1 are satisfied for the sequences and that in addition there exists M for which B k ≤ M, for all k.Then there exist K ≥ 0 with S = s k i : where and Proof 2 The proof is similar to the proof in the Lemma 1 of (Khalfan et al., (1993)).
For our final result, Theorem 1 below, which establish convergence rate of the modified SR1 method (+MSR1), we first establish the following Lemma which is closely related to the well-known superlinear convergence characterization of Dennis and Moré (Dennis and Moré, (1974)).
Lemma 3 Suppose the Assumptions ((i)-(iii)) hold for objective function f .If the following quantities in the +MSR1 method e k = x k − x * and Proof 3 See Khalfan et al. (Khalfan et al., (1993)).
We are now ready to prove our final result.
Theorem 1 Consider the sequence {x k } generated by the +MSR1 Algorithm and suppose that Assumptions (i)-(iii) holds.
If there exists K 0 such that B k is positive definite for all k ≥ K 0 , then for any p ≥ n + 1 there exists K 1 such that for all k ≥ K 1 , where α is a constant.
On the other hand using this fact, together with Lemma 3 and eq.( 24) implies that if e k is sufficiently small then for some constants α.
We may also apply Lemma 2 to the set {s k , s k+1 , .
for each t i .Since we have a descent method, it follows that Using eq.( 23) we have that for an arbitrary k ≥ K 1 , for i = 1, 2, ..., p − n.Therefore using eq.( 26) and eq.( 27) we obtain which by eq.( 23) implies that e k+p ≤ ( ł 2 ł 1 e 1/n k ) p−n e k .
Note that Theorem 1 only requires positive definiteness at the p − n out of p "good iterations" (which is, steps where f is reduced).
Finally, we give the rate of convergence for the +MSR1 algorithm: Corollary 1 Under the assumptions of Theorem 1, the sequence {x k } generated by the +SSR1 algorithm is at most n+1-step q-superlinear, i.e., lim inf k→∞ e k+n+1 e k = 0, and is 2n-step q-quadratic, i.e., lim sup k→∞ e k+2n e 2 k ≤ ∞.
Proof 5 The results follows by setting p = n + 1, and p = 2n, respectively in Theorem 1.
Remark 3 By choosing p = n + 1 in Corollary 1, we need just the positive definiteness condition just at only 1 step out of n + 1 "good steps".Hence, for every n + 1 steps greater than k, we will have at least 1 good step (which is, where B k (or H k ) is positive definite and bounded).

Conclusion
In summary, we have considered the convergence properties of the modified SR1 method based on modified secant equation of Wei et al.An important feature of the proposed method is that it preserves positive definiteness of the updates.
.., s k+n , s k+n+1 } − {s t 1 } to get t 2 .Hence, by repeating this step for n − p times we get a set of integers t 1 < t 2 < ... < t p−n with t 1 > k and t p−n < k + p such that e t i +1 ≤ αe 1/n k e t i ,