A Shifted Power Method for Homogenous Polynomial Optimization over Unit Spheres

This work was supported by the NSF of China (11171180). Abstract In this paper, we propose a shifted power method for a type of polynomial optimization problem over unit spheres. The global convergence of the proposed method is established and an easily implemented scope of the shifted parameter is provided.


Introduction
The polynomial optimization problem in which the objective function and constraints are polynomial functions received much attention recently due to their wide applications in such as signal processing (Ghosh 2008, Qi 2003), biomedical engineering (Kofidis 2002, Lasserre 2001), material science (Soare 2008), quantum mechanics (Dahl et al. 2008, Wang et al. 2009), and numerical linear algebra (Hof 2009, Qi 2005), see (Klerk 2008) for a survey on the various classes of polynomial optimization with simplex, hypercube, or sphere constraints.The polynomial optimization problem is a challenging task, as the simplest instances of polynomial optimization, such as maximizing a cubic polynomial over a sphere, is NP-hard (Nesterov 2003).However, researchers have made much contributions in this area from the theoretical side to numerical solution methods (Lathauwer 2000, He 2010, Lasserre 2001, Luo 2010, Qi 2004, 2009, Zhang 2012).
In this paper, we consider the following type of polynomial optimization over unit spheres where f : R is a homogenous polynomial function whose each term is also d i -order homogenous with respect to It is well known that tensor is a useful tool in polynomial optimization as a polynomial, especially a homogenous polynomial, has a very simple expression with the aid of tensor, and furthermore, the optimal condition of a homogenous polynomial optimization with a special structured feasible region can be deeply characterized (Qi 2005, Qi 2009).
A tensor is a multidimensional array of data whose elements are referred by using multiple indices, i.e., where d, the number of indices, is called the order of the tensor, and the d-tuple

and the element
By virtue of partially symmetric tensor, problem (1.1) can be written as where A is a partially symmetric tensor w.r.t.index blocks (i This problem contains the problem of finding best rank-1 approximation or computing the largest eigenvalue in magnitude of a super-symmetric tensor (Kofidis 2002, Kolda 2011, Zhang 2012, Qi 2009) as special cases.If s = 2 and d 1 = d 2 = 2, then this problem reduces to the problem considered in (Dah 2008, Wang 2009) arising from the nonlinear elastic materials analysis and entanglement studies in quantum physics.The problem is also a special case of the spherically constrained homogenous polynomial optimization problem considered in (Chen 2012).
Generally, there are three popular numerical solution methods for polynomial optimization.The first one is the SOS approach which is based on the decomposition of a multivariate polynomials into sum of squares (Lasserre 2001, Parrilo 2003), and the second one is the semidefinite programming relaxation technique (Luo 2010) where the concerned problem is approximated by a specially constructed semidefinite programming problem.These two kinds of methods can obtain a global solution of the problem in a sense, however, the computing quantity of these two methods is too large and they are efficient for small scale polynomial optimization problems.The third method is the power method.This method is initiated from the power method for computing the largest eigenvalue in magnitude of a square matrix (Golub 1996), and later was successfully extended to computing the largest singular eigenvalue of a higher-order tensor (Lathauwer 2000) and the largest Z-eigenvalue in magnitude of higher-order super-symmetric tensor (Kofidis 2002, Qi 2005), which in fact are a kind of homogenously polynomial optimization problem with unit sphere constraints (Lathauwer 2000(Lathauwer ,kofidis 2002)).It is also used to compute the largest eigenvalue of a nonnegative tensor.The distinct feature of this method is that it meeds less computing cost at each iteration and its convergence can be guaranteed under convexity assumption.To remove the confine, a novel shifted technique is introduced into the objective function (Kolda 2011,Wang 2009).
In this paper, we extend the shifted power method to a more general type of polynomial optimization problem (1.1) and establish the convergence of the method.The main contribution of the paper is as follows.
(1) We apply the shifted power method to a more general type of polynomial optimization problem defined on the unit spheres and establish its convergence.
(2) We provide an easily implemented scope of the shifted parameters used in the designed iterative method.
The content of this paper is organized as follows.In the Section 2, we will give a short description on the tensor algebra and give some notations.In Section 3, we will give the design power method and establish its convergence.

Preliminaries and Notations
In this section, we will give a short description on the tensor algebra and some notations used throughout the paper.
For vectors where (x i ) i r denotes the i r th entry of vector Based on the inner product of tensors with same order and same size, the inner product of X with a general tensor For simplicity, we denote the product by Ax and Ax d−2 is an n × n symmetric matrix.
Throughout this paper, we use Γ n and Σ n to denote the unit ball and sphere in R n , and the subscript n is omitted if no confusion is made.

Algorithm and Convergence
For problem (1.1), since the linear independence of the constraint qualification is always satisfied, thus for any optimal solution (x This system constitutes the KKT condition of problem (1.2), which corresponds to the stationary point condition of Lagrange function of the problem (Nocedal 1999). As Certainly, this is the optimal objective function value at point Based on the shifted power method for computing the largest eigenpairs of a super-symmetric tensor (Kolda 2011, Wang 2009), we have the following shifted power method for solving problem (1.1).

end if end for
To see the well-definedness of the algorithm, we need to show that, at each iteration, the gradient of the shifted objective function, Proof.With the notation of the product of tensor, the function g can be expressed as where unfold is the unfolded matrix of 2-order tensor in R n i ×n i , and I n i is an identity matrix of dimension n i .
To show the strict convexity of function g From the partial symmetry of tensor A, this matrix is symmetric, and from the Geršgorin disc theorem (Golub 1996), the spectral radius of matrix unfold ) is less than or equal to Denote the sum above as α.Then, for any α i > α, the matrix From the continuity of the smallest eigenvalue and largest eigenvalue of matrix w.r.t.its elements and the boundedness of Γ n 1 × • • • × Γ n s , we conclude that the eigenvalue of the above matrix is bounded away from zero on This means that the matrix is nonsingular w.r.t.
and thus the gradient of the polynomial function g s )} is strictly increasing; (2) if the algorithm terminates after finite steps, then final point is a KKT point of problem (1.1), and if algorithm generate an infinite sequence {(x (k)  1 , Proof.To prove (1), we first show that for any k = 0, 1, 2, In fact, since the function g (3.2) Owning to , one has from the Cauchy-Schwarz inequality that This means that inequality (3.1) holds.
For (2), we first consider the case that the algorithm terminates after finite steps, say, stops at Step k 0 .Then from Algorithm 3.1 and discussion for (1), Combining this with the iterative procedure of Algorithm 3.1 yields that i .This means that (x (k 0 ) 1 , x (k 0 ) 2 , • • • , x (k 0 ) s ) is a KKT point of the problem (1.2).Now, consider the case that the algorithm generates an infinite sequence.In this case, as the sequence g(x (k) 1 , • • • , x (k) s ) is increasing, we know that it converges to g * as the sequence is defined over unit spheres and thus bounded from above.In particular, one has ••×n d is called partially symmetric if its indices can be partitioned into s index blocks and there exist positive integers d 1