Skip to main content
Springer logoLink to Springer
. 2017 Sep 22;2017(1):236. doi: 10.1186/s13660-017-1510-0

A family of conjugate gradient methods for large-scale nonlinear equations

Dexiang Feng 1,2, Min Sun 3, Xueyong Wang 3,
PMCID: PMC5610229  PMID: 28989261

Abstract

In this paper, we present a family of conjugate gradient projection methods for solving large-scale nonlinear equations. At each iteration, it needs low storage and the subproblem can be easily solved. Compared with the existing solution methods for solving the problem, its global convergence is established without the restriction of the Lipschitz continuity on the underlying mapping. Preliminary numerical results are reported to show the efficiency of the proposed method.

Keywords: nonlinear equations, conjugate gradient method, projection method, global convergence

Introduction

Consider the following nonlinear equations problem of finding xC such that

F(x)=0, 1.1

where F:RnRn is a continuous nonlinear mapping and C is a nonempty closed convex set of Rn. The problem finds wide applications in areas such as ballistic trajectory computation and vibration systems [1, 2], the power flow equations [35], economic equilibrium problem [68], etc.

Generally, there are two categories of solution methods for solving the problem. The first one is first-order methods including the trust region method, the Levenberg-Marquardt method and the projection method. The second one is second-order methods including the Newton method and quasi-Newton method. For the first method, Zhang et al. [9] proposed a spectral gradient method for problem (1.1) with C=Rn, and Wang et al. [3] proposed a projection method for problem (1.1). Later, Yu et al. [10] proposed a spectral gradient projection method for constrained nonlinear equations. Compared with the projection method in [3], the methods [9, 10] need the Lipschitz continuity of the underlying mapping F(), but the former needs to solve a linear equation at each iteration, and its variants [11, 12] also inherit the shortcoming. Different from the above, in this paper, we consider the conjugate gradient method for solving the non-monotone problem (1.1). To this end, we briefly review the well-known conjugate gradient method for the unconstrained optimization problem

minxRnf(x).

The conjugate gradient method generates the sequence of iterates recurrently by

xk+1=xk+αkdk,k=0,1,2,,

where xk is the current iterate, αk>0 is the step-size determined by some line search, and dk is the search direction defined by

dk={gkif k=0,gk+βkdk1if k1,

in which gk=f(xk) and βk>0 is a parameter. The famous conjugate gradient methods include the Fletcher-Reeves (FR) method, the Polak-Ribière-Polyak (PRP) method, the Liu-Storey (LS) method, the Dai-Yuan (DY) method. Recently, Sun et al. [13] and Li et al. [14] proposed two variants of PRP method which possesses the following property:

|βk|tgkdk1,k1,

where t>0 is a constant.

In this paper, motivated by the projection methods in [3, 11] and the conjugate gradient methods in [13, 14], we propose a new family of conjugate gradient projection methods for solving nonlinear problem (1.1). The new designed method is derivative-free as it does not need to compute the Jacobian matrix or its approximation of the underlying function (1.1). Further, the new method does not need to solve any linear equations at each iteration, thus it is suitable to solve large scale problem (1.1).

The remainder of this paper is organized as follows. Section 2 describes the new method and presents its convergence. The numerical results are reported in Section 3. Some concluding remarks are drawn in the last section.

Algorithm and convergence analysis

Throughout this paper, we assume that the mapping F() is monotone, or more generally pseudo-monotone, on Rn in the sense of Karamardian [15]. That is, it satisfies that

F(y),yx0,for all yRn,xS, 2.1

where , denotes the usual inner product in Rn. Further, we use PC(x) to denote the projection of point xRn onto the convex set C, which satisfies the following property:

PC(x)PC(y)2xy2PC(x)x+yPC(y)2,x,yRn. 2.2

Now, we describe the new conjugate gradient projection method for nonlinear constrained equations.

Algorithm 2.1

Step 0.

Given an arbitrary initial point x0Rn, parameters 0<ρ<1,σ>0,t>0,β>0,ϵ>0, and set k:=0.

Step 1.

If F(xk)<ϵ, stop; otherwise go to Step 2.

Step 2.
Compute
dk={F(xk)if k=0,(1+βkF(xk)dk1F(xk)2)F(xk)+βkdk1if k1, 2.3
where βk is such that
|βk|tF(xk)dk1,k1. 2.4
Step 3.
Find the trial point yk=xk+αkdk, where αk=βρmk and mk is the smallest nonnegative integer m such that
F(xk+βρmdk),dkσβρmdk2. 2.5
Step 4.
Compute
xk+1=PHk[xkξkF(yk)], 2.6
where
Hk={xRn|hk(x)0},
with
hk(x)=F(yk),xyk, 2.7
and
ξk=F(yk),xkykF(yk)2.
Set k:=k+1 and go to Step 1.

Obviously, Algorithm 2.1 is different from the methods in [13, 14].

Now, we give some comment on the searching direction dk defined by (2.3). We claim that it is derived from Schmidt orthogonalization. In fact, in order to make dk=F(xk)+βkdk1 satisfy the property

F(xk)dk=F(xk)2, 2.8

we only need to ensure that βkdk1 is vertical to F(xk). As a matter of fact, by Schmidt orthogonalization, we have

dk=(1+βkF(xk)dk1F(xk)2)F(xk)+βkdk1.

Equality (2.8) together with the Cauchy-Schwarz inequality implies that dkF(xk). In addition, by (2.3) and (2.4), we have

dkF(xk)+|βk|F(xk)dk1F(xk)2F(xk)+|βk|dk1F(xk)+tF(xk)+tF(xk)=(1+2t)F(xk).

Therefore, for all k0, it holds that

F(xk)dk(1+2t)F(xk). 2.9

Furthermore, it is easy to see that the line search (2.5) is well defined if F(xk)0.

For parameter βk defined by (2.4), it has many choices such as βkS1=F(xk)/dk1, or [13, 14]

βkNWYL=F(xk),F(xk)F(xk)F(xk1)F(xk1)|F(xk)dk1|+tF(xk)dk1,βkNPRP=F(xk),F(xk)F(xk1)max{tdk1,F(xk1)2}.

From the structure of Hk, the orthogonal projection onto Hk has a closed-form expression. That is,

PHk(x)={xF(yk),xykF(yk)2F(yk)if F(yk),xyk>0,xif F(yk),xyk0.

Lemma 2.1

For function hk(x) defined by (2.7), let xS, it holds that

hk(xk)σxkyk2andhk(x)0. 2.10

In particular, if xkyk, then hk(xk)>0.

Proof

From xkyk=αkdk and the line search (2.5), we have

hk(xk)=F(yk),xkyk=αkF(yk),dkαk2σdk2=σxkyk2.

On the other hand, from condition (2.1), we can obtain

hk(x)=F(yk),xyk0.

This completes the proof. □

Lemma 2.1 indicates that the hyperplane Hk={xRn|hk(x)=0} strictly separates the current iterate from the solutions of problem (1.1) if xk is not a solution. In addition, from Lemma 2.1, we also can derive that the solution set S of problem (1.1) is included in Hk for all k.

Certainly, if Algorithm 2.1 terminates at step k, then xk is a solution of problem (1.1). So, in the following analysis, we assume that Algorithm 2.1 always generates an infinite sequence {xk}. Based on the lemma, we can establish the convergence of the algorithm.

Theorem 2.1

If F is continuous and condition (2.1) holds, then the sequence {xk} generated by Algorithm 2.1 globally converges to a solution of problem (1.1).

Proof

First, we show that the sequences {xk} and {yk} are both bounded. In fact, it follows from xHk, (2.1), (2.2) and (2.6) that

xk+1x2xkξkF(yk)x2=xkx22ξkF(yk),xkx+ξk2F(yk)2xkx22ξkF(yk),xkyk+ξk2F(yk)2=xkx2F(yk),xkyk2F(yk)2xkx2σ2αk4dk2F(yk)4.

Thus the sequence {xkx} is decreasing and convergent, and hence the sequence {xk} is bounded, and from (2.9), the sequence {dk} is also bounded. Then, by yk=xk+αkdk, the sequence {yk} is also bounded. Then, by the continuity of F(), there exists constant M>0 such that F(yk)M for all k. So,

xk+1x2xkx2σ2αk4dk4M2, 2.11

from which we can deduce that

limkαkdk=0. 2.12

If lim infkdk=0, then from (2.9) it holds that lim infkF(xk)=0. From the boundedness of {xk} and the continuity of F(), {xk} has some accumulation point such that F(x¯)=0. Then from (2.11), {xkx¯} converges, and thus the sequence {xk} globally converges to .

If lim infkdk>0, from (2.9) again, we have

lim infkF(xk)>0. 2.13

By (2.12), it holds that

limkαk=0. 2.14

Therefore, from the line search (2.5), we have

F(xk+βρmk1dk),dk<σβρmk1dk2. 2.15

Since {xk} and {dk} are both bounded, then letting k in (2.15) yields that

F(x¯),d¯0, 2.16

where x¯,d¯ are limits of corresponding sequences. In addition, from (2.8) and (2.13), we get

F(x¯),d¯=F(x¯)2>0. 2.17

Obviously, (2.16) contradicts (2.17). This completes the proof. □

Numerical results

In this section, numerical results are provided to substantiate the efficacy of the proposed method. The codes are written in Mablab R2010a and run on a personal computer with 2.0 GHz CPU processor. For comparison, we also give the numerical results of the spectral gradient method (denoted by SGM) in [9], the conjugate gradient method (denoted by CGM) in [16]. The parameters used in Algorithm 2.1 are set as t=1,σ=0.01,ρ=0.5,β=1, and

βk=F(xk)dk1.

For SGM, we set β=0.4,σ=0.01,r=0.001. For CGM, we choose ρ=0.1,σ=104 and ξ=1. Furthermore, the stopping criterion is set as F(xk)106 for all the tested methods.

Problem 1

The mapping F() is taken as F(x)=(f1(x),f2(x),,fn(x)), where

fi(x)=exi1for i=1,2,,n.

Obviously, this problem has a unique solution x=(0,0,,0).

Problem 2

The mapping F() is taken as F(x)=(f1(x),f2(x),,fn(x)), where

f1(x)=x1ecos(x1+x2n+1),fi(x)=xiecos(xi1+xi+xi+1n+1),for i=2,3,,n1,fn(x)=xnecos(xn1+xnn+1).

Problem 3

The mapping F(x):R4R4 is given by

F(x)=(1000011001100000)(x1x2x3x4)+(x13x232x332x43)+(10130).

This problem has a degenerate solution x=(2,0,1,0).

For Problem 1, the initial point is set as x0=(1,1,,1), and Table 1 gives the numerical results by Algorithm 2.1 with different dimensions, where Iter. denotes the iteration number and CPU denotes the CPU time in seconds when the algorithm terminates. Table 2 lists the numerical results of Problem 2 with different initial points. The numerical results given in Table 1 and Table 2 show that the proposed method is efficient for solving the given two test problems.

Table 1.

Numerical results with different dimensions of Problem 1

Dimension 10 50 100 500 1,000
Iter. 20 21 22 23 23
CPU 0.3594 0.8750 1.7031 41.4844 257.1406

Table 2.

Numerical results with different initial points of Problem 3

Initial point Iter. CPU F(xk)
(0,0,0,0) 20 0.9219 9.9320×10−7
(3,0,0,0) 22 0.8438 7.0942×10−7
(1,1,1,0) 41 1.6250 7.2649×10−7
(0,1,1,1) 28 1.1875 3.8145×10−7
(0,100,100,1) 34 1.4063 6.6320×10−7

Conclusion

In this paper, we extended the conjugate gradient method to nonlinear equations. The major advantage of the method is that it does not need to compute the Jacobian matrix or any linear equations at each iteration, thus it is suitable to solve large-scale nonlinear constrained equations. Under mild conditions, the proposed method possesses global convergence.

In Step 4 of Algorithm 2.1, we have to compute a projection onto the intersection of the feasible set C and a half-space at each iteration, which is equivalent to quadratic programming, quite time-consuming work. Hence, how to remove this projection step is one of our future research topics.

Acknowledgements

The authors thank anonymous referees for valuable comments and suggestions, which helped to improve the manuscript. This work is supported by the Natural Science Foundation of China (11671228).

Authors’ contributions

DXF and MS organized and wrote this paper. XYW examined all the steps of the proofs in this research and gave some advice. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Dexiang Feng, Email: dexiangfeng@163.com.

Min Sun, Email: ziyouxiaodou@163.com.

Xueyong Wang, Email: yonggk@163.com.

References

  • 1.Wang YJ, Caccetta L, Zhou GL. Convergence analysis of a block improvement method for polynomial optimization over unit spheres. Numer. Linear Algebra Appl. 2015;22:1059–1076. doi: 10.1002/nla.1996. [DOI] [Google Scholar]
  • 2.Zeidler E. Nonlinear Functional Analysis and Its Applications. Berlin: Springer; 1990. [Google Scholar]
  • 3.Wang CW, Wang YJ. A superlinearly convergent projection method for constrained systems of nonlinear equations. J. Glob. Optim. 2009;40:283–296. doi: 10.1007/s10898-008-9324-8. [DOI] [Google Scholar]
  • 4.Wang YJ, Caccetta L, Zhou GL. Convergence analysis of a block improvement method for polynomial optimization over unit spheres. Numer. Linear Algebra Appl. 2015;22:1059–1076. doi: 10.1002/nla.1996. [DOI] [Google Scholar]
  • 5.Wood AJ, Wollenberg BF. Power Generation, Operation, and Control. New York: Wiley; 1996. [Google Scholar]
  • 6.Chen HB, Wang YJ, Zhao HG. Finite convergence of a projected proximal point algorithm for the generalized variational inequalities. Oper. Res. Lett. 2012;40:303–305. doi: 10.1016/j.orl.2012.03.011. [DOI] [Google Scholar]
  • 7.Dirkse SP, Ferris MC. MCPLIB: A collection of nonlinear mixed complementarity problems. Optim. Methods Softw. 1995;5:319–345. doi: 10.1080/10556789508805619. [DOI] [Google Scholar]
  • 8.Wang YJ, Qi L, Luo S, Xu Y. An alternative steepest direction method for the optimization in evaluating geometric discord. Pac. J. Optim. 2014;10:137–149. [Google Scholar]
  • 9.Zhang L, Zhou W. Spectral gradient projection method for solving nonlinear monotone equations. J. Comput. Appl. Math. 2006;196:478–484. doi: 10.1016/j.cam.2005.10.002. [DOI] [Google Scholar]
  • 10.Yu ZS, Lin J, Sun J, Xiao YH, Liu LY, Li ZH. Spectral gradient projection method for monotone nonlinear equations with convex constraints. Appl. Numer. Math. 2009;59:2416–2423. doi: 10.1016/j.apnum.2009.04.004. [DOI] [Google Scholar]
  • 11.Ma FM, Wang CW. Modified projection method for solving a system of monotone equations with convex constraints. J. Appl. Math. Comput. 2010;34:47–56. doi: 10.1007/s12190-009-0305-y. [DOI] [Google Scholar]
  • 12.Zheng L. A new projection algorithm for solving a system of nonlinear equations with convex constraints. Bull. Korean Math. Soc. 2013;50:823–832. doi: 10.4134/BKMS.2013.50.3.823. [DOI] [Google Scholar]
  • 13.Sun M, Wang YJ, Liu J. Generalized Peaceman-Rachford splitting method for multiple-block separable convex programming with applications to robust PCA. Calcolo. 2017;54:77–94. doi: 10.1007/s10092-016-0177-0. [DOI] [Google Scholar]
  • 14.Li M, Qu AP. Some sufficient descent conjugate gradient methods and their global convergence. Comput. Appl. Math. 2014;33:333–347. doi: 10.1007/s40314-013-0064-0. [DOI] [Google Scholar]
  • 15.Karamardian S. Complementarity problems over cones with monotone and pseudomonotone maps. J. Optim. Theory Appl. 1976;18:445–454. doi: 10.1007/BF00932654. [DOI] [Google Scholar]
  • 16.Xiao YH, Zhu H. A conjugate gradient method to solve convex constrained monotone equations with applications in compressive sensing. J. Math. Anal. Appl. 2013;405:310–319. doi: 10.1016/j.jmaa.2013.04.017. [DOI] [Google Scholar]

Articles from Journal of Inequalities and Applications are provided here courtesy of Springer

RESOURCES