Skip to main content
Taylor & Francis Open Select logoLink to Taylor & Francis Open Select
. 2017 Dec 19;96(1):33–50. doi: 10.1080/00207160.2017.1413552

A non-monotone pattern search approach for systems of nonlinear equations

Keyvan Amini a, Morteza Kimiaei b,CONTACT, Hassan Khotanlou c
PMCID: PMC6235546  PMID: 30487705

ABSTRACT

In this paper, a new pattern search is proposed to solve the systems of nonlinear equations. We introduce a new non-monotone strategy which includes a convex combination of the maximum function of some preceding successful iterates and the current function. First, we produce a stronger non-monotone strategy in relation to the generated strategy by Gasparo et al. [Nonmonotone algorithms for pattern search methods, Numer. Algorithms 28 (2001), pp. 171–186] whenever iterates are far away from the optimizer. Second, when iterates are near the optimizer, we produce a weaker non-monotone strategy with respect to the generated strategy by Ahookhosh and Amini [An efficient nonmonotone trust-region method for unconstrained optimization, Numer. Algorithms 59 (2012), pp. 523–540]. Third, whenever iterates are neither near the optimizer nor far away from it, we produce a medium non-monotone strategy which will be laid between the generated strategy by Gasparo et al. [Nonmonotone algorithms for pattern search methods, Numer. Algorithms 28 (2001), pp. 171–186] and Ahookhosh and Amini [An efficient nonmonotone trust-region method for unconstrained optimization, Numer. Algorithms 59 (2012), pp. 523–540]. Reported are numerical results of the proposed algorithm for which the global convergence is established.

Keywords: Nonlinear equation, pattern search, coordinate search, non-monotone technique, theoretical convergence

2010 AMS Subject Classifications: 90C30, 93E24, 34A34

1. Introduction

Consider the following nonlinear system of equations

F(x)=0,xRn, (1)

for which F:RnRn is a continuously differentiable mapping. Suppose that F(x) has a zero. Then every solution x of the nonlinear equation problem (1) is a solution of the following nonlinear unconstrained least-squares problem

minf(x):=12F(x)2s.t.xRn, (2)

where denotes the Euclidean norm. Conversely, if x solves Equation (2) and f(x)=0, then x is a solution of (1). There are variant methods to solve nonlinear system (1), as conjugate gradient methods [33,35], line-search methods [5,12,14–16,32] and trust-region methods [2,3,7–11,34,36,37,39], which are quite fast and robust; but they may have some shortcomings. First, by a ratio, trust-region algorithm tries to control the agreement between the actual and predicted reduction essentially only along with a direction, for more details on the trust-region algorithm, cf. [26] If this ratio is near one and the Jacobin matrix F(x) is ill-conditioned or f is a highly nonlinear function for which approximated quadratic is not good, then the trust-region radius may increase before reaching a narrow curved valley. Afterwards, we need to reduce several times the radius to get around this narrow curved valley that leads to increase computational cost and also produce unsuitable solution for the cases in which highly accurate solutions are necessary. Second, solving the trust-region subproblems leads to increase CPU times. Third, these methods need to compute both F(x) and F(x)TF(x) to determine the approximated quadratic in each iteration. Pattern search methods represent a derivative free subclass of direct search algorithms to minimize a continuous function (see, e.g. [4,17,21,22]). Box [4] and Hooke and Jeeves [17] were the first researchers to introduce the original pattern search methods. Some researchers have shown that pattern search algorithms converge globally, see [13,19,20,30,31]. Lewis and Torczon successfully extended these algorithms to obtain bound and linearly constrained minimization [19,20]. Torczon [29,30] presented a multidirectional search algorithm for parallel machines. In ill-conditioned problems, using monotone pattern search auxiliary algorithm may have unsuitable influence on the performance of the whole procedure, cf. [13]. Hence, we are going to introduce a new non-monotone pattern search framework that decreases the total number of function evaluations and CPU times. This development enables us to produce a suitable non-monotone strategy at each iteration and maintains the global convergence. Numerical results show that the new modification of pattern search is efficient to solve systems of nonlinear equations.

Notation: The Euclidean vector norm or the associated matrix norm is denoted by the symbol . A set of directions {dk1,,dkp} is called positively span Rn if for each yRn there exist λi0, for i=1,,p, such that

y=j=1pλidki.

Moreover, ei, for i=1,,p, is considered as the orthonormal set of the coordinate directions. To simplify our notation, we set N0:=N{0}.

Organization. The rest of this paper is organized as follows. In Section 2, we first describe the exploratory moves and then the generalized pattern search is presented. A new non-monotone pattern search algorithm is presented in Section 3. In Section 4, the global convergence of the new algorithm is investigated. Numerical results are provided in Section 5 to show that the proposed algorithm is efficient and promising for systems of nonlinear equations. Finally, some concluding remarks are given in Section 6.

2. The generalized pattern search method

First of all, we define two components, namely a basis matrix and a generating matrix, cf. [31].

Definition 2.1

Any arbitrary non-singular matrix BRn×n is called a basis matrix.

Definition 2.2

The generating matrix CkZn×p with p>2n, divided into two parts, is considered as

Ck:=[ΓkLk],

in which Γk:=[MkMk], MkMZn×n, M is a finite set of non-singular matrices and LkZn×(p2n) is a matrix that contains at least a column zeros.

Defined by the columns of the matrix Pk=BCk, in which B is a basis matrix, is a pattern Pk. By the definition Ck and this fact that Mk has rank n, it is clear that Ck also has rank n. This fact implies that the columns of Pk span Rn. To act better, the partition of the generating matrix Ck to partition Pk is used, as follows:

Pk:=BCk=[BΓkBLk]. (3)

Given xk, a step-size Δk>0, we define a trial step dki to be any vector of the form

dki:=ΔkBcki, i=1,,p,

in which cki indicates the ith column of Ck, the vectors Bcki, named exploratory moves as proposed in [31], determine the step directions and Δk is considered as a step-size parameter. Furthermore, a trial point as any point of the form xki:=xk+dki will be defined, where xk is the current iterate. Before declaring a new iterate and updating the associated information, pattern search methods use the series of exploratory moves in order to produce the new iterate. To prove the convergence property of pattern search methods, we require that the exploratory moves are obtained by the following two procedures:

2.

In Procedure 1, note that yA means that the vector y is contained in the set of columns of the matrix A. (S.2) is more interesting; hence, let us describe how it works. As long as there exists a decrease on the function value at each iterate among any of the 2n steps presented by ΔkBΓk, the exploratory moves must return a decrease step on the function value at each iterate, without satisfying f(xk+dk)min{f(xk+y), yΔkBΓk}.

2.

In Procedure 2, (S.2) is replaced by a strong version, as presented above.

Algorithm 1 situates the generalized pattern search method for systems of nonlinear equations, cf. [31].

In Algorithm 1, if ρk>0 (Line 8), then it is called a successful iteration. Otherwise, it is called a unsuccessful iteration. The parameter θ is considered the shrinkage parameter with the role θ:=τw0, in which τ>1 and w0 is a negative integer, and λk is called the expanding factor such that

λk{τw1,τw2,,τwl},

in which w1,w2,,wl are positive integers, with l<. In Line 4 of this algorithm, the step dk can be obtained either by Procedure 1 or Procedure 2. This algorithm is called generalized weak pattern search (GWPS) if dk is obtained by Procedure 1; otherwise, if dk is obtained by Procedure 2, it is called generalized strong pattern search (GSPS).

Both GWPS and GSPS contain a drawback. This fact that the quantity ρk can not truly prevent the production of unsuccessful iterations in the presence of narrow curved valley leads to the increase of CPU time and the total number of function evaluations. In order to overcome this drawback, Gasparo et al. [13] modified the quantity ρk.

Torczon [31] showed in Theorem 3.2 that each iterate xn generated by GWPS can be considered as

xn:=x0+(βrLBαrUB)Δ0Bk=0n1zk, (4)

in which α and β are relatively prime positive integers while satisfying β/α:=τ, rLB:=min{r0,,rn1}, rUB:=max{r0,,rn1} and zkZn. Moreover, Torczon showed that Δk can be written as follows:

Δk:=τrkΔ0, (5)

in which rkZ. Both Equations (4) and (5) help us to prove Lemma 4.6 in Section 4.

3. The new non-monotone strategy

It is believed that some globalization techniques such as pattern search can generally guarantee the global convergence of the traditional direct search approaches. A monotonicity of the sequence of objective function values is generated by this globalization technique, which usually leads to produce short steps. Due to this fact, a slow numerical convergence is created for highly nonlinear problems, see [1,5,13,14,27,28,38]. As an example, the generalized pattern search framework exploits the quantity ρk which guarantees

fkfk+1>0,

this means that the sequence {fk}k0 is monotone. In order to avoid this drawback of globalization techniques, Gasparo et al. [13] based on the definition introduced by Grippo et al. [14], proposed a non-monotone strategy in pattern search algorithms with the quantity ρˆk satisfying

ρˆk:=fl(k)fk+1,

for which

fl(k):=max0jm(k){fkj},kN0 (6)

in which m(0):=0 and 0m(k)min{m(k1)+1,N} with N0. This strategy has excellent results having caused many researchers to investigate the effects of these strategies in a wide variety of optimization procedures and to propose some other non-monotone techniques, see [1,13,14,27,28,38]. Although the non-monotone technique (6) has many advantages, this rule contributes to some drawbacks as well, see [1,38]. Recently, Ahookhosh and Amini [1] have presented a weaker non-monotone strategy of Grippo et al. [14] which overcomes some of its disadvantages [14] with the quantity ρ¯k satisfying

ρ¯k:=Rkfk+1,

where

Rk:=ηkfl(k)+(1ηk)fk, (7)

in which ηk[ηmin,ηmax], ηmin[0,1) and ηmax[ηmin,1]. Although, this proposal generates the more efficient algorithm, it depends on choosing ηk. An unsuitable choice of ηk can cause some shortcomings. According to the characteristics and expectations of our algorithm, we further propose an appropriate ηk. In this regard, let us first define the following ratio

Θk:=fl(k)fk,

which can help us to compare the distance between the members of {fk}k0 and {fl(k)}k0. It is clear that Θk1 because fl(k)fk>0 and Lemma 4.5 show that limkΘk=1. Also, it can be seen that if Θkβ (β>1), then {fk}k0 and {fl(k)}k0 are far away from each other and otherwise they are close. Now, after representing of

ηˆk:=ηkΘkif Θkβ,ηkΘkelse, (8)

a new non-monotone pattern search formula is defined by

Λk:=ηˆkfl(k)+(1ηˆk)fk, (9)

for which the new quantity is considered as

ρ~k:=Λkfk+1. (10)

The theoretical and numerical results show that the new choice of ρ~k has remarkable positive effects on pattern search to get faster convergence, especially for highly nonlinear problems. Let us now use the following procedure to compute the non-monotone strategy (9)

3.

Remark 3.1

The sequence {Λk}k0 generates the convergence results obtained by stronger non-monotone strategy whenever iterates are far away from the optimizer and the members of {fk}k0 and {fl(k)}k0 are close to each other while this sequence generates the convergence results obtained by weaker non-monotone strategy whenever iterates are close to the optimizer and the members of {fk}k0 and {fl(k)}k0 are far away from each other.

Before presenting our algorithm, we describe how to determine the step dk by the following two procedures:

3.

Procedure 3 tries to reach the condition Λkfk at each iterate among any of the 2n steps presented by ΔkBΓk without satisfying f(xk+dk)min{f(xk+y), yΔkBΓk}.

3.

Now, to investigate the effectiveness of the new pattern search, we add the new non-monotone strategy to the framework of pattern search method.

Note that in Algorithm 2, if dk is obtained by Procedure 3, then it is called non-monotone weak pattern search (NMWPS-N) while if dk is obtained by Procedure 4, then it is considered as non-monotone strong pattern search (NMSPS-N). To guarantee the global convergence of NMWPS-N using Procedure 3 to determine dk, we need to update Δk by

Δk+1:=λkΔkif ρ~k>0,θΔkelse, (11)

and NMSPS-N, dk obtained by Procedure 4, updates Δk by

Δk+1:=Δkif ρ~k>0,θΔkelse, (12)

where both θ and λk are updated similar to Algorithm 1. We determine how to update Ck in Section 5.

The global convergence results of both NMWPS-N and NMSPS-N require the following assumptions necessary:

  1. The level set L(x0):={xRnf(x)f(x0)} is bounded.

  2. F(x) is continuously differentiable on a compact convex set Ω containing L(x0).

It can be easily seen that in Algorithm 2 for any index k, one of the following cases can occur

I1:={kΘkβ},I2:={kΘk<β,ηˆk(0,1]}andI3:={kΘk<β,ηˆk>1}.

Lemma 3.1

Suppose that the sequence {xk}k0 is generated by Algorithm 2. Then, we have the following properties:

  1. If kI1, then fkΛkRk.

  2. If kI2, then Rk<Λkfl(k).

  3. If kI3, then Λk>fl(k).

Proof.

(1) This fact that Θkβ>1 implies ηˆkηk and consequently

Λk=ηˆkfl(k)+(1ηˆk)fk=ηˆk(fl(k)fk)+fkηk(fl(k)fk)+fk=Rk.

On the other hand, because fl(k)fk, it is easily seen that

Λk=ηˆkfl(k)+(1ηˆk)fkηˆkfk+(1ηˆk)fk=fk.

So, (P1) is correct.

(2) Using the definition fl(k) along with this fact that 1Θk<β implies that ηˆkηk and so

Rk=ηk(fl(k)fk)+fkηˆk(fl(k)fk)+fk=Λk=(1ηˆk)(fkfl(k))+fl(k)fl(k),

which gives (P2).

(3) The definition of fl(k) and ηˆk>1 results in

Λk=ηˆkfl(k)+(1ηˆk)fk=(ηˆk1)(fl(k)fk)+fl(k)>fl(k),

so, (P3) is correct.

Based on Lemma 3.1, using the new sequence {ηˆk}k0 causes some appropriate properties. If kI1, (P1) concludes ΛkRk, so in this case where iterates are close to the optimizer, the definition (9) proposes a weaker non-monotone strategy in relation to the non-monotone strategy (7). Otherwise, if kI2, then (P2) concludes that Rk<Λkfl(k) and so it leads us to produce a medium non-monotone strategy whenever iterates are not far away from the optimizer. Finally, if kI3, far away from the optimizer, (P3) results in Λk>fl(k) and so algorithm uses a strong non-monotone strategy with respect to the non-monotone strategy (6).

4. Convergence analysis

In this section, we investigate the global convergence results of the new proposed algorithm.

Lemma 4.1

Suppose that Assumption (H1) holds and the sequence {xk}k0 is generated by Algorithm 2. Then, for all kN0, we have xkL(x0) and the sequence {fl(k)} for all kI1I2 is a convergent decreasing sequences and also for all kI3 provided that fk+1fl(k).

Proof.

If xk+1 is not accepted by Algorithm 2, then fk+1=fk and fl(k+1)=fl(k). Otherwise, we have

fk+1=f(xk+dk)Λk kN0. (13)

In the sequel, we divide the proof into two parts.

(a) kI1I2. (P1), (P2) of Lemma 3.1 along with (13) imply that fk+1fl(k). In order to prove that the sequence {fl(k)}kI1I2 is decreasing, we consider the following two cases:

(i) k<N. In this case m(k+1)=k+1. It is easily seen that

fl(k+1)=max0jk+1{fk+1j}=max{fl(k),fk+1}=fl(k).

(ii) kN. In this case, we have m(k+1)=N, for all k. Therefore, inequality fk+1fl(k) results in

fl(k+1)=max0jN{fk+1j}maxmax0jN{fkj},fk+1=max{fl(k),fk+1}=fl(k),

while the last inequality along with kI1I2 is conclusion of (13).

(b) kI3 and fk+1fl(k). The proof is similar to the cases (i) and (ii) of the part (a).

Now, by a strong induction, assuming xiL(x0), for all i=1,,k, it is sufficient to show xk+1L(x0). Now, we can obtain

fk+1fl(k)f0.

Thus, the sequence {xk}k0 is contained in L(x0). Finally, Assumption (H1) along with xkL(x0) for all kN0 implies that the sequence {fl(k)}k0 is bounded. Thus, the sequence {fl(k)}k0 is convergent.

Lemma 4.2

Suppose that Assumption (H1) holds and the sequence {xk}k0 is generated by Algorithm 2. Then, for all kN0, we have xkL(x0) and whenever fk+1>fl(k), the sequence {Λk}k0 for all kI3 is a convergent decreasing sequences.

Proof.

If xk+1 is not accepted by Algorithm 2, then fk+1=fk and fl(k+1)=fl(k). Otherwise, we have

fk+1=f(xk+dk)Λk kN0.

This fact along with fk+1>fl(k) and the definition fl(k+1) results in fl(k+1)=fk+1 and also

Λk+1=ηˆk+1fl(k+1)+(1ηˆk+1)fk+1=ηˆk+1fk+1+(1ηˆk+1)fk+1=fk+1Λk.

Now, by a strong induction, assuming xiL(x0), for all i=1,,k, it is sufficient to show xk+1L(x0). Now, we can obtain

fk+1=Λk+1Λkf0.

Thus, the sequence {xk}k0 is contained in L(x0). Finally, Assumption (H1) along with xkL(x0) for all kN0 implies that the sequence {Λk}k0 is bounded. Thus, the sequence {Λk}k0 is convergent.

Lemma 4.3

Let {xk}k0 be a bounded sequence of vectors in Rn by the NMSPS-N algorithm and ηR such that fkη>0. Then, under Assumptions (H1) and (H2), there exist δ>0, such that for all Δk>0, if Δkδ, then the kth iteration of NMSPS-N will be successful (ρ~k>0) and Δk+1Δk.

Proof.

Similar to Proposition 6.4 in [31], for i=1,,p, If Δk<δ, then we can get

f(xk+dki)f(xk)12ξfkdki<0,

in which ξ>0 is a constant. Hence, there exists at least one i{1,,p} such that dkiΔkBCk. Whenever Δk<δ, f(xk+dki)<f(xk)Λk. If min{f(xk+y), yΔkBΓk}<Λk, then Procedures 3 guarantees f(xk+dk)<Λk and consequently ρ~k>0. According NMWSP-N, we have Δk+1Δk.

Lemma 4.3 gives the following corollary, see Corollary 6.5 in [31].

Corollary 4.4

Let {xk}k0 be a bounded sequence of vectors in Rn by NMWPS-N and ηR such that fkη>0. Then, under Assumptions (H1) and (H2), there exist ζ,δ>0, such that for all Δk>0, if Δkδ, then

fk+1fkζfkdk.

The above corollary helps us to establish the following lemma.

Lemma 4.5

Suppose that Assumptions (H1) and (H2) hold and the sequence {xk}k0 is generated by the NMWPS-N algorithm. Then, we have

limkfl(k)=limkfk.

Proof.

Using the fact that xk is not the optimum of (2), we can conclude that there exists a constant ϵ>0 such that fkϵ. This fact along with Lemma 4.4 and fkfl(k), for some ζ>0, implies that

fk+1=f(xk+dk)fkζfkdkfkζϵdkfl(k)ωdk, (14)

where ω=ζϵ. By replacing k with l(k)1 in Equation (14), we have

fl(l(k)1)fl(k)ωdl(k)1. (15)

This fact along with Lemma 4.2 results in

limkdl(k)1=0. (16)

Assumption (H2) and (16) give

limkf(xl(k))=limkf(xl(k)1). (17)

By letting lˆ(k)=l(k+N+2) and using the induction, for all j1, we can prove

limkdlˆ(k)j1=0. (18)

This fact that {lˆ(k)}{l(k)} and according to Equation (16), for j=1, Equation (18) is satisfied. Assume that Equation (18) holds for a given j and take k large enough so that lˆ(k)(j+1)>0. Using Equation (14) and substituting k with lˆ(k)j1, we have

f(xlˆ(k)j1)f(xlˆ(k)j)ωdlˆ(k)j1.

Following the same argument to derive (17), we deduce that

limkdlˆ(k)j1=0.

and also

limkf(xlˆ(k)j1)=limkf(xl(k)).

Similar with Equation (17), for any given j1, we have

limkf(xlˆ(k)j)=limkf(xl(k)).

On the other hand, we can generate

xk+1=xlˆ(k)j=1lˆ(k)k1dlˆ(k)j,k.

This fact along with Equation (18) and lˆ(k)j1N+1 implies that

limkxk+1xlˆ(k)=0.

Hence, Assumption (H2) leads to

limkf(xl(k))=limkf(xlˆ(k))=limkf(xk).

Using Lemma 4.5, we can obtain the following corollary.

Corollary 4.6

Suppose that Assumptions (H1) and (H2) hold and the sequence {xk}k0 is generated by the NMWPS-N algorithm. Then, we have

limkΛk=limkfk.

Proof.

(1) If kI1I2, then the inequality fkΛkfl(k) along with Lemma 4.5 implies that

limkkI1I2Λk=limkkI1I2fk.

(2) For kI3, recalling Lemma 4.5 along with the definition of ηˆk results in

limkkI3Λk=limkkI3fk.

The following lemmas show that NMWPS-N and NMSPS-N algorithms are well-defined.

Lemma 4.7

Suppose that Assumption (H1) holds and the NMWPS-N algorithm has constructed an infinite sequence {xk}k0, then limkinfk=0.

Proof.

By contradiction, suppose that limkinfk=0 is not satisfied; hence, we can assume that there exists a constant ΔLB>0 and a set index KN0 such that

ΔkΔLB,kK.

This fact along with Equation (5) results in

τrkΔLBΔ0>0,kK,

which means that the sequence {τrk}kK is bounded away from zero. Since {xk}k0L(x0) and L(x0) is compact, Lemma 3.1 in [31] implies that the sequence {Δk}k0 has an upper bounded denoted by ΔUB and hence the sequence {τrk}kK is bounded above. In other words, the sequence {τrk}kK is finite and consequently {rk}k0 has, respectively, a lower and upper bounded, defined by

rLB:=min{rk0k<+}andrUB:=max{rk0k<+},

hence, for any kK, it can be concluded

xk:=x0+(βrLBαrUB)Δ0Bk=0k1zk,

i.e. it lies on a translated integer lattice generated by x0 and the columns of (βrLBαrUB)Δ0B, denoted by K1. Therefore xkL(x0)K1 for which L(x0)K1 is finite and it must has at least a point x in L(x0)K1 such that xk:=x for infinitely many k. By steps of NMWPS-N, a lattice point can be revisited finitely many times; hence, the new step dk is accepted if only if Λk>f(xk+dk). This fact implies that there exists an positive index m such that xk:=x, for km. This fact together with Corollary 4.6 yields to ρ~k0 and consequently Δk0, which is a contradiction since 0<ΔLB0.

Since NMSPS-N uses the relationship (12) to update Δk, it ensures that limkk=0.

Corollary 4.8

Suppose that Assumption (H1) holds and the NMSPS-N algorithm has constructed an infinite sequence {xk}k0. Then, limkk=0.

Remark 4.1

Grippo and Sciandrone, in Proposition 2 in [23], showed that if there exist the sequences {cki}k0, i=1,,p, which are bounded and each limit point of the sequence {ck1,,ckp}k0 is denoted by (c1,,cp) for which ci, i=1,,p, is positively span Rn, then

limkfk=0limki=1pmin0,f(xk)Tckicki=0. (19)

Theorem 4.9

Suppose that Assumptions (H1) and (H2) hold. Let {xk}k0 be the infinite sequence generated by the NMWPS-N. Then,

lim infkfk=0. (20)

Proof.

By contradiction, we assume that Equation (20) does not hold. Then, there exists a constant δ>0 such that fkδ for all kN0. From Lemma 4.7, there exists an infinite sequence K such that

lim infk, kKΔk=0. (21)

By recalling the continuous differentiability of f, it can find, for each kN0 and for i=1,,p, ξki:=xk+ωkdki=xk+ωkΔkcki, in which ωk(0,1), such that

f(xk+dki)=f(xk)+f(ξki)TdkiΛk+f(ξki)Tdki. (22)

We can get {ξk}kKx because Equation (21) gives {dk}kK0 (limk, kK(cki/cki)=ci) and {xk}kKx. Now, these facts along with taking a limit from both sides (22), for i=1,,p, gives

f(x)Tci=limkf(xk)Tckicki=limkf(ξk)Tckicki0,

yielding to

limk, kKi=1pmin0,f(xk)Tckicki=0.

Then, by Equation (19), we get

limk, kKfk=0,

leading

lim infkfk=0.

The following lemma helps us to establish the main global theorem.

Lemma 4.10

Suppose that Assumptions (H1) and (H2) hold and the columns of the Ck are bounded in norm, i.e. there exist two positive constant γ1 and γ2 such that γ1ckiγ2 for i=1,,p. Let {xk}k0 be the sequence generated by NMSPS-N. If there exists a positive constant δ and a subsequence KN0 such that fkδ for kK, then

kKΔk<.

Proof.

First, we show that

fk+1f0ζδγ1j=0,jKkΔj.

By Lemma 4.4, we get

fk+1fkζfkdkfkζδγ1Δk(fk1ζδγ1Δk1)ζδγ1Δkf0ζδγ1j=0,jKkΔk.

Suppose that there exists a subset index KK such that kKΔk=. Then, we get

f0f0fkζδγ1j=0, jKk1Δj,as k,

yielding to f0, which is a contradiction. Hence, we conclude that

j=0, jKk1Δj<.

At this point, the global convergence of Algorithm 2 based on the mentioned assumptions of this section can be investigated.

Theorem 4.11

Suppose that Assumptions (H1) and (H2) hold and the columns of the Ck are bounded in norm, i.e. there exist two positive constant γ1 and γ2 such that γ1ckiγ2 for i=1,,p. Then, for any {xk}k0 generated by the non-monotone pattern search method (NMSPS-N),

limkfk=0. (23)

Proof.

By contradiction, let us assume that the conclusion does not hold. Then, there is a subsequence of successful iterations such that

fkδ>0,for some δ>0.

Theorem 4.9 guarantees that, for each i=1,,p, there exists a first successful iteration l(ti)>ti such that fti<δ. Denote li:=l(ti) and define the index set Ξi:={ktik<li}, hence there exists another subsequence li such that

fkδ,kΞiandfli<δ. (24)

This fact along with taking Ξ:=i=0Ξi leads to

lim infk,kΞfkδ.

Then, Lemma 4.10 gives

jΞΔj<,

leading to

limijΞiΔj=0.

Hence

xtixlijΞixjxj+1jΞiΔj0,as i,

which deduces from continuity of f(x) on L(x0)

limiftifli=0.

This is a contradiction since Equation (24) implies ftifliδ.

5. Numerical experiments

One of the well known pattern search methods is the generalized coordinate search method with fixed step lengths [31]. This section reports some numerical experiments. Our algorithm, NMCS-N, is compared with the following considered algorithms

  • GSCS: The generalized strong coordinate search [31]

  • NMSCS-G: Algorithm 2 with the non-monotone term of Grippo et al. [14]

  • NMSCS-A: Algorithm 2 with the non-monotone term of Ahookhosh and Amini [1]

  • NMSCS-Z: Algorithm 2 with the non-monotone term of Zhang and Hager [38]

Test problems were selected from a wide range of papers: Problems 1–23 from [25], problems 24–31 from [24] and the problems 32–52 from [18].

All codes are written in MATLAB 9 programming environment on a 2.7 GHz Pentium(R) Dual-core CPU Windows 7 PC with 2G RAM with double precision format in the same subroutine. In our numerical experiments, the algorithms are stopped

Δk106,

or whenever the total number of function evaluations exceeds 100,000. For all algorithms, we take advantages of the parameters λ:=1.5, θ:=0.5, Δ0:=1 and B:=I. To calculate the non-monotone term fl(k), NMSCS-G, NMSCS-A and NMSCS-N have selected N:=5. For NMSCS-A, NMSCS-Z and NMSCS-N, we use η0:=0.001 and for NMSCS-A and NMSCS-N, the parameter ηk is updated by

ηk:=η0/2,if k=1,(ηk1+ηk2)/2,if k2.

For NMSCS-N, we have taken β:=1+ϵm, in which ϵm is machine ϵ. For all iterations of the coordinate search method, the generating matrix is fixed, i.e. Ck:=C. Hence, this matrix contains in its columns all possible combinations of {1,0,1} and consequently it has p=3n columns. In particular, the columns of C contain both I and I, as well as a column of zeros.

The following algorithm briefly summarizes how the exploratory move directions for non-monotone coordinate search are generated, see [31]:

5.

The exploratory moves are executed sequentially in the sense that the selection of the next trial step is based on the success or failure of the previous trial step. Thus, we may compute as few as n trial steps while there are 3n possible trial steps, but we compute no more than 2n at any given iteration, see Figure 1 in [31]. However, in the worst case, the algorithm for coordinate search ensures that all 2n steps, defined by ΔkBΓ=ΔkB[MM]=Δk[II], are tried before returning the step dk=0. In other words, the exploratory moves given in Algorithm 3 examine all 2n steps defined by ΔkBΓ unless a step satisfying f(xk+dk)<Λk is found.

At this point, to have a more reliable comparison and demonstrate the overall behaviour of the present algorithms and get more insight about the performance of considered codes, the performance of all codes, based on both Ct and Nf which are shown in Table 1, have been, respectively, assessed in Figure 1 by applying the performance profile proposed from Dolan and Moré in [6]. Subfigure (a) and (b) of Figure 1 plot the function P(τ):[0,rmax]R+, considered as

P(τ):=card(pPrp,sτ)card(P),τ1,

where P denotes the set of test problems, rp,s denotes the ratio of number of function evaluations and CPU-times needed to solve problem p with method s with the least number of function evaluations and CPU-times needed to solve problem p, respectively, and rmax is the maximum value of rp,s. Finally, the highest on the plot is describing the best solver.

Table 1. List of test functions.

Problem name Dim Problem name Dim
Extended Powell badly scaled 2 Powell singular 4
Brent 3 Broyden banded 5
Seven-Diagonal System 7 Chebyquad 10
Extended Powell Singular 8 Brown almost linear 10
Triadiagnal exponential 10 Discrete integral equation 20
Generalized Broyden banded 10 Diag. func. premul. by … matrix 3
Flow in a channel 10 Function 18 3
Swiriling flow 10 Strictly convex 2 5
Thorech 12 Strictly convex 1 5
Trig. exponential system 2 15 Zero Jacobian 5
Countercurrent reactors 1 16 Geometric 5
Countercurrent reactors 2 16 Extended Rosenbrock 6
Porous medium 16 Geometric programming 8
Trigonometric 20 Tridimensional valley 9
Singular Broyden 20 Chandrasekhar's H-equation 10
Broyden tridiagonal 20 Singular 10
Extended Wood 20 Logarithmic 10
Extended Cragg and Levy 24 Variable band 2 10
Trig. exponential system 1 25 Function 15 10
Structured Jacobian 25 Linear function-full rank 1 10
Discrete boundary value 25 Hanbook 10
Possion 25 Variable band 1 15
Possion 2 25 Linear function-full rank 2 20
Rosenbrock 2 Function 27 20
Powell badley scaled 2 Complementary 20
Helical valley 3 Function 21 21

Figure 1.

Figure 1.

A comparison among proposed algorithms with the performance measures: (a) Number of function evaluations (top), (b) CPU-times (bottom).

On one hand, subfigure (a) of Figure 1 compares NMSCS-N in the sense of the total number of function evaluations. It can be easily seen that NMSCS-N is the best algorithm in the sense of the most wins on more than 50% of the test functions. On the other hand, to compare the CPU times, because of variation of CPU time, each problem is solved five times and then the average of the CPU times is taken into account. Subfigure (b) of Figure 1 represents a comparison among the considered algorithms regarding CPU times. The results of this subfigure indicate that the performance of NMSCS-N is better than other present algorithms. In details, the new algorithm is the best algorithm on more than 35% of all cases.

6. Concluding remarks

This paper proposes a new non-monotone coordinate search algorithm to solve systems of nonlinear equations. Our method can overcome some disadvantages of the proposed method by Ahookhosh and Amini [1] by presenting a new parameter, defined by using combination of the maximum function value of some preceding successful iterates and the current function value. This parameter can prevent the production of weaker non-monotone strategy whenever iterates are far away from the optimizer and stronger nonmonotone strategy whenever iterates are close to the optimizer. The global convergence properties of the proposed algorithms are established. Preliminary numerical results show the significant efficiency of the new algorithm.

Funding Statement

The second author acknowledges the financial support of the Doctoral Program ‘Vienna Graduate School on Computational Optimization’ funded by Austrian Science Foundation under Project No W1260-N35.

Disclosure statement

No potential conflict of interest was reported by the authors.

References

  • [1].Ahookhosh M. and Amini K., An efficient nonmonotone trust-region method for unconstrained optimization, Numer. Algorithms 59 (2012), pp. 523–540. doi: 10.1007/s11075-011-9502-5 [DOI] [Google Scholar]
  • [2].Ahookhosh M., Amini K., and Kimiaei M., A globally convergent trust-region method for large-scale symmetric nonlinear systems, Numer. Funct. Anal. Optim. 36 (2015), pp. 830–855. doi: 10.1080/01630563.2015.1046080 [DOI] [Google Scholar]
  • [3].Ahookhosh M., Esmaeili H., and Kimiaei M., An effective trust-region-based approach for symmetric nonlinear systems, Int. J. Comput. Math. 90(3) (2013), pp. 671–690. doi: 10.1080/00207160.2012.736617 [DOI] [Google Scholar]
  • [4].Box G.E.P., Evolutionary operation: A method for increasing industrial productivity, Appl. Stat. 6 (1957), pp. 81–101. doi: 10.2307/2985505 [DOI] [Google Scholar]
  • [5].Dai Y.H., On the nonmonotone line search, J. Optim. Theory Appl. 112(2) (2002), pp. 315–330. doi: 10.1023/A:1013653923062 [DOI] [Google Scholar]
  • [6].Dolan E.D. and Moré J.J., Benchmarking optimization software with performance profiles, Math. Program. 91 (2002), pp. 201–213. doi: 10.1007/s101070100263 [DOI] [Google Scholar]
  • [7].Esmaeili H. and Kimiaei M., An improved adaptive trust-region method for unconstrained optimization, Math. Model. Anal. 19 (2014), pp. 469–490. doi: 10.3846/13926292.2014.956237 [DOI] [Google Scholar]
  • [8].Esmaeili H. and Kimiaei M., An efficient adaptive trust-region method for systems of nonlinear equations, Int. J. Comput. Math. 92 (2015), pp. 151–166. doi: 10.1080/00207160.2014.887701 [DOI] [Google Scholar]
  • [9].Esmaeili H. and Kimiaei M., A trust-region method with improved adaptive radius for systems of nonlinear equations, Math. Methods Oper. Res. 83(1) (2016), pp. 109–125. doi: 10.1007/s00186-015-0522-0 [DOI] [Google Scholar]
  • [10].Fan J.Y., Convergence rate of the trust region method for nonlinear equations under local error bound condition, Comput. Optim. Appl. 34 (2005), pp. 215–227. doi: 10.1007/s10589-005-3078-8 [DOI] [Google Scholar]
  • [11].Fan J. and Pan J., An improved trust region algorithm for nonlinear equations, Comput. Optim. Appl. 48(1) (2011), pp. 59–70. doi: 10.1007/s10589-009-9236-7 [DOI] [Google Scholar]
  • [12].Gasparo M.G., A nonmonotone hybrid method for nonlinear systems, Optim. Methods Softw. 13 (2000), pp. 79–94. doi: 10.1080/10556780008805776 [DOI] [Google Scholar]
  • [13].Gasparo M.G., Papini A., and Pasquali A., Nonmonotone algorithms for pattern search methods, Numer. Algorithms 28 (2001), pp. 171–186. doi: 10.1023/A:1014046817188 [DOI] [Google Scholar]
  • [14].Grippo L., Lampariello F., and Lucidi S., A nonmonotone line search technique for Newton's method, SIAM J. Numer. Anal. 23 (1986), pp. 707–716. doi: 10.1137/0723046 [DOI] [Google Scholar]
  • [15].Grippo L., Lampariello F., and Lucidi S., A truncated Newton method with nonmonotone line search for unconstrained optimization, J. Optim. Theory Appl. 60(3) (1989), pp. 401–419. doi: 10.1007/BF00940345 [DOI] [Google Scholar]
  • [16].Grippo L., Lampariello F., and Lucidi S., A class of nonmonotone stabilization methods in unconstrained optimization, Numer. Math. 59 (1991), pp. 779–805. doi: 10.1007/BF01385810 [DOI] [Google Scholar]
  • [17].Hooke R. and Jeeves T.A, Direct search solution of numerical and statistical problems, J. ACM 8 (1961), pp. 212–229. doi: 10.1145/321062.321069 [DOI] [Google Scholar]
  • [18].LaCruz W., Venezuela C., Martínez J.M., and Raydan M., Spectral residual method without gradient information for solving large-scale nonlinear systems of equations: Theory and experiments, Technical Report RT–04–08, July 2004.
  • [19].Lewis R.M and Torczon V., Pattern search algorithms for bound constrained minimization, SIAM. J. Optim. 9 (1999), pp. 1082–1099. doi: 10.1137/S1052623496300507 [DOI] [Google Scholar]
  • [20].Lewis R.M. and Torczon V., Pattern search methods for linearly constrained minimization, SIAM. J. Optim. 10 (2000), pp. 917–941. doi: 10.1137/S1052623497331373 [DOI] [Google Scholar]
  • [21].Lewis R.M., Torczon V., and Trosset M.W., Why pattern search works, Optima (1988), pp. 1–7. [Google Scholar]
  • [22].Lewis R.M., Torczon V., and Trosset M.W., Direct search methods: Then and now, J. Comput. Appl. Math. 124 (2000), pp. 191–207. doi: 10.1016/S0377-0427(00)00423-4 [DOI] [Google Scholar]
  • [23].Lucidi S. and Sciandrone M., On the global convergence of derivative free methods for unconstrained optimization, Technical Report 32–96, DIS, Universita' di Roma ‘La Sapienza’, 1996.
  • [24].Lukšan L. and Vlček J., Sparse and partially separable test problems for unconstrained and equality constrained optimization, Techical Report, No. 767, January 1999.
  • [25].Moré J.J., Garbow B.S., and Hillström K.E., Testing Unconstrained Optimization Software, ACM Trans. Math. Softw. 7 (1981), pp. 17–41. doi: 10.1145/355934.355936 [DOI] [Google Scholar]
  • [26].Nocedal J. and Wright J.S., Numerical Optimization, Springer, NewYork, 2006. [Google Scholar]
  • [27].Shi Z.J. and Wang S., Modified nonmonotone Armijo line search for descent method, Numer. Algorithms 57(1) (2011), pp. 1–25. doi: 10.1007/s11075-010-9408-7 [DOI] [Google Scholar]
  • [28].Toint P.L., An assessment of nonmonotone linesearch techniques for unconstrained optimization, SIAM J. Sci. Comput. 17 (1996), pp. 725–739. doi: 10.1137/S106482759427021X [DOI] [Google Scholar]
  • [29].Torczon V., Multidirectional search: A direct search algorithm for parallel machines, Ph.D. thesis, Rice University, Houston, TX, 1989.
  • [30].Torczon V., On the convergence of the multidirectional search algorithm, SIAM J. Optim. 1 (1991), pp. 123–145. doi: 10.1137/0801010 [DOI] [Google Scholar]
  • [31].Torczon V., On the convergence of pattern search algorithms, SIAM J. Optim. 7 (1997), pp. 1–25. doi: 10.1137/S1052623493250780 [DOI] [Google Scholar]
  • [32].Yuan G.L. and Lu X.W., A new backtracking inexact BFGS method for symmetric nonlinear equations, Comput. Math. Appl. 55 (2008), pp. 116–129. doi: 10.1016/j.camwa.2006.12.081 [DOI] [Google Scholar]
  • [33].Yuan G.L. and Zhang M.J., A three-terms Polak-R*ibière-Polyak conjugate gradient algorithm for large-scale nonlinear equations, J. Comput. Appl. Math. 286 (2015), pp. 186–195. doi: 10.1016/j.cam.2015.03.014 [DOI] [Google Scholar]
  • [34].Yuan G.L., Lu S., and Wei Z., A new trust-region method with line search for solving symmetric nonlinear equations, Int. J. Comput. Math. 88(10) (2011), pp. 2109–2123. doi: 10.1080/00207160.2010.526206 [DOI] [Google Scholar]
  • [35].Yuan G.L., Meng Z.H., and Li Y., A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations, J. Optim. Theory Appl. 168 (2016), pp. 129–152. doi: 10.1007/s10957-015-0781-1 [DOI] [Google Scholar]
  • [36].Yuan G.L., Lu X.W., and Wei Z.X., BFGS trust-region method for symmetric nonlinear equations, J. Comput. Appl. Math. 230 (2009), pp. 44–58. doi: 10.1016/j.cam.2008.10.062 [DOI] [Google Scholar]
  • [37].Yuan G.L., Wei Z.X., and Lu X.W., A BFGS trust-region method for nonlinear equations, Computing 92(4) (2011), pp. 317–333. doi: 10.1007/s00607-011-0146-z [DOI] [Google Scholar]
  • [38].Zhang H.C and Hager W.W., A nonmonotone line search technique and its application to unconstrained optimization, SIAM J. Optim. 14(4) (2004), pp. 1043–1056. doi: 10.1137/S1052623403428208 [DOI] [Google Scholar]
  • [39].Zhang J. and Wang Y., A new trust region method for nonlinear equations, Math. Methods Oper. Res. 58 (2003), pp. 283–298. doi: 10.1007/s001860300302 [DOI] [Google Scholar]

Articles from International Journal of Computer Mathematics are provided here courtesy of Taylor & Francis

RESOURCES