A non-monotone pattern search approach for systems of nonlinear equations

Keyvan Amini; Morteza Kimiaei; Hassan Khotanlou

doi:10.1080/00207160.2017.1413552

. 2017 Dec 19;96(1):33–50. doi: 10.1080/00207160.2017.1413552

A non-monotone pattern search approach for systems of nonlinear equations

Keyvan Amini ^a, Morteza Kimiaei ^b,^CONTACT, Hassan Khotanlou ^c

PMCID: PMC6235546 PMID: 30487705

ABSTRACT

In this paper, a new pattern search is proposed to solve the systems of nonlinear equations. We introduce a new non-monotone strategy which includes a convex combination of the maximum function of some preceding successful iterates and the current function. First, we produce a stronger non-monotone strategy in relation to the generated strategy by Gasparo et al. [Nonmonotone algorithms for pattern search methods, Numer. Algorithms 28 (2001), pp. 171–186] whenever iterates are far away from the optimizer. Second, when iterates are near the optimizer, we produce a weaker non-monotone strategy with respect to the generated strategy by Ahookhosh and Amini [An efficient nonmonotone trust-region method for unconstrained optimization, Numer. Algorithms 59 (2012), pp. 523–540]. Third, whenever iterates are neither near the optimizer nor far away from it, we produce a medium non-monotone strategy which will be laid between the generated strategy by Gasparo et al. [Nonmonotone algorithms for pattern search methods, Numer. Algorithms 28 (2001), pp. 171–186] and Ahookhosh and Amini [An efficient nonmonotone trust-region method for unconstrained optimization, Numer. Algorithms 59 (2012), pp. 523–540]. Reported are numerical results of the proposed algorithm for which the global convergence is established.

Keywords: Nonlinear equation, pattern search, coordinate search, non-monotone technique, theoretical convergence

2010 AMS Subject Classifications: 90C30, 93E24, 34A34

1. Introduction

Consider the following nonlinear system of equations

F (x) = 0, x \in R^{n},

(1)

for which $F : R^{n} \to R^{n}$ is a continuously differentiable mapping. Suppose that $F (x)$ has a zero. Then every solution $x^{*}$ of the nonlinear equation problem (1) is a solution of the following nonlinear unconstrained least-squares problem

\begin{aligned} min & f (x) := \frac{1}{2} ∥ F (x) ∥^{2} \\ s . t . & x \in R^{n}, \end{aligned}

(2)

where $∥ \cdot ∥$ denotes the Euclidean norm. Conversely, if $x^{*}$ solves Equation (2) and $f (x^{*}) = 0$ , then $x^{*}$ is a solution of (1). There are variant methods to solve nonlinear system (1), as conjugate gradient methods [33,35], line-search methods [5,12,14–16,32] and trust-region methods [2,3,7–11,34,36,37,39], which are quite fast and robust; but they may have some shortcomings. First, by a ratio, trust-region algorithm tries to control the agreement between the actual and predicted reduction essentially only along with a direction, for more details on the trust-region algorithm, cf. [26] If this ratio is near one and the Jacobin matrix $\nabla F (x)$ is ill-conditioned or f is a highly nonlinear function for which approximated quadratic is not good, then the trust-region radius may increase before reaching a narrow curved valley. Afterwards, we need to reduce several times the radius to get around this narrow curved valley that leads to increase computational cost and also produce unsuitable solution for the cases in which highly accurate solutions are necessary. Second, solving the trust-region subproblems leads to increase CPU times. Third, these methods need to compute both $\nabla F (x)$ and $\nabla F (x)^{T} \nabla F (x)$ to determine the approximated quadratic in each iteration. Pattern search methods represent a derivative free subclass of direct search algorithms to minimize a continuous function (see, e.g. [4,17,21,22]). Box [4] and Hooke and Jeeves [17] were the first researchers to introduce the original pattern search methods. Some researchers have shown that pattern search algorithms converge globally, see [13,19,20,30,31]. Lewis and Torczon successfully extended these algorithms to obtain bound and linearly constrained minimization [19,20]. Torczon [29,30] presented a multidirectional search algorithm for parallel machines. In ill-conditioned problems, using monotone pattern search auxiliary algorithm may have unsuitable influence on the performance of the whole procedure, cf. [13]. Hence, we are going to introduce a new non-monotone pattern search framework that decreases the total number of function evaluations and CPU times. This development enables us to produce a suitable non-monotone strategy at each iteration and maintains the global convergence. Numerical results show that the new modification of pattern search is efficient to solve systems of nonlinear equations.

Notation: The Euclidean vector norm or the associated matrix norm is denoted by the symbol $∥ \cdot ∥$ . A set of directions ${d_{k}^{1}, \dots, d_{k}^{p}}$ is called positively span $R^{n}$ if for each $y \in R^{n}$ there exist $λ_{i} \geq 0$ , for $i = 1, \dots, p$ , such that

y = \sum_{j = 1}^{p} λ_{i} d_{k}^{i} .

Moreover, $e_{i}$ , for $i = 1, \dots, p$ , is considered as the orthonormal set of the coordinate directions. To simplify our notation, we set $N_{0} := N \cup {0}$ .

Organization. The rest of this paper is organized as follows. In Section 2, we first describe the exploratory moves and then the generalized pattern search is presented. A new non-monotone pattern search algorithm is presented in Section 3. In Section 4, the global convergence of the new algorithm is investigated. Numerical results are provided in Section 5 to show that the proposed algorithm is efficient and promising for systems of nonlinear equations. Finally, some concluding remarks are given in Section 6.

2. The generalized pattern search method

First of all, we define two components, namely a basis matrix and a generating matrix, cf. [31].

Definition 2.1

Any arbitrary non-singular matrix $B \in R^{n \times n}$ is called a basis matrix.

Definition 2.2

The generating matrix $C_{k} \in Z^{n \times p}$ with p>2n, divided into two parts, is considered as

$C_{k} := [Γ_{k} L_{k}],$

in which $Γ_{k} := [M_{k} - M_{k}]$ , $M_{k} \in M \subset Z^{n \times n}$ , M is a finite set of non-singular matrices and $L_{k} \in Z^{n \times (p - 2 n)}$ is a matrix that contains at least a column zeros.

Defined by the columns of the matrix $P_{k} = B C_{k}$ , in which B is a basis matrix, is a pattern $P_{k}$ . By the definition $C_{k}$ and this fact that $M_{k}$ has rank n, it is clear that $C_{k}$ also has rank n. This fact implies that the columns of $P_{k}$ span $R^{n}$ . To act better, the partition of the generating matrix $C_{k}$ to partition $P_{k}$ is used, as follows:

P_{k} := B C_{k} = [B Γ_{k} B L_{k}] .

(3)

Given $x_{k}$ , a step-size $Δ_{k} > 0$ , we define a trial step $d_{k}^{i}$ to be any vector of the form

d_{k}^{i} := Δ_{k} B c_{k}^{i}, \forall i = 1, \dots, p,

in which $c_{k}^{i}$ indicates the ith column of $C_{k}$ , the vectors $B c_{k}^{i}$ , named exploratory moves as proposed in [31], determine the step directions and $Δ_{k}$ is considered as a step-size parameter. Furthermore, a trial point as any point of the form $x_{k}^{i} := x_{k} + d_{k}^{i}$ will be defined, where $x_{k}$ is the current iterate. Before declaring a new iterate and updating the associated information, pattern search methods use the series of exploratory moves in order to produce the new iterate. To prove the convergence property of pattern search methods, we require that the exploratory moves are obtained by the following two procedures:

In Procedure 1, note that $y \in A$ means that the vector y is contained in the set of columns of the matrix A. (S.2) is more interesting; hence, let us describe how it works. As long as there exists a decrease on the function value at each iterate among any of the $2 n$ steps presented by $Δ_{k} B Γ_{k}$ , the exploratory moves must return a decrease step on the function value at each iterate, without satisfying $f (x_{k} + d_{k}) \leq min {f (x_{k} + y), y \in Δ_{k} B Γ_{k}}$ .

In Procedure 2, (S.2) is replaced by a strong version, as presented above.

Algorithm 1 situates the generalized pattern search method for systems of nonlinear equations, cf. [31].

In Algorithm 1, if $ρ_{k} > 0$ (Line 8), then it is called a successful iteration. Otherwise, it is called a unsuccessful iteration. The parameter θ is considered the shrinkage parameter with the role $θ := τ^{w_{0}}$ , in which $τ > 1$ and $w_{0}$ is a negative integer, and $λ_{k}$ is called the expanding factor such that

λ_{k} \in {τ^{w_{1}}, τ^{w_{2}}, \dots, τ^{w_{l}}},

in which $w_{1}, w_{2}, \dots, w_{l}$ are positive integers, with $l < \infty$ . In Line 4 of this algorithm, the step $d_{k}$ can be obtained either by Procedure 1 or Procedure 2. This algorithm is called generalized weak pattern search (GWPS) if $d_{k}$ is obtained by Procedure 1; otherwise, if $d_{k}$ is obtained by Procedure 2, it is called generalized strong pattern search (GSPS).

Both GWPS and GSPS contain a drawback. This fact that the quantity $ρ_{k}$ can not truly prevent the production of unsuccessful iterations in the presence of narrow curved valley leads to the increase of CPU time and the total number of function evaluations. In order to overcome this drawback, Gasparo et al. [13] modified the quantity $ρ_{k}$ .

Torczon [31] showed in Theorem 3.2 that each iterate $x_{n}$ generated by GWPS can be considered as

x_{n} := x_{0} + (β^{r_{LB}} α^{- r_{UB}}) Δ_{0} B \sum_{k = 0}^{n - 1} z_{k},

(4)

in which α and β are relatively prime positive integers while satisfying $β / α := τ$ , $r_{LB} := min {r_{0}, \dots, r_{n - 1}}$ , $r_{UB} := max {r_{0}, \dots, r_{n - 1}}$ and $z_{k} \in Z^{n}$ . Moreover, Torczon showed that $Δ_{k}$ can be written as follows:

Δ_{k} := τ^{r_{k}} Δ_{0},

(5)

in which $r_{k} \in Z$ . Both Equations (4) and (5) help us to prove Lemma 4.6 in Section 4.

3. The new non-monotone strategy

It is believed that some globalization techniques such as pattern search can generally guarantee the global convergence of the traditional direct search approaches. A monotonicity of the sequence of objective function values is generated by this globalization technique, which usually leads to produce short steps. Due to this fact, a slow numerical convergence is created for highly nonlinear problems, see [1,5,13,14,27,28,38]. As an example, the generalized pattern search framework exploits the quantity $ρ_{k}$ which guarantees

f_{k} - f_{k + 1} > 0,

this means that the sequence ${f_{k}}_{k \geq 0}$ is monotone. In order to avoid this drawback of globalization techniques, Gasparo et al. [13] based on the definition introduced by Grippo et al. [14], proposed a non-monotone strategy in pattern search algorithms with the quantity ${\hat{ρ}}_{k}$ satisfying

{\hat{ρ}}_{k} := f_{l (k)} - f_{k + 1},

for which

f_{l (k)} := max_{0 \leq j \leq m (k)} {f_{k - j}}, k \in N_{0}

(6)

in which $m (0) := 0$ and $0 \leq m (k) \leq min {m (k - 1) + 1, N}$ with $N \geq 0$ . This strategy has excellent results having caused many researchers to investigate the effects of these strategies in a wide variety of optimization procedures and to propose some other non-monotone techniques, see [1,13,14,27,28,38]. Although the non-monotone technique (6) has many advantages, this rule contributes to some drawbacks as well, see [1,38]. Recently, Ahookhosh and Amini [1] have presented a weaker non-monotone strategy of Grippo et al. [14] which overcomes some of its disadvantages [14] with the quantity ${\bar{ρ}}_{k}$ satisfying

{\bar{ρ}}_{k} := R_{k} - f_{k + 1},

where

R_{k} := η_{k} f_{l (k)} + (1 - η_{k}) f_{k},

(7)

in which $η_{k} \in [η_{min}, η_{max}]$ , $η_{min} \in [0, 1)$ and $η_{max} \in [η_{min}, 1]$ . Although, this proposal generates the more efficient algorithm, it depends on choosing $η_{k}$ . An unsuitable choice of $η_{k}$ can cause some shortcomings. According to the characteristics and expectations of our algorithm, we further propose an appropriate $η_{k}$ . In this regard, let us first define the following ratio

Θ_{k} := \frac{f_{l (k)}}{f_{k}},

which can help us to compare the distance between the members of ${f_{k}}_{k \geq 0}$ and ${f_{l (k)}}_{k \geq 0}$ . It is clear that $Θ_{k} \geq 1$ because $f_{l (k)} \geq f_{k} > 0$ and Lemma 4.5 show that $lim_{k \to \infty} Θ_{k} = 1$ . Also, it can be seen that if $Θ_{k} \geq β$ ( $β > 1$ ), then ${f_{k}}_{k \geq 0}$ and ${f_{l (k)}}_{k \geq 0}$ are far away from each other and otherwise they are close. Now, after representing of

{\hat{η}}_{k} := \{\begin{cases} \frac{η_{k}}{Θ_{k}} & if Θ_{k} \geq β, \\ η_{k} Θ_{k} & else, \end{cases}

(8)

a new non-monotone pattern search formula is defined by

Λ_{k} := {\hat{η}}_{k} f_{l (k)} + (1 - {\hat{η}}_{k}) f_{k},

(9)

for which the new quantity is considered as

{\tilde{ρ}}_{k} := Λ_{k} - f_{k + 1} .

(10)

The theoretical and numerical results show that the new choice of ${\tilde{ρ}}_{k}$ has remarkable positive effects on pattern search to get faster convergence, especially for highly nonlinear problems. Let us now use the following procedure to compute the non-monotone strategy (9)

Remark 3.1

The sequence ${Λ_{k}}_{k \geq 0}$ generates the convergence results obtained by stronger non-monotone strategy whenever iterates are far away from the optimizer and the members of ${f_{k}}_{k \geq 0}$ and ${f_{l (k)}}_{k \geq 0}$ are close to each other while this sequence generates the convergence results obtained by weaker non-monotone strategy whenever iterates are close to the optimizer and the members of ${f_{k}}_{k \geq 0}$ and ${f_{l (k)}}_{k \geq 0}$ are far away from each other.

Before presenting our algorithm, we describe how to determine the step $d_{k}$ by the following two procedures:

Procedure 3 tries to reach the condition $Λ_{k} \geq f_{k}$ at each iterate among any of the $2 n$ steps presented by $Δ_{k} B Γ_{k}$ without satisfying $f (x_{k} + d_{k}) \leq min {f (x_{k} + y), y \in Δ_{k} B Γ_{k}}$ .

Now, to investigate the effectiveness of the new pattern search, we add the new non-monotone strategy to the framework of pattern search method.

Note that in Algorithm 2, if $d_{k}$ is obtained by Procedure 3, then it is called non-monotone weak pattern search (NMWPS-N) while if $d_{k}$ is obtained by Procedure 4, then it is considered as non-monotone strong pattern search (NMSPS-N). To guarantee the global convergence of NMWPS-N using Procedure 3 to determine $d_{k}$ , we need to update $Δ_{k}$ by

Δ_{k + 1} := \{\begin{cases} λ_{k} Δ_{k} & if {\tilde{ρ}}_{k} > 0, \\ θ Δ_{k} & else, \end{cases}

(11)

and NMSPS-N, $d_{k}$ obtained by Procedure 4, updates $Δ_{k}$ by

Δ_{k + 1} := \{\begin{cases} Δ_{k} & if {\tilde{ρ}}_{k} > 0, \\ θ Δ_{k} & else, \end{cases}

(12)

where both θ and $λ_{k}$ are updated similar to Algorithm 1. We determine how to update $C_{k}$ in Section 5.

The global convergence results of both NMWPS-N and NMSPS-N require the following assumptions necessary:

The level set $L (x_{0}) := {x \in R^{n} ∣ f (x) \leq f (x_{0})}$ is bounded.
$F (x)$ is continuously differentiable on a compact convex set Ω containing $L (x_{0})$ .

It can be easily seen that in Algorithm 2 for any index k, one of the following cases can occur

I_{1} := {k ∣ Θ_{k} \geq β}, I_{2} := {k ∣ Θ_{k} < β, {\hat{η}}_{k} \in (0, 1]} and I_{3} := {k ∣ Θ_{k} < β, {\hat{η}}_{k} > 1} .

Lemma 3.1

Suppose that the sequence ${x_{k}}_{k \geq 0}$ is generated by Algorithm 2. Then, we have the following properties:

If $k \in I_{1},$ then $f_{k} \leq Λ_{k} \leq R_{k}$ .

If $k \in I_{2},$ then $R_{k} < Λ_{k} \leq f_{l (k)}$ .

If $k \in I_{3},$ then $Λ_{k} > f_{l (k)}$ .

Proof.

(1) This fact that $Θ_{k} \geq β > 1$ implies ${\hat{η}}_{k} \leq η_{k}$ and consequently

$Λ_{k} = {\hat{η}}_{k} f_{l (k)} + (1 - {\hat{η}}_{k}) f_{k} = {\hat{η}}_{k} (f_{l (k)} - f_{k}) + f_{k} \leq η_{k} (f_{l (k)} - f_{k}) + f_{k} = R_{k} .$

On the other hand, because $f_{l (k)} \geq f_{k}$ , it is easily seen that

$Λ_{k} = {\hat{η}}_{k} f_{l (k)} + (1 - {\hat{η}}_{k}) f_{k} \geq {\hat{η}}_{k} f_{k} + (1 - {\hat{η}}_{k}) f_{k} = f_{k} .$

So, (P1) is correct.

(2) Using the definition $f_{l (k)}$ along with this fact that $1 \leq Θ_{k} < β$ implies that ${\hat{η}}_{k} \geq η_{k}$ and so

$R_{k} = η_{k} (f_{l (k)} - f_{k}) + f_{k} \leq {\hat{η}}_{k} (f_{l (k)} - f_{k}) + f_{k} = Λ_{k} = (1 - {\hat{η}}_{k}) (f_{k} - f_{l (k)}) + f_{l (k)} \leq f_{l (k)},$

which gives (P2).

(3) The definition of $f_{l (k)}$ and ${\hat{η}}_{k} > 1$ results in

$Λ_{k} = {\hat{η}}_{k} f_{l (k)} + (1 - {\hat{η}}_{k}) f_{k} = ({\hat{η}}_{k} - 1) (f_{l (k)} - f_{k}) + f_{l (k)} > f_{l (k)},$

so, (P3) is correct.

Based on Lemma 3.1, using the new sequence ${{\hat{η}}_{k}}_{k \geq 0}$ causes some appropriate properties. If $k \in I_{1}$ , (P1) concludes $Λ_{k} \leq R_{k}$ , so in this case where iterates are close to the optimizer, the definition (9) proposes a weaker non-monotone strategy in relation to the non-monotone strategy (7). Otherwise, if $k \in I_{2}$ , then (P2) concludes that $R_{k} < Λ_{k} \leq f_{l (k)}$ and so it leads us to produce a medium non-monotone strategy whenever iterates are not far away from the optimizer. Finally, if $k \in I_{3}$ , far away from the optimizer, (P3) results in $Λ_{k} > f_{l (k)}$ and so algorithm uses a strong non-monotone strategy with respect to the non-monotone strategy (6).

4. Convergence analysis

In this section, we investigate the global convergence results of the new proposed algorithm.

Lemma 4.1

Suppose that Assumption (H1) holds and the sequence ${x_{k}}_{k \geq 0}$ is generated by Algorithm 2. Then, for all $k \in N_{0},$ we have $x_{k} \in L (x_{0})$ and the sequence ${f_{l (k)}}$ for all $k \in I_{1} \cup I_{2}$ is a convergent decreasing sequences and also for all $k \in I_{3}$ provided that $f_{k + 1} \leq f_{l (k)}$ .

Proof.

If $x_{k + 1}$ is not accepted by Algorithm 2, then $f_{k + 1} = f_{k}$ and $f_{l (k + 1)} = f_{l (k)}$ . Otherwise, we have

$f_{k + 1} = f (x_{k} + d_{k}) \leq Λ_{k} \forall k \in N_{0} .$ (13)

In the sequel, we divide the proof into two parts.

(a) $k \in I_{1} \cup I_{2}$ . (P1), (P2) of Lemma 3.1 along with (13) imply that $f_{k + 1} \leq f_{l (k)}$ . In order to prove that the sequence ${f_{l (k)}}_{k \in I_{1} \cup I_{2}}$ is decreasing, we consider the following two cases:

(i) k<N. In this case $m (k + 1) = k + 1$ . It is easily seen that

$f_{l (k + 1)} = max_{0 \leq j \leq k + 1} {f_{k + 1 - j}} = max {f_{l (k)}, f_{k + 1}} = f_{l (k)} .$

(ii) $k \geq N$ . In this case, we have $m (k + 1) = N$ , for all k. Therefore, inequality $f_{k + 1} \leq f_{l (k)}$ results in

$f_{l (k + 1)} = max_{0 \leq j \leq N} {f_{k + 1 - j}} \leq max \{max_{0 \leq j \leq N} {f_{k - j}}, f_{k + 1}\} = max {f_{l (k)}, f_{k + 1}} = f_{l (k)},$

while the last inequality along with $k \in I_{1} \cup I_{2}$ is conclusion of (13).

(b) $k \in I_{3}$ and $f_{k + 1} \leq f_{l (k)}$ . The proof is similar to the cases (i) and (ii) of the part (a).

Now, by a strong induction, assuming $x_{i} \in L (x_{0})$ , for all $i = 1, \dots, k$ , it is sufficient to show $x_{k + 1} \in L (x_{0})$ . Now, we can obtain

$f_{k + 1} \leq f_{l (k)} \leq f_{0} .$

Thus, the sequence ${x_{k}}_{k \geq 0}$ is contained in $L (x_{0})$ . Finally, Assumption (H1) along with $x_{k} \in L (x_{0})$ for all $k \in N_{0}$ implies that the sequence ${f_{l (k)}}_{k \geq 0}$ is bounded. Thus, the sequence ${f_{l (k)}}_{k \geq 0}$ is convergent.

Lemma 4.2

Suppose that Assumption (H1) holds and the sequence ${x_{k}}_{k \geq 0}$ is generated by Algorithm 2. Then, for all $k \in N_{0},$ we have $x_{k} \in L (x_{0})$ and whenever $f_{k + 1} > f_{l (k)},$ the sequence ${Λ_{k}}_{k \geq 0}$ for all $k \in I_{3}$ is a convergent decreasing sequences.

Proof.

If $x_{k + 1}$ is not accepted by Algorithm 2, then $f_{k + 1} = f_{k}$ and $f_{l (k + 1)} = f_{l (k)}$ . Otherwise, we have

$f_{k + 1} = f (x_{k} + d_{k}) \leq Λ_{k} \forall k \in N_{0} .$

This fact along with $f_{k + 1} > f_{l (k)}$ and the definition $f_{l (k + 1)}$ results in $f_{l (k + 1)} = f_{k + 1}$ and also

$Λ_{k + 1} = {\hat{η}}_{k + 1} f_{l (k + 1)} + (1 - {\hat{η}}_{k + 1}) f_{k + 1} = {\hat{η}}_{k + 1} f_{k + 1} + (1 - {\hat{η}}_{k + 1}) f_{k + 1} = f_{k + 1} \leq Λ_{k} .$

Now, by a strong induction, assuming $x_{i} \in L (x_{0})$ , for all $i = 1, \dots, k$ , it is sufficient to show $x_{k + 1} \in L (x_{0})$ . Now, we can obtain

$f_{k + 1} = Λ_{k + 1} \leq Λ_{k} \leq f_{0} .$

Thus, the sequence ${x_{k}}_{k \geq 0}$ is contained in $L (x_{0})$ . Finally, Assumption (H1) along with $x_{k} \in L (x_{0})$ for all $k \in N_{0}$ implies that the sequence ${Λ_{k}}_{k \geq 0}$ is bounded. Thus, the sequence ${Λ_{k}}_{k \geq 0}$ is convergent.

Lemma 4.3

Let ${x_{k}}_{k \geq 0}$ be a bounded sequence of vectors in $R^{n}$ by the NMSPS-N algorithm and $η \in R$ such that $∥ \nabla f_{k} ∥ \geq η > 0$ . Then, under Assumptions (H1) and (H2), there exist $δ > 0,$ such that for all $Δ_{k} > 0,$ if $Δ_{k} \leq δ,$ then the kth iteration of NMSPS-N will be successful $({\tilde{ρ}}_{k} > 0)$ and $Δ_{k + 1} \geq Δ_{k}$ .

Proof.

Similar to Proposition 6.4 in [31], for $i = 1, \dots, p$ , If $Δ_{k} < δ$ , then we can get

$f (x_{k} + d_{k}^{i}) - f (x_{k}) \leq - \frac{1}{2} ξ ∥ \nabla f_{k} ∥ ∥ d_{k}^{i} ∥ < 0,$

in which $ξ > 0$ is a constant. Hence, there exists at least one $i \in {1, \dots, p}$ such that $d_{k}^{i} \in Δ_{k} B C_{k}$ . Whenever $Δ_{k} < δ$ , $f (x_{k} + d_{k}^{i}) < f (x_{k}) \leq Λ_{k}$ . If $min {f (x_{k} + y), y \in Δ_{k} B Γ_{k}} < Λ_{k}$ , then Procedures 3 guarantees $f (x_{k} + d_{k}) < Λ_{k}$ and consequently ${\tilde{ρ}}_{k} > 0$ . According NMWSP-N, we have $Δ_{k + 1} \geq Δ_{k}$ .

Lemma 4.3 gives the following corollary, see Corollary 6.5 in [31].

Corollary 4.4

Let ${x_{k}}_{k \geq 0}$ be a bounded sequence of vectors in $R^{n}$ by NMWPS-N and $η \in R$ such that $∥ \nabla f_{k} ∥ \geq η > 0$ . Then, under Assumptions (H1) and (H2), there exist $ζ, δ > 0,$ such that for all $Δ_{k} > 0,$ if $Δ_{k} \leq δ,$ then

$f_{k + 1} \leq f_{k} - ζ ∥ \nabla f_{k} ∥ ∥ d_{k} ∥ .$

The above corollary helps us to establish the following lemma.

Lemma 4.5

Suppose that Assumptions (H1) and (H2) hold and the sequence ${x_{k}}_{k \geq 0}$ is generated by the NMWPS-N algorithm. Then, we have

$lim_{k \to \infty} f_{l (k)} = lim_{k \to \infty} f_{k} .$

Proof.

Using the fact that $x_{k}$ is not the optimum of (2), we can conclude that there exists a constant $ϵ > 0$ such that $∥ \nabla f_{k} ∥ \geq ϵ$ . This fact along with Lemma 4.4 and $f_{k} \leq f_{l (k)}$ , for some $ζ > 0$ , implies that

$\begin{aligned} f_{k + 1} & = f (x_{k} + d_{k}) \\ \leq f_{k} - ζ ∥ \nabla f_{k} ∥ ∥ d_{k} ∥ \\ \leq f_{k} - ζ ϵ ∥ d_{k} ∥ \\ \leq f_{l (k)} - ω ∥ d_{k} ∥, \end{aligned}$ (14)

where $ω = ζ ϵ$ . By replacing k with $l (k) - 1$ in Equation (14), we have

$f_{l (l (k) - 1)} - f_{l (k)} \geq ω ∥ d_{l (k) - 1} ∥ .$ (15)

This fact along with Lemma 4.2 results in

$lim_{k \to \infty} ∥ d_{l (k) - 1} ∥ = 0.$ (16)

Assumption (H2) and (16) give

$lim_{k \to \infty} f (x_{l (k)}) = lim_{k \to \infty} f (x_{l (k) - 1}) .$ (17)

By letting $\hat{l} (k) = l (k + N + 2)$ and using the induction, for all $j \geq 1$ , we can prove

$lim_{k \to \infty} ∥ d_{\hat{l} (k) - j - 1} ∥ = 0.$ (18)

This fact that ${\hat{l} (k)} \subset {l (k)}$ and according to Equation (16), for j=1, Equation (18) is satisfied. Assume that Equation (18) holds for a given j and take k large enough so that $\hat{l} (k) - (j + 1) > 0$ . Using Equation (14) and substituting k with $\hat{l} (k) - j - 1$ , we have

$f (x_{\hat{l} (k) - j - 1}) - f (x_{\hat{l} (k) - j}) \leq ω ∥ d_{\hat{l} (k) - j - 1} ∥ .$

Following the same argument to derive (17), we deduce that

$lim_{k \to \infty} ∥ d_{\hat{l} (k) - j - 1} ∥ = 0.$

and also

$lim_{k \to \infty} f (x_{\hat{l} (k) - j - 1}) = lim_{k \to \infty} f (x_{l (k)}) .$

Similar with Equation (17), for any given $j \geq 1$ , we have

$lim_{k \to \infty} f (x_{\hat{l} (k) - j}) = lim_{k \to \infty} f (x_{l (k)}) .$

On the other hand, we can generate

$x_{k + 1} = x_{\hat{l} (k)} - \sum_{j = 1}^{\hat{l} (k) - k - 1} d_{\hat{l} (k) - j}, \forall k .$

This fact along with Equation (18) and $\hat{l} (k) - j - 1 \leq N + 1$ implies that

$lim_{k \to \infty} ∥ x_{k + 1} - x_{\hat{l} (k)} ∥ = 0.$

Hence, Assumption (H2) leads to

$lim_{k \to \infty} f (x_{l (k)}) = lim_{k \to \infty} f (x_{\hat{l} (k)}) = lim_{k \to \infty} f (x_{k}) .$

Using Lemma 4.5, we can obtain the following corollary.

Corollary 4.6

Suppose that Assumptions (H1) and (H2) hold and the sequence ${x_{k}}_{k \geq 0}$ is generated by the NMWPS-N algorithm. Then, we have

$lim_{k \to \infty} Λ_{k} = lim_{k \to \infty} f_{k} .$

Proof.

(1) If $k \in I_{1} \cup I_{2}$ , then the inequality $f_{k} \leq Λ_{k} \leq f_{l (k)}$ along with Lemma 4.5 implies that

$lim_{\begin{matrix} k \to \infty \\ k \in I_{1} \cup I_{2} \end{matrix}} Λ_{k} = lim_{\begin{matrix} k \to \infty \\ k \in I_{1} \cup I_{2} \end{matrix}} f_{k} .$

(2) For $k \in I_{3}$ , recalling Lemma 4.5 along with the definition of ${\hat{η}}_{k}$ results in

$lim_{\begin{matrix} k \to \infty \\ k \in I_{3} \end{matrix}} Λ_{k} = lim_{\begin{matrix} k \to \infty \\ k \in I_{3} \end{matrix}} f_{k} .$

The following lemmas show that NMWPS-N and NMSPS-N algorithms are well-defined.

Lemma 4.7

Suppose that Assumption (H1) holds and the NMWPS-N algorithm has constructed an infinite sequence ${x_{k}}_{k \geq 0},$ then $lim_{k \to \infty} inf △_{k} = 0$ .

Proof.

By contradiction, suppose that $lim_{k \to \infty} inf △_{k} = 0$ is not satisfied; hence, we can assume that there exists a constant $Δ_{LB} > 0$ and a set index $K \subset N_{0}$ such that

$Δ_{k} \geq Δ_{LB}, \forall k \in K .$

This fact along with Equation (5) results in

$τ^{r_{k}} \geq \frac{Δ_{LB}}{Δ_{0}} > 0, \forall k \in K,$

which means that the sequence ${τ^{r_{k}}}_{k \in K}$ is bounded away from zero. Since ${x_{k}}_{k \geq 0} \in L (x_{0})$ and $L (x_{0})$ is compact, Lemma 3.1 in [31] implies that the sequence ${Δ_{k}}_{k \geq 0}$ has an upper bounded denoted by $Δ_{UB}$ and hence the sequence ${τ^{r_{k}}}_{k \in K}$ is bounded above. In other words, the sequence ${τ^{r_{k}}}_{k \in K}$ is finite and consequently ${r_{k}}_{k \geq 0}$ has, respectively, a lower and upper bounded, defined by

$r_{LB} := min {r_{k} ∣ 0 \leq k < + \infty} and r_{UB} := max {r_{k} ∣ 0 \leq k < + \infty},$

hence, for any $k \in K$ , it can be concluded

$x_{k} := x_{0} + (β^{r_{LB}} α^{- r_{UB}}) Δ_{0} B \sum_{k = 0}^{k - 1} z_{k},$

i.e. it lies on a translated integer lattice generated by $x_{0}$ and the columns of $(β^{r_{LB}} α^{- r_{UB}}) Δ_{0} B$ , denoted by $K_{1}$ . Therefore $x_{k} \in L (x_{0}) \cap K_{1}$ for which $L (x_{0}) \cap K_{1}$ is finite and it must has at least a point $x_{*}$ in $L (x_{0}) \cap K_{1}$ such that $x_{k} := x_{*}$ for infinitely many k. By steps of NMWPS-N, a lattice point can be revisited finitely many times; hence, the new step $d_{k}$ is accepted if only if $Λ_{k} > f (x_{k} + d_{k})$ . This fact implies that there exists an positive index m such that $x_{k} := x_{*}$ , for $k \geq m$ . This fact together with Corollary 4.6 yields to ${\tilde{ρ}}_{k} \to 0$ and consequently $Δ_{k} \to 0$ , which is a contradiction since $0 < Δ_{LB} \leq 0$ .

Since NMSPS-N uses the relationship (12) to update $Δ_{k}$ , it ensures that $lim_{k \to \infty} △_{k} = 0$ .

Corollary 4.8

Suppose that Assumption (H1) holds and the NMSPS-N algorithm has constructed an infinite sequence ${x_{k}}_{k \geq 0}$ . Then, $lim_{k \to \infty} △_{k} = 0$ .

Remark 4.1

Grippo and Sciandrone, in Proposition 2 in [23], showed that if there exist the sequences ${c_{k}^{i}}_{k \geq 0}$ , $i = 1, \dots, p$ , which are bounded and each limit point of the sequence ${c_{k}^{1}, \dots, c_{k}^{p}}_{k \geq 0}$ is denoted by $(c_{*}^{1}, \dots, c_{*}^{p})$ for which $c_{*}^{i}$ , $i = 1, \dots, p$ , is positively span $R^{n}$ , then

$lim_{k \to \infty} ∥ \nabla f_{k} ∥ = 0 ⟺ lim_{k \to \infty} \sum_{i = 1}^{p} min \{0, \frac{\nabla f (x_{k})^{T} c_{k}^{i}}{∥ c_{k}^{i} ∥}\} = 0.$ (19)

Theorem 4.9

Suppose that Assumptions (H1) and (H2) hold. Let ${x_{k}}_{k \geq 0}$ be the infinite sequence generated by the NMWPS-N. Then,

$\underset{k \to \infty}{lim inf} ∥ \nabla f_{k} ∥ = 0.$ (20)

Proof.

By contradiction, we assume that Equation (20) does not hold. Then, there exists a constant $δ > 0$ such that $∥ \nabla f_{k} ∥ \geq δ$ for all $k \in N_{0}$ . From Lemma 4.7, there exists an infinite sequence $K$ such that

$\underset{k \to \infty, k \in K}{lim inf} Δ_{k} = 0.$ (21)

By recalling the continuous differentiability of f, it can find, for each $k \in N_{0}$ and for $i = 1, \dots, p$ , $ξ_{k}^{i} := x_{k} + ω_{k} d_{k}^{i} = x_{k} + ω_{k} Δ_{k} c_{k}^{i}$ , in which $ω_{k} \in (0, 1)$ , such that

$f (x_{k} + d_{k}^{i}) = f (x_{k}) + \nabla f (ξ_{k}^{i})^{T} d_{k}^{i} \leq Λ_{k} + \nabla f (ξ_{k}^{i})^{T} d_{k}^{i} .$ (22)

We can get ${ξ_{k}}_{k \in K} \to x_{*}$ because Equation (21) gives ${d_{k}}_{k \in K} \to 0$ ( $lim_{k \to \infty, k \in K} (c_{k}^{i} / ∥ c_{k}^{i} ∥) = c_{*}^{i}$ ) and ${x_{k}}_{k \in K} \to x_{*}$ . Now, these facts along with taking a limit from both sides (22), for $i = 1, \dots, p$ , gives

$\begin{aligned} \nabla f (x_{*})^{T} c_{*}^{i} & = lim_{k \to \infty} \frac{\nabla f (x_{k})^{T} c_{k}^{i}}{∥ c_{k}^{i} ∥} \\ = lim_{k \to \infty} \frac{\nabla f (ξ_{k})^{T} c_{k}^{i}}{∥ c_{k}^{i} ∥} \geq 0, \end{aligned}$

yielding to

$lim_{k \to \infty, k \in K} \sum_{i = 1}^{p} min \{0, \frac{\nabla f (x_{k})^{T} c_{k}^{i}}{∥ c_{k}^{i} ∥}\} = 0.$

Then, by Equation (19), we get

$lim_{k \to \infty, k \in K} ∥ \nabla f_{k} ∥ = 0,$

leading

$\underset{k \to \infty}{lim inf} ∥ \nabla f_{k} ∥ = 0.$

The following lemma helps us to establish the main global theorem.

Lemma 4.10

Suppose that Assumptions (H1) and (H2) hold and the columns of the $C_{k}$ are bounded in norm, i.e. there exist two positive constant $γ_{1}$ and $γ_{2}$ such that $γ_{1} \leq ∥ c_{k}^{i} ∥ \leq γ_{2}$ for $i = 1, \dots, p$ . Let ${x_{k}}_{k \geq 0}$ be the sequence generated by NMSPS-N. If there exists a positive constant δ and a subsequence $K \subseteq N_{0}$ such that $∥ \nabla f_{k} ∥ \geq δ$ for $k \in K,$ then

$\sum_{k \in K} Δ_{k} < \infty .$

Proof.

First, we show that

$f_{k + 1} \leq f_{0} - ζ δ γ_{1} \sum_{j = 0, j \in K}^{k} Δ_{j} .$

By Lemma 4.4, we get

$\begin{aligned} f_{k + 1} & \leq f_{k} - ζ ∥ \nabla f_{k} ∥ ∥ d_{k} ∥ \\ \leq f_{k} - ζ δ γ_{1} Δ_{k} \\ \leq (f_{k - 1} - ζ δ γ_{1} Δ_{k - 1}) - ζ δ γ_{1} Δ_{k} \\ \leq f_{0} - ζ δ γ_{1} \sum_{j = 0, j \in K}^{k} Δ_{k} . \end{aligned}$

Suppose that there exists a subset index $K^{'} \subset K$ such that $\sum_{k \in K^{'}} Δ_{k} = \infty$ . Then, we get

$f_{0} \geq f_{0} - f_{k} \geq ζ δ γ_{1} \sum_{j = 0, j \in K^{'}}^{k - 1} Δ_{j} \to \infty, as k \to \infty,$

yielding to $f_{0} \to \infty$ , which is a contradiction. Hence, we conclude that

$\sum_{j = 0, j \in K}^{k - 1} Δ_{j} < \infty .$

At this point, the global convergence of Algorithm 2 based on the mentioned assumptions of this section can be investigated.

Theorem 4.11

Suppose that Assumptions (H1) and (H2) hold and the columns of the $C_{k}$ are bounded in norm, i.e. there exist two positive constant $γ_{1}$ and $γ_{2}$ such that $γ_{1} \leq ∥ c_{k}^{i} ∥ \leq γ_{2}$ for $i = 1, \dots, p$ . Then, for any ${x_{k}}_{k \geq 0}$ generated by the non-monotone pattern search method $($ NMSPS-N $),$

$lim_{k \to \infty} ∥ \nabla f_{k} ∥ = 0.$ (23)

Proof.

By contradiction, let us assume that the conclusion does not hold. Then, there is a subsequence of successful iterations such that

$∥ \nabla f_{k} ∥ \geq δ > 0, for some δ > 0.$

Theorem 4.9 guarantees that, for each $i = 1, \dots, p$ , there exists a first successful iteration $l (t_{i}) > t_{i}$ such that $∥ \nabla f_{t_{i}} ∥ < δ$ . Denote $l_{i} := l (t_{i})$ and define the index set $Ξ^{i} := {k ∣ t_{i} \leq k < l_{i}}$ , hence there exists another subsequence $l_{i}$ such that

$∥ \nabla f_{k} ∥ \geq δ, \forall k \in Ξ^{i} and ∥ \nabla f_{l_{i}} ∥ < δ .$ (24)

This fact along with taking $Ξ := \cup_{i = 0}^{\infty} Ξ^{i}$ leads to

$\underset{k \to \infty, k \in Ξ}{lim inf} ∥ \nabla f_{k} ∥ \geq δ .$

Then, Lemma 4.10 gives

$\sum_{j \in Ξ} Δ_{j} < \infty,$

leading to

$lim_{i \to \infty} \sum_{j \in Ξ^{i}} Δ_{j} = 0.$

Hence

$∥ x_{t_{i}} - x_{l_{i}} ∥ \leq \sum_{j \in Ξ^{i}} ∥ x_{j} - x_{j + 1} ∥ \leq \sum_{j \in Ξ^{i}} Δ_{j} \to 0, as i \to \infty,$

which deduces from continuity of $\nabla f (x)$ on $L (x_{0})$

$lim_{i \to \infty} ∥ \nabla f_{t_{i}} - \nabla f_{l_{i}} ∥ = 0.$

This is a contradiction since Equation (24) implies $∥ \nabla f_{t_{i}} - \nabla f_{l_{i}} ∥ \geq δ$ .

5. Numerical experiments

One of the well known pattern search methods is the generalized coordinate search method with fixed step lengths [31]. This section reports some numerical experiments. Our algorithm, NMCS-N, is compared with the following considered algorithms

GSCS: The generalized strong coordinate search [31]
NMSCS-G: Algorithm 2 with the non-monotone term of Grippo et al. [14]
NMSCS-A: Algorithm 2 with the non-monotone term of Ahookhosh and Amini [1]
NMSCS-Z: Algorithm 2 with the non-monotone term of Zhang and Hager [38]

Test problems were selected from a wide range of papers: Problems 1–23 from [25], problems 24–31 from [24] and the problems 32–52 from [18].

All codes are written in MATLAB 9 programming environment on a 2.7 GHz Pentium(R) Dual-core CPU Windows 7 PC with 2G RAM with double precision format in the same subroutine. In our numerical experiments, the algorithms are stopped

Δ_{k} \leq 10^{- 6},

or whenever the total number of function evaluations exceeds 100,000. For all algorithms, we take advantages of the parameters $λ := 1.5$ , $θ := 0.5$ , $Δ_{0} := 1$ and B:=I. To calculate the non-monotone term $f_{l (k)}$ , NMSCS-G, NMSCS-A and NMSCS-N have selected N:=5. For NMSCS-A, NMSCS-Z and NMSCS-N, we use $η_{0} := 0.001$ and for NMSCS-A and NMSCS-N, the parameter $η_{k}$ is updated by

η_{k} := \{\begin{cases} η_{0} / 2, & if k = 1, \\ (η_{k - 1} + η_{k - 2}) / 2, & if k \geq 2. \end{cases}

For NMSCS-N, we have taken $β := 1 + ϵ_{m}$ , in which $ϵ_{m}$ is machine ϵ. For all iterations of the coordinate search method, the generating matrix is fixed, i.e. $C_{k} := C$ . Hence, this matrix contains in its columns all possible combinations of ${- 1, 0, 1}$ and consequently it has $p = 3^{n}$ columns. In particular, the columns of C contain both I and $- I$ , as well as a column of zeros.

The following algorithm briefly summarizes how the exploratory move directions for non-monotone coordinate search are generated, see [31]:

The exploratory moves are executed sequentially in the sense that the selection of the next trial step is based on the success or failure of the previous trial step. Thus, we may compute as few as n trial steps while there are $3^{n}$ possible trial steps, but we compute no more than $2 n$ at any given iteration, see Figure 1 in [31]. However, in the worst case, the algorithm for coordinate search ensures that all $2 n$ steps, defined by $Δ_{k} B Γ = Δ_{k} B [M - M] = Δ_{k} [I - I]$ , are tried before returning the step $d_{k} = 0$ . In other words, the exploratory moves given in Algorithm 3 examine all $2 n$ steps defined by $Δ_{k} B Γ$ unless a step satisfying $f (x_{k} + d_{k}) < Λ_{k}$ is found.

At this point, to have a more reliable comparison and demonstrate the overall behaviour of the present algorithms and get more insight about the performance of considered codes, the performance of all codes, based on both $C_{t}$ and $N_{f}$ which are shown in Table 1, have been, respectively, assessed in Figure 1 by applying the performance profile proposed from Dolan and Moré in [6]. Subfigure (a) and (b) of Figure 1 plot the function $P (τ) : [0, r_{max}] \to R^{+}$ , considered as

P (τ) := \frac{card (p \in P ∣ r_{p, s} \leq τ)}{card (P)}, τ \geq 1,

where $P$ denotes the set of test problems, $r_{p, s}$ denotes the ratio of number of function evaluations and CPU-times needed to solve problem p with method s with the least number of function evaluations and CPU-times needed to solve problem p, respectively, and $r_{max}$ is the maximum value of $r_{p, s}$ . Finally, the highest on the plot is describing the best solver.

Table 1. List of test functions.

Problem name	Dim	Problem name	Dim
Extended Powell badly scaled	2	Powell singular	4
Brent	3	Broyden banded	5
Seven-Diagonal System	7	Chebyquad	10
Extended Powell Singular	8	Brown almost linear	10
Triadiagnal exponential	10	Discrete integral equation	20
Generalized Broyden banded	10	Diag. func. premul. by … matrix	3
Flow in a channel	10	Function 18	3
Swiriling flow	10	Strictly convex 2	5
Thorech	12	Strictly convex 1	5
Trig. exponential system 2	15	Zero Jacobian	5
Countercurrent reactors 1	16	Geometric	5
Countercurrent reactors 2	16	Extended Rosenbrock	6
Porous medium	16	Geometric programming	8
Trigonometric	20	Tridimensional valley	9
Singular Broyden	20	Chandrasekhar's H-equation	10
Broyden tridiagonal	20	Singular	10
Extended Wood	20	Logarithmic	10
Extended Cragg and Levy	24	Variable band 2	10
Trig. exponential system 1	25	Function 15	10
Structured Jacobian	25	Linear function-full rank 1	10
Discrete boundary value	25	Hanbook	10
Possion	25	Variable band 1	15
Possion 2	25	Linear function-full rank 2	20
Rosenbrock	2	Function 27	20
Powell badley scaled	2	Complementary	20
Helical valley	3	Function 21	21

Open in a new tab

On one hand, subfigure (a) of Figure 1 compares NMSCS-N in the sense of the total number of function evaluations. It can be easily seen that NMSCS-N is the best algorithm in the sense of the most wins on more than 50% of the test functions. On the other hand, to compare the CPU times, because of variation of CPU time, each problem is solved five times and then the average of the CPU times is taken into account. Subfigure (b) of Figure 1 represents a comparison among the considered algorithms regarding CPU times. The results of this subfigure indicate that the performance of NMSCS-N is better than other present algorithms. In details, the new algorithm is the best algorithm on more than 35% of all cases.

6. Concluding remarks

This paper proposes a new non-monotone coordinate search algorithm to solve systems of nonlinear equations. Our method can overcome some disadvantages of the proposed method by Ahookhosh and Amini [1] by presenting a new parameter, defined by using combination of the maximum function value of some preceding successful iterates and the current function value. This parameter can prevent the production of weaker non-monotone strategy whenever iterates are far away from the optimizer and stronger nonmonotone strategy whenever iterates are close to the optimizer. The global convergence properties of the proposed algorithms are established. Preliminary numerical results show the significant efficiency of the new algorithm.

Funding Statement

The second author acknowledges the financial support of the Doctoral Program ‘Vienna Graduate School on Computational Optimization’ funded by Austrian Science Foundation under Project No W1260-N35.

Disclosure statement

No potential conflict of interest was reported by the authors.

References

[1].Ahookhosh M. and Amini K., An efficient nonmonotone trust-region method for unconstrained optimization, Numer. Algorithms 59 (2012), pp. 523–540. doi: 10.1007/s11075-011-9502-5 [DOI] [Google Scholar]
[2].Ahookhosh M., Amini K., and Kimiaei M., A globally convergent trust-region method for large-scale symmetric nonlinear systems, Numer. Funct. Anal. Optim. 36 (2015), pp. 830–855. doi: 10.1080/01630563.2015.1046080 [DOI] [Google Scholar]
[3].Ahookhosh M., Esmaeili H., and Kimiaei M., An effective trust-region-based approach for symmetric nonlinear systems, Int. J. Comput. Math. 90(3) (2013), pp. 671–690. doi: 10.1080/00207160.2012.736617 [DOI] [Google Scholar]
[4].Box G.E.P., Evolutionary operation: A method for increasing industrial productivity, Appl. Stat. 6 (1957), pp. 81–101. doi: 10.2307/2985505 [DOI] [Google Scholar]
[5].Dai Y.H., On the nonmonotone line search, J. Optim. Theory Appl. 112(2) (2002), pp. 315–330. doi: 10.1023/A:1013653923062 [DOI] [Google Scholar]
[6].Dolan E.D. and Moré J.J., Benchmarking optimization software with performance profiles, Math. Program. 91 (2002), pp. 201–213. doi: 10.1007/s101070100263 [DOI] [Google Scholar]
[7].Esmaeili H. and Kimiaei M., An improved adaptive trust-region method for unconstrained optimization, Math. Model. Anal. 19 (2014), pp. 469–490. doi: 10.3846/13926292.2014.956237 [DOI] [Google Scholar]
[8].Esmaeili H. and Kimiaei M., An efficient adaptive trust-region method for systems of nonlinear equations, Int. J. Comput. Math. 92 (2015), pp. 151–166. doi: 10.1080/00207160.2014.887701 [DOI] [Google Scholar]
[9].Esmaeili H. and Kimiaei M., A trust-region method with improved adaptive radius for systems of nonlinear equations, Math. Methods Oper. Res. 83(1) (2016), pp. 109–125. doi: 10.1007/s00186-015-0522-0 [DOI] [Google Scholar]
[10].Fan J.Y., Convergence rate of the trust region method for nonlinear equations under local error bound condition, Comput. Optim. Appl. 34 (2005), pp. 215–227. doi: 10.1007/s10589-005-3078-8 [DOI] [Google Scholar]
[11].Fan J. and Pan J., An improved trust region algorithm for nonlinear equations, Comput. Optim. Appl. 48(1) (2011), pp. 59–70. doi: 10.1007/s10589-009-9236-7 [DOI] [Google Scholar]
[12].Gasparo M.G., A nonmonotone hybrid method for nonlinear systems, Optim. Methods Softw. 13 (2000), pp. 79–94. doi: 10.1080/10556780008805776 [DOI] [Google Scholar]
[13].Gasparo M.G., Papini A., and Pasquali A., Nonmonotone algorithms for pattern search methods, Numer. Algorithms 28 (2001), pp. 171–186. doi: 10.1023/A:1014046817188 [DOI] [Google Scholar]
[14].Grippo L., Lampariello F., and Lucidi S., A nonmonotone line search technique for Newton's method, SIAM J. Numer. Anal. 23 (1986), pp. 707–716. doi: 10.1137/0723046 [DOI] [Google Scholar]
[15].Grippo L., Lampariello F., and Lucidi S., A truncated Newton method with nonmonotone line search for unconstrained optimization, J. Optim. Theory Appl. 60(3) (1989), pp. 401–419. doi: 10.1007/BF00940345 [DOI] [Google Scholar]
[16].Grippo L., Lampariello F., and Lucidi S., A class of nonmonotone stabilization methods in unconstrained optimization, Numer. Math. 59 (1991), pp. 779–805. doi: 10.1007/BF01385810 [DOI] [Google Scholar]
[17].Hooke R. and Jeeves T.A, Direct search solution of numerical and statistical problems, J. ACM 8 (1961), pp. 212–229. doi: 10.1145/321062.321069 [DOI] [Google Scholar]
[18].LaCruz W., Venezuela C., Martínez J.M., and Raydan M., Spectral residual method without gradient information for solving large-scale nonlinear systems of equations: Theory and experiments, Technical Report RT–04–08, July 2004.
[19].Lewis R.M and Torczon V., Pattern search algorithms for bound constrained minimization, SIAM. J. Optim. 9 (1999), pp. 1082–1099. doi: 10.1137/S1052623496300507 [DOI] [Google Scholar]
[20].Lewis R.M. and Torczon V., Pattern search methods for linearly constrained minimization, SIAM. J. Optim. 10 (2000), pp. 917–941. doi: 10.1137/S1052623497331373 [DOI] [Google Scholar]
[21].Lewis R.M., Torczon V., and Trosset M.W., Why pattern search works, Optima (1988), pp. 1–7. [Google Scholar]
[22].Lewis R.M., Torczon V., and Trosset M.W., Direct search methods: Then and now, J. Comput. Appl. Math. 124 (2000), pp. 191–207. doi: 10.1016/S0377-0427(00)00423-4 [DOI] [Google Scholar]
[23].Lucidi S. and Sciandrone M., On the global convergence of derivative free methods for unconstrained optimization, Technical Report 32–96, DIS, Universita' di Roma ‘La Sapienza’, 1996.
[24].Lukšan L. and Vlček J., Sparse and partially separable test problems for unconstrained and equality constrained optimization, Techical Report, No. 767, January 1999.
[25].Moré J.J., Garbow B.S., and Hillström K.E., Testing Unconstrained Optimization Software, ACM Trans. Math. Softw. 7 (1981), pp. 17–41. doi: 10.1145/355934.355936 [DOI] [Google Scholar]
[26].Nocedal J. and Wright J.S., Numerical Optimization, Springer, NewYork, 2006. [Google Scholar]
[27].Shi Z.J. and Wang S., Modified nonmonotone Armijo line search for descent method, Numer. Algorithms 57(1) (2011), pp. 1–25. doi: 10.1007/s11075-010-9408-7 [DOI] [Google Scholar]
[28].Toint P.L., An assessment of nonmonotone linesearch techniques for unconstrained optimization, SIAM J. Sci. Comput. 17 (1996), pp. 725–739. doi: 10.1137/S106482759427021X [DOI] [Google Scholar]
[29].Torczon V., Multidirectional search: A direct search algorithm for parallel machines, Ph.D. thesis, Rice University, Houston, TX, 1989.
[30].Torczon V., On the convergence of the multidirectional search algorithm, SIAM J. Optim. 1 (1991), pp. 123–145. doi: 10.1137/0801010 [DOI] [Google Scholar]
[31].Torczon V., On the convergence of pattern search algorithms, SIAM J. Optim. 7 (1997), pp. 1–25. doi: 10.1137/S1052623493250780 [DOI] [Google Scholar]
[32].Yuan G.L. and Lu X.W., A new backtracking inexact BFGS method for symmetric nonlinear equations, Comput. Math. Appl. 55 (2008), pp. 116–129. doi: 10.1016/j.camwa.2006.12.081 [DOI] [Google Scholar]
[33].Yuan G.L. and Zhang M.J., A three-terms Polak-R*ibière-Polyak conjugate gradient algorithm for large-scale nonlinear equations, J. Comput. Appl. Math. 286 (2015), pp. 186–195. doi: 10.1016/j.cam.2015.03.014 [DOI] [Google Scholar]
[34].Yuan G.L., Lu S., and Wei Z., A new trust-region method with line search for solving symmetric nonlinear equations, Int. J. Comput. Math. 88(10) (2011), pp. 2109–2123. doi: 10.1080/00207160.2010.526206 [DOI] [Google Scholar]
[35].Yuan G.L., Meng Z.H., and Li Y., A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations, J. Optim. Theory Appl. 168 (2016), pp. 129–152. doi: 10.1007/s10957-015-0781-1 [DOI] [Google Scholar]
[36].Yuan G.L., Lu X.W., and Wei Z.X., BFGS trust-region method for symmetric nonlinear equations, J. Comput. Appl. Math. 230 (2009), pp. 44–58. doi: 10.1016/j.cam.2008.10.062 [DOI] [Google Scholar]
[37].Yuan G.L., Wei Z.X., and Lu X.W., A BFGS trust-region method for nonlinear equations, Computing 92(4) (2011), pp. 317–333. doi: 10.1007/s00607-011-0146-z [DOI] [Google Scholar]
[38].Zhang H.C and Hager W.W., A nonmonotone line search technique and its application to unconstrained optimization, SIAM J. Optim. 14(4) (2004), pp. 1043–1056. doi: 10.1137/S1052623403428208 [DOI] [Google Scholar]
[39].Zhang J. and Wang Y., A new trust region method for nonlinear equations, Math. Methods Oper. Res. 58 (2003), pp. 283–298. doi: 10.1007/s001860300302 [DOI] [Google Scholar]

[CIT0001] [1].Ahookhosh M. and Amini K., An efficient nonmonotone trust-region method for unconstrained optimization, Numer. Algorithms 59 (2012), pp. 523–540. doi: 10.1007/s11075-011-9502-5 [DOI] [Google Scholar]

[CIT0002] [2].Ahookhosh M., Amini K., and Kimiaei M., A globally convergent trust-region method for large-scale symmetric nonlinear systems, Numer. Funct. Anal. Optim. 36 (2015), pp. 830–855. doi: 10.1080/01630563.2015.1046080 [DOI] [Google Scholar]

[CIT0003] [3].Ahookhosh M., Esmaeili H., and Kimiaei M., An effective trust-region-based approach for symmetric nonlinear systems, Int. J. Comput. Math. 90(3) (2013), pp. 671–690. doi: 10.1080/00207160.2012.736617 [DOI] [Google Scholar]

[CIT0004] [4].Box G.E.P., Evolutionary operation: A method for increasing industrial productivity, Appl. Stat. 6 (1957), pp. 81–101. doi: 10.2307/2985505 [DOI] [Google Scholar]

[CIT0005] [5].Dai Y.H., On the nonmonotone line search, J. Optim. Theory Appl. 112(2) (2002), pp. 315–330. doi: 10.1023/A:1013653923062 [DOI] [Google Scholar]

[CIT0006] [6].Dolan E.D. and Moré J.J., Benchmarking optimization software with performance profiles, Math. Program. 91 (2002), pp. 201–213. doi: 10.1007/s101070100263 [DOI] [Google Scholar]

[CIT0007] [7].Esmaeili H. and Kimiaei M., An improved adaptive trust-region method for unconstrained optimization, Math. Model. Anal. 19 (2014), pp. 469–490. doi: 10.3846/13926292.2014.956237 [DOI] [Google Scholar]

[CIT0008] [8].Esmaeili H. and Kimiaei M., An efficient adaptive trust-region method for systems of nonlinear equations, Int. J. Comput. Math. 92 (2015), pp. 151–166. doi: 10.1080/00207160.2014.887701 [DOI] [Google Scholar]

[CIT0009] [9].Esmaeili H. and Kimiaei M., A trust-region method with improved adaptive radius for systems of nonlinear equations, Math. Methods Oper. Res. 83(1) (2016), pp. 109–125. doi: 10.1007/s00186-015-0522-0 [DOI] [Google Scholar]

[CIT0010] [10].Fan J.Y., Convergence rate of the trust region method for nonlinear equations under local error bound condition, Comput. Optim. Appl. 34 (2005), pp. 215–227. doi: 10.1007/s10589-005-3078-8 [DOI] [Google Scholar]

[CIT0011] [11].Fan J. and Pan J., An improved trust region algorithm for nonlinear equations, Comput. Optim. Appl. 48(1) (2011), pp. 59–70. doi: 10.1007/s10589-009-9236-7 [DOI] [Google Scholar]

[CIT0012] [12].Gasparo M.G., A nonmonotone hybrid method for nonlinear systems, Optim. Methods Softw. 13 (2000), pp. 79–94. doi: 10.1080/10556780008805776 [DOI] [Google Scholar]

[CIT0013] [13].Gasparo M.G., Papini A., and Pasquali A., Nonmonotone algorithms for pattern search methods, Numer. Algorithms 28 (2001), pp. 171–186. doi: 10.1023/A:1014046817188 [DOI] [Google Scholar]

[CIT0014] [14].Grippo L., Lampariello F., and Lucidi S., A nonmonotone line search technique for Newton's method, SIAM J. Numer. Anal. 23 (1986), pp. 707–716. doi: 10.1137/0723046 [DOI] [Google Scholar]

[CIT0015] [15].Grippo L., Lampariello F., and Lucidi S., A truncated Newton method with nonmonotone line search for unconstrained optimization, J. Optim. Theory Appl. 60(3) (1989), pp. 401–419. doi: 10.1007/BF00940345 [DOI] [Google Scholar]

[CIT0016] [16].Grippo L., Lampariello F., and Lucidi S., A class of nonmonotone stabilization methods in unconstrained optimization, Numer. Math. 59 (1991), pp. 779–805. doi: 10.1007/BF01385810 [DOI] [Google Scholar]

[CIT0017] [17].Hooke R. and Jeeves T.A, Direct search solution of numerical and statistical problems, J. ACM 8 (1961), pp. 212–229. doi: 10.1145/321062.321069 [DOI] [Google Scholar]

[CIT0018] [18].LaCruz W., Venezuela C., Martínez J.M., and Raydan M., Spectral residual method without gradient information for solving large-scale nonlinear systems of equations: Theory and experiments, Technical Report RT–04–08, July 2004.

[CIT0019] [19].Lewis R.M and Torczon V., Pattern search algorithms for bound constrained minimization, SIAM. J. Optim. 9 (1999), pp. 1082–1099. doi: 10.1137/S1052623496300507 [DOI] [Google Scholar]

[CIT0020] [20].Lewis R.M. and Torczon V., Pattern search methods for linearly constrained minimization, SIAM. J. Optim. 10 (2000), pp. 917–941. doi: 10.1137/S1052623497331373 [DOI] [Google Scholar]

[CIT0021] [21].Lewis R.M., Torczon V., and Trosset M.W., Why pattern search works, Optima (1988), pp. 1–7. [Google Scholar]

[CIT0022] [22].Lewis R.M., Torczon V., and Trosset M.W., Direct search methods: Then and now, J. Comput. Appl. Math. 124 (2000), pp. 191–207. doi: 10.1016/S0377-0427(00)00423-4 [DOI] [Google Scholar]

[CIT0023] [23].Lucidi S. and Sciandrone M., On the global convergence of derivative free methods for unconstrained optimization, Technical Report 32–96, DIS, Universita' di Roma ‘La Sapienza’, 1996.

[CIT0024] [24].Lukšan L. and Vlček J., Sparse and partially separable test problems for unconstrained and equality constrained optimization, Techical Report, No. 767, January 1999.

[CIT0025] [25].Moré J.J., Garbow B.S., and Hillström K.E., Testing Unconstrained Optimization Software, ACM Trans. Math. Softw. 7 (1981), pp. 17–41. doi: 10.1145/355934.355936 [DOI] [Google Scholar]

[CIT0026] [26].Nocedal J. and Wright J.S., Numerical Optimization, Springer, NewYork, 2006. [Google Scholar]

[CIT0027] [27].Shi Z.J. and Wang S., Modified nonmonotone Armijo line search for descent method, Numer. Algorithms 57(1) (2011), pp. 1–25. doi: 10.1007/s11075-010-9408-7 [DOI] [Google Scholar]

[CIT0028] [28].Toint P.L., An assessment of nonmonotone linesearch techniques for unconstrained optimization, SIAM J. Sci. Comput. 17 (1996), pp. 725–739. doi: 10.1137/S106482759427021X [DOI] [Google Scholar]

[CIT0029] [29].Torczon V., Multidirectional search: A direct search algorithm for parallel machines, Ph.D. thesis, Rice University, Houston, TX, 1989.

[CIT0030] [30].Torczon V., On the convergence of the multidirectional search algorithm, SIAM J. Optim. 1 (1991), pp. 123–145. doi: 10.1137/0801010 [DOI] [Google Scholar]

[CIT0031] [31].Torczon V., On the convergence of pattern search algorithms, SIAM J. Optim. 7 (1997), pp. 1–25. doi: 10.1137/S1052623493250780 [DOI] [Google Scholar]

[CIT0032] [32].Yuan G.L. and Lu X.W., A new backtracking inexact BFGS method for symmetric nonlinear equations, Comput. Math. Appl. 55 (2008), pp. 116–129. doi: 10.1016/j.camwa.2006.12.081 [DOI] [Google Scholar]

[CIT0033] [33].Yuan G.L. and Zhang M.J., A three-terms Polak-R*ibière-Polyak conjugate gradient algorithm for large-scale nonlinear equations, J. Comput. Appl. Math. 286 (2015), pp. 186–195. doi: 10.1016/j.cam.2015.03.014 [DOI] [Google Scholar]

[CIT0034] [34].Yuan G.L., Lu S., and Wei Z., A new trust-region method with line search for solving symmetric nonlinear equations, Int. J. Comput. Math. 88(10) (2011), pp. 2109–2123. doi: 10.1080/00207160.2010.526206 [DOI] [Google Scholar]

[CIT0035] [35].Yuan G.L., Meng Z.H., and Li Y., A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations, J. Optim. Theory Appl. 168 (2016), pp. 129–152. doi: 10.1007/s10957-015-0781-1 [DOI] [Google Scholar]

[CIT0036] [36].Yuan G.L., Lu X.W., and Wei Z.X., BFGS trust-region method for symmetric nonlinear equations, J. Comput. Appl. Math. 230 (2009), pp. 44–58. doi: 10.1016/j.cam.2008.10.062 [DOI] [Google Scholar]

[CIT0037] [37].Yuan G.L., Wei Z.X., and Lu X.W., A BFGS trust-region method for nonlinear equations, Computing 92(4) (2011), pp. 317–333. doi: 10.1007/s00607-011-0146-z [DOI] [Google Scholar]

[CIT0038] [38].Zhang H.C and Hager W.W., A nonmonotone line search technique and its application to unconstrained optimization, SIAM J. Optim. 14(4) (2004), pp. 1043–1056. doi: 10.1137/S1052623403428208 [DOI] [Google Scholar]

[CIT0039] [39].Zhang J. and Wang Y., A new trust region method for nonlinear equations, Math. Methods Oper. Res. 58 (2003), pp. 283–298. doi: 10.1007/s001860300302 [DOI] [Google Scholar]

PERMALINK

A non-monotone pattern search approach for systems of nonlinear equations

Keyvan Amini

Morteza Kimiaei

Hassan Khotanlou

ABSTRACT

1. Introduction

2. The generalized pattern search method

Definition 2.1

Definition 2.2

3. The new non-monotone strategy

Remark 3.1

Lemma 3.1

Proof.

4. Convergence analysis

Lemma 4.1

Proof.

Lemma 4.2

Proof.

Lemma 4.3

Proof.

Corollary 4.4

Lemma 4.5

Proof.

Corollary 4.6

Proof.

Lemma 4.7

Proof.

Corollary 4.8

Remark 4.1

Theorem 4.9

Proof.

Lemma 4.10

Proof.

Theorem 4.11

Proof.

5. Numerical experiments

Table 1. List of test functions.

Figure 1.

6. Concluding remarks

Funding Statement

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases