Skip to main content
Heliyon logoLink to Heliyon
. 2021 Jul 7;7(7):e07499. doi: 10.1016/j.heliyon.2021.e07499

Alternative structured spectral gradient algorithms for solving nonlinear least-squares problems

Mahmoud Muhammad Yahaya a,c, Poom Kumam a,c,, Aliyu Muhammed Awwal a,b,d, Sani Aji a,d
PMCID: PMC8319484  PMID: 34345725

Abstract

The study of efficient iterative algorithms for addressing nonlinear least-squares (NLS) problems is of great importance. The NLS problems, which belong to a special class of unconstrained optimization problems, are of particular interest because of the special structure of their gradients and Hessians. In this paper, based on the spectral parameters of Barzillai and Borwein (1998), we propose three structured spectral gradient algorithms for solving NLS problems. Each spectral parameter in the respective algorithms incorporates the structured gradient and the information gained from the structured Hessian approximation. Moreover, we develop a safeguarding technique for the first two structured spectral parameters to avoid negative curvature directions. Moreso, using a nonmonotone line-search strategy, we show that the proposed algorithms are globally convergent under some standard conditions. The comparative computational results on some standard test problems show that the proposed algorithms are efficient.

Keywords: Iterative algorithm, Spectral gradient algorithm, Nonlinear least squares, Line–search, Quasi–Newton algorithm


Iterative algorithm; Spectral gradient algorithm; Nonlinear least squares; Line–search; Quasi–Newton algorithm

1. Introduction

Consider the general unconstrained optimization problem:

min{f(x):xRn}, (1.1)

where f:RnR is assumed to be twice continuously differentiable function and bounded below. Popular iterative algorithms, such as Newton's algorithm and quasi-Newton algorithms, generate a sequence of iterates {xk}Rn that eventually converges to some solutions of problem (1.1) using the following recurrence relation

xk+1=xk+αkdk, (1.2)

where the scalar αk>0 is a step size usually computed through suitable line-search strategies while the vector dk is a search direction obtained using different types of approaches. One of the famous and successful algorithms for calculating the search direction, dk, is the quasi-Newton approach defined as follows:

dk=Bk1gk,BkHkordk=Qkgk,QkHk1, (1.3)

where Hk and gk are the Hessian matrix and the gradient of f at xk, respectively, and B0=I, I is an identity matrix. The approximations Bk and Qk are usually required to satisfy the following secant equations

{Bksk1=yk1,Qkyk1=sk1, (1.4)

where sk1=xkxk1 and yk1=gkgk1.

Researchers developed quasi-Newton algorithms to address the high computational cost associated with computing the exact Hessian matrix Hk in the famous Newton's algorithm, where the second derivative of f(xk) needs to be evaluated in every iteration. However, most variants of quasi-Newton algorithms still need to form and store n×n matrices in every iteration. This also makes those algorithms computationally expensive and unsuitable to handle large-scaled problems. Therefore, algorithms that do not require the storage of any matrix should be a better alternative.

One of such matrix-free algorithms is called Barzilai and Borwein (BB) algorithm [1], this algorithm uses (1.2) to update its sequence of iterates where the search direction is given by dk=gk. The scalar αk is determined as follows. Let Dk=αkI be the approximation of the Hessian matrix, the diagonal matrix Dk is supposed to satisfy the quasi-Newton equation (1.4) where I is an identity matrix. But, it's often difficult to find an αk such that Dk1=αk1I satisfies the secant equation (1.4), when the entries of xk are more than one. As a result, Barzilai and Borwein required Dk1 to approximately satisfy the quasi-Newton equation (1.4) by finding αkR that minimizes the following least squares problem

minα12αsk1yk12. (1.5)

The solution of problem (1.5) is as follows

αkBB1=sk12sk1Tyk1. (1.6)

Similarly, we can have another choice of αk by minimizing:

minα12sk1α1yk12,

where the solution of the problem is as follows

αkBB2=sk1Tyk1yk12. (1.7)

It is pertinent to point out here that for general unconstrained optimization problems, the BB algorithm with the spectral parameter (1.7) performs numerically better than the spectral parameter (1.6) for some problems (see, [2]). However, despite the simplicity and good performance of the BB algorithm, the spectral parameters may produce negative values for non-convex objective functions [2]. To overcome this, Raydan et al. [3] set a bound for the spectral parameter in the interval [1030,1030]. However, the interval [1030,1030] looks artificial and therefore, a geometric mean of (1.6) and (1.7) was proposed and analyzed by Dai et al. [2]. This geometric mean is given by:

αk=sk1yk1. (1.8)

This paper deals with the special class of problem (1.1), called “nonlinear least-squares problems.” The problem is defined as follows:

minxRnf(x),f(x)=12R(x)2, (1.9)

where R:RnRm (usually mn) is continuous and bounded below. R(x)=(R1(x),,Rm(x))T, Rj:RnR, j=1,2,,m is twice differentiable function and . is the Euclidean norm.

The gradient, f(x) and the Hessian, 2f(x) of the objective function (1.9) have special structures which are respectively given by:

f(x)=j=1mRj(x)Rj(x)=J(x)TR(x), (1.10)
2f(x)=j=1mRj(x)Rj(x)T+j=1mRj(x)2Rj(x)=J(x)TJ(x)+P(x), (1.11)

respectively, where, J(x)=R(x) is the Jacobian matrix of the residual function and P(x)=j=1mRj(x)Sj(x), where Rj(x) is j-component of the residual vector R(x) and Sj(x) is the Hessian matrix of Rj(x).

The algorithms for solving (1.9) include Newton-like approaches such as the Gauss-Newton algorithm (GN), Levenberg-Marquardt algorithm (LM), and structured quasi-Newton algorithm. In another vein, trust-region algorithms are another algorithms for solving (1.9). These algorithms do not use line–search, instead they generate steps using a quadratic model of the objective function. Some of the variations of these algorithms include the Quasi-Newton trust-region proposed by Sun et al. [4] and Adaptive trust-region algorithms developed by Sheng et al. [5]. For details about these algorithms, the interested reader may refer to the survey of algorithms for solving nonlinear least-squares problems by Mohammad et al. [6] and Yuan [7].

Guass-Newton and Levenberg-Marquardt algorithms are efficient algorithms for solving small-scale problems; however, these algorithms tend to show poor performance when applied to solve non-zero residual problems [8]. As a result of this shortcoming, Brown and Dennis [9] introduced the Structured Quasi-Newton algorithm (SQN), which utilizes GN's and quasi-Newton's step to exploit the structure of the Hessian of the objective function (1.9). The SQN algorithm shows remarkable improvement numerically upon comparison to the GN and LM algorithms (see, [10], [11], [12]). However, SQN algorithms need to compute and store matrices in each iteration. That limits their performance when solving large–scale problems. Consequently, structured matrix-free algorithms for solving (1.9) are more preferable [13], [14], [15]. For instance, Kobayashi et al. [16] introduced a structured matrix-free algorithm that uses conjugate gradient direction to solve large–scale nonlinear least-squares problems. Their algorithm incorporated some approaches such as GN, LM, and SQN into the Dai and Liao conjugate gradient algorithm [17]. They showed the global convergence of the algorithm under some standard assumptions.

In a different approach, Mohammad and Waziri [18] proposed two BB-like algorithms for solving nonlinear least-squares problems. The two algorithms update their search directions by incorporating a structured vector, which approximately satisfies the structured secant equation, into the BB spectral parameters (1.7) and (1.8). However, they derived their structured vector by approximating both the first term and the second term of the Hessian matrix (1.11) which we believe may lead to some loss of information of the Hessian matrix. In this paper, we propose three alternative matrix-free algorithms with a different structured vector obtained by retaining the first term of (1.11) and approximate its second term, since computing the exact second term is computationally expensive. As a result, our proposed search directions possess more information about the Hessian of the objective function. More so, to avoid negative curvature directions, we provide a safeguarding technique for the first two of the proposed spectral parameters when they are nonpositive at a particular iteration. Our modification improves the numerical performance of Mohammad and Waziri [18]. Numerical experiments in Section 4 support this claim.

We segmented the remainder of the paper into the following components. In the 2nd section, we present the formulation of the proposed algorithms. We describe the global convergence of the proposed algorithms in the 3rd section. In the 4th section, we present numerical experiments. Finally, we give some conclusions in the 5th Section.

2. Formulation of the three spectral algorithms and their algorithms

In this segment, we provide important components for the proposed algorithms. Consider the second term, P(x), of the Hessian matrix (1.11) at certain iteration, say, k1, for k>0, i.e.,

P(xk1)=j=1mRj(xk1)2Rj(xk1). (2.1)

We wish to derive a structured vector, say γk1, in such a way that the matrix P(xk1) satisfies the following secant equation:

P(xk)sk1=γk1, (2.2)

where sk1=xkxk1. Using Taylor's series expansion on Rj(xk1) and simplifying it, will give

Rj(xk1)Rj(xk)2Rj(xk)Tsk1,j=1,2,3,,m, (2.3)

therefore, pre-multiplying (2.3) by Rj(xk) and simplifying gives

Rj(xk)2Rj(xk)Tsk1Rj(xk)Rj(xk)Rj(xk)Rj(xk1). (2.4)

Adding up on both sides of the above equation (2.4) for j=1,2,3,,m and using (2.1), it gives

P(xk)sk1(J(xk)J(xk1))TR(xk). (2.5)

Multiplying equation (1.11) by sk1 and substituting (2.5) into it, we have

2f(xk)sk1J(xk)TJ(xk)sk1+(J(xk)J(xk1))TR(xk). (2.6)

Suppose Dk2f(xk), such that

Dksk1γk1, (2.7)

is satisfied. Then from (2.6), we have

γk1=J(xk)TJ(xk)sk1+(J(xk)J(xk1))TR(xk). (2.8)

In a similar manner to classical BB method, suppose Dk=αkI, we require Dk1 to approximately satisfy the above modified secant equation (2.7) by finding αkR that minimize the following least–squares problems

minα12αsk1γk12, (2.9)

the solution of problem (2.9) is as follows

αkM1=sk12sk1Tγk1. (2.10)

Similarly, another choice of αk obtained by minimizing:

minα12sk1α1γk12,

the above minimization problem has the following solution given as

αkM2=sk1Tγk1γk12. (2.11)

The geometric mean of αkBB1 and αkBB2 parameters is given by:

αkM3=sk1γk1. (2.12)

To build the proposed algorithms, we define the search directions using the above spectral parameters in equations (2.10), (2.11) and (2.12) as follows

{dkM1=αkM1gk,dkM2=αkM2gk,dkM3=αkM3gk, (2.13)

where

{αkM1=sk12sk1Tγk1,αkM2=sk1Tγk1γk12,αkM3=sk1γk1. (2.14)

In the case of the search directions dkM1 and dkM2, negative curvature directions could be avoided by making sk1Tγk1 strictly positive. As a result, a suitable safeguarding parameter which is updated in every iteration is usually used to replace sk1Tγk1 i.e. when sk1Tγk10. For example, Luengo et al. [19] developed a retarding technique, that is, if sk1Tγk10 then αk=ϑαk1, where ϑ is a suitable positive constant. Recently, Mohammad and Waziri [20] presented another safeguarding parameter ηk=max{ϑαk1,sk1Tγk1+sk1γk1} to replace sk1Tγk1 whenever it is nonpositive.

Similarly, we propose another safeguarding technique for the spectral parameters in order to avoid a negative curvature direction. Obviously, αkM1 and αkM2 will be nonpositive only if sk1Tγk10. Therefore, we present the safeguarding technique as follows:

Since it holds that sk1Tγk10.5(sk12+γk12), then if sk1Tγk10 in αkM1 or αkM2, we replace it with the following parameter

ηk=max{ϑαk1,sk12+γk12}, (2.15)

where ϑ is a positive constant. In the case of αkM3, it can be seen that computing sk1Tγk1 is not required, and as a result, αkM3, is expected to perform reasonably well. The preliminary numerical experiments we conducted support our expectations.

To ensure the global convergence of the proposed algorithms, we adopt the nonmonotone line-search of Zhang and Hager [21]. We describe the nonmonotone line-search as follows. Suppose that a direction dk (that is, either dkM1, dkM2 or dkM3) is sufficiently descent, then a step length h is determined such that it satisfies the following nonmonotone Armijo-type line-search conditions:

f(xk+hdk)Uk+δhgkTdk,δ(0,1),gk=JkTRk, (2.16)

where,

{U0=f(x0),Uk+1=μkWkUk+f(xk+1)Wk+1,W0=1,Wk+1=μkWk+1, (2.17)

and Jk=J(xk), Rk=R(xk).

The degree of nonmonotonicity is controlled by μk. If for all k,μk=0, then the nonmonotone line-search (2.16) above usually reduces to the Wolfe or Armijo-type line-search.

Remark 2.1

[21] The sequence {Uk} lies between f(xk) and Ck, where

Ck:=1k+1j=0kf(xj),k0. (2.18)

Next, we outline the steps of the proposed alternative structured spectral algorithms for solving nonlinear least-squares problems.

Remark 2.2

It is important to point out that, the above algorithm is composed of three different algorithms combined in one, different choices of the parameters αkM1, αkM2 and αkM3 correspond to different algorithms. If αkM1 or αkM2 or αkM3 is used, we denoted the algorithm as ASSA1 or ASSA2 or ASSA3, respectively. Also, for each problem considered, we write MATLAB code for the structured gradient gk and the structured vector γk1, in such a way that the matrix-vector product components of the vectors are computed directly without explicitly forming or storing a matrix throughout the iteration process.

3. Convergence analysis

To discuss the global convergence of the proposed algorithms, we first state some valuable assumptions as follows:

Assumption 3.1

The level set ={xRn|f(x)f(xo)} is bounded, i.e. there exists a positive constant ν such that xν, x.

Assumption 3.2

The Jacobian matrix J(x)=R(x)T is Lipschitz continuous on some neighborhood N of with Lipschitz constant p1 i.e. J(x)J(y)p1xy, x,y.

It can be deduced from the above Assumption 3.2 that there exist positive constants p2,p3,q1,q2,q3 such that for every x,y, for which

R(x)R(y)p2xy,g(x)g(y)p3xy,J(x)<q1,R(x)<q2,g(x)q3.

Lemma 3.3

Suppose Assumption 3.1, Assumption 3.2 hold, then there exists a positive constant L such that, k>0,

γk1Lsk1. (3.1)

Proof

γk1=J(xk)TJ(xk)sk1+(J(xk)J(xk1))TR(xk)J(xk)TJ(xk)sk1+(J(xk)J(xk1))TR(xk)(using triangular inequality)J(xk)2sk1+J(xk)J(xk1)R(xk)(using matrix norm property )q12sk1+p1xkxk1R(xk)q12sk1+p1q2sk1=(q12+p1q2)sk1.

Hence, by setting L:=q12+p1q2, the inequality (3.1) holds. □

Lemma 3.4

The parameter ζk defined by (2.20) is well-defined and bounded for every k0.

Proof

For k=0, ζ0=α0=1.

Now, for k1, we consider the following three choices for αk.

Choice 1: αk=αkM1=sk12sk1Tγk1. Clearly, if sk1Tγk1 is positive, then αk>0 and as a result ζk>0. Else, set αk=sk12max{ϑαk1,sk12+γk12}. If ϑαk1>sk12+γk12 then, it is obvious that αk>0 which means ζk>0. Otherwise, if (sk12+γk12)>ϑαk1, then we have

αk=sk12sk12+γk12sk12sk12+L2sk12=sk12(L2+1)sk12=1L2+1>0,

where the inequality follows from (3.1).

Choice 2: αk=αkM2=sk1Tγk1sk12.

If sk1Tγk1>0, then αk>0 and subsequently ζk>0. Else, we set αk=max{ϑαk1,sk12+γk12}sk12.

If ϑαk1>sk12+γk12, then αk>0 which means ζk>0.

Otherwise, if sk12+γk12>ϑαk1, then

αk=sk12+γk12γk121L2γk12+γk12γk12=(1L2+1)γk12γk12=1L2+1>0,

where the inequality follows from (3.1).

Choice 3:

αk=αkM3=sk1γk1sk1Lsk1=1L>0,

where the inequality follows from (3.1). Therefore, from choices 1, 2, and 3 above, it follows that αk is well-defined for every three choices. Thus, ζk is well-defined. Also, from the definition of ζk in (2.20), it follows that for all k0, ζk is bounded above and below by ζmax and ζmin respectively.

Hence, combining all the cases, we have

max{1L2+1,ζmin}ζkζmax.

 □

Lemma 3.5

Suppose the sequence {xk} and the search direction {dk} are generated by Algorithm 1. Let c1 and c2 be two positive constants. Then for all k0, the following inequalities hold:

  • (a)

    gkTdkc1gk2,

  • (b)

    dkc2gk,

  • (c)

    fkUkCk.

Proof

For inequality (a), suppose dk is defined by (2.19), then

gkTdk=ζkgk2max{1L2+1,ζmax}gk2, (3.2)

the result follows by setting c1=max{1L2+1,ζmax}. The inequality in (b) follows from Lemma 3.4. To show the inequality (c), let t0 and define Φ:RR by

Φ=tUk1+f(xk)t+1. (3.3)

Differentiating the above (3.3) with respect to t gives

dΦdt=Uk1f(xk)(t+1)2. (3.4)

By the relation gkTdkc1gk2,k, we have from (2.16) that

f(xk)=f(xk1+hdk1)Uk1+δhgk1Tdk1Uk1c1δhgk12Uk1.

This means, for all t0, dΦdt0 which thus implies that Φ is nondecreasing. Hence (3.3) satisfies f(xk)=Φ(0)Φ(t), for all t0. In particular, by taking t=μk1Wk1 there follows

f(xk)=Φ(0)Φ(μk1Wk1)=μk1Wk1Uk1+f(xk)μk1Wk1+1=μk1Wk1Uk1+f(xk)Wk(from (2.17))=Uk. (3.5)

Hence, the lower bound of Uk is established.

Next, we show UkCk by induction. Let k=0, from (2.17) we have U0=C0=f(x0). Now, suppose that UjCj, for all 0j<k. Since μk[0,1] and W0=1 by (2.17) we have

Wj+1=1+i=0jl=0iμjlj+2. (3.6)

Combining (3.3), (3.5) and (3.6), we have,

Uk=Φ(Wk1)=Φ(μk1Wk1)=Φ(i=0k1n=0iμkn1)Φ(k).

By induction step, we obtain

Φ(k)=kUk1+f(xk)k+1kCk1+f(xk)k+1=Ck. (3.7)

Hence it holds that UkCk. □

Theorem 3.6

Assume f(x) is given by (1.9) and Assumption 3.1, Assumption 3.2 hold. Then the generated sequence of iterates {xk} from the Algorithm 1 is contained in the level set ℓ and

liminfkgk=0. (3.8)

Furthermore, if ηmax<1, then

limkgk=0. (3.9)

Proof

The proof follows from [21]. □

Algorithm 1.

Algorithm 1

Alternative structured spectral algorithm (ASSA).

4. Numerical experiments

Numerical experiments are reported in this section in order to evaluate the computational performance of the proposed algorithms ASSA1, ASSA2 and ASSA3 in comparison with the algorithms (SSGM1 and SSGM2) as reported in [20].

In the experiment. Among the 35 benchmark test problems, which are considered in the experiment, 27 problems are large-scaled while the remaining 8 are small-scaled. The list of the test problems, their initial guesses, and their respective references are reported in Tables 1 and 2.

Table 1.

List of large–scale test problems with their respective references and starting points.

Problem Function name Starting point
P1 Penalty function I [22] (1/3,1/3,...,1/3)T
P2 Trigonometric function [23] (1/n,...,1/n)T
P3 Discrete boundary-value function [23] (1n+1(1n+11),...,1n+1(nn+11))T
P4 Linear full rank function [23] (1,1,...,1)T
P5 Linear rank-1 function [23] (1,1,...,1)T
P6 Problem 202 [24] (2,2,...2)T
P7 Problem 206 [24] (1/n,...,1/n)T
P8 Problem 212 [24] (0.5,...,0.5)T
P9 Ascher and Russel boundary value problem [24] (1/n,1/n,...,1/n)T
P10 Strictly convex function (Raydan 1) [22] (1/n,2/n,...,1)T
P11 Singular function 2 [22] (1,1,...,1)
P12 Exponential function 1 [22] (n/n−1,n/n−1,...,n/n−1)T
P13 Exponential function 2 [22] (1/n2,1/n2,...,1/n2)T
P14 Logarithm function [22] (1,1,...,1)T
P15 Trigonometric exponential function [24] (0.5,0.5,...,0.5)T
P16 Extended Powell singular function [22] (1.5E−4,...,1.5E−4)T
P17 Function 21 [22] (1,1,...,1)T
P18 Tridiagonal Broyden function [23] (−1,−1,...,−1)T
P19 Extended Himmelblau function [25] (1,1/n,1,1/n,...,1,1/n)T
P20 Function 27 [22] (100,1/n2,...,1/n2)T
P21 Trigonometric Logarithmic function [20] (1,1,...,1)T
P22 Zero Jacobian function [22] fori=1,100(n100)n,fori2,(n1000)(n500)(60n)2
P23 Exponential function [22] (1/(4n2),2/(4n2),...,n/(4n2))
P24 Singular Broyden function 1 [24] (−1,−1,...,−1)T
P25 Brown almost linear function [23] (0.5,0.5,...,0.5)T
P26 Extended Freudenstein and Roth function [22] (6,3,6,3,...,6,3)T
P27 Generalized Tridiagonal Broyden [24] (−1,−1,...,−1)T

Table 2.

List of small-scale test problems with their respective references and starting points.

Problem Function name Starting point
P28 Brown Badly Scaled function [23] (1,1)T
P29 Jennrich and Sampson function [23] (0.2,0.2)T
P30 Box three-dimensional function [23] (0,0.1)T
P31 Rank deficient Jacobian [26] (−1,1)T
P32 Rosenbrock function [23] (−1,1)T
P33 Parameterized function [27] (10,10)T
P34 Freudenstein and Roth function [23] (0.5,−2)T
P35 Beale Function [23] (2,3)

For the large-scaled test problems, we solved each test problem using the following dimensions 1000, 3000, 5000, 7000, 9000, 11000, and 13000. In total, we solved 197 instances in the course of the experiments.

The parameters used in the execution of all our algorithms and SSGM algorithms are as follows:

  • ASSA Algorithms: ζmin=1030, ζmax=1030, δ=104, μmin=0.1, μmax=0.85, tolerance,ϵ=104 and ϑ=1000.

  • SSGM Algorithms: All the parameters used in these algorithms remain the same as in [20].

The coding of all the algorithms and their execution were done in MATLAB R2019b on ACER-PC with intel (8th generation) CORE i5-8265U @ CPU 1.60 GHz processor and 8 GB of RAM. The iteration process proceeds as long as the inequality gk>ϵ=104 holds. Termination of the iteration will occur when the above inequality is FALSE, or the number of iterations exceeds 1000, or the function evaluation surpasses 5000. In all the above-stated instances, success is reported only if the inequality gkϵ=104 is satisfied otherwise, a failure represented by F, is reported.

The results of the experiments as reported in Tables 3, 4, 5 and 6 show the number of iterations (ITER), the number of function evaluations (FVAL), and the time taken required by an algorithm to approximately converge to a solution (CPU-TIME).

Table 3.

Numerical results of ASSA algorithms and SSGM algorithms on large-scaled problems 1 - 9 with their dimensions.

PROBLEMS DIM ASSA1
SSGM1
ASSA2
SSGM2
ASSA3
ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME
P1
1000 4 5 0.1218 4 5 0.0040 4 5 0.0525 4 5 0.0139 4 5 0.0314
3000 3 4 0.0475 4 5 0.0087 3 4 0.0275 4 5 0.0160 3 4 0.0332
5000 3 4 0.0194 3 4 0.0181 3 4 0.0212 3 4 0.0155 3 4 0.0110
7000 3 4 0.0170 3 4 0.0296 3 4 0.0308 3 4 0.0162 3 4 0.0399
9000 3 4 0.0198 3 4 0.0174 3 4 0.0333 3 4 0.0230 3 4 0.0722
11000 2 3 0.0202 2 3 0.0130 2 3 0.0184 2 3 0.0379 2 3 0.0405
13000 2 3 0.0218 2 3 0.0345 2 3 0.0341 2 3 0.0315 2 3 0.0242

P2
1000 20 40 0.1442 3 4 0.0068 20 40 0.1150 3 4 0.0144 20 40 0.0527
3000 18 42 0.0842 F F F 20 44 0.1984 F F F 19 43 0.2013
5000 23 48 0.2431 F F F 23 48 0.1957 F F F 23 48 0.2336
7000 23 49 0.1707 F F F 24 50 0.2819 F F F 23 49 0.2680
9000 21 48 0.2654 F F F 23 50 0.3223 F F F 22 49 0.5296
11000 24 51 0.3632 F F F 25 52 0.4220 F F F 25 52 0.5226
13000 22 50 0.2699 F F F 24 52 0.2889 F F F 23 51 0.6974

P3
1000 10 14 0.1322 10 14 0.0376 8 12 0.0178 8 12 0.0546 10 14 0.0522
3000 4 8 0.0364 4 8 0.0283 4 8 0.0321 4 8 0.0236 3 7 0.0374
5000 3 7 0.0155 3 7 0.0274 2 6 0.0478 2 6 0.0348 3 7 0.0567
7000 2 6 0.0300 2 6 0.0327 2 6 0.0220 2 6 0.0285 2 6 0.0293
9000 1 5 0.0345 1 5 0.0535 1 5 0.0697 1 5 0.0381 1 5 0.0554
11000 1 5 0.0551 1 5 0.0308 1 5 0.0185 1 5 0.0643 1 5 0.0305
13000 1 5 0.0307 1 5 0.0377 1 5 0.0274 1 5 0.0474 1 5 0.0729

P4
1000 2 3 0.0455 2 3 0.0051 2 3 0.0056 2 3 0.0045 2 3 0.0049
3000 2 3 0.0098 2 3 0.0072 2 3 0.0052 2 3 0.0102 2 3 0.0057
5000 2 3 0.0070 2 3 0.0258 2 3 0.0054 2 3 0.0105 2 3 0.0136
7000 2 3 0.0053 2 3 0.0098 2 3 0.0204 2 3 0.0115 2 3 0.0294
9000 2 3 0.0220 2 3 0.0103 2 3 0.0151 2 3 0.0170 2 3 0.0283
11000 2 3 0.0110 2 3 0.0267 2 3 0.0277 2 3 0.0147 2 3 0.0524
13000 2 3 0.0126 2 3 0.0178 2 3 0.0165 2 3 0.0350 2 3 0.0160

P5
1000 2 59 0.0660 F F F 2 59 0.0144 F F F 2 59 0.0235
3000 3 70 0.0286 F F F 3 70 0.0348 F F F 3 70 0.0508
5000 3 74 0.0956 F F F 3 74 0.0785 F F F 3 74 0.0361
7000 3 77 0.0523 F F F 3 77 0.0445 F F F 3 77 0.1397
9000 3 79 0.0724 F F F 3 79 0.1199 F F F 3 79 0.1095
11000 3 81 0.1140 F F F 3 81 0.1731 F F F 3 81 0.1185

P6
1000 5 6 0.0569 5 6 0.0027 5 6 0.0038 5 6 0.0142 5 6 0.0047
3000 5 6 0.0059 5 6 0.0086 5 6 0.0136 5 6 0.0083 5 6 0.0101
5000 5 6 0.0130 5 6 0.0135 5 6 0.0202 5 6 0.0171 5 6 0.0278
7000 5 6 0.0100 5 6 0.0328 5 6 0.0426 5 6 0.0295 5 6 0.0465
9000 5 6 0.0307 5 6 0.0376 5 6 0.0288 5 6 0.0382 5 6 0.0765
11000 5 6 0.0393 5 6 0.0360 5 6 0.0378 5 6 0.0936 5 6 0.0371
13000 5 6 0.0703 5 6 0.0478 5 6 0.0416 5 6 0.0671 5 6 0.0916

P7
1000 10 14 0.1364 10 14 0.0129 8 12 0.0199 8 12 0.0186 10 14 0.0194
3000 4 8 0.0176 4 8 0.0226 4 8 0.0408 4 8 0.0181 3 7 0.0103
5000 3 7 0.0256 3 7 0.0530 2 6 0.0143 2 6 0.0240 3 7 0.0214
7000 2 6 0.0208 2 6 0.0194 2 6 0.0432 2 6 0.0199 2 6 0.0426
9000 1 5 0.0401 1 5 0.0274 1 5 0.0216 1 5 0.0498 1 5 0.0226
11000 1 5 0.0167 1 5 0.0189 1 5 0.0185 1 5 0.0241 1 5 0.0656
13000 1 5 0.0141 1 5 0.0215 1 5 0.0469 1 5 0.0323 1 5 0.0597

P8
1000 5 7 0.0857 6 8 0.0148 5 7 0.0097 5 7 0.0118 5 7 0.0252
3000 5 7 0.0180 6 8 0.0154 5 7 0.0210 5 7 0.0253 5 7 0.0345
5000 5 7 0.0171 6 8 0.0293 5 7 0.0264 6 8 0.0279 5 7 0.0582
7000 5 7 0.0348 6 8 0.0531 5 7 0.0482 6 8 0.1154 5 7 0.0302
9000 5 7 0.0476 6 8 0.0770 5 7 0.0772 6 8 0.0476 5 7 0.0440
11000 5 7 0.0770 6 8 0.0565 5 7 0.0469 6 8 0.0768 5 7 0.0447
13000 5 7 0.0691 6 8 0.1026 5 7 0.0889 6 8 0.0688 5 7 0.0541

P9 1000 273 556 0.3340 428 922 0.3705 253 271 0.1782 240 259 0.3296 694 1014 0.9078
3000 306 670 0.5481 484 1076 0.8123 345 367 0.5969 467 488 1.1298 456 653 0.8818
5000 293 622 0.9203 646 1439 1.6881 313 332 0.7301 283 301 1.1130 819 1185 3.8548
7000 355 758 1.5064 396 893 1.9556 408 425 1.4273 399 418 2.3700 628 925 4.1467
9000 438 976 2.5387 430 933 2.7037 387 407 2.0961 323 341 2.7924 573 826 5.8853
11000 467 1028 3.2065 391 838 2.9445 337 356 2.0743 340 360 3.3758 392 556 3.7520
13000 384 832 2.8887 272 543 2.1416 310 325 2.0715 200 215 2.1939 513 745 6.9349

Table 4.

Numerical results of ASSA algorithms and SSGM algorithms on large–scale problems 10 - 18 with their dimensions.

PROBLEMS DIM ASSA1
SSGM1
ASSA2
SSGM2
ASSA3
ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME
P10 1000 4 5 0.0864 4 5 0.0100 4 5 0.0148 4 5 0.0150 4 5 0.0203
3000 4 5 0.0144 4 5 0.0382 4 5 0.0091 4 5 0.0269 4 5 0.0186
5000 4 5 0.0441 4 5 0.0510 4 5 0.0194 4 5 0.0326 4 5 0.0242
7000 4 5 0.0341 4 5 0.0354 4 5 0.0291 4 5 0.0519 4 5 0.1436
9000 4 5 0.0558 4 5 0.0345 4 5 0.0474 4 5 0.0716 4 5 0.1129
11000 4 5 0.0585 4 5 0.0904 4 5 0.0821 4 5 0.0523 4 5 0.0546
13000 4 5 0.0746 4 5 0.0644 4 5 0.1171 4 5 0.0854 4 5 0.0774
P11 1000 4 5 0.0364 4 5 0.0031 4 5 0.0081 4 5 0.0041 4 5 0.0209
3000 4 5 0.0096 4 5 0.0102 4 5 0.0214 4 5 0.0114 4 5 0.0256
5000 4 5 0.0291 4 5 0.0166 4 5 0.0147 4 5 0.0138 4 5 0.0168
7000 4 5 0.0180 4 5 0.0163 4 5 0.0235 4 5 0.0184 4 5 0.0475
9000 4 5 0.0588 4 5 0.0258 4 5 0.0417 4 5 0.0289 4 5 0.0608
11000 4 5 0.0240 4 5 0.0300 4 5 0.0347 4 5 0.0492 4 5 0.0464
13000 4 5 0.0341 4 5 0.0480 4 5 0.0648 4 5 0.0474 4 5 0.0687
P12 1000 2 3 0.0685 2 3 0.0051 2 3 0.0079 2 3 0.0089 2 3 0.0100
3000 1 2 0.0061 1 2 0.0048 1 2 0.0036 1 2 0.0043 1 2 0.0136
5000 1 2 0.0099 1 2 0.0085 1 2 0.0048 1 2 0.0123 1 2 0.0067
7000 1 2 0.0163 1 2 0.0065 1 2 0.0049 1 2 0.0066 1 2 0.0082
9000 1 2 0.0103 1 2 0.0114 1 2 0.0184 1 2 0.0123 1 2 0.0097
P13 1000 26 52 0.1101 26 52 0.0290 29 45 0.0305 29 45 0.0503 31 54 0.0477
3000 26 55 0.0834 26 55 0.1085 29 48 0.0479 29 48 0.1119 26 50 0.0495
5000 30 62 0.1575 30 62 0.1749 28 49 0.1847 28 49 0.1152 27 51 0.1838
7000 25 56 0.2409 25 56 0.1829 29 50 0.2571 29 50 0.1678 32 59 0.1780
9000 29 63 0.2558 29 63 0.2257 28 53 0.2789 28 53 0.2955 27 53 0.2893
11000 25 59 0.1936 25 59 0.2356 30 52 0.3664 30 52 0.3051 31 61 0.4500
13000 30 65 0.2439 30 65 0.4019 29 52 0.2868 29 52 0.4385 31 59 0.4889
P14 1000 4 6 0.0533 5 7 0.0039 4 6 0.0037 5 7 0.0140 4 6 0.0043
3000 4 6 0.0175 5 7 0.0116 4 6 0.0131 5 7 0.0214 4 6 0.0280
5000 4 6 0.0183 5 7 0.0315 4 6 0.0182 5 7 0.0236 4 6 0.0157
7000 4 6 0.0152 5 7 0.0273 4 6 0.0307 5 7 0.0293 4 6 0.0560
9000 4 6 0.0242 5 7 0.0365 4 6 0.0292 5 7 0.1082 4 6 0.0267
11000 4 6 0.0696 5 7 0.0696 4 6 0.0327 5 7 0.0334 4 6 0.0298
13000 4 6 0.0449 5 7 0.0743 4 6 0.0709 5 7 0.0394 4 6 0.0340
P15 1000 26 33 0.2736 76 122 0.2511 36 43 0.1526 57 71 0.2341 31 38 0.1591
3000 24 31 0.1887 F F F 36 43 0.1628 44 57 0.6031 34 41 0.2349
5000 25 32 0.5459 F F F 34 41 0.3250 45 58 0.6044 34 41 0.5982
7000 25 32 0.3477 F F F 37 44 0.5041 48 61 0.6758 32 39 0.7530
9000 27 34 0.4433 F F F 37 44 0.4586 45 58 0.9181 35 42 0.7845
11000 26 33 0.5158 F F F 35 42 0.6068 47 60 1.0753 34 41 1.1040
13000 29 36 0.5643 F F F 35 42 0.6318 47 60 1.2024 34 41 1.0577
P16 1000 16 29 0.1196 16 30 0.0104 16 25 0.0321 17 28 0.0216 13 22 0.0171
3000 16 29 0.0520 16 30 0.0584 16 25 0.0458 17 28 0.0358 13 22 0.0312
5000 16 29 0.0654 16 30 0.0838 16 25 0.0928 17 28 0.1675 13 22 0.0831
7000 16 29 0.1123 16 30 0.0834 16 25 0.0954 17 28 0.0990 13 22 0.1071
9000 16 29 0.1325 16 30 0.1218 16 25 0.1648 17 28 0.0994 13 22 0.1498
11000 16 29 0.1117 16 30 0.1210 16 25 0.1841 17 28 0.1260 13 22 0.1710
13000 16 29 0.1850 16 30 0.2385 16 25 0.2639 17 28 0.1582 13 22 0.3017
P17 1000 2 9 0.1048 2 9 0.0046 2 9 0.0104 2 9 0.0091 2 9 0.0223
3000 2 9 0.0065 2 9 0.0065 2 9 0.0080 2 9 0.0108 2 9 0.0099
5000 2 9 0.0689 2 9 0.0093 2 9 0.0144 2 9 0.0205 2 9 0.0158
7000 2 9 0.0228 2 9 0.0283 2 9 0.0310 2 9 0.0531 2 9 0.0418
9000 2 9 0.0187 2 9 0.0237 2 9 0.0206 2 9 0.0347 2 9 0.0360
11000 2 9 0.0210 2 9 0.0622 2 9 0.0340 2 9 0.0382 2 9 0.0723
13000 2 9 0.0387 2 9 0.0332 2 9 0.0495 2 9 0.0564 2 9 0.0390
P18 1000 24 34 0.1386 F F F F F F 22 29 0.0615 21 28 0.0992
3000 24 34 0.1585 F F F F F F 22 29 0.1494 21 28 0.1959
5000 24 34 0.1509 F F F F F F 22 29 0.2431 21 28 0.1483
7000 24 34 0.3281 29 43 0.1560 21 28 0.1371 22 29 0.3078 21 28 0.4024
9000 24 34 0.3605 F F F 14 21 0.1691 22 29 0.3780 21 28 0.4940
11000 24 34 0.4601 F F F 14 21 0.1947 22 29 0.2749 21 28 0.4075
13000 24 34 0.3309 F F F 14 21 0.2054 22 29 0.3192 21 28 0.3085

Table 5.

Numerical results of ASSA algorithms and SSGM algorithms on large–scale problems 19 - 27 with their dimensions.

PROBLEMS DIM ASSA1
SSGM1
ASSA2
SSGM2
ASSA3
ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME
P19 1000 124 227 0.2454 99 175 0.0548 F F F 61 68 0.0516 75 102 0.0666
3000 72 116 0.1881 440 863 0.5243 70 78 0.0692 63 85 0.0733 67 83 0.1909
5000 75 107 0.2482 121 217 0.3774 F F F 64 72 0.1592 83 111 0.3084
7000 69 106 0.3913 262 503 1.2322 68 79 0.2577 72 82 0.1816 111 149 0.5482
9000 54 84 0.4995 293 595 1.7250 75 86 0.5050 62 73 0.3605 116 160 1.1176
11000 143 265 1.7144 157 271 1.0503 69 81 0.5512 73 83 0.4355 80 104 0.7283
13000 113 206 1.5045 131 251 0.9464 77 89 0.5620 59 66 0.3730 54 64 0.4717
P20 1000 1 2 0.0737 1 2 0.0013 1 2 0.0035 1 2 0.0057 1 2 0.0071
3000 1 2 0.0101 1 2 0.0054 1 2 0.0043 1 2 0.0027 1 2 0.0024
5000 1 2 0.0119 1 2 0.0063 1 2 0.0083 1 2 0.0040 1 2 0.0103
7000 1 2 0.0056 1 2 0.0058 1 2 0.0053 1 2 0.0043 1 2 0.0063
9000 1 2 0.0158 1 2 0.0067 1 2 0.0108 1 2 0.0065 1 2 0.0087
11000 1 2 0.0096 1 2 0.0224 1 2 0.0098 1 2 0.0105 1 2 0.0109
13000 1 2 0.0135 1 2 0.0085 1 2 0.0127 1 2 0.0054 1 2 0.0047
P21 1000 13 18 0.1035 13 17 0.0345 14 18 0.0222 11 15 0.0217 10 14 0.0113
3000 13 18 0.0181 13 17 0.0294 14 18 0.0220 11 15 0.0270 10 14 0.0276
5000 13 18 0.0522 13 17 0.0407 14 18 0.0427 11 15 0.0427 10 14 0.0549
7000 13 18 0.1027 13 17 0.0551 14 18 0.0672 11 15 0.0365 10 14 0.0926
9000 13 18 0.1262 13 17 0.0888 14 18 0.1334 11 15 0.0618 10 14 0.0540
11000 13 18 0.1401 13 17 0.0895 14 18 0.0806 11 15 0.1284 10 14 0.1137
13000 13 18 0.1009 13 17 0.0926 14 18 0.1125 11 15 0.0749 10 14 0.1147
P22 1000 17 32 0.1077 22 49 0.0174 17 32 0.0331 21 36 0.0307 17 32 0.0148
3000 17 32 0.0273 22 49 0.0501 17 32 0.0399 21 36 0.0356 17 32 0.0901
5000 17 32 0.0324 22 49 0.1321 17 32 0.0758 21 36 0.1068 17 32 0.0893
7000 17 32 0.0856 22 49 0.1598 17 32 0.1456 21 36 0.1223 17 32 0.1837
9000 17 32 0.1175 22 49 0.1947 17 32 0.2250 21 36 0.2084 17 32 0.2152
11000 17 32 0.1803 22 49 0.2920 17 32 0.1969 21 36 0.2336 17 32 0.1968
13000 17 32 0.2482 22 49 0.3538 17 32 0.2299 21 36 0.1970 17 32 0.2889
P23 1000 4 6 0.0419 5 7 0.0049 4 6 0.0158 5 7 0.0082 4 6 0.0145
3000 4 6 0.0138 5 7 0.0196 4 6 0.0211 5 7 0.0214 4 6 0.0262
5000 4 6 0.0143 5 7 0.0244 4 6 0.0205 5 7 0.0234 4 6 0.0450
7000 4 6 0.0304 5 7 0.0717 4 6 0.0285 5 7 0.0443 4 6 0.0317
9000 4 6 0.0273 5 7 0.0443 4 6 0.1002 5 7 0.0402 4 6 0.0491
11000 4 6 0.0221 5 7 0.0552 4 6 0.0460 5 7 0.0998 4 6 0.0366
13000 4 6 0.0462 5 7 0.0557 4 6 0.0362 5 7 0.0710 4 6 0.0721
P24 1000 21 35 0.1239 15 29 0.0076 21 35 0.0273 15 29 0.0168 21 35 0.0394
3000 16 31 0.0655 22 47 0.0445 16 31 0.0445 19 34 0.0348 16 31 0.0576
5000 17 32 0.0969 22 48 0.0715 17 32 0.0500 20 35 0.0735 17 32 0.0902
7000 17 32 0.1311 23 49 0.1438 17 32 0.0811 20 35 0.0653 17 32 0.0985
9000 17 32 0.2278 23 49 0.1736 17 32 0.1244 20 35 0.2216 17 32 0.2913
11000 17 32 0.2305 20 47 0.2080 17 32 0.1851 20 35 0.2208 17 32 0.2307
13000 17 32 0.1758 20 47 0.3059 17 32 0.2854 21 36 0.2856 17 32 0.2278
P25 1000 17 29 0.0882 21 34 0.0628 18 30 0.0201 18 30 0.0343 17 29 0.0494
3000 20 35 0.0741 26 41 0.1577 21 36 0.0540 21 36 0.0474 20 35 0.1115
5000 17 29 0.1038 19 31 0.1272 17 29 0.1251 19 31 0.1644 17 29 0.1261
7000 18 31 0.1181 20 33 0.2038 18 31 0.1356 20 33 0.1977 18 31 0.1256
9000 19 33 0.1547 20 34 0.1436 19 33 0.2348 21 35 0.1163 19 33 0.2147
11000 19 33 0.2715 20 34 0.2641 20 34 0.1575 22 36 0.1926 20 34 0.2546
13000 19 34 0.2095 21 36 0.2056 20 35 0.3355 22 37 0.1945 20 35 0.2838
P26 1000 336 735 0.6566 314 703 0.3193 272 291 0.1699 230 248 0.1212 441 636 0.6060
3000 316 666 1.1565 283 585 0.5389 248 275 0.4805 199 218 0.2794 393 564 1.1398
5000 302 647 1.5136 293 601 1.1529 260 282 0.7733 312 327 0.7131 468 654 2.2223
7000 468 984 3.8041 387 833 1.5875 266 291 1.0078 264 280 1.4697 437 631 1.6654
9000 339 714 3.2110 318 690 1.6699 188 212 0.8802 280 300 1.2895 542 781 2.8109
11000 270 554 3.0366 263 537 1.7500 347 360 2.0573 297 312 1.3763 503 685 3.1762
13000 254 512 1.7371 472 981 3.4959 293 314 1.7208 277 294 1.5208 401 565 4.1690
P27 1000 2 23 0.0913 2 23 0.0089 2 23 0.0138 2 23 0.0116 2 23 0.0249
3000 2 27 0.0230 2 27 0.0396 2 27 0.0237 2 27 0.0177 2 27 0.0188
5000 2 28 0.0542 2 28 0.0621 2 28 0.0460 2 28 0.0502 2 28 0.0496
7000 2 29 0.0684 2 29 0.0641 2 29 0.0913 2 29 0.0489 2 29 0.0550
9000 2 30 0.1293 2 30 0.0781 2 30 0.0717 2 30 0.0733 2 30 0.0986
11000 2 30 0.1330 2 30 0.1141 2 30 0.1279 2 30 0.1292 2 30 0.1260
13000 2 31 0.1211 2 31 0.1041 2 31 0.1184 2 31 0.0579 2 31 0.1242

Table 6.

Numerical results of ASSA algorithms and SSGM algorithms on small scale problems 28 - 35 with their dimensions.

PROBLEMS DIM ASSA1
SSGM1
ASSA2
SSGM2
ASSA3
ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME ITER FEVAL CPU–TIME
P28 n=2, m=3 F F F F F F 9 27 0.0123 F F F 9 48 0.0113
P29 n=2, m=20 F 11 0.0248 1 7 0.0116 1 11 0.0078 1 7 0.0128 1 11 0.0051
P30 n=3, m=10 35 56 0.0298 F F F 17 22 0.0193 39 52 0.0341 13 18 0.0180
P31 n=2, m=3 8 11 0.0485 10 13 0.0048 10 13 0.0048 8 11 0.0213 8 11 0.0051
P32 n=2, m=2 F F F 49 127 0.0310 F F F 70 172 0.0331 72 125 0.0110
P33 n=2, m=3 11 27 0.0184 8 24 0.0151 7 23 0.0216 9 25 0.0061 14 33 0.0067
P34 n=2, m=2 26 61 0.0166 57 145 0.0196 23 38 0.0104 25 45 0.0109 38 70 0.0205
P35 n=2, m=3 25 42 0.0376 28 47 0.0140 26 32 0.0042 29 43 0.0133 25 38 0.0106

It can be seen from the data reported in Tables 3, 4, 5 and 6, that ASSA3, i.e. Algorithm 1 with the choice αkM3, solved all the considered test problems successfully. However, each of the remaining algorithms recorded failure in at least 3 instances. In comparison with ASSA1 and ASSA2 algorithms, the algorithms SSGM1 and SSGM2 of [20] have had the highest number of failures in the course of solving the test problems. That means fusing a better approximation of the Hessian defined in equation (2.8) into the BB parameters coupled with the safeguarding technique resulted in an improved numerical performance to some extent.

The reported results in Tables 3, 4, 5 and 6 are summarized with the aid of well-known Dolan and Moré [28] performance profile. The comparison is conducted based on the following three metrics: ITER, FVAL and CPU–TIME.

Figs. 1a, 2a and 3a show the comparison between the proposed ASSA1 and SSGM1; Figs. 1b, 2b and 3b are the comparison between proposed ASSA2 and SSGM2; Figs. 1c, 2c and 3c are the comparison between the proposed ASSA3 and SSGM1; SSGM2 and Figs. 1d, 2d and 3d are the comparison among the three proposed algorithms ASSA1, ASSA2 and ASSA3.

Figure 1.

Figure 1

The above Fig. show comparisons based on the ITER. In 1(a) the results of ASSA1 and SSGM1 are compared. Similarly, ASSA2 and SSGM2 are also compared in 1(b). Moreover, in 1(c), ASSA3 is compared with SSGM1 and SSGM2, and all the three proposed algorithms, ASSA1, ASSA2 and ASSA3 are compared in 1(d).

Figure 2.

Figure 2

Furthermore, the above Fig. show comparisons based on the FVAL. In 2(a) the results of ASSA1 and SSGM1 are compared. Similarly, ASSA2 and SSGM2 are also compared in 2(b). Moreover, in 2(c), ASSA3 is compared with SSGM1 and SSGM2, and all the three proposed algorithms, ASSA1, ASSA2 and ASSA3 are compared in 2(d).

Figure 3.

Figure 3

More so, the above Fig. show comparisons based on the CPU-TIME. In 3(a) the results of ASSA1 and SSGM1 are compared. Similarly, ASSA2 and SSGM2 are also compared in 3(b). Moreover, in 3(c), ASSA3 is compared with SSGM1 and SSGM2, and all the three proposed algorithms, ASSA1, ASSA2 and ASSA3 are compared in 3(d).

In terms of ITER, Fig. 1a–1c, show that the curves formed by our algorithms ASSA1, ASSA2, and ASSA3 stay longer on the vertical axis with a success rate of about 93%, 84%, and 82% respectively, as displayed by the height of their performance profile for τ>0.6. These show that our algorithms outperform their main competitors, SSGM1 and SSGM2. While Fig. 1d shows that our three proposed algorithms are competitive where ASSA3 performs slightly better than ASSA1 and ASSA2. Therefore, the ASSA3 algorithm can be considered the most reliable of all the algorithms since it was able to solve all the problems under consideration.

Moreover, with regards to FVAL, we can see in Fig. 2a–2c that ASSA1, ASSA2, and ASSA3 algorithms won about 85%, 83%, and 79%, respectively of the experiments in comparison to SSGM1 and SSGM2 algorithms. On the other hand, a comparison between the three proposed algorithms, reported in Fig. 2d shows that ASSA2 performs moderately better than its competitors ASSA1 and ASSA3.

Furthermore, regarding the CPU-TIME, Fig. 3a–3b show that the proposed algorithms ASSA1 and ASSA2 are very competitive against SSGM1 and SSGM2, where ASSA1 and ASSA2 perform slightly better. Hence, our algorithms are more efficient than their respective counterparts. Although ASSA3 achieves better results than ASSA1, ASSA2, SSGM1, and SSGM2 in terms of ITER and solved all the problems under consideration, it, however, loss to them based on CPU–TIME. Considering everything, we can see that the proposed algorithms have a better computational performance, and in particular, ASSA3 solved all the problems in all the instances.

5. Conclusions

We have proposed three structured spectral gradient algorithms for solving nonlinear least-squares problems. We built the three algorithms by incorporating the structured vector γk1 into three different spectral parameters. The structured vector, γk1 is obtained by approximating the Hessian of the objective function such that the secant equation is satisfied. We implemented three algorithms without the need to create or store matrices throughout the iteration process. That makes them suitable for large–scale problems. Moreover, we came up with a safeguarding strategy, different from that of [20], to avoid negative curvature directions. Numerical experiments presented in Section 4 have shown that our proposed algorithms work well and perform better than SSGM1 and SSGM2 of [20]. Furthermore, the implementation of Algorithm 1 using αkM3, that is, the geometric mean of αkM1 and αkM2, proved to be more efficient as it successfully solved all the test problems considered in our numerical experiments without any failure. Our future study will include using a two-step approach of [29] to solve non-linear least-squares problems and applying these algorithms to motion control problems [30], [31], [32].

Declarations

Author contribution statement

M.M. Yahaya: Conceived and designed the experiments; Performed the experiments; Wrote the paper.

P. Kumam: Analyzed and interpreted the data; Wrote the paper.

A.M. Awwal: Conceived and designed the experiments; Analyzed and interpreted the data; Wrote the paper.

S. Aji: Performed the experiments; Wrote the paper.

Funding statement

The authors acknowledge the financial support provided by the Petchra Pra Jom Klao Scholarship of King Mongkut's University of Technology Thonburi (KMUTT) and Center of Excellence in Theoretical and Computational Science (TaCSCoE), KMUTT (TaCS-CoE 2021). The first author got support from Petchra Pra Jom Klao Masters Research Scholarship from King Mongkut's University of Technology Thonburi (Contract No. 5/2562). Also, Aliyu Muhammed Awwal would like to thank the Postdoctoral Fellowship from King Mongkut's University of Technology Thonburi (KMUTT), Thailand. Moreover, this research project is supported by Thailand Science Research and Innovation (TSRI) Basic Research Fund: Fiscal year 2021 under project number 64A306000005.

Data availability statement

Data included in article/supplementary material/referenced in article.

Declaration of interests statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

References

  • 1.Barzilai Jonathan, Borwein Jonathan M. Two-point step size gradient methods. IMA J. Numer. Anal. 1988;8(1):141–148. [Google Scholar]
  • 2.Dai Yu-Hong, Al-Baali Mehiddin, Yang Xiaoqi. Numerical Analysis and Optimization. Springer; 2015. A positive Barzilai–Borwein-like stepsize and an extension for symmetric linear systems; pp. 59–75. [Google Scholar]
  • 3.Raydan Marcos. The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 1997;7(1):26–33. [Google Scholar]
  • 4.Sun Wenyu, Sampaio R.J.B., Yuan J. Quasi-Newton trust region algorithm for non-smooth least squares problems. Appl. Math. Comput. 1999;105(2–3):183–194. [Google Scholar]
  • 5.Sheng Zhou, Yuan Gonglin, Cui Zengru, Duan Xiabin, Wang Xiaoliang. An adaptive trust region algorithm for large-residual nonsmooth least squares problems. J. Ind. Manag. Optim. 2018;14(2):707. [Google Scholar]
  • 6.Mohammad Hassan, Waziri Mohammed Yusuf, Santos Sandra Augusta. A brief survey of methods for solving nonlinear least-squares problems. Numer. Algebra Control Optim. 2019;9(1):1–13. [Google Scholar]
  • 7.Yuan Ya-Xiang. Recent advances in numerical methods for nonlinear equations and nonlinear least squares. Numer. Algebra Control Optim. 2011;1(1):15–34. [Google Scholar]
  • 8.Al-Baali Mehiddin, Fletcher Roger. Variational methods for non-linear least-squares. J. Oper. Res. Soc. 1985;36(5):405–421. [Google Scholar]
  • 9.Dennis J.E., Martinez Héctor J., Tapia Richard A. Convergence theory for the structured BFGS secant method with an application to nonlinear least squares. J. Optim. Theory Appl. 1989;61(2):161–178. [Google Scholar]
  • 10.Dennis John E., Jr, Gay David M., Walsh Roy E. An adaptive nonlinear least-squares algorithm. ACM Trans. Math. Softw. 1981;7(3):348–368. [Google Scholar]
  • 11.Spedicato E., Vespucci M.T. Numerical experiments with variations of the Gauss-Newton algorithm for nonlinear least squares. J. Optim. Theory Appl. 1988;57(2):323–339. [Google Scholar]
  • 12.Wang Fei, Li Dong-Hui, Qi Liqun. Global convergence of Gauss-Newton-MBFGS method for solving the nonlinear least squares problem. Adv. Model. Optim. 2010;12(1):1–20. [Google Scholar]
  • 13.Knoll Dana A., Keyes David E. Jacobian-free Newton–Krylov methods: a survey of approaches and applications. J. Comput. Phys. 2004;193(2):357–397. [Google Scholar]
  • 14.Xu Wei, Coleman Thomas F., Liu Gang. A secant method for nonlinear least-squares minimization. Comput. Optim. Appl. 2012;51(1):159–173. [Google Scholar]
  • 15.Xu Wei, Zheng Ning, Hayami Ken. Jacobian-free implicit inner-iteration preconditioner for nonlinear least squares problems. J. Sci. Comput. 2016;68(3):1055–1081. [Google Scholar]
  • 16.Kobayashi Michiya, Narushima Yasushi, Yabe Hiroshi. Nonlinear conjugate gradient methods with structured secant condition for nonlinear least squares problems. J. Comput. Appl. Math. 2010;234(2):375–397. [Google Scholar]
  • 17.Dai Y-H., Liao L-Z. New conjugacy conditions and related nonlinear conjugate gradient methods. Appl. Math. Optim. 2001;43(1):87–101. [Google Scholar]
  • 18.Mohammad Hassan, Waziri Mohammed Yusuf. Structured two-point stepsize gradient methods for nonlinear least squares. J. Optim. Theory Appl. 2019;181(1):298–317. [Google Scholar]
  • 19.Luengo Francisco, Raydan Marcos. Gradient method with dynamical retards for large-scale optimization problems. Electron. Trans. Numer. Anal. 2003;16:186–193. [Google Scholar]
  • 20.Mohammad Hassan, Santos Sandra A. A structured diagonal Hessian approximation method with evaluation complexity analysis for nonlinear least squares. Comput. Appl. Math. 2018;37(5):6619–6653. [Google Scholar]
  • 21.Zhang Hongchao, Hager William W. A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim. 2004;14(4):1043–1056. [Google Scholar]
  • 22.La Cruz William, Martínez José Mario, Raydan Marcos. Spectral residual method without gradient information for solving large-scale nonlinear systems: theory and experiments. 2004. http://kuainasi.ciens.ucv.ve/mraydan/download_papers/TechRep.pdf
  • 23.Moré Jorge J., Garbow Burton S., Hillstrom Kenneth E. Testing unconstrained optimization software. ACM Trans. Math. Softw. 1981;7(1):17–41. [Google Scholar]
  • 24.Lukšan Ladislav, Vlcek Jan. 2003. Test problems for unconstrained optimization. Academy of Sciences of the Czech Republic, Institute of Computer Science, Technical Report, (897) [Google Scholar]
  • 25.Momin Jamil, Yang Xin-She. A literature survey of benchmark functions for global optimization problems. 2013. arXiv:1308.4008 arXiv preprint.
  • 26.Gonçalves Douglas S., Santos Sandra A. Local analysis of a spectral correction for the Gauss-Newton model applied to quadratic residual problems. Numer. Algorithms. 2016;73(2):407–431. [Google Scholar]
  • 27.Huschens Jürgen. On the use of product structure in secant methods for nonlinear least squares problems. SIAM J. Optim. 1994;4(1):108–129. [Google Scholar]
  • 28.Dolan Elizabeth D., Moré Jorge J. Benchmarking optimization software with performance profiles. Math. Program. 2002;91(2):201–213. [Google Scholar]
  • 29.Awwal Aliyu Muhammed, Wang Lin, Kumam Poom, Mohammad Hassan. A two-step spectral gradient projection method for system of nonlinear monotone equations and image deblurring problems. Symmetry. 2020;12(6):874. [Google Scholar]
  • 30.Yahaya Mahmoud Muhammad, Kumam Poom, Awwal Aliyu Muhammed, Aji Sani. A structured quasi–Newton algorithm with nonmonotone search strategy for structured nls problems and its application in robotic motion control. J. Comput. Appl. Math. 2021 [Google Scholar]
  • 31.Awwal Aliyu Muhammed, Kumam Poom, Wang Lin, Huang Shuang, Kumam Wiyada. Inertial-based derivative-free method for system of monotone nonlinear equations and application. IEEE Access. 2020;8:226921–226930. [Google Scholar]
  • 32.Aji Sani, Kumam Poom, Awwal Aliyu Muhammed, Yahaya Mahmoud Muhammad, Kumam Wiyada. Two hybrid spectral methods with inertial effect for solving system of nonlinear monotone equations with application in robotics. IEEE Access. 2021;9:30918–30928. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data included in article/supplementary material/referenced in article.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES