A HS-PRP-Type Hybrid Conjugate Gradient Method with Sufficient Descent Property

Xiaodi Wu; Yihan Zhu; Jianghua Yin

doi:10.1155/2021/2087438

. 2021 Oct 22;2021:2087438. doi: 10.1155/2021/2087438

A HS-PRP-Type Hybrid Conjugate Gradient Method with Sufficient Descent Property

Xiaodi Wu ¹, Yihan Zhu ¹, Jianghua Yin ^2,^✉

PMCID: PMC8556087 PMID: 34721562

Abstract

In this paper, based on the HS method and a modified version of the PRP method, a hybrid conjugate gradient (CG) method is proposed for solving large-scale unconstrained optimization problems. The CG parameter generated by the method is always nonnegative. Moreover, the search direction possesses the sufficient descent property independent of line search. Utilizing the standard Wolfe–Powell line search rule to yield the stepsize, the global convergence of the proposed method is shown under the common assumptions. Finally, numerical results show that the proposed method is promising compared with two existing methods.

1. Introduction

Consider the problem of minimizing f over Rⁿ:

\begin{matrix} \min_{x \in R^{n}} f (x), \end{matrix}

(1)

where f : Rⁿ⟶R is continuously differentiable. Throughout, the gradient of f at x is denoted by g(x), i.e., g(x) : =∇f(x). We know that conjugate gradient (CG) methods are very popular and effective for solving unconstrained optimization problems (1), especially for large-scale case by means of their simplicity and low memory requirements. These preferred features greatly promote their applications in various areas such as image deblurring and denoising, neural network, compressed sensing, and others. We refer the interested readers to some recent works [1–3] and references therein for more details. The numerical results reported in [1] reveal that the CG method has great potential in solving image restoration problems.

Generally, the iterative formula of the CG method for solving problem (1) can be read as

\begin{matrix} x_{k + 1} = x_{k} + α_{k} d_{k}, \end{matrix}

(2)

where α_k > 0 is called the stepsize computed by some line search. Here, d_k is commonly known as the search direction, which is defined as follows:

\begin{matrix} d_{k} = \{\begin{matrix} - g_{k}, & if k = 1, \\ - g_{k} + β_{k} d_{k - 1}, & if k \geq 2, \end{matrix}) \end{matrix}

(3)

where β_k ∈ R is the so-called CG parameter and g_k is the abbreviation of g(x_k), i.e., g_k : =g(x_k). The two key factors that affect the numerical performance of the CG method are the stepsize and the CG parameter. First, we outline several well-known line search criteria in the literature.

(a)
The exact line search rule: calculate a stepsize α_k satisfying
$\begin{matrix} f (x_{k} + α_{k} d_{k}) = \min_{α \geq 0} f (x_{k} + α d_{k}) . \end{matrix}$ (4)
(b)
The standard (weak) Wolfe–Powell (WWP) line search rule: calculate a stepsize α_k satisfying
$\begin{matrix} f (x_{k} + α_{k} d_{k}) \leq f (x_{k}) + δ α_{k} g_{k}^{T} d_{k}, \end{matrix}$ (5)
and
$\begin{matrix} g {(x_{k} + α_{k} d_{k})}^{T} d_{k} \geq σ g_{k}^{T} d_{k}, \end{matrix}$ (6)
where 0 < δ < σ < 1.
(c)
The strong Wolfe–Powell (SWP) line search rule: calculate a stepsize α_k satisfying (5) and
$\begin{matrix} |g {(x_{k} + α_{k} d_{k})}^{T} d_{k}| \leq σ |g_{k}^{T} d_{k}| . \end{matrix}$ (7)

On the other hand, different CG methods are determined by different CG parameters. The well-known CG methods include the Fletcher–Reeves (FR) [4], Polak–Ribière–Polyak (PRP) [5, 6], Hestenes–Stiefel (HS) [7], Liu–Storey (LS) [8], Fletcher (CD) [9], and Dai–Yuan (DY) [10] methods, and their CG parameters β_k are, respectively, given by

\begin{matrix} β_{k}^{FR} = \frac{{‖g_{k}‖}^{2}}{{‖g_{k - 1}‖}^{2}}, \\ β_{k}^{PRP} = \frac{g_{k}^{T} y_{k - 1}}{{‖g_{k - 1}‖}^{2}}, \\ β_{k}^{HS} = \frac{g_{k}^{T} y_{k - 1}}{d_{k - 1}^{T} y_{k - 1}}, \\ β_{k}^{LS} = - \frac{g_{k}^{T} y_{k - 1}}{g_{k - 1}^{T} d_{k - 1}}, \\ β_{k}^{CD} = - \frac{{‖g_{k}‖}^{2}}{d_{k - 1}^{T} g_{k - 1}}, \\ β_{k}^{DY} = \frac{{‖g_{k}‖}^{2}}{d_{k - 1}^{T} y_{k - 1}}, \end{matrix}

(8)

where y_k−1 : =g_k − g_k−1 and ‖·‖ stands for the Euclidean norm. The methods yielded by the above CG parameters are called the classical CG methods, and their convergence analysis and numerical performance have been extensively studied (see, e.g., [4–12]). It has been shown that the above formulas for the CG parameters are equivalent when f(x) is convex quadratic and the stepsize α_k is obtained by carrying out the exact line search rule (4). However, their numerical performance strongly depends on the CG parameter β_k. The FR, CD, and DY methods possess good convergence, but the numerical performance for these methods is somewhat unsatisfactory for solving general unconstrained nonlinear optimization problems [12–14]. On the contrary, it has been shown that the convergence properties of PRP, HS, and LS methods are not so well, but they often possess better computational performance [12–14]. Therefore, in the past few decades, based on the above formulas, plenty of formulas for β_k are designed for CG methods that possess both good global convergence properties and promising numerical performance (see [12–16] and references therein).

To our knowledge, the first hybrid CG method in the literature was proposed by Touati-Ahmed and Storey [17] (TS method), where β_k is computed as

\begin{matrix} β_{k}^{TS} = \{\begin{matrix} β_{k}^{PRP}, & if 0 \leq β_{k}^{PRP} \leq β_{k}^{FR}, \\ β_{k}^{FR}, & otherwise . \end{matrix}) \end{matrix}

(9)

Apparently, the TS method has some good properties of FR and PRP methods since β_k^TS is a hybrid of β_k^FR and β_k^PRP. Combined with HS and DY methods, Dai and Yuan [18] proposed another hybrid CG method (hHD method), in which the hybrid CG parameter β_k is obtained by

\begin{matrix} β_{k}^{hHD} = \max \{0, \min \{β_{k}^{HS}, β_{k}^{DY}\}\} . \end{matrix}

(10)

When the WWP line search rule is used to compute the stepsize, the resulting search direction in [18] is a descent one and the global convergence for the hHD method is proved. Moreover, the numerical experiments reported in [18] illustrated that the hHD method is competitive and practicable. For other closely related works, we refer the readers to [18, 19] and the references therein. It is worth noting that the CG parameters β_k defined in [17–19] are restricted to positive values. As explicated in [19], this restriction in turn results in global convergence of the algorithm. In recent years, many hybrid CG methods were proposed on the basis of the methodology of discrete combinations of several CG parameters (see, e.g., [1, 13, 20–23]). The combination parameter is computed by some secant equations [13, 20], the conjugacy condition [21, 22], or by minimizing the least-squares problem consisting of the unknown search direction and an existing one (see [23] and the references therein).

In 2016, Wei et al. [24] introduced a modified PRP method, usually called the WYL method, where the corresponding parameter β_k is yielded by

\begin{matrix} β_{k}^{WYL} = \frac{{‖g_{k}‖}^{2} - ‖g_{k}‖ / ‖g_{k - 1}‖ g_{k}^{T} g_{k - 1}}{{‖g_{k - 1}‖}^{2}} . \end{matrix}

(11)

Under the assumption that d_k generated by Wei et al. [24] satisfies the so-called sufficient descent condition

\begin{matrix} g_{k}^{T} d_{k} \leq - c {‖g_{k}‖}^{2}, c > 0, \end{matrix}

(12)

the WYL method is globally convergent under the WWP line search rule and possesses superior numerical performance. Subsequently, Dai and Wen [25] proposed two improved CG methods with sufficient descent property. The CG parameters β_k in [25] are defined as

\begin{matrix} β_{k}^{DHS} = \frac{{‖g_{k}‖}^{2} - ‖g_{k}‖ / ‖g_{k - 1}‖ |g_{k}^{T} g_{k - 1}|}{d_{k - 1}^{T} y_{k - 1} + μ |g_{k}^{T} d_{k - 1}|}, \\ β_{k}^{DPRP} = \frac{{‖g_{k}‖}^{2} - ‖|g_{k}|‖ / ‖g_{k - 1}‖ |g_{k}^{T} g_{k - 1}|}{{‖g_{k - 1}‖}^{2} + μ |g_{k}^{T} d_{k - 1}|}, \end{matrix}

(13)

where μ > 1. Clearly, the search direction yielded by β_k^DPRP satisfies the sufficient descent condition without depending on any line search. However, the sufficient descent property associated with β_k^DHS relies on the WWP line search rule.

Based on the above observations, it is interesting to design a hybrid CG method such that the CG parameter is nonnegative and the resulting search direction possesses the sufficient descent property independent of line search technique. Motivated by the methods in [24, 25] and considering that the HS method performs best among the classical CG methods, a new formula for the CG parameter β_k is given by

\begin{matrix} β_{k}^{hHPR} = \min \{|β_{k}^{HS}|, \frac{{‖g_{k}‖}^{2} - ‖g_{k}‖ / ‖g_{k - 1}‖ g_{k}^{T} g_{k - 1}}{{‖g_{k - 1}‖}^{2} + γ |g_{k}^{T} d_{k - 1}|}\}, \end{matrix}

(14)

where γ > 2. It is not difficult to see that β_k^hHPR is a hybrid of β_k^HS, β_k^WYL, and β_k^DPRP. Interestingly, the above parameter β_k^hHPR is always nonnegative. To see this, let θ_k be the angle between g_k and g_k−1. Thus, we know from (14) that

\begin{matrix} β_{k}^{hHPR} \leq \frac{{‖g_{k}‖}^{2} - ‖g_{k}‖ / ‖g_{k - 1}‖ g_{k}^{T} g_{k - 1}}{{‖g_{k - 1}‖}^{2} + γ |g_{k}^{T} d_{k - 1}|} = \frac{{‖g_{k}‖}^{2} (1 - \cos θ_{k})}{{‖g_{k - 1}‖}^{2} + γ |g_{k}^{T} d_{k - 1}|} \leq \frac{2 {‖g_{k}‖}^{2}}{{‖g_{k - 1}‖}^{2} + γ |g_{k}^{T} d_{k - 1}|}, \end{matrix}

(15)

which further implies

\begin{matrix} 0 \leq β_{k}^{hHPR} \leq \frac{2 {‖g_{k}‖}^{2}}{{‖g_{k - 1}‖}^{2}} . \end{matrix}

(16)

Moreover, plugging the CG parameter β_k : =β_k^hHPR into (3), we can show that the resulting search direction possesses the sufficient descent property independent of line search technique (see Lemma 1 below).

The structure of this paper is organized as follows. In Section 2, our algorithm framework is presented, and the sufficient descent property with respect to the resulting search direction is discussed in detail. Section 3 is devoted to establishing the convergence of the proposed method with the WWP line search rule. In the last section, some preliminary numerical results are reported to verify the efficiency of the presented method.

2. The Algorithm

In this section, we first propose the algorithm framework for solving problem (1), in which we do not specify which line search rule generates the stepsize. Subsequently, we analyze the sufficient descent property for the search direction. By inserting the WWP line search rule into the algorithm framework, our hybrid CG method is proposed.

The following lemma shows that the direction sequence {d_k} generated by Algorithm 1 possesses the sufficient descent property independent of any line search.

Lemma 1 . —

Let {d_k} be a sequence generated by Algorithm 1. Then, for some constant M ∈ (0,1), it holds that

$\begin{matrix} g_{k}^{T} d_{k} \leq - M {‖g_{k}‖}^{2}, \forall, k \geq 1. \end{matrix}$ (17)

Proof —

When k=1, it follows from the definition of d_k in (3) that g₁^Td₁=−‖g₁‖² ≤ −M‖g₁‖². So, the relation in (17) holds when k=1. Now, consider the case k ≥ 2. If g_k^Td_k−1=0, it follows from (3) that

$\begin{matrix} g_{k}^{T} d_{k} = - {‖g_{k}‖}^{2} \leq - M {‖g_{k}‖}^{2} . \end{matrix}$ (18)

Suppose that g_k^Td_k−1 ≠ 0 for all k ≥ 2. It then follows from (3), (15), and (16) that

$\begin{matrix} g_{k}^{T} d_{k} = - {‖g_{k}‖}^{2} + β_{k}^{hHPR} g_{k}^{T} d_{k - 1} \\ \leq - {‖g_{k}‖}^{2} + \frac{2 {‖g_{k}‖}^{2}}{{‖g_{k - 1}‖}^{2} + γ |g_{k}^{T} d_{k - 1}|} |g_{k}^{T} d_{k - 1}| \\ \leq - {‖g_{k}‖}^{2} + \frac{2 {‖g_{k}‖}^{2}}{γ |g_{k}^{T} d_{k - 1}|} |g_{k}^{T} d_{k - 1}| \\ \leq - (1 - \frac{2}{γ}) {‖g_{k}‖}^{2} ≕ - M {‖g_{k}‖}^{2}, \end{matrix}$ (19)

which completes the proof.

For convenience, in the following statements, we call the method generated by Algorithm 1 with the WWP line search rule as the hHPR CG method.

3. Convergence

In this section, we analyze the convergence for the hHPR CG method. For this goal, the following common assumptions are necessary.

Assumption 1 . —

(i)
The level set Ω={x ∈ Rⁿ|f(x) ≤ f(x₁)} is bounded. Here, x₁ is the given initial point.

(ii)
In some neighborhood N of the level set Ω, the objective function f(x) is continuously differentiable, and its gradient g(x) is Lipschitz continuous, i.e., there exists a constant L > 0 such that
$\begin{matrix} ‖g (x) - g (y)‖ \leq L ‖x - y‖, \forall x, y \in N . \end{matrix}$ (20)

The following lemma provides the convergence for the PRP-type CG method, which was originally introduced in [19].

Lemma 2 . —

Consider the general CG method (2) and (3) with the following three properties:

(i)
The CG parameter is always nonnegative, i.e., β_k ≥ 0 for all k ≥ 1.

(ii)
The line search satisfies (5) and (6) and the sufficient descent condition.

(iii)
Property (∗) holds. Then,
$\begin{matrix} \underset{k ⟶ \infty}{\lim \inf} ‖g_{k}‖ = 0. \end{matrix}$ (21)

Property 1 . —

(∗) Consider a method of forms (2) and (3). Suppose that

$\begin{matrix} 0 < γ \leq ‖g_{k}‖ \leq \bar{γ}, \forall, k \geq 1. \end{matrix}$ (22)

We say that the method has property (∗), if for all k ≥ 1, there exist constants b > 1 and λ > 0 such that |β_k| ≤ b, and if ‖s_k−1‖ ≤ λ where s_k−1=x_k − x_k−1, then we have |β_k| ≤ 1/2b.

From (16) and Lemmas 1 and 2, to obtain the global convergence of the hHPR CG method, we only prove that our method owns property (∗).

Lemma 3 . —

Consider the method of forms (2) and (3) in which β_k=β_k^hHPR. If Assumption 1 holds, then β_k^hHPR satisfies property (∗).

Proof —

Considering the method of forms (2) and (3) and using the constants γ and $\bar{γ}$ in (22), we have from (16) that

$\begin{matrix} 0 \leq β_{k}^{hHPR} \leq \frac{2 {‖g_{k}‖}^{2}}{{‖g_{k - 1}‖}^{2}} \leq \frac{2 {\bar{γ}}^{2}}{γ^{2}} . \end{matrix}$ (23)

Let $b = 2 {\bar{γ}}^{2} / γ^{2} \geq 2$ and $λ = γ^{4} / 8 L {\bar{γ}}^{3} > 0$ . If ‖s_k−1‖ ≤ λ, we obtain from Assumption 1(ii) and (15) that

$\begin{matrix} β_{k}^{hHPR} \leq \frac{g_{k}^{T} (g_{k} - ‖g_{k}‖ / ‖g_{k - 1}‖ g_{k - 1})}{{‖g_{k - 1}‖}^{2}} \\ \leq \frac{‖g_{k}‖ \cdot ‖g_{k} - g_{k - 1} + g_{k - 1} - ‖g_{k}‖ / ‖g_{k - 1}‖ g_{k - 1}‖}{{‖g_{k - 1}‖}^{2}} \\ \leq \frac{‖g_{k}‖ \cdot ‖g_{k} - g_{k - 1}‖ + ‖g_{k}‖ \cdot |‖g_{k - 1}‖ - ‖g_{k}‖|}{{‖g_{k - 1}‖}^{2}} \\ \leq \frac{2 ‖g_{k}‖ ‖g_{k} - g_{k - 1}‖}{{‖g_{k - 1}‖}^{2}} \leq \frac{2 L ‖s_{k - 1}‖ ‖g_{k}‖}{{‖g_{k - 1}‖}^{2}} \\ \leq \frac{2 L λ \bar{γ}}{γ^{2}} = \frac{1}{2 b} . \end{matrix}$ (24)

Therefore, the proof is completed.

With (16) and Lemmas 1–3 at hand, one can establish the global convergence of the hHPR CG method.

Theorem 1 . —

Let {x_k} be a sequence generated by the hHPR CG method. If Assumption 1 holds, then lim inf_k⟶∞‖g_k‖=0.

4. Numerical Experiments

In this section, we verify the efficiency and robustness of the hHPR CG method (hHPR for short) by solving some classical tested problems and compare it with two well-known CG methods: DHS and DPRP in [25].

For the tested problems, some of them are from the well-known CUTE library in [26] and the others come from [27]. Moreover, their dimensions range from 2 to 1000000. All codes were written in MATLAB R2016a, and the numerical experiments were conducted on a Dell PC with Intel Core CPU 3.00 GHz and 16.00 GB RAM. For the aforementioned methods, we reset the search direction by taking d_k : =−g_k once an ascent direction occurs. For the sake of fairness, all the stepsizes α_k are yielded by the WWP line search rule following a bisection algorithm proposed in [28], and the corresponding parameters are set to δ=0.01 and σ=0.1. Moreover, we adopt the strategy described in [29] to compute the initial stepsize.

Let γ=3 for hHPR, and let μ=2 for DHS and DPRP. Denote the iteration numbers, the CPU time in seconds, and the final value of ‖g_k‖ by Itr, Tcpu, and ‖g_∗‖, respectively. If ‖g_k‖ ≤ 10⁻⁶ or Itr > 2000, we stop the program. If the latter requirement holds, i.e., Itr > 2000, we use “-” to denote Itr, Tcpu, and ‖g_∗‖.

The numerical results are listed in Tables 1 and 2, where “TP” denotes the tested problems used in numerical experiments and “Dim” stands for the dimension of the tested problems.

Table 1.

Numerical results for the three tested methods.

Problems	hHPR	DHS	DPRP
TP/Dim	Itr/Tcpu/‖g_∗‖	Itr/Tcpu/‖g_∗‖	Itr/Tcpu/‖g_∗‖
bdexp/50000	2/0.157/3.51e − 89	2/0.136/3.51e − 89	2/0.138/3.51e − 89
bdexp/100000	2/0.223/4.58e − 106	2/0.223/4.58e − 106	2/0.220/4.58e − 106
bdexp/1000000	2/2.857/1.42e − 170	2/2.829/1.42e − 170	2/2.848/1.42e − 170
exdenschnf/50000	38/0.260/1.53e − 07	30/0.230/9.80e − 07	30/0.227/2.39e − 07
exdenschnf/100000	42/0.496/6.67e − 07	25/0.425/8.47e − 07	24/0.412/2.50e − 07
exdenschnb/5000	24/0.020/6.69e − 07	16/0.011/3.36e − 08	19/0.011/3.61e − 07
exdenschnb/20000	25/0.047/2.79e − 07	17/0.043/1.77e − 07	22/0.057/4.59e − 07
exdenschnb/100000	17/0.215/5.21e − 07	17/0.182/1.48e − 07	17/0.183/1.03e − 07
himmelbg/20000	2/0.029/6.91e − 28	2/0.022/6.91e − 28	2/0.021/6.91e − 28
himmelbg/100000	2/0.095/1.46e − 28	2/0.094/1.46e − 28	2/0.095/1.46e − 28
genquartic/20000	21/0.061/3.91e − 07	18/0.052/5.09e − 07	13/0.045/5.23e − 07
genquartic/100000	17/0.231/6.08e − 07	18/0.230/4.49e − 07	17/0.228/6.68e − 07
genquartic/1000000	28/3.189/1.74e − 07	18/2.818/5.34e − 07	16/2.469/4.73e − 07
biggsb1/200	1218/0.064/9.68e − 07	1607/0.103/9.61e − 07	1742/0.114/1.00e − 06
biggsb1/400	—	—	—
sine/100	95/0.013/9.33e − 07	29/0.003/5.63e − 07	27/0.003/2.45e − 07
sinquad/3	79/0.015/6.74e − 07	359/0.022/9.39e − 07	224/0.012/5.27e − 07
fletcbv3/20	101/0.014/4.42e − 07	110/0.006/9.01e − 07	147/0.008/5.29e − 07
fletcbv3/40	409/0.024/4.63e − 07	485/0.026/7.67e − 07	504/0.025/7.84e − 07
eg2/100	428/0.074/9.04e − 07	—	1428/0.115/7.75e − 07
eg2/170	1353/0.395/2.61e − 07	—	—
nonscomp/10000	—	—	—
nonscomp/20000	50/0.077/7.79e − 07	49/0.076/2.55e − 07	46/0.073/8.37e − 07
nonscomp/50000	74/0.251/6.30e − 07	69/0.239/6.07e − 07	80/0.266/8.04e − 07
cosine/1000	51/0.020/9.66e − 07	20/0.009/5.27e − 07	21/0.009/6.42e − 07
cosine/10000	51/0.090/5.31e − 07	33/0.072/4.32e − 07	21/0.057/1.04e − 07
dixmaana/3000	20/0.132/2.60e − 07	22/0.113/4.60e − 07	17/0.107/5.22e − 07
dixmaanb/3000	18/0.129/8.63e − 07	20/0.107/1.96e − 07	14/0.100/3.01e − 07
dixmaanc/3000	17/0.130/3.91e − 07	17/0.098/2.62e − 07	16/0.126/6.68e − 07
dixmaand/3000	23/0.146/2.23e − 07	17/0.106/4.12e − 07	17/0.116/3.92e − 07
dixmaane/3000	428/0.915/8.90e − 07	612/1.731/8.10e − 07	584/1.676/9.45e − 07
dixmaanf/3000	287/0.633/9.86e − 07	540/1.527/9.46e − 07	477/1.364/9.29e − 07
dixmaang/3000	431/0.951/7.93e − 07	553/1.587/9.31e − 07	598/1.759/8.21e − 07
dixmaanh/3000	538/1.292/5.87e − 07	960/3.049/7.59e − 07	918/2.984/8.35e − 07
dixmaanj/3000	—	—	—
dixmaank/3000	156/0.387/6.90e − 07	—	—
dixmaanl/3000	—	—	1199/3.390/9.95e − 07
dixon3dq/80	876/0.043/8.55e − 07	912/0.050/8.66e − 07	1097/0.062/9.83e − 07
dixon3dq/160	—	—	—
dqdrtic/10000	67/0.073/4.83e − 07	296/0.189/7.98e − 07	243/0.160/5.41e − 07
dqdrtic/100000	85/0.468/9.32e − 07	256/1.248/5.90e − 07	153/0.819/2.43e − 07
dqdrtic/1000000	70/5.007/6.70e − 07	265/14.500/8.11e − 07	273/14.436/7.49e − 07
dqrtic/200	27/0.014/2.68e − 07	24/0.008/5.68e − 07	26/0.009/5.17e − 08
dqrtic/500	33/0.021/5.05e − 07	34/0.023/6.73e − 07	29/0.020/5.84e − 07
edensch/1000	40/0.051/7.37e − 07	41/0.056/7.02e − 07	36/0.046/8.63e − 07
edensch/4000	73/0.493/8.66e − 07	46/0.133/3.62e − 07	43/0.204/8.20e − 07
edensch/8000	44/0.620/8.57e − 07	68/1.022/5.23e − 07	39/0.456/6.70e − 07
engval1/6	30/0.009/8.33e − 07	37/0.002/2.36e − 07	38/0.002/8.34e − 07
errinros/3	314/0.028/8.62e − 07	—	—
fletchcr/100	84/0.022/9.11e − 07	82/0.005/4.67e − 07	67/0.005/7.99e − 07
fletchcr/300	45/0.005/8.17e − 07	110/0.008/7.13e − 07	117/0.007/8.66e − 07
freuroth/50	263/0.047/8.81e − 07	788/0.107/8.23e − 07	664/0.054/9.91e − 07
genrose/5000	—	—	—
genrose/10000	177/0.152/6.78e − 07	499/0.433/9.48e − 07	776/0.659/8.02e − 07
genrose/20000	—	—	—

Open in a new tab

Table 2.

Numerical results for the three tested methods (continued).

Problems	hHPR	DHS	DPRP
TP/Dim	Itr/Tcpu/‖g_∗‖	Itr/Tcpu/‖g_∗‖	Itr/Tcpu/‖g_∗‖
liarwhd/5000	94/0.066/7.37e − 07	—	—
liarwhd/10000	138/0.186/8.02e − 07	—	—
liarwhd/20000	125/0.388/7.88e − 07	—	—
nondquar/30	534/0.072/8.10e − 07	760/0.085/9.19e − 07	—
penalty1/1000	30/0.418/2.55e − 07	29/0.411/2.65e − 07	29/0.390/9.90e − 07
penalty1/5000	117/39.785/2.62e − 07	278/103.383/4.27e − 07	203/72.649/1.01e − 07
power1/50	595/0.029/9.03e − 07	743/0.039/9.46e − 07	917/0.050/2.82e − 07
power1/100	1468/0.068/9.91e − 07	1666/0.101/6.89e − 07	—
quartc/100	23/0.010/6.45e − 08	21/0.004/3.28e − 07	27/0.006/1.06e − 07
quartc/560	32/0.023/7.15e − 07	29/0.020/7.08e − 07	33/0.022/5.70e − 07
tridia/100	409/0.026/6.30e − 07	477/0.030/3.94e − 07	681/0.038/9.21e − 07
tridia/1000	1564/0.153/9.39e − 07	—	—
raydan1/100	77/0.008/9.73e − 07	127/0.010/8.52e − 07	106/0.005/9.59e − 07
raydan1/500	210/0.015/9.14e − 07	281/0.023/5.80e − 07	268/0.022/7.43e − 07
raydan2/5000	12/0.027/9.72e − 07	12/0.021/3.54e − 07	12/0.023/3.54e − 07
raydan2/10000	12/0.051/3.75e − 07	13/0.052/7.31e − 08	13/0.050/7.31e − 08
raydan2/50000	16/0.239/2.43e − 08	16/0.217/8.05e − 07	17/0.272/7.70e − 07
diagonal1/40	61/0.010/7.58e − 07	81/0.005/7.52e − 07	63/0.004/8.67e − 07
diagonal2/10000	830/1.082/9.48e − 07	1398/2.017/9.75e − 07	1134/1.660/6.18e − 07
diagonal2/20000	1241/3.050/8.03e − 07	1387/4.025/9.98e − 07	1879/5.192/8.08e − 07
diagonal3/10	39/0.007/7.69e − 07	37/0.002/4.92e − 07	39/0.002/4.59e − 07
diagonal3/90	91/0.008/7.31e − 07	144/0.010/9.04e − 07	166/0.016/6.14e − 07
bv/1000	138/0.447/8.56e − 07	117/0.513/8.14e − 07	127/0.550/9.09e − 07
bv/2000	105/1.174/9.03e − 07	107/1.476/8.99e − 07	130/1.701/9.46e − 07
ie/50	16/0.051/9.24e − 07	11/0.039/2.20e − 07	15/0.044/1.83e − 07
ie/10	14/0.162/7.41e − 07	12/0.156/2.72e − 07	13/0.167/2.10e − 07
singx/100	277/0.051/3.49e − 07	—	—
singx/1000	565/2.397/7.35e − 07	—	453/1.711/5.18e − 07
woods/10000	177/0.175/2.74e − 07	589/0.492/2.25e − 07	814/0.631/8.15e − 07
band/3	19/0.012/2.13e − 07	15/0.002/6.71e − 07	18/0.002/9.14e − 07
bard/3	101/0.028/3.00e − 07	698/0.097/4.95e − 07	811/0.112/8.16e − 07
beale/2	48/0.010/6.38e − 07	131/0.007/2.30e − 07	96/0.007/8.34e − 07
biggs/6	—	—	—
box/3	74/0.014/5.48e − 07	203/0.017/1.45e − 07	185/0.015/5.45e − 07
froth/2	92/0.014/6.85e − 07	338/0.025/9.02e − 07	381/0.026/8.13e − 07
gauss/3	13/0.010/3.75e − 07	24/0.004/4.71e − 07	21/0.003/3.27e − 07
helix/3	102/0.019/4.65e − 07	361/0.032/9.10e − 07	418/0.036/9.20e − 07
jensam/2	39/0.009/2.79e − 07	148/0.009/9.70e − 07	149/0.009/8.40e − 07
kowosb/4	224/0.029/9.98e − 07	1029/0.075/3.25e − 07	797/0.056/9.64e − 07
lin/100	13/0.062/8.86e − 07	13/0.054/8.86e − 07	13/0.054/8.86e − 07
lin/500	18/0.418/9.57e − 07	18/0.420/9.38e − 07	18/0.421/9.38e − 07
osb2/11	734/0.113/7.42e − 07	1513/0.259/4.38e − 07	1174/0.198/7.47e − 07
pen1/60	78/0.025/4.74e − 07	390/0.056/8.20e − 07	668/0.079/8.95e − 07
pen2/100	151/0.074/7.94e − 07	162/0.045/5.89e − 07	235/0.081/8.49e − 07
rose/2	79/0.011/9.71e − 07	584/0.037/9.06e − 07	858/0.053/2.86e − 07
rosex/100	104/0.022/4.68e − 07	1190/0.132/7.73e − 07	705/0.080/8.08e − 07
rosex/1000	106/0.899/2.49e − 07	1040/6.318/8.60e − 07	1374/8.180/2.95e − 07
sing/4	213/0.023/6.12e − 07	—	1111/0.070/9.15e − 07
trid/100	80/0.021/4.41e − 07	112/0.022/4.22e − 07	128/0.024/6.31e − 07
trid/200	37/0.015/6.81e − 07	46/0.016/7.20e − 07	42/0.017/6.19e − 07
vardim/8	28/0.011/4.13e − 07	26/0.004/6.51e − 07	31/0.004/9.75e − 07
watson/6	1811/0.500/7.74e − 07	—	—
wood/4	173/0.020/6.93e − 07	622/0.047/5.42e − 07	966/0.069/9.26e − 07

Open in a new tab

As we all know, the performance profile introduced in [30] is very useful in measuring the performance of numerical algorithms. Figures 1 and 2 plot the performance profiles of hHPR, DHS, and DPRP in terms of Itr and Tcpu, respectively. Based on the left side of Figures 1 and 2, the proposed method is clearly above the other two curves, and this in turn shows that compared with DHS and DPRP, our proposed method is efficient and encouraging. On the other hand, based on the right side of Figures 1 and 2, our proposed method can successfully solve about 90% of the tested problems and clearly outperforms the other two methods.

Acknowledgments

The corresponding author acknowledges the Natural Science Foundation of Guangxi Province (grant no. 2021GXNSFAA075001).

Data Availability

All the datasets used in this paper are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

1.Alhawarat A., Salleh Z., Masmali I. A. A convex combination between two different search directions of conjugate gradient method and application in image restoration. Mathematical Problems in Engineering . 2021;2021:15.9941757 [Google Scholar]
2.Alhawarat A., Alhamzi G., Masmali I., Salleh Z. A descent four-term conjugate gradient method with global convergence properties for large-scale unconstrained optimisation problems. Mathematical Problems in Engineering . 2021;2021:14.6219062 [Google Scholar]
3.Masmali I. A., Salleh Z., Salleh Z., Alhawarat A. A decent three term conjugate gradient method with global convergence properties for large scale unconstrained optimization problems. AIMS Mathematics . 2021;6(10):10742–10764. doi: 10.3934/math.2021624. [DOI] [Google Scholar]
4.Fletcher R., Reeves C. M. Function minimization by conjugate gradients. The Computer Journal . 1964;7(2):149–154. doi: 10.1093/comjnl/7.2.149. [DOI] [Google Scholar]
5.Polak E., Ribière G. Note sur la convergence de méthodes de directions conjuguées. Revue française d’informatique et de recherche opérationnelle. Série rouge . 1969;3(16):35–43. doi: 10.1051/m2an/196903r100351. [DOI] [Google Scholar]
6.Polyak B. T. The conjugate gradient method in extremal problems. USSR Computational Mathematics and Mathematical Physics . 1969;9(4):94–112. doi: 10.1016/0041-5553(69)90035-4. [DOI] [Google Scholar]
7.Hestenes M. R., Stiefel E. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards . 1952;49(6):409–436. doi: 10.6028/jres.049.044. [DOI] [Google Scholar]
8.Liu Y., Storey C. Efficient generalized conjugate gradient algorithms, part 1: Theory. Journal of Optimization Theory and Applications . 1991;69(1):129–137. doi: 10.1007/bf00940464. [DOI] [Google Scholar]
9.Fletcher R. Unconstrained Optimization . New York, NY, USA: John Wiley & Sons; 1987. Practical methods of optimization. [Google Scholar]
10.Dai Y. H., Yuan Y. A nonlinear conjugate gradient method with a strong global convergence property. SIAM Journal ohn Optimization . 1999;10(1):177–182. doi: 10.1137/s1052623497318992. [DOI] [Google Scholar]
11.Dai Y. H., Yuan Y. Nonlinear Conjugate Gradient Methods (In Chinese) Shanghai, China: Shanghai Scientific and Technical Publishers; 2000. [Google Scholar]
12.Liu J. K., Li S. J. New hybrid conjugate gradient method for unconstrained optimization. Applied Mathematics and Computation . 2014;245:36–43. doi: 10.1016/j.amc.2014.07.096. [DOI] [Google Scholar]
13.Babaie-Kafaki S., Ghanbari R. Two hybrid nonlinear conjugate gradient methods based on a modified secant equation. Optimization . 2014;63(7):1027–1042. doi: 10.1080/02331934.2012.693083. [DOI] [Google Scholar]
14.Jian J., Han L., Jiang X. A hybrid conjugate gradient method with descent property for unconstrained optimization. Applied Mathematical Modelling . 2015;39(3-4):1281–1290. doi: 10.1016/j.apm.2014.08.008. [DOI] [Google Scholar]
15.Sun M., Liu J. Three modified Polak-Ribière-Polyak conjugate gradient methods with sufficient descent property. Journal of Inequalities and Applications . 2015;2015(1):125–138. doi: 10.1186/s13660-015-0649-9. [DOI] [Google Scholar]
16.Arzuka I., Abu Bakar M. R., Leong W. J. A scaled three-term conjugate gradient method for unconstrained optimization. Journal of Inequalities and Applications . 2016;2016(1):325–340. doi: 10.1186/s13660-016-1239-1. [DOI] [Google Scholar]
17.Touati-Ahmed D., Storey C. Efficient hybrid conjugate gradient techniques. Journal of Optimization Theory and Applications . 1990;64(2):379–397. doi: 10.1007/bf00939455. [DOI] [Google Scholar]
18.Dai Y. H., Yuan Y. An efficient hybrid conjugate gradient method for unconstrained optimization. Annals of Operations Research . 2001;103:33–47. [Google Scholar]
19.Gilbert J. C., Nocedal J. Global convergence properties of conjugate gradient methods for optimization. SIAM Journal on Optimization . 1992;2(1):21–42. doi: 10.1137/0802003. [DOI] [Google Scholar]
20.Andrei N. Another hybrid conjugate gradient algorithm for unconstrained optimization. Numerical Algorithms . 2008;47(2):143–156. doi: 10.1007/s11075-007-9152-9. [DOI] [Google Scholar]
21.Andrei N. Hybrid conjugate gradient algorithm for unconstrained optimization. Journal of Optimization Theory and Applications . 2009;141(2):249–264. doi: 10.1007/s10957-008-9505-0. [DOI] [Google Scholar]
22.Djordjević S. S. New hybrid conjugate gradient method as a convex combination of LS and FR methods. Acta Mathematica Scientia . 2019;39:214–228. [Google Scholar]
23.Babaie-Kafaki S., Ghanbari R. A hybridization of the Hestenes-Stiefel and Dai-Yuan conjugate gradient methods based on a least-squares approach. Optimization Methods and Software . 2015;30(4):673–681. doi: 10.1080/10556788.2014.966825. [DOI] [Google Scholar]
24.Wei Z., Yao S., Liu L. The convergence properties of some new conjugate gradient methods. Applied Mathematics and Computation . 2006;183(2):1341–1350. doi: 10.1016/j.amc.2006.05.150. [DOI] [Google Scholar]
25.Dai Z., Wen F. Another improved Wei-Yao-Liu nonlinear conjugate gradient method with sufficient descent property. Applied Mathematics and Computation . 2012;218(14):7421–7430. doi: 10.1016/j.amc.2011.12.091. [DOI] [Google Scholar]
26.Bongartz I., Conn A. R., Gould N., Toint P. L. Cute. ACM Transactions on Mathematical Software . 1995;21(1):123–160. doi: 10.1145/200979.201043. [DOI] [Google Scholar]
27.Moré J. J., Garbow B. S., Hillstrom K. E. Testing unconstrained optimization software. ACM Transactions on Mathematical Software . 1981;7(1):17–41. doi: 10.1145/355934.355936. [DOI] [Google Scholar]
28.Burke J. V., Engle A. Line search methods for convex-composite optimization. 2018. https://arxiv.org/abs/1806.05218 .
29.Sellami B., Laskri Y., Benzine R. A new two-parameter family of nonlinear conjugate gradient methods. Optimization . 2015;64(4):993–1009. doi: 10.1080/02331934.2013.830118. [DOI] [Google Scholar]
30.Dolan E. D., Moré J. J. Benchmarking optimization software with performance profiles. Mathematical Programming . 2002;91(2):201–213. doi: 10.1007/s101070100263. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All the datasets used in this paper are available from the corresponding author upon request.

[B1] 1.Alhawarat A., Salleh Z., Masmali I. A. A convex combination between two different search directions of conjugate gradient method and application in image restoration. Mathematical Problems in Engineering . 2021;2021:15.9941757 [Google Scholar]

[B2] 2.Alhawarat A., Alhamzi G., Masmali I., Salleh Z. A descent four-term conjugate gradient method with global convergence properties for large-scale unconstrained optimisation problems. Mathematical Problems in Engineering . 2021;2021:14.6219062 [Google Scholar]

[B3] 3.Masmali I. A., Salleh Z., Salleh Z., Alhawarat A. A decent three term conjugate gradient method with global convergence properties for large scale unconstrained optimization problems. AIMS Mathematics . 2021;6(10):10742–10764. doi: 10.3934/math.2021624. [DOI] [Google Scholar]

[B4] 4.Fletcher R., Reeves C. M. Function minimization by conjugate gradients. The Computer Journal . 1964;7(2):149–154. doi: 10.1093/comjnl/7.2.149. [DOI] [Google Scholar]

[B5] 5.Polak E., Ribière G. Note sur la convergence de méthodes de directions conjuguées. Revue française d’informatique et de recherche opérationnelle. Série rouge . 1969;3(16):35–43. doi: 10.1051/m2an/196903r100351. [DOI] [Google Scholar]

[B6] 6.Polyak B. T. The conjugate gradient method in extremal problems. USSR Computational Mathematics and Mathematical Physics . 1969;9(4):94–112. doi: 10.1016/0041-5553(69)90035-4. [DOI] [Google Scholar]

[B7] 7.Hestenes M. R., Stiefel E. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards . 1952;49(6):409–436. doi: 10.6028/jres.049.044. [DOI] [Google Scholar]

[B8] 8.Liu Y., Storey C. Efficient generalized conjugate gradient algorithms, part 1: Theory. Journal of Optimization Theory and Applications . 1991;69(1):129–137. doi: 10.1007/bf00940464. [DOI] [Google Scholar]

[B9] 9.Fletcher R. Unconstrained Optimization . New York, NY, USA: John Wiley & Sons; 1987. Practical methods of optimization. [Google Scholar]

[B10] 10.Dai Y. H., Yuan Y. A nonlinear conjugate gradient method with a strong global convergence property. SIAM Journal ohn Optimization . 1999;10(1):177–182. doi: 10.1137/s1052623497318992. [DOI] [Google Scholar]

[B11] 11.Dai Y. H., Yuan Y. Nonlinear Conjugate Gradient Methods (In Chinese) Shanghai, China: Shanghai Scientific and Technical Publishers; 2000. [Google Scholar]

[B12] 12.Liu J. K., Li S. J. New hybrid conjugate gradient method for unconstrained optimization. Applied Mathematics and Computation . 2014;245:36–43. doi: 10.1016/j.amc.2014.07.096. [DOI] [Google Scholar]

[B13] 13.Babaie-Kafaki S., Ghanbari R. Two hybrid nonlinear conjugate gradient methods based on a modified secant equation. Optimization . 2014;63(7):1027–1042. doi: 10.1080/02331934.2012.693083. [DOI] [Google Scholar]

[B14] 14.Jian J., Han L., Jiang X. A hybrid conjugate gradient method with descent property for unconstrained optimization. Applied Mathematical Modelling . 2015;39(3-4):1281–1290. doi: 10.1016/j.apm.2014.08.008. [DOI] [Google Scholar]

[B15] 15.Sun M., Liu J. Three modified Polak-Ribière-Polyak conjugate gradient methods with sufficient descent property. Journal of Inequalities and Applications . 2015;2015(1):125–138. doi: 10.1186/s13660-015-0649-9. [DOI] [Google Scholar]

[B16] 16.Arzuka I., Abu Bakar M. R., Leong W. J. A scaled three-term conjugate gradient method for unconstrained optimization. Journal of Inequalities and Applications . 2016;2016(1):325–340. doi: 10.1186/s13660-016-1239-1. [DOI] [Google Scholar]

[B17] 17.Touati-Ahmed D., Storey C. Efficient hybrid conjugate gradient techniques. Journal of Optimization Theory and Applications . 1990;64(2):379–397. doi: 10.1007/bf00939455. [DOI] [Google Scholar]

[B18] 18.Dai Y. H., Yuan Y. An efficient hybrid conjugate gradient method for unconstrained optimization. Annals of Operations Research . 2001;103:33–47. [Google Scholar]

[B19] 19.Gilbert J. C., Nocedal J. Global convergence properties of conjugate gradient methods for optimization. SIAM Journal on Optimization . 1992;2(1):21–42. doi: 10.1137/0802003. [DOI] [Google Scholar]

[B20] 20.Andrei N. Another hybrid conjugate gradient algorithm for unconstrained optimization. Numerical Algorithms . 2008;47(2):143–156. doi: 10.1007/s11075-007-9152-9. [DOI] [Google Scholar]

[B21] 21.Andrei N. Hybrid conjugate gradient algorithm for unconstrained optimization. Journal of Optimization Theory and Applications . 2009;141(2):249–264. doi: 10.1007/s10957-008-9505-0. [DOI] [Google Scholar]

[B22] 22.Djordjević S. S. New hybrid conjugate gradient method as a convex combination of LS and FR methods. Acta Mathematica Scientia . 2019;39:214–228. [Google Scholar]

[B23] 23.Babaie-Kafaki S., Ghanbari R. A hybridization of the Hestenes-Stiefel and Dai-Yuan conjugate gradient methods based on a least-squares approach. Optimization Methods and Software . 2015;30(4):673–681. doi: 10.1080/10556788.2014.966825. [DOI] [Google Scholar]

[B24] 24.Wei Z., Yao S., Liu L. The convergence properties of some new conjugate gradient methods. Applied Mathematics and Computation . 2006;183(2):1341–1350. doi: 10.1016/j.amc.2006.05.150. [DOI] [Google Scholar]

[B25] 25.Dai Z., Wen F. Another improved Wei-Yao-Liu nonlinear conjugate gradient method with sufficient descent property. Applied Mathematics and Computation . 2012;218(14):7421–7430. doi: 10.1016/j.amc.2011.12.091. [DOI] [Google Scholar]

[B26] 26.Bongartz I., Conn A. R., Gould N., Toint P. L. Cute. ACM Transactions on Mathematical Software . 1995;21(1):123–160. doi: 10.1145/200979.201043. [DOI] [Google Scholar]

[B27] 27.Moré J. J., Garbow B. S., Hillstrom K. E. Testing unconstrained optimization software. ACM Transactions on Mathematical Software . 1981;7(1):17–41. doi: 10.1145/355934.355936. [DOI] [Google Scholar]

[B28] 28.Burke J. V., Engle A. Line search methods for convex-composite optimization. 2018. https://arxiv.org/abs/1806.05218 .

[B29] 29.Sellami B., Laskri Y., Benzine R. A new two-parameter family of nonlinear conjugate gradient methods. Optimization . 2015;64(4):993–1009. doi: 10.1080/02331934.2013.830118. [DOI] [Google Scholar]

[B30] 30.Dolan E. D., Moré J. J. Benchmarking optimization software with performance profiles. Mathematical Programming . 2002;91(2):201–213. doi: 10.1007/s101070100263. [DOI] [Google Scholar]

PERMALINK

A HS-PRP-Type Hybrid Conjugate Gradient Method with Sufficient Descent Property

Xiaodi Wu

Yihan Zhu

Jianghua Yin

Abstract

1. Introduction

2. The Algorithm

Algorithm 1.

Lemma 1 . —

Proof —

3. Convergence

Assumption 1 . —

Lemma 2 . —

Property 1 . —

Lemma 3 . —

Proof —

Theorem 1 . —

4. Numerical Experiments

Table 1.

Table 2.

Figure 1.

Figure 2.

Acknowledgments

Data Availability

Conflicts of Interest

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A HS-PRP-Type Hybrid Conjugate Gradient Method with Sufficient Descent Property

Xiaodi Wu

Yihan Zhu

Jianghua Yin

Abstract

1. Introduction

2. The Algorithm

Algorithm 1.

Lemma 1 . —

Proof —

3. Convergence

Assumption 1 . —

Lemma 2 . —

Property 1 . —

Lemma 3 . —

Proof —

Theorem 1 . —

4. Numerical Experiments

Table 1.

Table 2.

Figure 1.

Figure 2.

Acknowledgments

Data Availability

Conflicts of Interest

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases