Skip to main content
Springer logoLink to Springer
. 2018 May 3;2018(1):105. doi: 10.1186/s13660-018-1696-9

A new smoothing modified three-term conjugate gradient method for l1-norm minimization problem

Shouqiang Du 1,, Miao Chen 1
PMCID: PMC5934501  PMID: 29755245

Abstract

We consider a kind of nonsmooth optimization problems with l1-norm minimization, which has many applications in compressed sensing, signal reconstruction, and the related engineering problems. Using smoothing approximate techniques, this kind of nonsmooth optimization problem can be transformed into a general unconstrained optimization problem, which can be solved by the proposed smoothing modified three-term conjugate gradient method. The smoothing modified three-term conjugate gradient method is based on Polak–Ribière–Polyak conjugate gradient method. For the Polak–Ribière–Polyak conjugate gradient method has good numerical properties, the proposed method possesses the sufficient descent property without any line searches, and it is also proved to be globally convergent. Finally, the numerical experiments show the efficiency of the proposed method.

Keywords: Nonsmooth optimization problem, Smoothing modified three-term conjugate gradient method, Global convergence

Introduction

In this paper, we consider the following nonsmooth optimization problems with l1-norm minimization problem

minxRn12Axb22+τx1, 1

where xRn, ARm×n (mn), bRm, τ>0, v2 denotes the Euclidean norm of v and v1=i=1n|vi| is the l1-norm of v. This problem is widely used in compressed sensing, signal reconstruction, analog-to-information conversion and related to many mathematical problems [116]. Problem (1) is also a typical compressed sensing scenario, which can reconstruct a length-n sparse signal from m observations. From the Bayesian perspective, problem (1) can also be seen as a maximum a posteriori criterion for estimating x from observations b=Ax+ω, where ω is the Gaussian noise of variance σ2. Many researchers have studied the numerical algorithms, which can be used to solve problem (1) with large-scale data such as fixed point method [1], gradient projection method for sparse reconstruction [2], interior-point continuation method [3, 4], iterative shrinkage thresholds algorithms in [5, 6], linearized Bregman method [7, 8], alternating direction algorithms [9], nonsmooth equations-based method [10] and some related methods [11, 12]. Just recently, a smoothing gradient method has been given for solving problem (1) based on the new transformed absolute value equations in [14, 15]. The transformation is based on the equivalence between a linear complementarity problem and an absolute value equation problem [17, 18]. The complementarity problem, the absolute value equation problem, and the related constrained optimization problem are three kinds of important optimization problems [1923]. On the other hand, the nonlinear conjugate gradient methods and smoothing methods are used widely to solve large-scale optimization problems [24, 25], total variation image restoration [26], monotone nonlinear equations with convex constraints [27], and nonsmooth optimization problems, such as nonsmooth nonconvex problems [28], minimax problem [29], P0 nonlinear complementarity problems [30]. Specially, the effectiveness of widely used and attained different numerical outcomes three-term conjugate gradient method, which is based on Hang–Zhang conjugate gradient method and Polak–Ribière–Polyak conjugate gradient method [3133], has been widely studied. Based on the above analysis, in this paper, we propose a new smoothing modified three-term conjugate gradient method to solve problem (1). The global convergence analysis of the proposed method is also presented.

The remainder of this paper is organized as follows. In Sect. 2, we give the transformation of problem (1), which includes the transformation of a linear complementarity problem transformed into an absolute value equation problem. In Sect. 3, we present the smoothing modified three-term conjugate gradient method and give the convergence analysis of it. Finally, we give some numerical results of the given method which show the effectiveness of it.

Results: the transformation of the problem

In this section, as in [9, 10, 14, 15], we set

x=uv,u0,v0,

where ui=(xi)+ and vi=(xi)+ for all i=1,2,,n with (xi)+=max{xi,0}. And we also have x1=1nTu+1nTv, where 1n=[1,1,,1]T is an n-dimensional vector with n ones. Thus, problem (1) can be transformed into the following problem:

minz=(u,v)T012bAz22+τ1nTu+τ1nTv,

i.e.,

minz012zTHz+cTz, 2

where

z=(u,v)T,c=τ12n+(cc),c=ATb,H=(ATAATAATAATA).

Since H is a positive semi-definite matrix, problem (2) can be translated into a linear variable inequality problem, which is to find zR2n such that

Hz+c,z˜z0,z˜0. 3

By the feasible structure of the feasible region of z, problem (3) is equivalent to the linear complementary problem, which to find zR2n such that

z0,Hz+c0,zT(Hz+c)=0. 4

Due to the equivalence of linear complementarity problems and absolute value equation problems, problem (4) can be transformed into the following absolute value equation problem, which is defined by

(H+I)z+c=|(HI)z+c|.

Then problem (1) can be transformed into the following problem:

minzR2nf(z)=12(H+I)z+c|(HI)z+c|2. 5

Main results and discussions

In this section, we present the smoothing modified three-term conjugate gradient method to solve problem (1). Firstly, we give the definition of smoothing function and smoothing approximation function of the absolute value function [14, 15, 29].

Definition 1

Let f:RnR be a local Lipschitz continuous function. We call f˜:Rn×R+R a smoothing function of f, if

limμ0f˜(x)=f(x),

where fμ() is continuously differentiable in Rn for any fixed μ>0.

The smoothing function of the absolute value function is defined by

Φiμ(z)=((HI)z+c)i2+μ2,μR+,i=1,2,,2n, 6

and satisfies

limμ0Φiμ(z)=|((HI)z+c)i|,i=1,2,,2n.

Based on (6), we obtain the following unconstrained optimization problem:

minzR2nf¯μ(z)=12i=12nf¯iμ2(z),

where f¯iμ(z)=((H+I)z+c)iΦiμ(z) is a smoothing function of f(z) in (5) for i=1,2,,2n.

Now, we give the smoothing modified three-term conjugate gradient method.

Algorithm 1

(Smoothing modified three-term conjugate gradient method)

  • Step 0.

    Choose 0<σ<1, 0<ρ<1, r>0, μ=2, η=1, ε>0, μ0>1 and, given an initial point z0Rn, let d0=g˜0, where g˜0=zf˜(z0,μ0).

  • Step 1.

    If zf˜ε, stop; otherwise, go to Step 2.

  • Step 2.
    Compute search direction by using β˜kBZAU and θ˜kBZAU, which are defined by
    β˜kBZAU=Zf˜μ(zk)T(Zf˜μ(zk)Zf˜μ(zk1))ηZf˜μ(zk1)Tdk1+μ|Zf˜μ(zk)Tdk1|, 7
    θ˜kBZAU=Zf˜μ(zk)Tdk1ηZf˜μ(zk1)Tdk1+μ|Zf˜μ(zk)Tdk1|,dk={Zf˜μ(zk)if k=0,Zf˜μ(zk)+β˜kBZAUdk1θ˜kBZAUyk1if k1, 8
    where yk1=Zf˜μ(zk)Zf˜μ(zk1).
  • Step 3.
    Compute αk by the Armijo line search, where αk=max{ρ0,ρ1,ρ2,} and ρi satisfies
    f˜(zk+ρidk,μk)f˜(zk,μk)+σρiZf˜μ(zk)Tdk. 9
  • Step 4.

    Compute zk+1=zk+αkdk, if zf¯(zk+1,μk)rμk, set μk+1=μk. Otherwise, let μk+1=σμk.

  • Step 5.

    Set k:=k+1 and go to Step 1.

Now, we give convergence analysis of Algorithm 1. In order to get the global convergence of Algorithm 1, we give the following assumptions.

Assumption 1

  • (i)

    The level set Ω={zR2n|f˜μ(z)f˜μ(z0)} is bounded.

  • (ii)
    There exists a positive constant L>0 such that Zf˜μ(zk) is Lipschitz continuous on an open convex set BΩ and for any z1,z2B, i.e.,
    Zf˜μ(z1)Zf˜μ(z2)Lz1z2.
  • (iii)
    There exists a positive constant m such that
    mdx2dTz2f˜μ(zk)dk,x,dRn,
    where z2f˜μ(zk) is the Hessian matrix of .

By Assumption 1, we can see that there exist positive constants γ>0 and b such that

Zf˜μ(zk)γ,zkΩ

and

z1z2b,z1,z2Ω.

Lemma 1

Suppose {zk} and {dk} are generated by Algorithm 1, then

Zf˜μ(zk)Tdk=Zf˜μ(zk)2

and

Zf˜μ(zk)dk.

Proof

By Algorithm 1, we have

dk=Zf˜μ(zk)+β˜kBZAUdk1θ˜kBZAUyk1.

Multiplying both sides of the above equation by Zf˜μ(zk)T, we obtain

Zf˜μ(zk)Tdk=Zf˜μ(zk)2+Zf˜μ(zk)T(Zf˜μ(zk)Zf˜μ(zk1))(Zf˜μ(zk)Tdk1)ηZf˜μ(zk1)Tdk1+μ|Zf˜μ(zk)Tdk1|(Zf˜μ(zk)Tdk1Zf˜μ(zk)T(Zf˜μ(zk)Zf˜μ(zk1))ηZf˜μ(zk1)Tdk1+μ|Zf˜μ(zk)Tdk1|,

i.e.,

Zf˜μ(zk)Tdk=Zf˜μ(zk)2.

Now, we have

|Zf˜μ(zk)Tdk|=Zf˜μ(zk)2

and

|Zf˜μ(zk)Tdk|Zf˜μ(zk)dk.

By

Zf˜μ(zk)2Zf˜μ(zk)dk

we have

Zf˜μ(zk)dk.

Hence, the proof is complete. □

Lemma 2

Suppose Assumption 1 holds and {zk} and {dk} are generated by Algorithm 1, then

k=0(Zf˜μ(zk)Tdk)2dk2<+

and

k=0Zf˜μ(zk)4dk2<+.

Proof

Using the techniques similar to lemmas in [3133], we can get this lemma. The description will not be repeated again. □

Lemma 3

Suppose Assumption 1 holds and xk and dk are generated by Algorithm 1, then

a1αkdk2Zf˜μ(zk)Tdk, 10

where a1=(1σ)1(m/2), m is a positive constant and 0<σ<1.

Proof

By using Taylor’s expansion, we have

f˜(zk+1)=f˜(zk)+Zf˜μ(zk)Tsk+12skTGksk, 11

where sk=zk+1zk=αkdk and

Gk=01z2f˜μ(zk+τsk)dτsk.

By Armijo line search, we know that

f˜(zk+1)f˜(zk)+σzf˜μ(zk)Tsk. 12

By (11) and (12), we have

12skTGksk(1σ)(Zf˜μ(zk)Tsk),

i.e.,

12(1σ)1mαkdk2Zf˜μ(zk)Tdk.

Denote a1=(1σ)1(m/2), we get (10). Thus, we complete the proof. □

By Lemmas 1, 2, and 3, we can get global convergence of the given method, i.e., the following theorem.

Theorem 1

Suppose Assumption 1 holds, then

limkZf˜μ(zk)=0.

Proof

From Assumption 1, (7), and (10), we have

|β˜kBZAU||Zf˜μ(zk)T(Zf˜μ(zk)Zf˜μ(zk1))η(Zf˜μ(zk1)Tdk1)|Zf˜μ(zk)Lαk1dk1η(a1αk1dk12),|β˜kBZAU|dk1(LZf˜μ(zk)η(a1dk1))dk1=LZf˜μ(zk)ηa1, 13

i.e.,

|θ˜kBZAU|yk1|Zf˜μ(zk)Tdk1η(Zf˜μ(zk1)Tdk1)|yk1.

From Assumption 1, (8), and (10), we have

|θ˜kBZAU|yk1(Zf˜μ(zk)Lxkxk1η(a1αk1dk12))dk1=LZf˜μ(zk)ηa1. 14

Combining (13), (14), and dk generated in Algorithm 1, we obtain

dkZf˜μ(zk)+|β˜kBZAU|dk1+|θ˜kBZAU|yk1Zf˜μ(zk)+LZf˜μ(zk)ηa1+LZf˜μ(zk)ηa1=(1+2Lηa1)Zf˜μ(zk).

Denote B=(1+2Lηa1), we have dk2BZf˜μ(zk)2, i.e.,

1dk21BZf˜μ(zk)2

and

BZf˜μ(zk)4dk2Zf˜μ(zk)4g˜k2=Zf˜μ(zk)2

By Lemma 2, we have

k=0Zf˜μ(zk)2<+.

This completes the proof. □

Numerical experiments

In this section, we give some numerical experiments of Algorithm 1, which are also considered in [2, 9, 10, 14, 15]. We compare Algorithm 1 with smoothing gradient method, GPSR method, debiased and minimum norm methods proposed in [2, 9, 10, 14] respectively. The numerical results of all the examples show that Algorithm 1 is effective. All codes run in MATLAB 8.0. For Examples 1 and 2, the parameters used in Algorithm 1 are chosen as σ=0.2, μ=5, η=2, γ=0.5, ε=106, ρ=0.4.

Example 1

Consider

A=(358415296574347216896574),b=(2417)T,

and τ=5.

From [14], we know that this example has a solution x=(0.3461,0.0850,0,0,0.3719,0)T. The optimal solution of Algorithm 1 is x=(0.3459,0.0850,0.0001,0.0009,0.3717,0.0001)T. In Figs. 1 and 2, we plot the evolution of the objective function versus the number of iterations when solving Example 1 with Algorithm 1 and the smoothing gradient method respectively.

Figure 1.

Figure 1

Numerical results for solving Example 1 with Algorithm 1

Figure 2.

Figure 2

Numerical results for solving Example 1 with smoothing gradient method

Example 2

Consider

A=(111000000110001110001100000011111)m×n,b=(111)T,

and τ=2. In this example, we choose m=30, n=100. The numerical results are given in Figs. 3 and 4.

Figure 3.

Figure 3

Numerical results for solving Example 2 with Algorithm 1

Figure 4.

Figure 4

Numerical results for solving Example 2 with smoothing gradient method

Example 3

Consider

A=(100111011011011011100111)m×n,b=(111)T,

and τ=6. In this example, we choose m=100, n=110. The numerical results are given in Figs. 5 and 6.

Figure 5.

Figure 5

Numerical results for solving Example 3 with Algorithm 1

Figure 6.

Figure 6

Numerical results for solving Example 3 with smoothing gradient method

Example 4

Consider

A=(4100111410141001411)m×n,b=(111)T,

and τ=10. In this example, we choose m=200, n=210. The numerical results are given in Figs. 7 and 8.

Figure 7.

Figure 7

Numerical results for solving Example 4 with Algorithm 1

Figure 8.

Figure 8

Numerical results for solving Example 4 with smoothing gradient method

Example 5

In this example, we consider a typical compressed sensing problem, which is also considered in [9, 10, 14, 15]. In this example, we choose m=24, n=26, σ=0.5, ρ=0.4, γ=0.5, ε=106, μ=5, η=2. The original signal contains 520 randomly generated ±1 spikes. Further, the m×n matrix A is obtained by first filling it with independent samples of a standard Gaussian distribution and then orthogonalization of its rows. In this example, we choose σ2=104 and τ=0.1ATy the same as suggested in [14]. The numerical results are shown in Fig. 9.

Figure 9.

Figure 9

Numerical results for solving Example 5 with Algorithm 1

Conclusion

In this paper, we have proposed a new smoothing modified three-term conjugate gradient method for solving l1-norm nonsmooth problems. Comparing with the smoothing gradient method, GPSR method, and other methods proposed in [2, 9, 10, 14], we can see that the smoothing modified three-term conjugate gradient method is simple and needs small storage. Comparing with the smoothing gradient method proposed in [14], the smoothing modified three-term conjugate gradient method is significantly faster especially in solving large-scale problems.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 11671220) and the Natural Science Foundation of Shandong Province (No. ZR2016AM29).

Authors’ contributions

All authors contributed equally. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Shouqiang Du, Email: sqdu@qdu.edu.cn.

Miao Chen, Email: 1551244396@qq.com.

References

  • 1.Hale E.T., Yin W., Zhang Y. Fixed-point continuation for l1-minimization: methodology and convergence. SIAM J. Optim. 2008;19(3):1107–1130. doi: 10.1137/070698920. [DOI] [Google Scholar]
  • 2.Figueiredo M.A.T., Nowak R.D., Wright S.J. Gradient projection for sparse reconstruction, application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process. 2007;1(4):586–597. doi: 10.1109/JSTSP.2007.910281. [DOI] [Google Scholar]
  • 3.Kim S.J., Koh K., Lustig M., Boyd S., Gorinevsky D. An interior-point method for large-scale l1-regularized least squares. IEEE J. Sel. Top. Signal Process. 2007;4:606–617. doi: 10.1109/JSTSP.2007.910971. [DOI] [Google Scholar]
  • 4.Turlach B.A., Venables W.N., Wright S.J. Simultaneous variable selection. Technometrics. 2005;47(3):349–363. doi: 10.1198/004017005000000139. [DOI] [Google Scholar]
  • 5.Daubechies I., Defrise M., Mol C.D. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 2004;57(11):1413–1457. doi: 10.1002/cpa.20042. [DOI] [Google Scholar]
  • 6.Beck A., Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2009;2:183–202. doi: 10.1137/080716542. [DOI] [Google Scholar]
  • 7.Osher S., Mao Y., Dong B., Yin W. Fast linearized Bregman iteration for compressive sensing and sparse denoising. Math. Comput. 2011;8(1):93–111. [Google Scholar]
  • 8.Yin W., Osher S., Goldfarb D., Darbon J. Bregman iterative algorithm for l1 minimization with applications to compressed sensing. SIAM J. Imaging Sci. 2008;1(1):143–168. doi: 10.1137/070703983. [DOI] [Google Scholar]
  • 9.Yang J., Zhang Y. Alternating direction algorithms for l1-problems in compressive sensing. SIAM J. Sci. Comput. 2011;33:250–278. doi: 10.1137/090777761. [DOI] [Google Scholar]
  • 10.Xiao Y.H., Wang Q.Y., Hu Q.J. Non-smooth equations based method for l1-norm problems with applications to compressed sensing. Nonlinear Anal. 2011;74(11):3570–3577. doi: 10.1016/j.na.2011.02.040. [DOI] [Google Scholar]
  • 11.Wakin M.B., Laska J.N., Duarte M.F., Baron D., Sarvotham S., Takhar D., Kelly K.F., Baraniuk R.G. 2006 International Conference on Image Processing. 2006. An architecture for compressive imaging; pp. 1273–1276. [Google Scholar]
  • 12.Lustig M., Donoho D., Pauly J.M. Sparse MRI: the application of compressed sensing for rapid MR imaging. Magn. Reson. Med. 2010;58(6):1182–1195. doi: 10.1002/mrm.21391. [DOI] [PubMed] [Google Scholar]
  • 13.Farajzadeh A.P., Plubtieng S., Ungchittrakool K. On best proximity point theorems without ordering. Abstr. Appl. Anal. 2014;2014:130439. [Google Scholar]
  • 14.Chen Y.Y., Gao Y., Liu Z.M., Du S.Q. The smoothing gradient method for a kind of special optimization problem. Oper. Res. Trans. 2017;21(2):119–125. [Google Scholar]
  • 15.Chen M., Du S.Q. The smoothing FR conjugate gradient method for solving a kind of nonsmooth optimization problem with l1-norm. Math. Probl. Eng. 2018;2018:5817931. [Google Scholar]
  • 16.Wang Y.J., Zhou G.L., Caccetta L., Liu W.Q. An alternative Lagrange-dual based algorithm for sparse signal reconstruction. IEEE Trans. Signal Process. 2011;59(4):1895–1901. doi: 10.1109/TSP.2010.2103066. [DOI] [Google Scholar]
  • 17.Mangasarian O.L. Absolute value programming. Comput. Optim. Appl. 2007;36(1):43–53. doi: 10.1007/s10589-006-0395-5. [DOI] [Google Scholar]
  • 18.Mangasarian O.L. A generalized Newton method for absolute value equations. Optim. Lett. 2009;3(1):101–108. doi: 10.1007/s11590-008-0094-5. [DOI] [Google Scholar]
  • 19.Caccetta L., Qu B., Zhou G.L. A globally and quadratically convergent method for absolute value equations. Comput. Optim. Appl. 2011;48(1):45–58. doi: 10.1007/s10589-009-9242-9. [DOI] [Google Scholar]
  • 20.Mangasarian O.L. Absolute value equation solution via concave minimization. Optim. Lett. 2007;1(1):3–8. doi: 10.1007/s11590-006-0005-6. [DOI] [Google Scholar]
  • 21.Chen H.B., Wang Y.J., Zhao H.G. Finite convergence of a projected proximal point algorithm for generalized variational inequalities. Oper. Res. Lett. 2012;40(4):303–305. doi: 10.1016/j.orl.2012.03.011. [DOI] [Google Scholar]
  • 22.Chen Y.Y., Gao Y. Two kinds of new Levenberg–Marquardt method for nonsmooth nonlinear complementarity problem. ScienceAsia. 2014;40:89–93. doi: 10.2306/scienceasia1513-1874.2014.40.089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Qu B., Xiu N.H. A relaxed extragradient-like method for a class of constrained optimization problem. J. Ind. Manag. Optim. 2007;3(4):645–654. doi: 10.3934/jimo.2007.3.645. [DOI] [Google Scholar]
  • 24.Fleteher R., Reeves C.M. Function minimization by conjugate gradients. Comput. J. 1964;7(2):149–154. doi: 10.1093/comjnl/7.2.149. [DOI] [Google Scholar]
  • 25.Dai Y.H., Yuan Y. A nonlinear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 1999;10(1):177–182. doi: 10.1137/S1052623497318992. [DOI] [Google Scholar]
  • 26.Yu G.H., Qi L.Q., Dai Y.H. On nonmonotone chambolle gradient projection algorithms for total variation image restoration. J. Math. Imaging Vis. 2009;35(2):143–154. doi: 10.1007/s10851-009-0160-3. [DOI] [Google Scholar]
  • 27.Yu Z.S., Lin J., Sun J., Xiao Y.H., Lin L.Y., Li Z.H. Spectral gradient projection method for monotone nonlinear equations with convex constraints. Appl. Numer. Math. 2009;59(10):2416–2423. doi: 10.1016/j.apnum.2009.04.004. [DOI] [Google Scholar]
  • 28.Chen X.J. Smoothing methods for nonsmooth, nonconvex minimization. Math. Program. 2012;134:71–99. doi: 10.1007/s10107-012-0569-0. [DOI] [Google Scholar]
  • 29.Pang D.Y., Du S.Q., Ju J.J. The smoothing Fletcher–Reeves conjugate gradient method for solving finite minimax problems. ScienceAsia. 2016;42:40–45. doi: 10.2306/scienceasia1513-1874.2016.42.040. [DOI] [Google Scholar]
  • 30.Zhang L., Wu S.Y., Gao T. Improved smoothing Newton methods for P0 nonlinear complementarity problems. Appl. Math. Comput. 2009;215(1):324–332. [Google Scholar]
  • 31.Zhang L., Zhou W. On the global convergence of the Hang–Zhang conjugate gradient method with Armijo line search. Acta Math. Sci. 2008;28(5):840–845. [Google Scholar]
  • 32.Sun M., Liu J. Three modified Polak–Ribière–Polyak conjugate gradient methods with sufficient descent property. J. Inequal. Appl. 2015;2015:125. doi: 10.1186/s13660-015-0649-9. [DOI] [Google Scholar]
  • 33.Baluch B., Salleh Z., Alhawarat A., Roslan U. A new modified three-term conjugate gradient method with sufficient descent property and its global convergence. J. Math. 2017;2017:2715854. [Google Scholar]

Articles from Journal of Inequalities and Applications are provided here courtesy of Springer

RESOURCES