Skip to main content
Springer logoLink to Springer
. 2021 Dec 20;2021(1):25. doi: 10.1186/s13663-021-00709-0

On compositions of special cases of Lipschitz continuous operators

Pontus Giselsson 1, Walaa M Moursi 2,3,
PMCID: PMC8685197  PMID: 34993526

Abstract

Many iterative optimization algorithms involve compositions of special cases of Lipschitz continuous operators, namely firmly nonexpansive, averaged, and nonexpansive operators. The structure and properties of the compositions are of particular importance in the proofs of convergence of such algorithms. In this paper, we systematically study the compositions of further special cases of Lipschitz continuous operators. Applications of our results include compositions of scaled conically nonexpansive mappings, as well as the Douglas–Rachford and forward–backward operators, when applied to solve certain structured monotone inclusion and optimization problems. Several examples illustrate and tighten our conclusions.

Keywords: Compositions of operators, Conically nonexpansive operators, Douglas–Rachford algorithm, Forward-backward algorithm, Hypoconvex function, Maximally monotone operator, Proximal operator, Resolvent

Introduction

In this paper, we assume that

X is a real Hilbert space

with the inner product and the induced norm . Let L>0 and let T:XX. Then T is L-Lipschitz continuous if ((x,y)X×X) TxTyLxy, and T is nonexpansive if T is 1-Lipschitz continuous, i.e., ((x,y)X×X) TxTyxy. In this paper, we study compositions of what we call (see Definition 3.1) identity-nonexpansive decompositions (I-N decompositions for short) of Lipschitz continuous operators. Let (α,β)R2 and let Id:XX be the identity operator on X. A Lipschitz continuous operator R admits an (α,β)-I-N decomposition if R=αId+βN for some nonexpansive operator N:XX. For instance, averaged,1 conically nonexpansive,2 and cocoercive3 operators are all Lipschitz continuous operators that admit special I-N decompositions.

We consider compositions of the form

R=RmR1, 1

where m{2,3,}, I={1,,m}, and (Ri)iI is a family of Lipschitz continuous operators such that, for each iI, Ri admits an (αi,βi)-I-N decomposition. That is, Ri=αiId+βiNi for all iI, where αi and βi are real numbers, and Ni:XX are nonexpansive for all iI. A straightforward (and naive) conclusion is that the composition is Lipschitz continuous with a constant ΠiI(|αi|+|βi|). However, such a conclusion can be further refined when, for instance, each Ri is an averaged operator. Indeed, in this case it is known that the composition is an averaged (and not just Lipschitz continuous) operator (see, e.g., [2, Proposition 4.46], [6, Lemma 2.2], and [21, Theorem 3]). In this paper, we provide a systematic study of the structure of R under additional assumptions on the decomposition parameters.

Our main result is stated in Theorem 3.4. We show that, for m=2, under a mild assumption on (α1,α2,β1,β2) composition (1) is a scalar multiple of a conically nonexpansive operator. As a consequence of Theorem 3.4, we show in Theorem 4.2 that, under additional assumptions on the decomposition parameters, compositions of scaled conically nonexpansive mappings are scaled conically nonexpansive mappings, see also [1] for a relevant result.4 Special cases of Theorem 4.2 include, e.g., compositions of averaged operators [2, Proposition 4.46] and compositions of averaged and negatively averaged operators [12].

Of particular interest are compositions R that are averaged, conically nonexpansive, or contractive. Let x0X. For an averaged (respectively contractive) operator R, the sequence (Rkx0)kN converges weakly (respectively strongly) towards a fixed point of R (if one exists) [2, Theorem 5.14]. For conically nonexpansive operators, a simple averaging trick gives an averaged operator with the same fixed point set as the conically nonexpansive operator. Iterating the new averaged operator yields a sequence that converges weakly to a fixed point of the conically nonexpansive operator. These properties have been instrumental in proving convergence for the Douglas–Rachford algorithm and the forward–backward algorithm. In this paper, we apply our composition result Theorem 4.2 to prove convergence of these splitting methods in new settings.

The Douglas–Rachford and forward–backward methods traditionally solve monotone inclusion problems of the form

Find xX such that 0Ax+Bx, 2

where A:XX and B:XX are maximally monotone, and, in the case of the forward–backward method, A is additionally assumed to be cocoercive. The Douglas–Rachford method iterates the Douglas–Rachford map T=12(Id+RγBRγA), where5γ>0 is a positive step-size. The Douglas–Rachford map is an averaged map of the composition of reflected resolvents. The forward–backward method iterates the forward–backward map T=JγB(IdγA), where γ>0 is a positive step-size. The forward–backward map is a composition of a resolvent and a forward-step.

In this paper, we show that for Douglas–Rachford splitting we need not impose monotonicity on the individual operators, but only on the sum, provided the sum is strongly monotone. The reflected resolvents RγA and RγB are negatively conically nonexpansive, the composition is conically nonexpansive, and a sufficient averaging gives an averaged map that converges to a fixed point when iterated. Relevant work appears in [9, 16], and [17].

More striking, for the forward–backward method, we show that it is sufficient that the sum is monotone (not strongly monotone as for DR). More specifically, we show that identity can be shifted between the two operators, while still guaranteeing averagedness of the forward–backward map T=JγB(IdγA). Indeed, the resolvent JγB is cocoercive and the forward-step (IdγA) is scaled averaged. This implies that the composition is averaged (given restrictions on the cocoercivity and averagedness parameters). Moreover, when the sum is strongly monotone, again with no assumptions on monotonicity of the individual operators, we show that the forward–backward map is contractive. We also prove tightness of our contraction factor.

We also provide, in Theorem 4.7, a generalization of Theorem 4.2 to the setting in (1) of compositions of more than two operators. We assume that all Ri are scaled conically nonexpansive operators and provide conditions on the parameters that give a specific scaled conically nonexpansive representation of R. Our condition is symmetric in the individual operators and allows for one of them to be scaled conic, while the rest must be scaled averaged. This is in compliance with the m=2 case in Theorem 4.2.

Finally, in Sect. 8, we provide graphical 2D-representations of different operator classes that admit I-N decompositions such as Lipschitz continuous operators, averaged operators, and cocoercive operators. We also provide 2D-representations of compositions of two such operator classes. Illustrations of the firmly nonexpansive (12-averaged ) and nonexpansive operator classes have previously appeared in [10, 11], and illustrations of more operator classes that admit particular I-N decompositions and their compositions have appeared in [14, 24] and in early preprints of [15].

Organization and notation

The remainder of this paper is organized as follows: Sect. 2 presents useful facts and auxiliary results that are used throughout the paper. In Sect. 3, we present the main abstract results of the paper. Section 4 presents the main composition results of Lipschitz continuous operators that admit I-N decompositions, under mild assumptions on the decomposition parameters, as well as illustrative and limiting examples. In Sect. 5 and Sect. 6, we present applications of our composition results to the Douglas–Rachford and forward–backward algorithms, respectively. In Sect. 7 we present applications of our results to optimization problems. Finally, in Sect. 8, we provide graphical representations of many different I-N decompositions and their compositions.

The notation we use is standard and follows, e.g., [2] or [23].

Facts and auxiliary results

Let ρR. Let A:XX. Recall that A is ρ-monotone if ((x,u)graA) ((y,v)graA)

xyuvρxy2 3

and is maximally ρ-monotone if any proper extension of graA will violate (3). In passing we point out that A is (maximally) monotone (respectively ρ-hypomonotone, ρ-strongly monotone) if ρ=0 (respectively ρ<0, ρ>0) see, e.g., [2, Chap. 20], [4, Definition 6.9.1], [7, Definition 2.2], and [23, Example 12.28].

Fact 2.1

Let A:XX, let B:XX, let λR{0}, and suppose that zer(A+B)=(A+B)1(0). Suppose that JA and JB are single-valued and that domJA=domJB=X. Set

T=(1λ)Id+λRBRA. 4

Then T is single-valued, domT=X, and

zer(A+B)=JA(FixRBRA)=JA(FixT). 5

Proof

See [9, Lemma 4.1]. □

Proposition 2.2

Let A:XX, let B:XX, and suppose that zer(A+B)=(A+B)1(0). Suppose that JB is single-valued and that domJB=X. Set

T=JB(IdA). 6

Then T is single-valued, domT=X, and

zer(A+B)=FixT. 7

Proof

The proof is similar to the proof of [2, Proposition 26.1(iv)].6 Indeed, let xX. Then xzer(A+B) ⇔ AxBx ⇔ (IdA)x(Id+B)x ⇔ x=JB(IdA)x=Tx. □

Lemma 2.3

Let λR, let R1:XX, let R2:XX, and set

R(λ)=(1λ)Id+λR2R1. 8

Let (x,y)X×X. Then

R(λ)xR(λ)y(IdR(λ))x(IdR(λ))y=(12λ)xy(IdR(λ))x(IdR(λ))y+λ2(Id+R1)x(Id+R1)y(IdR1)x(IdR1)y+λ2(Id+R2)R1x(Id+R2)R1y(IdR2)R1x(IdR2)R1y. 9

Proof

See Appendix A. □

Proposition 2.4

Let αR, let βR, let N:XX, and set T=αId+βN. Let (x,y)X×X. Then the following hold:

β2(xy2NxNy2)=(β2α2)xy2TxTy2+2αxyTxTy 10a
=(β2α2)xy2(12α)TxTy2+2αTxTy(IdT)x(IdT)y 10b
=(β2α(α1))xy2((1α)TxTy2+α(IdT)x(IdT)y2). 10c

Proof

Indeed, we have

β2(xy2NxNy2)=β2xy2(Txαx)(Tyαy)2 11a
=β2xy2(TxTy2+α2xy22αTxTyxy) 11b
=(β2α2)xy2(TxTy22αTxTyxy) 11c
=(β2α2+α)xy2((1α)TxTy2+αTxTy22αTxTyxy+αxy2) 11d
=(β2α(α1))xy2((1α)TxTy2+α(IdT)x(IdT)y2). 11e

This proves (10a) and (10c) in view of (11c) and (11e). Finally, note that (β2α2)xy2TxTy2+2αxyTxTy=(β2α2)xy2(12α)TxTy22αTxTy2+2αxyTxTy=(β2α2)xy2(12α)TxTy2+2αTxTy(IdT)x(IdT)y. This proves (10b). □

Proposition 2.5

Let αR, let βR, let N:XX, and set T=αId+βN. Let (x,y)X×X. Then the following are equivalent:

  • (i)

    N is nonexpansive.

  • (ii)

    TxTy22αxyTxTy(β2α2)xy2.

  • (iii)

    (12α)TxTy22αTxTy(IdT)x(IdT)y(β2α2)xy2.

  • (iv)

    (2α1)(IdT)x(IdT)y22(1α)TxTy(IdT)x(IdT)y(β2(1α)2)xy2.

  • (v)

    (1α)TxTy2+α(IdT)x(IdT)y2(β2α(α1))xy2.

Proof

(i)⇔(ii)⇔(iii)⇔(v): This is a direct consequence of Proposition 2.4. (i)⇔(iv): Applying (10b) with (T,α,β) replaced by (IdT,1α,β) yields β2(xy2NxNy2)=(β2(1α)2)xy2(2α1)(IdT)x(IdT)y2+2(1α)TxTy(IdT)x(IdT)y. The proof is complete. □

Proposition 2.6

Let αR, let N:XX, and set T=(1α)Id+αN. Let (x,y)X×X. Then the following are equivalent:

  • (i)

    N is nonexpansive.

  • (ii)

    TxTy22(1α)xyTxTy(2α1)xy2.

  • (iii)

    (2α1)TxTy22(1α)TxTy(IdT)x(IdT)y(2α1)xy2.

  • (iv)

    (12α)(IdT)x(IdT)y22αTxTy(IdT)x(IdT)y.

  • (v)

    (1α)(IdT)x(IdT)y2αxy2αTxTy2.

Proof

Apply Proposition 2.5 with (α,β) replaced by (1α,α). □

Lemma 2.7

Let λ<1. Then

x2λy2λ1λx+y2. 12

Proof

Let δ>0. By Young’s inequality, x+y2=x2+2x,y+y2(1δ)x2+(1δ1)y2. Equivalently, x+y2(1δ)x2(1δ1)y2. Now, replace (x,y,δ) by (y,x+y,1λ). □

Proposition 2.8

Let α]0,1[, let β>0, and let T:XX. Then T is α-averaged if and only if T=(1β)Id+βM and M is αβ-conically nonexpansive.

Proof

Indeed, T is α-averaged if and only if there exists a nonexpansive mapping N:XX such that T=(1α)Id+αN. Equivalently,

T=(1α)Id+αN=(1β)Id+β((1αβ)Id+αβN),

and the conclusion follows by setting M=(1αβ)Id+αβN. □

The following three lemmas can be directly verified, hence we omit the proof.

Lemma 2.9

Let α>0, and let T:XX. Then T is α-conically nonexpansiveIdT is 12α-cocoerciveIdT is maximally monotone.

Lemma 2.10

Let β>0, let μR, and let A:XX. Suppose that A is maximally μ-monotone and 1β-cocoercive. Then μ1β.

Lemma 2.11

Let β>0, let T:XX, and let ββ. Suppose that T is 1β-cocoercive. Then T is 1β-cocoercive.

Lemma 2.12

Let β>0, and let A:XX. Suppose that A is β-Lipschitz continuous. Then the following hold:

  • (i)

    A is maximally (β)-monotone.

  • (ii)

    A+βId is 12β-cocoercive.

Proof

See Appendix B. □

Lemma 2.13

Let β>δ>0, let T1:XX, and let T2:XX. Suppose that T1 (respectively T2) is 1β-cocoercive (respectively 1δ-cocoercive). Then T1T2 is β-Lipschitz continuous.

Proof

See Appendix C. □

As a corollary, we obtain the following result which was stated in [27, page 4].

Corollary 2.14

Let f1:XR, f2:XR be Frechét differentiable convex functions, and let β>δ>0. Suppose that f1 (respectively f2) is β-Lipschitz continuous (respectively δ-Lipschitz continuous). Then the following hold:

  • (i)

    f1f2 is β-Lipschitz continuous.

  • (ii)

    Suppose that f1f2 is convex. Then f1f2 is 1β-cocoercive.

Proof

See Appendix D. □

Lemma 2.15

Let α]0,1[, let δ]0,1], and let T:XX. Suppose that T is α-averaged. Then the following hold:

  • (i)

    δT is (1δ(1α))-averaged.

  • (ii)

    Suppose that δ]0,1[. Then δT is a Banach contraction with constant δ.

Proof

See Appendix E. □

Let A be maximally ρ-monotone, where ρ>1. Then (see [9, Proposition 3.4] and [3, Corollary 2.11 and Proposition 2.12]) we have

JA is single-valued and domJA=X. 13

The following result involves resolvents and reflected resolvents of ρ-monotone operators.

Proposition 2.16

Let A be ρ-monotone, where ρ>1. Then the following hold:

  • (i)

    JA is (1+ρ)- cocoercive, in which case JA is Lipschitz continuous with constant 11+ρ.

  • (ii)

    RA is 11+ρ-conically nonexpansive.

  • (iii)

    Suppose that ρ0. Then RA is Lipschitz continuous with constant 1ρ1+ρ.

Proof

(i): See [9, Lemma 3.3(ii)]. Alternatively, it follows from [3, Corollary 3.8(ii)] that IdT is 12(1+ρ)-averaged. Now apply Lemma 2.9 with T replaced by IdJA. (ii): It follows from (i) that there exists a nonexpansive operator N:XX such that JA=12(1+ρ)(Id+N). Now, RA=Id2JA=Id11+ρ(Id+N)=(111+ρ)Id+11+ρN. (iii): Indeed, let (x,y)X×X and let N be as defined above. We have

RAxRAy=ρ1+ρ(xy)11+ρ(NxNy)ρ1+ρxy+11+ρNxNy 14a
1ρ1+ρxy. 14b

The proof is complete. □

Compositions

Definition 3.1

((α,β)-I-N decomposition)

Let R:XX be Lipschitz continuous, and let7(α,β)R×R+. We say that R admits an (α,β)-identity-nonexpansive (I-N) decomposition8 if there exists a nonexpansive operator N:XX such that R=αId+βN.

Throughout the rest of this paper, we assume that

R1:XX and R2:XX are Lipschitz continuous operators.

Proposition 3.2

Let α1],1[, let α2],1[, let β1R+, let β2R+, and suppose that α2(α21)β22. Set

δ1=α11α1(1(1α2)2β221α2), 15a
δ2=α21α2, 15b
δ3=1((1α1)2β121α1(1(1α2)2β221α2)+(1α2)2β22(1α2)). 15c

Suppose that R1 admits an (α1,β1)-I-N decomposition and that R2 admits an (α2,β2)-I-N decomposition. Then ((x,y)X×X) we have

R2R1xR2R1y2+δ1(IdR1)x(IdR1)y2+δ2(IdR2)R1x(IdR2)R1y2δ3xy2. 16

Proof

Set Ti=12(Id+Ri)=1+αi2Id+βi2Ni, and observe that by Proposition 2.5 applied with (T,α,β) replaced by (Ti,1+αi2,βi2), i{1,2}, we have ((x,y)X×X)

TixTiy(IdTi)x(IdTi)yαi1αi(IdTi)x(IdTi)y2+(1αi)2βi24(1αi)xy2. 17

Equivalently,

(Id+Ri)x(Id+Ri)y(IdRi)x(IdRi)yαi1αi(IdRi)x(IdRi)y2+(1αi)2βi21αixy2. 18

Observe also that, because α2<1, we have

α2(α21)β221(1α2)2β221α20. 19

It follows from (18), applied with i=2 and (x,y) replaced by (R1x,R1y) in (20c) and by i=1 in (20f), in view of (19) that

xy2R2R1xR2R1y2=xy2R1xR1y2+R1xR1y2R2R1xR2R1y2 20a
=(Id+R1)x(Id+R1)y(IdR1)x(IdR1)y+(Id+R2)R1x(Id+R2)R1y(IdR2)R1x(IdR2)R1y 20b
(Id+R1)x(Id+R1)y(IdR1)x(IdR1)y+α21α2(IdR2)R1x(IdR2)R1y2+(1α2)2β221α2R1xR1y2 20c
=(Id+R1)x(Id+R1)y(IdR1)x(IdR1)y+α21α2(IdR2)R1x(IdR2)R1y2+(1α2)2β221α2(xy2(Id+R1)x(Id+R1)y(IdR1)x(IdR1)y) 20d
=(1(1α2)2β221α2)(Id+R1)x(Id+R1)y(IdR1)x(IdR1)y+α21α2(IdR2)R1x(IdR2)R1y2+(1α2)2β221α2xy2 20e
(1(1α2)2β221α2)(α11α1(IdR1)x(IdR1)y2+(1α1)2β121α1xy2)+α21α2(IdR2)R1x(IdR2)R1y2+(1α2)2β221α2xy2 20f
=α11α1(1(1α2)2β221α2)(IdR1)x(IdR1)y2+α21α2(IdR2)R1x(IdR2)R1y2+((1α1)2β121α1(1(1α2)2β221α2)+(1α2)2β221α2)xy2. 20g

Rearranging yields the desired result. □

Theorem 3.3

Let α1],1[, let α2],1[, let β1R+, let β2R+, and suppose that α2(α21)β22. Let δ1, δ2, and δ3 be defined as in (15a)(15c). Set

δ4=δ1δ2δ1+δ2, 21

and suppose that δ1+δ2>0, that δ3δ4+δ3δ40, and that δ4>1. Suppose that R1 admits an (α1,β1)-I-N decomposition, and that R2 admits an (α2,β2)-I-N decomposition. Then R2R1 admits an (α,β)-I-N decomposition, where

α=δ41+δ4,β=δ3δ4+δ3δ41+δ4. 22

Proof

Let δ_:=min(δ1,δ2), let δ¯:=max(δ1,δ2), and let λ:=δ_/δ¯ (i.e., λ=δ1/δ2 if δ1δ2, and λ=δ2/δ1 if δ1δ2). Then Proposition 3.2 and Lemma 2.7 imply that

δ3xy2R2R1xR2R1y2δ1(IdR1)x(IdR1)y2+δ2(IdR2)R1x(IdR2)R1y2 23a
=δ¯(δ1δ¯(IdR1)x(IdR1)y2+δ2δ¯(IdR2)R1x(IdR2)R1y2 23b
δ¯(λ1λ(IdR1)x(IdR1)y+(IdR2)R1x(IdR2)R1y2) 23c
=λδ¯1λ(IdR2R1)x(IdR2R1)y2 23d
=δ_δ¯δ¯+δ_(IdR2R1)x(IdR2R1)y2 23e
=δ4(IdR2R1)x(IdR2R1)y2. 23f

Comparing (23a)–(23f) to Proposition 2.5 applied with T replaced by R2R1, we learn that there exist a nonexpansive operator N:XX and (α,β)R2 such that R2R1=αId+βN, where δ3=β2+α(1α)1α and δ4=α1α. Equivalently, α=δ41+δ4, hence β=δ3δ4+δ3δ41+δ4, as claimed. □

Theorem 3.4

Let α1R, let α2R, let β1>0, let β2>0, suppose that α1+β1>0, that α2+β2>0, and that either β1β2(α1+β1)(α2+β2)<1 or max{β1α1+β1,β2α2+β2}=1. Set

κ=(α1+β1)(α2+β2), 24a
θ={β1α2+β2α1α1α2+α1β2+α2β1,β1β2(α1+β1)(α2+β2)<1;1,max{β1α1+β1,β2α2+β2}=1. 24b

Suppose that R1 admits an (α1,β1)-I-N decomposition, and that R2 admits an (α2,β2)-I-N decomposition. Then θ]0,+[ and R2R1 admits a (κ(1θ),κθ)-I-N decomposition, i.e., R2R1 is κ-scaled θ-conically nonexpansive. That is, there exists a nonexpansive operator N:XX such that

R2R1=κ(1θ)Id+κθN. 25

Proof

Let θi=βiαi+βi>0, and observe that

Ri=(αi+βi)((1θi)Id+θiNi),i{1,2}. 26

Next, let N˜2=1α1+β1N2(α1+β1)Id, and note that N˜2 is nonexpansive. Now, set

R˜1=(1θ1)Id+θ1N1,R˜2=(1θ2)Id+θ2N˜2. 27

Then (26) and (27) yield

R2R1=((α2+β2)((1θ2)Id+θ2N2))((α1+β1)((1θ1)Id+θ1N1)) 28a
=(α1+β1)(α2+β2)(1α1+β1((1θ2)Id+θ2N2))((α1+β1)R˜1) 28b
=(α1+β1)(α2+β2)R˜2R˜1. 28c

We proceed by cases. Case I: α1α2=0. Observe that 0{α1,α2}max{β1α1+β1,β2α2+β2}=max{θ1,θ2}=1. The conclusion follows by observing that R˜i is nonexpansive, i{1,2}.

Case II: α1α20. By assumption we must have β1α1+β1β2α2+β2=θ1θ2<1. We claim that R˜i, i{1,2}, satisfy the conditions of Theorem 3.3 with (αi,βi) replaced by (1θi,θi). Indeed, observe that (1θ2)(1θ21)θ22θ2(θ21)θ22θ21θ2, which is always true. Moreover, replacing (αi,βi) by (1θi,θi) yields δ1=1θ1θ1, δ2=1θ2θ2, δ3=1, and, consequently, δ4=θ2(1θ1)+θ1(1θ2)(1θ1)(1θ2). We claim that

θ1+θ22θ1θ2>0. 29

Indeed, recall that θ1+θ22θ1θ2=θ1θ2(1θ1+1θ22)>θ1θ2(1θ1+θ12)=θ1θ2(θ11θ1)2>0. This implies that δ1+δ2=θ1+θ22θ1θ2θ1θ2>0. Moreover,

δ4=(1θ1)(1θ2)θ2(1θ1)+θ1(1θ2)=1θ1θ2+θ1θ2θ1+θ22θ1θ2=1+1θ1θ2θ1+θ22θ1θ2>1. 30

Therefore, by Theorem 3.3, we conclude that there exists a nonexpansive operator N:XX such that R˜2R˜1=αId+βN, α=δ41+δ4=1θ1θ2+θ1θ21θ1θ2=α1α2α1α2+α1β2+α2β1, and β=11+δ4=θ1+θ22θ1θ21θ1θ2=β1α2+β2α1α1α2+α1β2+α2β1. Now combine with (28a)–(28c). □

Applications to special cases

We start this section by recording the following simple lemma which can be easily verified, hence we omit the proof.

Lemma 4.1

Set (R˜1,R˜2)=(R1,R2(Id)). Then the following hold:

  • (i)

    R2R1=R˜2R˜1.

  • (ii)

    Let αi>0, let δiR{0}, and suppose that 1δiRi is αi-conically nonexpansive. Then 1δiR˜i is αi-conically nonexpansive.

Theorem 4.2

Let i{1,2}, let αi>0, let δiR{0}, let Ri:XX be such that 1δiRi is αi-conically nonexpansive. Suppose that either α1α2<1 or max{α1,α2}=1. Set

{α1+α22α1α21α1α2,α1α2<1;1,max{α1,α2}=1. 31

Then there exists a nonexpansive operator N:XX such that

R2R1=δ1δ2((1α)Id+αN). 32

Furthermore, α<1[α1<1 and α2<1].

Proof

Set (R˜1,R˜2)=(R1,R2(Id)) and set R=R2R1. The proof proceeds by cases.

Case I: δi>0, i{1,2}. By assumption, there exist nonexpansive operators Ni:XX such that Ri=δi(1αi)Id+δiαiNi. Moreover, one can easily check that Ri satisfy the assumptions of Theorem 3.4 with (αi,βi) replaced by (δi(1αi),δiαi). Applying Theorem 3.4, with (αi,βi) replaced by (δi(1αi),δiαi), we learn that there exists a nonexpansive operator N:XX such that R=(δ1(1α1)+δ1α1)(δ2(1α2)+δ2α2)((1α)Id+αN)=δ1δ2((1α)Id+αN), where

α=δ1(1α1)δ2α2+δ2(1α2)δ1α1δ1(1α1)δ2α2+δ2(1α2)δ1α1+δ1(1α1)δ2(1α2)=α1+α22α1α21α1α2. 33

Finally, observe that α<1 ⇔ [α1α2<1 and α1+α22α1α21α1α2<1] ⇔ [α1α2<1 and 1α1α2>α1+α22α1α2] ⇔ [α1α2<1 and (1α1)(1α2)>0] ⇔ [α1<1 and α2<1].

Case II: δi<0, i{1,2}. Observe that 1δiRi=1|δi|Ri is αi-conically nonexpansive. Therefore, Lemma 4.1(ii), applied with δi replaced by |δi|, implies that 1|δi|R˜i are αi-conically nonexpansive. Now combine Lemma 4.1(i) and Case I applied with (Ri,δi) replaced by (R˜i,|δi|).

Case III: δ1<0 and δ2>0: Observe that 1δ1R1=1|δ1|R1 is α1-conically nonexpansive. Now, using Lemma 4.1(i)&(ii), we have R=R2R1=R˜2R˜1, and 1δ2R˜2 is α2-conically nonexpansive. Now combine with Case II, applied with (R1,R2,δ1) replaced by (R˜1,R˜2,|δ1|), to learn that there exists a nonexpansive mapping N:XX such that R=|δ1|δ2((1α)Id+αN), and the conclusion follows.

Case IV: δ1>0 and δ2<0: Indeed, R=R2R1. Now combine with Case I applied with R2 replaced by R2, in view of Lemma 4.1(ii). □

Corollary 4.3

Let α]0,1[, let β>0, let δR{0}, let {i,j}={1,2}, and suppose that 1δRi is α-averaged, and that Rj is 1β-cocoercive. Set α=12α. Then α]0,1[, and there exists a nonexpansive operator N:XX such that

R2R1=βδ((1α)Id+αN). 34

Proof

Suppose first that (i,j)=(1,2), and observe that there exists a nonexpansive operator such that R2=β2(Id+N). Applying Theorem 4.7 with m=2, (α1,α2,δ1,δ2) replaced by (α,1/2,δ,β) yields that there exists a nonexpansive operator N such that R2R1=βδ((1α)Id+αN), where

α=α+122α21α2=12α]0,1[. 35

The case (i,j)=(2,1) follows similarly. □

The assumption α1α2<1 is critical in the conclusion of Theorem 4.2 as we illustrate below.

Example 4.4

(α1=α2>1)

Let α>1, and set R1=R2=(1α)IdαId=(12α)Id. Then

R2R1=(12α)2Id=(14α+4α2)Id. 36

Hence, IdR2R1=4α(1α)Id. That is, IdR2R1 is not monotone; hence, R2R1 is not conically nonexpansive by Lemma 2.9 applied with T replaced by R2R1.

The following proposition provides an abstract framework to construct a family of operators R1 and R2 such that R1 is α1-conically nonexpansive, R2 is α2-conically nonexpansive, α1α2>1, and the composition R2R1 fails to be conically nonexpansive.

Proposition 4.5

Let θR, let α1>0, let α2>0, let

Rθ=[cosθsinθsinθcosθ], 37

set

R1=(1α1)Id+α1Rθ,R2=(1α2)Idα2Rθ, 38

and set

κ=α1+α22α1α2sin2θ(α1α2)cosθ. 39

Then R1 is α1-conically nonexpansive, and R2 is α2-conically nonexpansive. Moreover, we have the implication κ<0R2R1 is not conically nonexpansive.

Proof

Set S=Rπ/2, and observe that S2=Id and that Rθ=(cosθ)Id+(sinθ)S. Now,

R2R1=((1α1)Id+α1Rθ)((1α2)Idα2Rθ) 40a
=(1α1α2+α1α2)Id+(α1α2)Rθα1α2R2θ 40b
=(1α1α2+α1α2+(α1α2)cosθα1α2cos(2θ))Id+((α1α2)sinθα1α2sin(2θ))S 40c
=(1α1α2+α1α2+(α1α2)cosθα1α2(2cos2θ1))Id+((α1α2)sinθα1α2sin(2θ))S 40d
=(1α1α2+2α1α2sin2θ+(α1α2)cosθ)Id+((α1α2)sinθα1α2sin(2θ))S. 40e

Consequently,

IdR2R1=(α1+α22α1α2sin2θ(α1α2)cosθ)Id((α1α2)sinθα1α2sin(2θ))S. 41

Hence, (xR2)

(IdR2R1)xx=(α1+α22α1α2sin2θ(α1α2)cosθ)x2=κx2. 42

Now, R2R1 is conically nonexpansive ⇒ IdR2R1 is monotone by Lemma 2.9, and the conclusion follows in view of (42). □

The following example provides two concrete instances where: (i) α1>1, α2>1, hence α1α2>1, (ii) α1>1, α2<1, α1α2>1. In both cases, R2R1 is not conically nonexpansive.

Example 4.6

Suppose that one of the following holds:

  • (i)

    θ]0,π/2[, ϵ0, δ0, α1=1+ϵsin2θ, and α2=1+δsin2θ.

  • (ii)

    θ]π/4,π/2[, ϵ>cos2θ(2cos2θ)(12cos2θ)(1+cosθ)+cosθ, α1=1+ϵsin2θ, and α2=sin2θ.

Let Rθ be defined as in (37), let R1=(1α1)Id+α1Rθ, and let R2=(1α2)Idα2Rθ. Then α1α2>1, and R2R1 is not conically nonexpansive.

Proof

Let κ be defined as in (39). In view of Proposition 4.5, it is sufficient to show that κ<0. (i): Note that κ<0κsin2θ<0. Now,

κsin2θ=2+ϵ+δ(ϵδ)cosθ22ϵ2δ2ϵδ 43a
=(ϵ(1+cosθ)+δ(1cosθ)+2ϵδ)<0. 43b

(ii): We have

κ=1+ϵ+sin4θsin2θ2(1+ϵ)sin2θ1+ϵsin4θsin2θcosθ 44a
=1sin2θ(2(1+ϵ)sin4θ(1+ϵ+sin4θ)+(1+ϵsin4θ)cosθ) 44b
=11cos2θ((2sin4θ+cosθ1)ϵ+sin4θ(1cosθ)(1cosθ)) 44c
=1cosθ1cos2θ((2(1+cosθ)(1cos2θ)1)ϵ+12cos2θ+cos4θ1) 44d
=11+cosθ((1+2cosθ2cos2θ2cos3θ)ϵcos2θ(2cos2θ)) 44e
=11+cosθ((12cos2θ)(1+cosθ)+cosθ)ϵcos2θ(2cos2θ)). 44f

Now, observe that (θ]π4,π2[) 12cos2θ=cos(2θ)>0. Consequently, (12cos2θ)(1+cosθ)+cosθ>cosθ>0. Now use the assumption ϵ>cos2θ(2cos2θ)(12cos2θ)(1+cosθ)+cosθ to learn that (12cos2θ)(1+cosθ)+cosθ)ϵcos2θ(2cos2θ)>0, hence κ<0, and the conclusion follows. □

Theorem 4.7

(composition of m scaled conically nonexpansive operators)

Let m2 be an integer, set I={1,,m}, let (Ri)iI be a family of operators from X to X, let rI, let αi be real numbers such that {αiiI{r}}]0,1[ and αr>0, let δi be real numbers in R{0}, and suppose that, for every iI, 1δiRi is αi-conically nonexpansive. Set

α=i=1irmαi1αi1+i=1irmαi1αi. 45

Suppose that αrα<1, and set

α={i=1mαi1αi1+i=1mαi1αi,αr1;1,αr=1. 46

Then there exists a nonexpansive operator N:XX such that

RmR1=δmδ1((1α)Id+αN). 47

Proof

First, observe that (iI{r}), 1δiRi is nonexpansive. If αr=1, then (i{1,,m}) Ri is |δi|-Lipschitz continuous and the conclusion readily follows. Now, suppose that αr1. We proceed by induction on k{2,,m}. At k=2, the claim holds by Theorem 4.2. Now, suppose that the claim holds for some k{2,,m1}. Let (Ri)1ik+1 be a family of operators from X to X, let r{1,,k,k+1}, let αi be real numbers such that {αii{1,,k,k+1}{r}}]0,1[ and αr]0,+[{1}, let δi be real numbers in R{0}, and suppose that, for every i{1,,k+1}, 1δiRi is αi-conically nonexpansive. Set β=i=1irk+1αi1αi1+i=1irk+1αi1αi, and suppose that αrβ<1. We examine two cases.

Case I: αk+1=αr. In this case the conclusion follows by applying Theorem 4.2 in view of the inductive hypothesis with (R1,R2) replaced by (RkR1,Rk+1) and (δ1,δ2,α1,α2) replaced by (δ1δk,δk+1,(i=1kαi1αi)/(1+i=1kαi1αi),αk+1).

Case II: αk+1αr. We claim that

αk+1i=1kαi1αi1+i=1kαi1αi<1. 48

To this end, set αˆ=i=1irkαi1αi1+i=1irkαi1αi, and observe that αˆ<β. By assumption we have αrβ<1. Altogether, we conclude that αrαˆ<1. It follows from the inductive hypothesis that

1δ1δk(RkR1) is i=1kαi1αi1+i=1kαi1αi-conically nonexpansive. 49

Next note that

i=1kαi1αi1+i=1kαi1αi=i=1irkαi1αi+αr1αr1+i=1irkαi1αi1+i=1irkαi1αi+αr1αr1+i=1irkαi1αi=αˆ+αr(1αr)(1+i=1irkαi1αi)1+αr(1αr)(1+i=1irkαi1αi) 50a
=αˆ(1αr)(1+i=1irkαi1αi)+αr(1αr)(1+i=1irkαi1αi)+αr 50b
=αr(1αˆ(1+i=1irkαi1αi))+αˆ(1+i=1irkαi1αi)1+(1αr)i=1irkαi1αi. 50c

Because αrβ<1, we learn that 1+(1αr)i=1irkαi1αi>0. Moreover, because αˆ<1, we have αk+1αˆ<1. Therefore, (50a)–(50c) implies

αk+1i=1kαi1αi1+i=1kαi1αi<1 51a
αk+1(αr(1αˆ(1+i=1irkαi1αi))+αˆ(1+i=1irkαi1αi))<1+(1αr)i=1irkαi1αi 51b
αr(αk+1(1αˆ(1+i=1irkαi1αi))+i=1irkαi1αi)<(1+i=1irkαi1αi)(1αk+1αˆ) 51c
αr(αk+1(1i=1irkαi1αi)+i=1irkαi1αi)<(1+i=1irkαi1αi)(1αk+1αˆ) 51d
αrαk+1(1i=1irkαi1αi)+i=1irkαi1αi(1+i=1irkαi1αi)(1αk+1αˆ)<1. 51e

Now, observe that

αk+1(1i=1irkαi1αi)+i=1irkαi1αi=(i=1irkαi1αi+αk+11αk+1)(1αk+1)=i=1irk+1αi1αi(1αk+1) 52

and

(1+i=1irkαi1αi)(1αk+1αˆ)=1+i=1irkαi1αiαk+1i=1irkαi1αi 53a
=(1+i=1irkαi1αi+αk+11αk+1)(1αk+1) 53b
=(1+i=1irk+1αi1αi)(1αk+1). 53c

In view of (52) and (53a)–(53c), (51a)–(51e) becomes

αk+1i=1kαi1αi1+i=1kαi1αi<1αri=1k+1α11αi1+i=1k+1α11αi=αrβ<1. 54

This proves (48). Now proceed similar to Case I in view of (48) and (49). □

The assumption αrα<1 is critical in the conclusion of the above theorem as we illustrate in the following example.

Example 4.8

Let ϵ>0, let δ>1, let α1]0,12((ϵ+δ)2+4(ϵ+δ))[, let α2=α1+δ+ϵ, and let

S=[0110]. 55

Set R1=(1α1)Idα1S, R2=(1α2)Id+α2S, R3=1δS, and

R=R3R2R1. 56

Then R=R3R1R2=R1R2R3=R1R3R2=R2R3R1=R2R1R3. Moreover, the following hold:

  • (i)

    α1]0,1[, α2>1, and α1α2<1.

  • (ii)

    R3 is α3-conically nonexpansive where α3=1+δ2δ]1/2,1].

  • (iii)

    α1+α22α1α21α1α2α3>1.

  • (iv)

    R=(ϵ+δδ)Id+(α1+α22α1α21δ)S.

  • (v)

    IdR=ϵδId(α1+α22αα21δ)S. Hence, IdR is not monotone.

  • (vi)

    R is not conically nonexpansive.

Proof

It is straightforward to verify that R=R3R1R2=R1R2R3=R1R3R2=R2R3R1=R2R1R3. (i): It is clear that α1]0,1[ and that α2>1. Note that α1α2<1α12+(ϵ+δ)α11<0α1 lies between the roots of the quadratic x2+(ϵ+δ)x1, and the conclusion follows from the quadratic formula. (ii): This follows from [2, Proposition 4.38]. (iii): Indeed, in view of (i) we have

α1+α22α1α21α1α2α3>1(α1+α22α1α2)α3>1α1α2 57a
(α1+α22α1α2)(1+δ)>2(1α1α2)δ 57b
(α1+α2)(1+δ)2α1α22α1α2δ>2δ2α1α2δ 57c
(α1+α2)(1+δ)2α1α2>2δ 57d
(2α1+ϵ+δ)(1+δ)2α1(α1+ϵ+δ)>2δ 57e
2α1(1+δα1ϵδ)+δ2+δ(1+ϵ)+ϵ>2δ 57f
2α1(α11+ϵ)<δ2δ+ϵδ+ϵ=δ2δ+(1+δ)ϵ. 57g

Now, because α1<1, δ1, we learn that 2α1(α11+ϵ)<2α1ϵ<(1+δ)ϵ<(1+δ)ϵ+δ2δ, and the conclusion follows. (iv): It is straightforward, by noting that S2=Id, to verify that R2R1=R1R2=(1α1α2+α1α2)Id+(α2(1α1)α1(1α2))Sα1α2S2=(1α1α2+2α1α2)Id+(α2α1)S. Consequently, R3R2R1=1δ((1α1α2+2α1α2)S(α2α1)S2)=1δ((α2α1)Id(1α1α2+2α1α2)S)=ϵ+δδId+α1+α22α1α21δS. (v): This is a direct consequence of (iv). (vi): Combine (v) and Lemma 2.9. □

Theorem 4.9

(Composition of cocoercive operators)

Let m1 be an integer, set I={1,,m}, let (Ri)iI be a family of operators from X to X, let βi be real numbers in ]0,+[, and suppose that, for every iI, Ri is 1βi-cocoercive. Then there exists a nonexpansive operator N:XX such that

RmR1=βmβ1(11+mId+m1+mN). 58

Proof

Apply Theorem 4.7 with (αi,δi) replaced by (1/2,βi), i{1,,m}. □

Application to the Douglas–Rachford algorithm

Theorem 5.1

(Averagedness of the Douglas–Rachford operator)

Let μ>ω0, and let γ]0,(μω)/(2μω)[. Suppose that one of the following holds:

  • (i)

    A is maximally (ω)-monotone and B is maximally μ-monotone.

  • (ii)

    A is maximally μ-monotone and B is maximally (ω)-monotone.

Set

T=12(Id+RγBRγA),andα=μω2(μωγμω). 59

Then α]0,1[ and T is α-averaged.

Proof

Suppose that (i) holds. Note that γA is γω-monotone, and

γω>μω2μμ2μ>1. 60

Using (13) and Fact 2.1 we learn that JγA and, in turn, T are single-valued and domJγA=domT=X. It follows from [3, Proposition 4.3 and Table 1] that RγA is 11+γμ-conically nonexpansive and RγB is 11γω-conically nonexpansive. It follows from Theorem 4.2, applied with (α1,β1,δ1,α2,β2,δ2) replaced by (111+γμ,11+γμ,1,111γω,11γω,1), that RγBRγA is μωμωγμω-conically nonexpansive. Therefore, there exists a nonexpansive mapping N:XX such that

RγBRγA=(1δ)Id+δN,δ=μωμωγμω. 61

The conclusion now follows by applying Proposition 2.8 with (β,N) replaced by (αδ,RγBRγA). Finally, notice that γ<μω2μω, which implies that 0<μω<2(μωγμω). Therefore,

α=μω2(μωγμω)]0,1[. 62

The proof of (ii) follows similarly. □

Corollary 5.2

([9, Theorem 4.5(ii)])

Let μ>ω0, and let γ]0,(μω)/(2μω)[. Suppose that one of the following holds:

  • (i)

    A is maximally (ω)-monotone and B is maximally μ-monotone.

  • (ii)

    A is maximally μ-monotone and B is maximally (ω)-monotone.

Set T=12(Id+RγBRγA) and let x0X. Then (xFixT=FixRγBRγA) such that Tnx0x.

Proof

Combine Theorem 5.1 and [2, Theorem 5.15]. □

Remark 5.3

In view of (13), one might think that the scaling factor γ is required only to guarantee the single-valuedness and the full domain of T. However, it is actually critical to guarantee convergence as well, as we illustrate in Example 5.4.

Example 5.4

Let μ>ω0, let U be a closed linear subspace of X, suppose that9

A=NU+μId,B=ωId. 63

Then A is μ-monotone, B is −ω-monotone, and (γ[1/(2ω),1/ω[) JγB is single-valued. Furthermore, we have

T=12(Id+RγBRγA)=1+γω(1γω)(1+γμ)PUγω1γωId, 64

and (x0U) (Tnx0)nN does not converge.

Proof

Indeed, one can verify that

JγA=11γωId,JγB=11+γμPU. 65

Consequently,

RγA=1+γω1γωId,RγB=21+γμPUId, 66

and (64) follows. Therefore,

T|U=γω1γωIdandγω1γω],1]. 67

Hence, (x0U) (Tnx0)nN does not converge. □

Before we proceed to the convergence analysis, we recall that if T is averaged and FixT then (xX) we have (see, e.g., [22, Theorem 3.7])

TnxTn+1x0. 68

We conclude this section by proving the strong convergence of the shadow sequence of the Douglas–Rachford algorithm.

Theorem 5.5

(Convergence analysis of the Douglas–Rachford algorithm)

Let μ>ω0, and let γ]0,(μω)/(2μω)[. Suppose that one of the following holds:

  • (i)

    A is maximally μ-monotone and B is maximally (ω)-monotone.

  • (ii)

    A is maximally (ω)-monotone and B is maximally μ-monotone.

Set

T=12(Id+RγBRγA), 69

and let x0X. Then zer(A+B). Moreover, there exist xFixT=FixRγBRγA, zer(A+B)={JγAx}={JγBRγAx}, Tnx0x, JγATnx0JγAx, and JγBRγATnx0JγBRγAx.

Proof

Suppose that (i) holds. Since A+B is (μω)-monotone and μω>0, we conclude from [2, Proposition 23.35] that zer(A+B) is a singleton. Combining with Fact 2.1 with (A,B) replaced by (γA,γB) yields zer(A+B)=zer(γA+γB)={JγAx}={JγBRγAx}. The claim that Tnx0x follows from Corollary 5.2. It remains to show that JγATnx0JγAx and JγBRγATnx0JγBRγAx. To this end, note that (Tnx0)nN is bounded; consequently, since JγA and JγBRγA are Lipschitz continuous (see Proposition 2.16(i)&(ii)), we learn that

(JγATnx0)nN and (JγBRγATnx0)nN are bounded. 70

On the one hand, in view of (68) we have

(IdT)Tnx0=Tnx0Tn+1x0=JγATnx0JγBRγATnx00. 71

Combining (70) and (71) yields

JγATnx0JγAx2JγBRγATnx0JγBRγAx2 72a
=JγATnx0JγBRγATnx0JγATnx0+JγBRγATnx0JγAxJγBRγAx 72b
=Tnx0Tn+1x0JγATnx0+JγBRγATnx0JγAxJγBRγAx0. 72c

On the other hand, combining Lemma 2.3, applied with (R1,R2,R(λ),λ) replaced by (RγA,RγB,T,1/2) and (x,y) replaced by (Tnx0,x), in view of (68) yields

0Tn+1x0xTnx0Tn+1x0 73a
γμ(JγATnx0JγAx2ωμJγBRγATnx0JγBRγAx2) 73b
γμωμωTnx0Tn+1x020. 73c

Therefore,

JγATnx0JγAx2ωμJγBRγATnx0JγBRγAx20. 74

Combining (72a)–(72c) and (74) and noting that ωμ<1 yields JγATnx0JγAx20 and JγBRγATnx0JγBRγAx20, which proves (i). The proof of (ii) proceeds similarly. □

Remark 5.6

(Relaxed Douglas–Rachford algorithm)

A careful look at the proofs of Theorem 5.1 and Theorem 5.5 reveals that analogous conclusions can be drawn for the relaxed Douglas–Rachford operator defined by Tλ=(1λ)Id+λRγBRγA, λ]0,1[. In this case, we choose γ]0,((1λ)(μω))/(μω)[. One can verify that the corresponding averagedness constant is α=λ(μω)μωγμω]0,1[.

Application to the forward–backward algorithm

Throughout this section we assume that

A:XX,B:XX,μ0,ω0,andβ>0.

In the rest of this section, we prove that the forward–backward operator is averaged, hence its iterates form a weakly convergent sequence in each of the following situations:

  • A is maximally μ-monotone, AμId is 1β-cocoercive, B is maximally (ω)-monotone, and μω.

  • A is maximally (ω)-monotone, A+ωId is 1β-cocoercive, B is maximally μ-monotone, and μω.

  • A is β-Lipschitz continuous, B is maximally μ-monotone, and μβ.

That is, we do not require A and B to be monotone. Instead, it is enough that the sum A+B is monotone to have an averaged forward–backward map. In addition, we show that the forward–backward map is contractive if the sum A+B is strongly monotone, and we prove the tightness of our contraction factor.

Theorem 6.1

(Case I: A is μ-monotone)

Let μω0, and let β>0. Suppose that A is maximally μ-monotone, AμId is 1β-cocoercive, and B is maximally (ω)-monotone. Let γ]0,2/(β+2μ)[. Set T=JγB(IdγA), set ν=γβ/(2(1γμ)), set δ=(1γμ)/(1γω), and let x0X. Then δ]0,1] and ν]0,1[. Moreover, the following hold:

  • (i)

    T=δ((1ν)Id+νN), N is nonexpansive.

  • (ii)

    T is (1(δ(1ν))/(2ν))-averaged.

  • (iii)

    T is δ-Lipschitz continuous.

  • (iv)

    There exists xFixT=zer(A+B) such that Tnx0x.

Suppose that μ>ω. Then we additionally have:

  • (v)

    T is a Banach contraction with a constant δ<1.

  • (vi)

    zer(A+B)={x} and Tnx0x with a linear rate δ<1.

Proof

Clearly, δ]0,1] and ν>0. Moreover, we have ν<1 ⇔ γβ<2(1γμ) ⇔ γ<2/(β+2μβ). Hence, ν]0,1[ as claimed. Next note that μ<(β+2μ)/2, hence γω<γμ<(2γ)/(β+2μ)<1. It follows from Proposition 2.2 that JγB and, in turn, T are single-valued and domJγB=domT=X. The assumption on A implies that there exists N:XX, is nonexpansive, such that AμId=β2Id+β2N. Therefore,

IdγA=Idγ(AμId)γμId=(1γμ)Idγβ2(Id+N) 75a
=(1γμ)((1ν)Id+ν(N)). 75b

Moreover, Proposition 2.16(i) implies that

JγB is (1γω)-cocoercive. 76

(i): It follows from Corollary 4.3 applied with (R1,R2) replaced by (IdγA,JγB) and (α,β,δ) replaced by (ν,1/(1γω),1γμ), in view of (75a)–(75b) and (76), that there exists a nonexpansive operator N such that T=JγB(IdγA)=δ((1ν)Id+νN). (ii): Combine (i) and Lemma 2.15(i). (iii): Combine (i) and (ii). (iv): Applying Proposition 2.2 with (A,B) replaced by (γA,γB) yields zer(A+B)=zer(γA+γB)=FixT. The claim that Tnx0x follows from combining (ii) and [2, Theorem 5.15]. (v): Observe that δ<1μ>ω. Now, combine with (iii). (vi): Note that A+B is maximally (μω)-monotone and μω>0, we conclude from [2, Proposition 23.35] that zer(A+B) is a singleton. Alternatively, use (iii) to learn that T is a Banach contraction with a constant δ<1, hence zer(A+B)=FixT is a singleton, and the conclusion follows. □

Theorem 6.2

Let μ>ω0, and let β>0. Suppose that A is maximally μ-monotone, AμId is 1β-cocoercive, and B is maximally (ω)-monotone. Let γ[2/(β+2μ),2/(β+μ)[. Set T=JγB(IdγA), set ν=γβ/(2(γ(μ+β)1)), set δ=(1γ(μ+β))/(1γω), and let x0X. Then δ]1,0] and ν]0,1[. Moreover, the following hold:

  • (i)

    T=δ((1ν)Id+νN), N is nonexpansive.

  • (ii)

    T is a Banach contraction with a constant |δ|<1.

  • (iii)

    There exists xX such that FixT=zer(A+B)={x} and Tnx0x with a linear rate |δ|<1.

Proof

We proceed similar to the proof of Theorem 6.1 to verify that T is single-valued, domT=X, ν]0,1[, and δ]1,0]. The assumption on A implies that there exists N:XX, is nonexpansive such that AμId=β2Id+β2N. Therefore,

IdγA=Idγ(AμId)γμId=(1γμ)Idγβ2(Id+N) 77a
=(1γ(μ+β))((1ν)Id+ν(N)). 77b

Now, proceed similar to the proof of Theorem 6.1(i), (v), and (vi) in view of (76). □

Corollary 6.3

Let μ>ω0, and let β>0. Suppose that A is maximally μ-monotone, AμId is 1β-cocoercive, and B is maximally (ω)-monotone. Let γ]0,2/(β+μ)[. Set T=JγB(IdγA), set δ=max(1γμ,γ(μ+β)1)/(1γω), and let x0X. Then δ[0,1[, T is a Banach contraction with a constant δ, and there exists xX such that FixT=zer(A+B)={x} and Tnx0x.

Proof

Combine Theorem 6.1 and Theorem 6.2. □

Remark 6.4

(Tightness of the Lipschitz constant)

  • (i)

    Suppose that the setting of Theorem 6.1 holds. Set (A,B)=(μId,ωId). Then T=1γμ1γωId. Hence, the claimed Lipschitz constant is tight.

  • (ii)

    Suppose that the setting of Theorem 6.2 holds. Set (A,B)=((μ+β)Id,ωId). Then T=γ(μ+β)11γωId. Hence, the claimed contraction factor is tight.

Note in particular that the worst cases are subgradients of convex functions. Hence, the worst cases are attained by the proximal gradient method.

Theorem 6.5

(Case II: A+ωId is cocoercive)

Let μω0, let β>0, and let β]max{β,μ+ω},+[. Suppose that A is maximally (ω)-monotone, A+ωId is β-cocoercive, and B is maximally μ-monotone. Let γ]0,2/(β2ω)[. Set T=JγB(IdγA), set ν=γβ/(2(1+γω)), set δ=(1+γω)/(1+γμ), and let x0X. Then δ]0,1] and ν]0,1[. Moreover, the following hold:

  • (i)

    T=δ((1ν)Id+νN), N is nonexpansive.

  • (ii)

    T is (1(δ(1ν))/(2ν))-averaged.

  • (iii)

    T is δ-Lipschitz continuous.

  • (iv)

    There exists xFixT=zer(A+B), and Tnx0x.

Suppose that μ>ω. Then we additionally have:

  • (v)

    T is a Banach contraction with a constant δ<1.

  • (vi)

    zer(A+B)={x} and Tnx0x with a linear rate δ<1.

Proof

Observe that the assumption on A and Lemma 2.11 applied with T replaced by A+ωId imply that there exists N:XX, is nonexpansive, such that A+ωId=β2Id+β2N.

IdγA=Idγ(A+ωId)+γωId=(1+γω)Idγβ2(Id+N) 78a
=(1+γω)((1ν)Id+ν(N)). 78b

Moreover, Proposition 2.16(i) implies that

JγB is (1+γμ)-cocoercive. 79

Now proceed similar to the proof of Theorem 6.1 but use (78a)–(78b) and (79). □

Theorem 6.6

Let μ>ω0, let β>0, and let β]max{β,μ+ω},+[. Suppose that A is maximally (ω)-monotone, A+ωId is β-cocoercive, and B is maximally μ-monotone. Let γ[2/(β2ω),2/(βμω)[. Set T=JγB(IdγA), set ν=γβ/(2(γβγω1)), set δ=(1+γωγβ)/(1+γμ), and let x0X. Then δ]1,0] and ν]0,1[. Moreover, the following hold:

  • (i)

    T=δ((1ν)Id+νN), N is nonexpansive.

  • (ii)

    T is a Banach contraction with a constant |δ|<1.

  • (iii)

    There exists xX such that FixT=zer(A+B)={x} and Tnx0x with a linear rate |δ|<1.

Proof

Observe that the assumption on A and Lemma 2.11 applied with T replaced by A+ωId implies that there exists N:XX, is nonexpansive, such that A+ωId=β2Id+β2N.

IdγA=Idγ(A+ωId)+γωId=(1+γω)Idγβ2(Id+N) 80a
=(1+γωγβ)((1ν)Id+νN). 80b

Now proceed similar to the proof of Theorem 6.5 in view of (79). □

Corollary 6.7

Let μ>ω0, let β>0, and let β]max{β,μ+ω},+[. Suppose that A is maximally (ω)-monotone, A+ωId is β-cocoercive, and B is maximally μ-monotone. Let γ[0,2/(βμω)[. Set T=JγB(IdγA), set δ=max{1+γμ,γβγω1}/(1+γμ), and let x0X. Then δ]1,0] and ν]0,1[. Then δ[0,1[, T is a Banach contraction with a constant δ, and there exists xX such that FixT=zer(A+B)={x} and Tnx0x.

Proof

Combine Theorem 6.5 and Theorem 6.6. □

Theorem 6.8

(Case III: A is β-Lipschitz continuous)

Let μβ>0. Suppose that A is β-Lipschitz continuous and that B is maximally μ-monotone. Let β]2β,+[, and let γ]0,2/(β2β)}[. Set T=JγB(IdγA), set ν=γβ/(2(1+γβ)), set δ=(1+γβ)/(1+γμ), and let x0X. Then δ]0,1] and ν]0,1[. Moreover, the following hold:

  • (i)

    T=δ((1ν)Id+νN), N is nonexpansive.

  • (ii)

    T is (1(δ(1ν))/(2ν))-averaged.

  • (iii)

    T is δ-Lipschitz continuous.

  • (iv)

    There exists xFixT=zer(A+B), and Tnx0x.

Suppose that μ>1/β. Then we additionally have:

  • (v)

    T is a Banach contraction with a constant δ<1.

  • (vi)

    zer(A+B)={x} and Tnx0x with a linear rate δ<1.

Proof

Combine Lemma 2.12 and Theorem 6.5 applied with (ω,β) replaced by (β,2β). □

Theorem 6.9

Let μ>β>0. Suppose that A is β-Lipschitz continuous and that B is maximally μ-monotone. Let β]μ+β,+[, and let γ[2/(β2β),2/(βμβ)[. Set T=JγB(IdγA), set ν=γβ/(2(γβγβ1)), set δ=(1+γβγβ)/(1+γμ), and let x0X. Then δ]1,0] and ν]0,1[. Moreover, the following hold:

  • (i)

    T=δ((1ν)Id+νN), N is nonexpansive.

  • (ii)

    T is a Banach contraction with a constant |δ|<1.

  • (iii)

    There exists xX such that FixT=zer(A+B)={x} and Tnx0x with a linear rate |δ|<1.

Proof

Combine Lemma 2.12 and Theorem 6.6 applied with (ω,β) replaced by (β,2β). □

Applications to optimization problems

Let f:X],+], and let g:X],+]. Throughout this section, we shall assume that

f and g are proper lower semicontinuous functions.

We shall use ∂f to denote the subdifferential mapping from convex analysis.

Definition 7.1

(see [3, Definition 6.1])

An abstract subdifferential # associates a subset #f(x) of X with f at xX, and it satisfies the following properties:

  • (i)

    #f=f if f is a proper lower semicontinuous convex function;

  • (ii)

    #f=f if f is continuously differentiable;

  • (iii)

    0#f(x) if f attains a local minimum at xdomf;

  • (iv)
    for every βR,
    #(f+βx22)=#f+β(Idx).

The Clarke–Rockafellar subdifferential, Mordukhovich subdifferential, and Frechét subdifferential all satisfy Definition 7.1(i)–(iv), see, e.g., [5, 19, 20], so they are #.

Let λ>0. Recall that f is λ-hypoconvex (see [23, 26]) if

f((1τ)x+τy)(1τ)f(x)+τf(y)+λ2τ(1τ)xy2 81

for all (x,y)X×X and τ]0,1[ or, equivalently,

f+λ22 is convex. 82

For γ>0, the proximal mapping Proxγf is defined at xX by

Proxγf(x)=argminyX(f(y)+γ2xy2). 83

Fact 7.2

Suppose that f:X],+] is a proper lower semicontinuous λ-hypoconvex function. Then

#f=(f+λ22)λId. 84

Moreover, we have:

  • (i)

    The Clarke–Rockafellar, Mordukhovich, and Frechét subdifferential operators of f all coincide.

  • (ii)

    #f is maximallyλ-monotone.

  • (iii)

    (γ]0,1/λ[) Proxγf is single-valued and domProxγf=X.

Proof

See [3, Proposition 6.2 and Proposition 6.3]. □

Proposition 7.3

Let μω0. Suppose that argmin(f+g) and that one of the following conditions is satisfied:

  • (i)

    f is μ- strongly convex, g is ω- hypoconvex.

  • (ii)

    f is ω- hypoconvex, and g is μ- strongly convex.

Then f+g is convex and #(f+g)=(f+g).

If, in addition, one of the following conditions is satisfied:

  1. 0sri(domfdomg).

  2. X is finite dimensional and 0ri(domfdomg).

  3. X is finite dimensional, f and g are polyhedral, and domfdomg.

Then

#(f+g)=(f+g)=#f+#g, 85

and

zer#(f+g)=zer(#f+#g)=argmin(f+g). 86

Proof

It is clear that either (i) or (ii) implies that f+g is convex, and the identity follows in view of Definition 7.1(i). Now, suppose that (i) holds along with one of the assumptions (a)–(c). Rewrite f and g as (f,g)=(f+μ22,gω22) and observe that both and are convex, as is f+g. Moreover, we have domf=domf and domg=domg. Now,

#(f+g)=#(f+g+μω22) 87a
=#(f+g)+(μω)Id=(f+g)+(μω)Id 87b
=f+g+(μω)Id=f+μId+gωId 87c
=f+#g=#f+#g. 87d

Here, (87b) follows from applying Definition 7.1(iv) to f+g, (87c) follows from [2, Theorem 16.47] applied to and , and (87c) follows from applying Fact 7.2 to f and g and using Definition 7.1(i), which verify (85). Finally, (86) follows from combining (85) and [2, Theorem 16.3]. □

The following theorem provides an alternative proof to [17, Theorem 4.4] and [9, Theorem 5.4(ii)].

Theorem 7.4

Let μ>ω0, and let γ]0,(μω)/(2μω)[. Suppose that one of the following holds:

  • (i)

    f is μ- strongly convex, g is ω- hypoconvex.

  • (ii)

    f is ω-hypoconvex, and g is μ-strongly convex,

and that 0#f+#g (see Proposition 7.3for sufficient conditions). Set

T=12(Id+(2ProxγgId)(2ProxγfId))andα=μω2(μωγμω), 88

and let x0X. Then α]0,1[, and T is α-averaged. Moreover, (xFixT) such that Tnx0x, argmin(f+g)={Proxfx}, and ProxfTnx0Proxfx.

Proof

Suppose that (i) holds. Then [2, Example 22.4] (respectively Fact 7.2(ii)) implies that #f=f (respectively #g) is maximally μ-monotone (respectively maximally (ω)-monotone). The conclusion follows from applying Theorem 5.5(i) with (A,B) replaced by (#f,#g). The proof for (ii) follows similarly by using Theorem 5.5(ii). □

Before we proceed further, we recall the following useful fact.

Fact 7.5

(Baillon–Haddad)

Let f:XR be a Frechét differentiable convex function, and let β>0. Then f is β-Lipschitz continuous if and only if f is 1β-cocoercive.

Proof

See, e.g., [2, Corollary 18.17]. □

Lemma 7.6

Let μ0, let β>0, and let f:XR be a Frechét differentiable function. Suppose that f is μ-strongly convex with a β-Lipschitz continuous gradient. Then the following hold:

  • (i)

    fμ22 is convex.

  • (ii)

    f is maximally μ-monotone.

  • (iii)

    fμId is 1β-cocoercive.

Proof

(i): See, e.g., [2, Proposition 10.8]. (ii): See, e.g., [2, Example 22.4(iv)]. (iii): Combine (i), Lemma 2.10, and Corollary 2.14(ii) applied with (f1,f2) replaced by (f,μ22). □

Theorem 7.7

(The forward–backward algorithm when f is μ-strongly convex)

Let μω0, and let β>0. Let f be μ-strongly convex and Frechét differentiable with a β-Lipschitz continuous gradient, and let g be ω-hypoconvex. Suppose that argmin(f+g). Let γ]0,2/(β+2μ)[, and set δ=(1γμ)/(1γω). Set T=Proxγg(Idγf), and let x0X. Then the following hold:

  • (i)

    There exists xFixT=zer(A+B)=argmin(f+g) such that Tnx0x.

Suppose that μ>ω. Then we additionally have:

  • (ii)

    FixT=argmin(f+g)={x} and Tnx0x with a linear rate δ<1.

Proof

Note that Definition 7.1(ii) implies that #f=f. Set (A,B)=(f,#g) and observe that Proposition 7.3 and Proposition 2.2 imply that FixT=zer(A+B)=argmin(f+g). It follows from [2, Example 22.4] (respectively Fact 7.2(ii)) that A (respectively B) is maximally μ-monotone (respectively maximally (ω)-monotone). Moreover, Lemma 7.6(iii) implies that AμId is 1β-cocoercive. (i)–(ii): Apply Theorem 6.1(iv)&(vi). □

To proceed to the next result, we need the following lemma.

Lemma 7.8

Let ω0, let β>0, and let f:XR be a Frechét differentiable function. Suppose that g is ω-hypoconvex with a 1β-Lipschitz continuous gradient. Then f+ωId is β/(1+ωβ)-cocoercive.

Theorem 7.9

(The forward–backward algorithm when f is ω-hypoconvex)

Let μω0, let β>0, and let β]max{β,2ω},+[. Let f be ω-hypoconvex, and let g be μ-strongly convex and Frechét differentiable with a β-Lipschitz continuous gradient. Suppose that argmin(f+g). Let γ]0,2/(β2ω)[, and set δ=(1+γω)/(1+γμ). Set T=Proxγg(Idγf), and let x0X. Then the following hold:

  • (i)

    There exists xFixT=argmin(f+g) such that Tnx0x.

Suppose that μ>ω. Then we additionally have:

  • (ii)

    FixT=argmin(f+g)={x} and Tnx0x with a linear rate δ<1.

Proof

Proceed similar to the proof of Theorem 7.7 but use Theorem 6.5(iv)&(vi). □

Theorem 7.10

(The forward–backward algorithm when f is 1/β-hypoconvex)

Let μβ>0, and let β]2β,+[. Let f be μ-strongly convex, and let g be Frechét differentiable with a β-Lipschitz continuous gradient. Suppose that argmin(f+g). Let γ]0,2/(β2β)}[, and set δ=(1+γβ)/(1+γμ). Set T=Proxγg(Idγf), and let x0X. Then the following hold:

  • (i)

    There exists xFixT=argmin(f+g) such that Tnx0x.

Suppose that μ>1/β. Then we additionally have:

  • (ii)

    FixT=argmin(f+g)={x} and Tnx0x with a linear rate δ<1.

Proof

Combine Lemma 2.12 applied with A replaced by f and Theorem 7.9 applied with (ω,β) replaced by (β,2β). □

Remark 7.11

The results of Theorem 6.2, Theorem 6.6, and Theorem 6.9 can be directly applied to optimization settings in a similar fashion à la Theorem 7.7, Theorem 7.9, and Theorem 7.10.

Graphical characterizations

This section contains 2D-graphical representations of different Lipschitz continuous operator classes that admit I-N decompositions and of their composition classes. We illustrate exact shapes of the composition classes in 2D and conservative estimates from Theorem 3.4 and Theorem 4.2. Similar graphical representations have appeared before in the literature. In [10, 11], nonexpansiveness and firm nonexpansiveness (12-averagedness) are characterized. Early preprints of [15] have more 2D graphical representations, and the lecture notes [14] contain many such characterizations with the purpose of illustrating how different properties relate to each other and to provide intuition on why different algorithms converge. This has been further extended and formalized in [24]. Not only do these illustrations provide intuition. Indeed, it is a straightforward consequence of, e.g., [24, 25] that for compositions of two operator classes that admit I-N decompositions, there always exists a 2D-worst case. Hence, if the 2D illustration implies that the composition class admits a specific (α,β)-I-N decomposition, so does the full operator class.

In Sect. 8.1, we characterize many well-known special cases of operator classes that admit I-N decompositions. In Sect. 8.2, we characterize classes obtained by compositions of such operator classes and highlight differences between the true composition classes and their characterizations using Theorem 3.4.

Single operators

We consider classes of (α,β)-I-N decomposition of Lipschitz continuous operators. We graphically illustrate properties of some special cases. The illustrations should be read as follows. Assume that xy is represented by the marker in the figure. The diagram then shows where RxRy can end up in relation to xy. If the point xy is rotated in the picture, the rest of the picture rotates with it. The characterization is, by construction of (α,β)-I-N decompositions, always a circle of radius βxy shifted αxy along the line defined by the origin and the point xy.

Lipschitz continuous operators

Let β>0 and let R:XX. Then R is β-Lipschitz continuous if and only if R admits an (α,β)-I-N decomposition, with α chosen as 0. Figure 1 shows the case β=0.8. The radius of the Lipschitz circle is βxy.

Figure 1.

Figure 1

Illustration of β-Lipschitz continuous operator with β=0.8

Cocoercive operators

Let β>0, and let R:XX. Then R is 1β-cocoercive if and only if R admits an (α,β)-I-N decomposition, with (α,β) chosen as (β2,β2). Figure 2 shows the cases β=1.4 and β=0.7. The diameter is βxy. The figure clearly illustrates that 1β-cocoercive operators are also β-Lipschitz (but not necessarily the other way around).

Figure 2.

Figure 2

Illustration of 1β-cocoercive operators with β=0.7 and β=1.4

Averaged operators

Let α]0,1[, and let R:XX. Then R is α-averaged if and only if R admits an (α,β)-I-N decomposition, with (α,β) chosen as (1α,α). Figure 3 shows the cases α=0.25 and α=0.5, and α=0.75. All averaged operators are nonexpansive.

Figure 3.

Figure 3

Illustration of α-averaged operators with α=0.25, α=0.5, and α=0.75

Conic operators

Let α>0, and let R:XX. Then R is α-conically nonexpansive if and only if R admits an (α,β)-I-N decomposition, with (α,β) chosen as (1α,α). Figure 4 shows the cases α=1.2 and α=1.5. Conically nonexpansive operators fail to be nonexpansive for α>1.

Figure 4.

Figure 4

Illustration of α-conically nonexpansive operators with α=1.2 and α=1.5

μ-Monotone operators

Let μR, and suppose that A:XX is μ-monotone. The shortest distance between the vertical line and the origin in the illustration is |μ|xy. Figure 5 shows the case μ=0.2.

Figure 5.

Figure 5

Illustration of μ-monotone operator with μ=0.2

Compositions of two operators

In this section, we provide illustrations of compositions of different classes of Lipschitz continuous operators. We consider compositions of the form

R=R2R1,where Ri admits an (αi,βi)-I-N decomposition,

i{1,2}. Let (x,y)X×X. We illustrate the regions within which R2R1xR2R1y can end up. For most considered composition classes, we provide two illustrations. The left illustration explicitly shows how the composition is constructed. It shows the region within which R1xR1y must end up. The second operator R2 is applied at a subset, marked by crosses, of boundary points of that region. Given these as starting points for R2 application, the dashed circles show where R2R1xR2R1y can end up for this subset. The right illustration shows, in gray, the resulting exact shape of the composition. It also contains the estimate from Theorem 3.4 that provides an I-N decomposition of the composition. From these illustrations, it is obvious that many different I-N decomposition are valid. The illustrations also reveal that the specific I-N decompositions provided in Theorem 3.4 indeed are suitable for our purpose of characterizing the composition as averaged, conic, or contractive.

Averaged-averaged composition

We first consider αi-averaged Ri with αi]0,1[. A special case is the forward–backward splitting operator T=JγB(IdγA) with 1β-cocoercive A and maximally monotone B. This implies that (IdγA) is γβ2-averaged for γ]0,2β[ and that JγB is 12-averaged. The example in Fig. 6 has individual averagedness parameters α1=0.5 and α2=0.5, i.e., R=R2R1 with R1=0.5Id+0.5N1 and R2=0.5Id+0.5N2. Theorem 3.4 shows that the composition is of the form 0.33Id+0.67N, where N is nonexpansive, i.e., it is 0.67-averaged. The fact that the composition is averaged is already known, see [8, 12].

Figure 6.

Figure 6

Illustration of composition of α1-averaged and α2-averaged operators with α1=α2=0.5

The example in Fig. 7 shows α1=0.7 and α2=0.6. Theorem 3.4 shows that the composition is of the form 0.21Id+0.79N, where N is nonexpansive, i.e., it is 0.79-averaged.

Figure 7.

Figure 7

Illustration of composition of α1-averaged and α2-averaged operators with α1=0.7 and α2=0.6

Conic-conic composition

We consider αi-averaged Ri with αi>0. Several examples with this setting are considered in for Douglas-Rachford splitting and forward–backward splitting in Sect. 5 and Sect. 6. We know from Theorem 4.2 that the composition is conic if α1α2<1. The example in Fig. 8 has α1=1.7 and α2=0.45, that satisfies α1α2=0.76<1. Theorem 4.2 shows that the composition is of the form 1.64Id+2.64N, where N is nonexpansive, i.e., it is 2.64-conic.

Figure 8.

Figure 8

Illustration of composition of α1-conic operator and α2-averaged operator with α1=1.7 and α2=0.45

In Example 4.6, we have shown that the assumption α1α2<1 is critical for the composition to be conic. Figure 9 illustrates the case α1=1.7 and α2=0.7, which satisfies α1α2=1.19>1, hence Theorem 4.2 cannot be used to deduce that the composition is conic. Indeed, we see from the figure that the composition is not conic. It is impossible to draw a circle that touches the marker at xy and extends only to the left.

Figure 9.

Figure 9

Illustration of composition of α1-conic operator and α2-averaged operator with α1=1.7 and α2=0.7

We conclude the conic composed with conic examples with a forward–backward example. The forward–backward splitting operator JγB(IdγA) with A 1β-cocoercive and B (maximally) monotone is composed of 12-averaged resolvent JγB and γβ2-conic forward step (IdγA). The composition R=R2R1 with Ri αi-conic is conic if α1α2<1, Theorem 4.2. In the forward–backward setting, this corresponds to γ(0,4β), which doubles the allowed range compared to guaranteeing an averaged composition. This extended range has been shown before, e.g., in [13, 18].

In Fig. 10, we illustrate the forward–backward setting with γ=3.9β. This corresponds to conic parameters α1=1.95 and α2=0.5, i.e., R=R2R1 with R1=0.95Id+1.95N1 and R2=0.5Id+0.5N2. The composition is of the form 18.99Id+19.99N, where N is nonexpansive, i.e., it is 19.99-conic, Theorem 4.2. The left figure shows the resulting composition and (parts of) the conic approximation. The conic approximation is very large compared to the actual region. This is due to the local behavior around the point xy, where it is almost vertical. As γ4β, the exact shape approaches being vertical around xy and the conic circle approaches to have an infinite radius. For γ>4β, the exact shape extends to the right of xy (as in the figure above), and the composition will not be conic.

Figure 10.

Figure 10

To the left is an illustration of the forward–backward composition JγB(IdγA) with γ=3.9β, where 1β is the cocoercivity constant of A. It is a composition between an α1-conic operator and an α2-averaged operator with α1=1.95 and α2=0.5. To the right is an illustration of a θ-relaxation of the same forward-backward map with θ=0.04

In the right figure, we consider the relaxed forward–backward map (1θ)Id+θJγB(IdγA) with θ>0. If the composition JγB(IdγA) is α-conic, it is straightforward to verify that the relaxed map is θα-conic. Therefore, any θ(0,α1) gives an θα-averaged relaxed forward–backward map. An averaged map is needed to guarantee convergence to a fixed-point when iterated. In the figure, we let θ=0.04, which satisfies θ<α10.05. The approximation is indeed averaged, but the region within which the composition can end up is very small compared to the conic approximation.

Scaled averaged and cocoercive compositions

Compositions of scaled averaged and cocoercive operators are also special cases of scaled conic composed with scaled conic operators treated in Theorem 4.2. It covers the forward backward examples in Sect. 6, where identity is shifted between the operators and the sum is (strongly) monotone. The operators in the composition are of the form R1=δ1((1α1)Id+α1N1) and R2=β22(Id+N2), where α1(0,1), δ1>0, and β2>0.

In Fig. 11, we consider the forward–backward setting in Theorem 6.5. The forward backward map is JγB(IdγA) and we let A+0.3Id be 1-cocoercive, B be maximally 0.3-monotone. That is, we have shifted 0.3Id from A to B and the sum is monotone. We use step-length γ=2. The proof of Theorem 6.5 shows that, in our setting, R1 is 1.6-scaled 0.62-averaged and that R2 is 1.6-cocoercive. Theorem 3.4 implies that the composition is of the form 0.27Id+0.73N, where N is nonexpansive, i.e., it is 0.73-averaged.

Figure 11.

Figure 11

Illustration of composition of 1.6-scaled 0.62-averaged operator with 1.6-cocoercive operator. The composition comes from the forward–backward map JγB(IdγA) with A+0.3Id 1-cocoercive, B 0.3-monotone, and γ=2

Figure 12 considers a similar forward–backward setting, but with a strongly monotone sum. We let A+0.2Id be 1-cocoercive, B be maximally 0.3-monotone, which implies that the sum is 0.1-strongly monotone. We keep step-length γ=2. The proof of Theorem 6.5 shows that, in our setting, R1 is 1.4-scaled 0.62-averaged and that R2 is 1.6-cocoercive. Theorem 3.4 implies that the composition is of the form 0.19Id+0.68N, where N is nonexpansive, i.e., it is 0.87-contractive.

Figure 12.

Figure 12

Illustration of composition of 1.4-scaled 0.62-averaged operator with 1.6-cocoercive operator. The composition comes from the forward–backward map JγB(IdγA) with A+0.2Id 1-cocoercive, B 0.3-monotone, and γ=2

The final example in Fig. 13 considers a similar forward–backward setting where the sum is not monotone. We let A+0.4Id be 1-cocoercive, B be maximally 0.3-monotone, which implies that the sum is −0.1-monotone, i.e., it is not monotone. We use step-length γ=2. The proof of Theorem 6.5 shows that, in our setting, R1 is 1.8-scaled 0.62-averaged and that R2 is 1.6-cocoercive. Theorem 3.4 implies that the composition is of the form 0.35Id+0.78N, where N is nonexpansive, i.e., it is 1.12-Lipschitz and not conic, averaged, or contractive.

Figure 13.

Figure 13

Illustration of composition of 1.8-scaled 0.62-averaged operator with 1.6-cocoercive operator. The composition comes from the forward–backward map JγB(IdγA) with A+0.4Id 1-cocoercive, B 0.3-monotone, and γ=2

Acknowledgements

Not applicable.

Appendix A

Proof of Lemma 2.3

Indeed, observe that

R(λ)=(12λ)Id+λ(Id+R2R1) 89

and

IdR(λ)=λ(IdR2R1). 90

In view of (89) and (90) we have

R(λ)xR(λ)y(IdR(λ))x(IdR(λ))y=(12λ)xyIdR(λ))x(IdR(λ))y+λ2(xy)(R2R1xR2R1y)(xy)+(R2R1xR2R1y)=(12λ)xyIdR(λ))x(IdR(λ))y+λ2(xy2R2R1xR2R1y2)=(12λ)xyIdR(λ))x(IdR(λ))y+λ2(xy2R1xR1y2+R1xR1y2R2R1xR2R1y2)=(12λ)xy(IdR(λ))x(IdR(λ))y+λ2(Id+R1)x(Id+R1)y(IdR1)x(IdR1)y+λ2(Id+R2)R1x(Id+R2)R1y(IdR2)R1x(IdR2)R1y,

and the conclusion follows. □

Appendix B

Proof of Lemma 2.12

(i): Because 1βA is nonexpansive, we learn from [2, Example 20.7] that Id+1βA, as is βId+A, is maximally monotone. The conclusion now follows in view of e.g., [3, Lemma 2.5]. (ii): This is clear by observing that 12β(βId+A)=12(Id+1βA). □

Appendix C

Proof of Lemma 2.13

Indeed, by assumption, there exist nonexpansive mappings N1:XX and N2:XX such that

T1=β2Id+β2N1,T2=δ2Id+δ2N2. 91

Now,

1β(T1T2)=1βT11βT2=12Id+12N1δ2βIdδ2βN2 92a
=βδ2βId+12N1δ2βN2. 92b

Using the triangle inequality, one can directly verify that 1β(T1T2) is Lipschitz continuous with a constant βδ2β+12+δ2β=1. The proof is complete. □

Appendix D

Proof of Corollary 2.14

(i): It follows from Fact 7.5 that f1 (respectively f2) is 1β-cocoercive (respectively 1δ-cocoercive). Now apply Lemma 2.13 with (T1,T2) replaced by (f1,f2). (ii): Combine (i) with Fact 7.5 applied with f replaced by f1f2. □

Appendix E

Proof of Lemma 2.15

(i): Indeed, we have δT=(1(1δ(1α)))Id+δαN=(1(1δ(1α)))Id+(1δ(1α))N˜, where N˜=((δα)/(1δ(1α)))N. Note that (δα)/(1δ(1α)1, hence Ñ is nonexpansive and the conclusion follows. (ii): Clear. □

Authors’ contributions

All authors contributed equally in writing this article. All authors read and approved the final manuscript.

Funding

PG was partially supported by the Swedish Research Council and the Wallenberg AI, Autonomous Systems and Software Program (WASP). WMM was partially supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grant (NSERC-DG).

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Declarations

Competing interests

The authors declare that they have no competing interests.

Footnotes

1

Let T:XX. Then T is α-averaged if α]0,1[ and nonexpansive N:XX exists such that T=(1α)Id+αN.

2

Let T:XX. Then T is α-conically nonexpansive if α]0,[ and nonexpansive N:XX exists such that T=(1α)Id+αN.

3

Let T:XX, and let β>0. Then T is 1β-cocoercive if nonexpansive N:XX exists such that T=β2(Id+N).

4

The paper [1] appeared online while putting the finishing touches on this paper. Partial results of this work were presented by the second author at the Numerical Algorithms in Nonsmooth Optimization workshop at Erwin Schrödinger International Institute for Mathematics and Physics (ESI) in Vienna in February 2019 and at the Operator Splitting Methods in Data Analysis workshop at the Flatiron Institute, in New York in March 2019. Both workshops predate [1].

5

Let A:XX be an operator. The resolvent of A, denoted by JA, is defined by JA=(Id+A)1, and the reflected resolvent of A, denoted by RA, is defined by RA=2JAId

6

In passing, we mention that [2, Proposition 26.1(iv)] assume that A and B are maximally monotone, which is not required here. However, the proof is the same.

7

Here and elsewhere, we use R+ to denote the interval [0,+[.

8

The assumption that βR+ is not restrictive. Indeed, since N is nonexpansive, an operator admits an (α,β)-I-N decomposition if and only if it admits an (α,β)-I-N decomposition. This is the reason why we define it only for nonnegative β.

9

Let C be a nonempty, closed convex subset of X. Here and elsewhere, we shall use NC to denote the normal cone operator associated with C, defined by NC(x)={uXsupCxu0} if xC; and NC(x)=, otherwise.

Contributor Information

Pontus Giselsson, Email: pontus.giselsson@control.lth.se.

Walaa M. Moursi, Email: walaa.moursi@uwaterloo.ca

References

  • 1. Bartz, S., Dao, M.N., Phan, H.M.: Conical averagedness and convergence analysis of fixed point algorithms. arXiv preprint (2019). arXiv:1910.14185
  • 2.Bauschke H.H., Combettes P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. 2. New York: Springer; 2017. [Google Scholar]
  • 3.Bauschke H.H., Moursi W.M., Wang X. Generalized monotone operators and their averaged resolvents. Math. Program., Ser. B. 2020 doi: 10.1007/s10107-020-01500-6. [DOI] [Google Scholar]
  • 4.Burachik R.S., Iusem A.N. Set-Valued Mappings and Enlargements of Monotone Operators. Berlin: Springer; 2007. [Google Scholar]
  • 5.Clarke F.H. Optimization and Nonsmooth Analysis. Philadelphia: SIAM; 1990. [Google Scholar]
  • 6.Combettes P.L. Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization. 2004;53(5–6):475–504. doi: 10.1080/02331930412331327157. [DOI] [Google Scholar]
  • 7.Combettes P.L., Pennanen T. Proximal methods for cohypomonotone operators. SIAM J. Control Optim. 2004;43(2):731–742. doi: 10.1137/S0363012903427336. [DOI] [Google Scholar]
  • 8.Combettes P.L., Yamada I. Compositions and convex combinations of averaged nonexpansive operators. J. Math. Anal. Appl. 2015;425(1):55–70. doi: 10.1016/j.jmaa.2014.11.044. [DOI] [Google Scholar]
  • 9.Dao M.N., Phan H.M. Adaptive Douglas–Rachford splitting algorithm for the sum of two operators. SIAM J. Optim. 2019;29(4):2697–2724. doi: 10.1137/18M121160X. [DOI] [Google Scholar]
  • 10. Eckstein, J.: Splitting methods for monotone operators with applications to parallel optimization. Ph.D. thesis, MIT (1989)
  • 11.Eckstein J., Bertsekas D.P. On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 1992;55(1):293–318. doi: 10.1007/BF01581204. [DOI] [Google Scholar]
  • 12.Giselsson P. Tight global linear convergence rate bounds for Douglas–Rachford splitting. J. Fixed Point Theory Appl. 2017;19:2241–2270. doi: 10.1007/s11784-017-0417-1. [DOI] [Google Scholar]
  • 13. Giselsson, P.: Nonlinear forward–backward splitting with projection correction (2019). arXiv:1908.07449
  • 14. Giselsson, P.: Lecture notes on large-scale convex optimization (2015). http://control.lth.se/education/doctorate-program/large-scale-convex-optimization/
  • 15.Giselsson P., Boyd S. Linear convergence and metric selection for Douglas–Rachford splitting and ADMM. IEEE Trans. Autom. Control. 2017;62(2):532–544. doi: 10.1109/TAC.2016.2564160. [DOI] [Google Scholar]
  • 16.Guo K., Han D. A note on the Douglas–Rachford splitting method for optimization problems involving hypoconvex functions. J. Glob. Optim. 2018;72(3):431–441. doi: 10.1007/s10898-018-0660-z. [DOI] [Google Scholar]
  • 17.Guo K., Han D., Yuan X. Convergence analysis of Douglas–Rachford splitting method for strongly + weakly convex programming. SIAM J. Numer. Anal. 2017;55(4):1549–1577. doi: 10.1137/16M1078604. [DOI] [Google Scholar]
  • 18.Latafat P., Patrinos P. Asymmetric forward–backward-adjoint splitting for solving monotone inclusions involving three operators. Comput. Optim. Appl. 2017;68(1):57–93. doi: 10.1007/s10589-017-9909-6. [DOI] [Google Scholar]
  • 19.Mordukhovich B.S. Variational Analysis and Generalized Differentiation I: Basic Theory. Berlin: Springer; 2006. [Google Scholar]
  • 20.Mordukhovich B.S. Variational Analysis and Applications. 2018. [Google Scholar]
  • 21.Ogura N., Yamada I. Non-strictly convex minimization over the fixed point set of an asymptotically shrinking nonexpansive mapping. Numer. Funct. Anal. Optim. 2002;22(1–2):113–137. doi: 10.1081/NFA-120003674. [DOI] [Google Scholar]
  • 22.Reich S. On the asymptotic behavior of nonlinear semigroups and the range of accretive operators. J. Math. Anal. Appl. 1981;79(1):113–126. doi: 10.1016/0022-247X(81)90013-5. [DOI] [Google Scholar]
  • 23.Rockafellar R.T., Wets R.J.-B. Variational Analysis. 3. Berlin: Springer; 1998. [Google Scholar]
  • 24. Ryu, E.K., Hannah, W., Yin, R.: Scaled relative graph: nonexpansive operators via 2D Euclidean geometry (2019). arXiv:1902.09788
  • 25.Ryu E.K., Taylor A.B., Bergeling C., Giselsson P. Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 2020;30(3):2251–2271. doi: 10.1137/19M1304854. [DOI] [Google Scholar]
  • 26.Wang X. On Chebyshev functions and Klee functions. J. Math. Anal. Appl. 2010;368(1):293–310. doi: 10.1016/j.jmaa.2010.03.041. [DOI] [Google Scholar]
  • 27.Wen B., Chen X., Pong T.K. Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 2017;27(1):124–145. doi: 10.1137/16M1055323. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.


Articles from Fixed Point Theory and Algorithms for Sciences and Engineering are provided here courtesy of Springer

RESOURCES