On compositions of special cases of Lipschitz continuous operators

Pontus Giselsson; Walaa M Moursi

doi:10.1186/s13663-021-00709-0

. 2021 Dec 20;2021(1):25. doi: 10.1186/s13663-021-00709-0

On compositions of special cases of Lipschitz continuous operators

Pontus Giselsson ¹, Walaa M Moursi ^2,^3,^✉

PMCID: PMC8685197 PMID: 34993526

Abstract

Many iterative optimization algorithms involve compositions of special cases of Lipschitz continuous operators, namely firmly nonexpansive, averaged, and nonexpansive operators. The structure and properties of the compositions are of particular importance in the proofs of convergence of such algorithms. In this paper, we systematically study the compositions of further special cases of Lipschitz continuous operators. Applications of our results include compositions of scaled conically nonexpansive mappings, as well as the Douglas–Rachford and forward–backward operators, when applied to solve certain structured monotone inclusion and optimization problems. Several examples illustrate and tighten our conclusions.

Keywords: Compositions of operators, Conically nonexpansive operators, Douglas–Rachford algorithm, Forward-backward algorithm, Hypoconvex function, Maximally monotone operator, Proximal operator, Resolvent

Introduction

In this paper, we assume that

X is a real Hilbert space

with the inner product $〈 \cdot ∣ \cdot 〉$ and the induced norm $∥ \cdot ∥$ . Let $L > 0$ and let $T : X \to X$ . Then T is L-Lipschitz continuous if $(\forall (x, y) \in X \times X)$ $∥ T x - T y ∥ \leq L ∥ x - y ∥$ , and T is nonexpansive if T is 1-Lipschitz continuous, i.e., $(\forall (x, y) \in X \times X)$ $∥ T x - T y ∥ \leq ∥ x - y ∥$ . In this paper, we study compositions of what we call (see Definition 3.1) identity-nonexpansive decompositions (I-N decompositions for short) of Lipschitz continuous operators. Let $(α, β) \in R^{2}$ and let $Id : X \to X$ be the identity operator on X. A Lipschitz continuous operator R admits an $(α, β)$ -I-N decomposition if $R = α Id + β N$ for some nonexpansive operator $N : X \to X$ . For instance, averaged,1 conically nonexpansive,2 and cocoercive3 operators are all Lipschitz continuous operators that admit special I-N decompositions.

We consider compositions of the form

R = R_{m} \dots R_{1},

where $m \in {2, 3, \dots}$ , $I = {1, \dots, m}$ , and ${(R_{i})}_{i \in I}$ is a family of Lipschitz continuous operators such that, for each $i \in I$ , $R_{i}$ admits an $(α_{i}, β_{i})$ -I-N decomposition. That is, $R_{i} = α_{i} Id + β_{i} N_{i}$ for all $i \in I$ , where $α_{i}$ and $β_{i}$ are real numbers, and $N_{i} : X \to X$ are nonexpansive for all $i \in I$ . A straightforward (and naive) conclusion is that the composition is Lipschitz continuous with a constant $Π_{i \in I} (| α_{i} | + | β_{i} |)$ . However, such a conclusion can be further refined when, for instance, each $R_{i}$ is an averaged operator. Indeed, in this case it is known that the composition is an averaged (and not just Lipschitz continuous) operator (see, e.g., [2, Proposition 4.46], [6, Lemma 2.2], and [21, Theorem 3]). In this paper, we provide a systematic study of the structure of R under additional assumptions on the decomposition parameters.

Our main result is stated in Theorem 3.4. We show that, for $m = 2$ , under a mild assumption on $(α_{1}, α_{2}, β_{1}, β_{2})$ composition (1) is a scalar multiple of a conically nonexpansive operator. As a consequence of Theorem 3.4, we show in Theorem 4.2 that, under additional assumptions on the decomposition parameters, compositions of scaled conically nonexpansive mappings are scaled conically nonexpansive mappings, see also [1] for a relevant result.4 Special cases of Theorem 4.2 include, e.g., compositions of averaged operators [2, Proposition 4.46] and compositions of averaged and negatively averaged operators [12].

Of particular interest are compositions R that are averaged, conically nonexpansive, or contractive. Let $x_{0} \in X$ . For an averaged (respectively contractive) operator R, the sequence ${(R^{k} x_{0})}_{k \in N}$ converges weakly (respectively strongly) towards a fixed point of R (if one exists) [2, Theorem 5.14]. For conically nonexpansive operators, a simple averaging trick gives an averaged operator with the same fixed point set as the conically nonexpansive operator. Iterating the new averaged operator yields a sequence that converges weakly to a fixed point of the conically nonexpansive operator. These properties have been instrumental in proving convergence for the Douglas–Rachford algorithm and the forward–backward algorithm. In this paper, we apply our composition result Theorem 4.2 to prove convergence of these splitting methods in new settings.

The Douglas–Rachford and forward–backward methods traditionally solve monotone inclusion problems of the form

Find x \in X such that 0 \in A x + B x,

where $A : X ⇉ X$ and $B : X ⇉ X$ are maximally monotone, and, in the case of the forward–backward method, A is additionally assumed to be cocoercive. The Douglas–Rachford method iterates the Douglas–Rachford map $T = \frac{1}{2} (Id + R_{γ B} R_{γ A})$ , where5 $γ > 0$ is a positive step-size. The Douglas–Rachford map is an averaged map of the composition of reflected resolvents. The forward–backward method iterates the forward–backward map $T = J_{γ B} (Id - γ A)$ , where $γ > 0$ is a positive step-size. The forward–backward map is a composition of a resolvent and a forward-step.

In this paper, we show that for Douglas–Rachford splitting we need not impose monotonicity on the individual operators, but only on the sum, provided the sum is strongly monotone. The reflected resolvents $R_{γ A}$ and $R_{γ B}$ are negatively conically nonexpansive, the composition is conically nonexpansive, and a sufficient averaging gives an averaged map that converges to a fixed point when iterated. Relevant work appears in [9, 16], and [17].

More striking, for the forward–backward method, we show that it is sufficient that the sum is monotone (not strongly monotone as for DR). More specifically, we show that identity can be shifted between the two operators, while still guaranteeing averagedness of the forward–backward map $T = J_{γ B} (Id - γ A)$ . Indeed, the resolvent $J_{γ B}$ is cocoercive and the forward-step $(Id - γ A)$ is scaled averaged. This implies that the composition is averaged (given restrictions on the cocoercivity and averagedness parameters). Moreover, when the sum is strongly monotone, again with no assumptions on monotonicity of the individual operators, we show that the forward–backward map is contractive. We also prove tightness of our contraction factor.

We also provide, in Theorem 4.7, a generalization of Theorem 4.2 to the setting in (1) of compositions of more than two operators. We assume that all $R_{i}$ are scaled conically nonexpansive operators and provide conditions on the parameters that give a specific scaled conically nonexpansive representation of R. Our condition is symmetric in the individual operators and allows for one of them to be scaled conic, while the rest must be scaled averaged. This is in compliance with the $m = 2$ case in Theorem 4.2.

Finally, in Sect. 8, we provide graphical 2D-representations of different operator classes that admit I-N decompositions such as Lipschitz continuous operators, averaged operators, and cocoercive operators. We also provide 2D-representations of compositions of two such operator classes. Illustrations of the firmly nonexpansive ( $\frac{1}{2}$ -averaged ) and nonexpansive operator classes have previously appeared in [10, 11], and illustrations of more operator classes that admit particular I-N decompositions and their compositions have appeared in [14, 24] and in early preprints of [15].

Organization and notation

The remainder of this paper is organized as follows: Sect. 2 presents useful facts and auxiliary results that are used throughout the paper. In Sect. 3, we present the main abstract results of the paper. Section 4 presents the main composition results of Lipschitz continuous operators that admit I-N decompositions, under mild assumptions on the decomposition parameters, as well as illustrative and limiting examples. In Sect. 5 and Sect. 6, we present applications of our composition results to the Douglas–Rachford and forward–backward algorithms, respectively. In Sect. 7 we present applications of our results to optimization problems. Finally, in Sect. 8, we provide graphical representations of many different I-N decompositions and their compositions.

The notation we use is standard and follows, e.g., [2] or [23].

Facts and auxiliary results

Let $ρ \in R$ . Let $A : X \to X$ . Recall that A is ρ-monotone if $(\forall (x, u) \in gra A)$ $(\forall (y, v) \in gra A)$

〈 x - y ∣ u - v 〉 \geq ρ {∥ x - y ∥}^{2}

and is maximally ρ-monotone if any proper extension of graA will violate (3). In passing we point out that A is (maximally) monotone (respectively ρ-hypomonotone, ρ-strongly monotone) if $ρ = 0$ (respectively $ρ < 0$ , $ρ > 0$ ) see, e.g., [2, Chap. 20], [4, Definition 6.9.1], [7, Definition 2.2], and [23, Example 12.28].

Fact 2.1

Let $A : X ⇉ X$ , let $B : X ⇉ X$ , let $λ \in R ∖ {0}$ , and suppose that $zer (A + B) = {(A + B)}^{- 1} (0) \neq \emptyset$ . Suppose that $J_{A}$ and $J_{B}$ are single-valued and that $dom J_{A} = dom J_{B} = X$ . Set

T = (1 - λ) Id + λ R_{B} R_{A} .

Then T is single-valued, $dom T = X$ , and

zer (A + B) = J_{A} (Fix R_{B} R_{A}) = J_{A} (Fix T) .

Proof

See [9, Lemma 4.1]. □

Proposition 2.2

Let $A : X \to X$ , let $B : X ⇉ X$ , and suppose that $zer (A + B) = {(A + B)}^{- 1} (0) \neq \emptyset$ . Suppose that $J_{B}$ is single-valued and that $dom J_{B} = X$ . Set

T = J_{B} (Id - A) .

Then T is single-valued, $dom T = X$ , and

zer (A + B) = Fix T .

Proof

The proof is similar to the proof of [2, Proposition 26.1(iv)].6 Indeed, let $x \in X$ . Then $x \in zer (A + B)$ ⇔ $- A x \in B x$ ⇔ $(Id - A) x \in (Id + B) x$ ⇔ $x = J_{B} (Id - A) x = T x$ . □

Lemma 2.3

Let $λ \in R$ , let $R_{1} : X \to X$ , let $R_{2} : X \to X$ , and set

R (λ) = (1 - λ) Id + λ R_{2} R_{1} .

Let $(x, y) \in X \times X$ . Then

\begin{aligned} 〈 R (λ) x - R (λ) y ∣ (Id - R (λ)) x - (Id - R (λ)) y 〉 \\ = (1 - 2 λ) 〈 x - y ∣ (Id - R (λ)) x - (Id - R (λ)) y 〉 \\ + λ^{2} 〈 (Id + R_{1}) x - (Id + R_{1}) y ∣ (Id - R_{1}) x - (Id - R_{1}) y 〉 \\ + λ^{2} 〈 (Id + R_{2}) R_{1} x - (Id + R_{2}) R_{1} y ∣ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y 〉 . \end{aligned}

Proof

See Appendix A. □

Proposition 2.4

Let $α \in R$ , let $β \in R$ , let $N : X \to X$ , and set $T = α Id + β N$ . Let $(x, y) \in X \times X$ . Then the following hold:

\begin{aligned} β^{2} ({∥ x - y ∥}^{2} - {∥ N x - N y ∥}^{2}) \\ = (β^{2} - α^{2}) {∥ x - y ∥}^{2} - {∥ T x - T y ∥}^{2} + 2 α 〈 x - y ∣ T x - T y 〉 \end{aligned}

10a

= (β^{2} - α^{2}) {∥ x - y ∥}^{2} - (1 - 2 α) {∥ T x - T y ∥}^{2} + 2 α 〈 T x - T y ∣ (Id - T) x - (Id - T) y 〉

10b

= (β^{2} - α (α - 1)) {∥ x - y ∥}^{2} - ((1 - α) {∥ T x - T y ∥}^{2} + α {∥ (Id - T) x - (Id - T) y ∥}^{2}) .

10c

Proof

Indeed, we have

\begin{aligned} β^{2} ({∥ x - y ∥}^{2} - {∥ N x - N y ∥}^{2}) \\ = β^{2} {∥ x - y ∥}^{2} - {∥ (T x - α x) - (T y - α y) ∥}^{2} \end{aligned}

11a

= β^{2} {∥ x - y ∥}^{2} - ({∥ T x - T y ∥}^{2} + α^{2} {∥ x - y ∥}^{2} - 2 α 〈 T x - T y ∣ x - y 〉)

11b

= (β^{2} - α^{2}) {∥ x - y ∥}^{2} - ({∥ T x - T y ∥}^{2} - 2 α 〈 T x - T y ∣ x - y 〉)

11c

\begin{aligned} = (β^{2} - α^{2} + α) {∥ x - y ∥}^{2} - ((1 - α) {∥ T x - T y ∥}^{2} \\ + α {∥ T x - T y ∥}^{2} - 2 α 〈 T x - T y ∣ x - y 〉 + α {∥ x - y ∥}^{2}) \end{aligned}

11d

= (β^{2} - α (α - 1)) {∥ x - y ∥}^{2} - ((1 - α) {∥ T x - T y ∥}^{2} + α {∥ (Id - T) x - (Id - T) y ∥}^{2}) .

11e

This proves (10a) and (10c) in view of (11c) and (11e). Finally, note that $(β^{2} - α^{2}) {∥ x - y ∥}^{2} - {∥ T x - T y ∥}^{2} + 2 α 〈 x - y ∣ T x - T y 〉 = (β^{2} - α^{2}) {∥ x - y ∥}^{2} - (1 - 2 α) {∥ T x - T y ∥}^{2} - 2 α {∥ T x - T y ∥}^{2} + 2 α 〈 x - y ∣ T x - T y 〉 = (β^{2} - α^{2}) {∥ x - y ∥}^{2} - (1 - 2 α) {∥ T x - T y ∥}^{2} + 2 α 〈 T x - T y ∣ (Id - T) x - (Id - T) y 〉$ . This proves (10b). □

Proposition 2.5

Let $α \in R$ , let $β \in R$ , let $N : X \to X$ , and set $T = α Id + β N$ . Let $(x, y) \in X \times X$ . Then the following are equivalent:

(i)
N is nonexpansive.
(ii)
${∥ T x - T y ∥}^{2} - 2 α 〈 x - y ∣ T x - T y 〉 \leq (β^{2} - α^{2}) {∥ x - y ∥}^{2}$ .
(iii)
$(1 - 2 α) {∥ T x - T y ∥}^{2} - 2 α 〈 T x - T y ∣ (Id - T) x - (Id - T) y 〉 \leq (β^{2} - α^{2}) {∥ x - y ∥}^{2}$ .
(iv)
$(2 α - 1) {∥ (Id - T) x - (Id - T) y ∥}^{2} - 2 (1 - α) 〈 T x - T y ∣ (Id - T) x - (Id - T) y 〉 \leq (β^{2} - {(1 - α)}^{2}) {∥ x - y ∥}^{2}$ .
(v)
$(1 - α) {∥ T x - T y ∥}^{2} + α {∥ (Id - T) x - (Id - T) y ∥}^{2} \leq (β^{2} - α (α - 1)) {∥ x - y ∥}^{2}$ .

Proof

(i)⇔(ii)⇔(iii)⇔(v): This is a direct consequence of Proposition 2.4. (i)⇔(iv): Applying (10b) with $(T, α, β)$ replaced by $(Id - T, 1 - α, - β)$ yields $β^{2} ({∥ x - y ∥}^{2} - {∥ N x - N y ∥}^{2}) = (β^{2} - {(1 - α)}^{2}) {∥ x - y ∥}^{2} - (2 α - 1) {∥ (Id - T) x - (Id - T) y ∥}^{2} + 2 (1 - α) 〈 T x - T y ∣ (Id - T) x - (Id - T) y 〉$ . The proof is complete. □

Proposition 2.6

Let $α \in R$ , let $N : X \to X$ , and set $T = (1 - α) Id + α N$ . Let $(x, y) \in X \times X$ . Then the following are equivalent:

(i)
N is nonexpansive.
(ii)
${∥ T x - T y ∥}^{2} - 2 (1 - α) 〈 x - y ∣ T x - T y 〉 \leq (2 α - 1) {∥ x - y ∥}^{2}$ .
(iii)
$(2 α - 1) {∥ T x - T y ∥}^{2} - 2 (1 - α) 〈 T x - T y ∣ (Id - T) x - (Id - T) y 〉 \leq (2 α - 1) {∥ x - y ∥}^{2}$ .
(iv)
$(1 - 2 α) {∥ (Id - T) x - (Id - T) y ∥}^{2} \leq 2 α 〈 T x - T y ∣ (Id - T) x - (Id - T) y 〉$ .
(v)
$(1 - α) {∥ (Id - T) x - (Id - T) y ∥}^{2} \leq α {∥ x - y ∥}^{2} - α {∥ T x - T y ∥}^{2}$ .

Proof

Apply Proposition 2.5 with $(α, β)$ replaced by $(1 - α, α)$ . □

Lemma 2.7

Let $λ < 1$ . Then

{∥ x ∥}^{2} - λ {∥ y ∥}^{2} \geq - \frac{λ}{1 - λ} {∥ x + y ∥}^{2} .

Proof

Let $δ > 0$ . By Young’s inequality, ${∥ x + y ∥}^{2} = {∥ x ∥}^{2} + 2 〈 x, y 〉 + {∥ y ∥}^{2} \geq (1 - δ) {∥ x ∥}^{2} + (1 - δ^{- 1}) {∥ y ∥}^{2}$ . Equivalently, ${∥ x + y ∥}^{2} - (1 - δ) {∥ x ∥}^{2} \geq (1 - δ^{- 1}) {∥ y ∥}^{2}$ . Now, replace $(x, y, δ)$ by $(- y, x + y, 1 - λ)$ . □

Proposition 2.8

Let $α \in] 0, 1 [$ , let $β > 0$ , and let $T : X \to X$ . Then T is α-averaged if and only if $T = (1 - β) Id + β M$ and M is $\frac{α}{β}$ -conically nonexpansive.

Proof

Indeed, T is α-averaged if and only if there exists a nonexpansive mapping $N : X \to X$ such that $T = (1 - α) Id + α N$ . Equivalently,

T = (1 - α) Id + α N = (1 - β) Id + β ((1 - \frac{α}{β}) Id + \frac{α}{β} N),

and the conclusion follows by setting $M = (1 - \frac{α}{β}) Id + \frac{α}{β} N$ . □

The following three lemmas can be directly verified, hence we omit the proof.

Lemma 2.9

Let $α > 0$ , and let $T : X \to X$ . Then T is α-conically nonexpansive ⇔ $Id - T$ is $\frac{1}{2 α}$ -cocoercive ⇒ $Id - T$ is maximally monotone.

Lemma 2.10

Let $β > 0$ , let $μ \in R$ , and let $A : X \to X$ . Suppose that A is maximally μ-monotone and $\frac{1}{β}$ -cocoercive. Then $μ \leq \frac{1}{β}$ .

Lemma 2.11

Let $β > 0$ , let $T : X \to X$ , and let $\overline{β} \geq β$ . Suppose that T is $\frac{1}{β}$ -cocoercive. Then T is $\frac{1}{\overline{β}}$ -cocoercive.

Lemma 2.12

Let $β > 0$ , and let $A : X \to X$ . Suppose that A is β-Lipschitz continuous. Then the following hold:

(i)
A is maximally $(- β)$ -monotone.
(ii)
$A + β Id$ is $\frac{1}{2 β}$ -cocoercive.

Proof

See Appendix B. □

Lemma 2.13

Let $β > δ > 0$ , let $T_{1} : X \to X$ , and let $T_{2} : X \to X$ . Suppose that $T_{1}$ (respectively $T_{2}$ ) is $\frac{1}{β}$ -cocoercive (respectively $\frac{1}{δ}$ -cocoercive). Then $T_{1} - T_{2}$ is β-Lipschitz continuous.

Proof

See Appendix C. □

As a corollary, we obtain the following result which was stated in [27, page 4].

Corollary 2.14

Let $f_{1} : X \to R$ , $f_{2} : X \to R$ be Frechét differentiable convex functions, and let $β > δ > 0$ . Suppose that ${\nabla f}_{1}$ (respectively ${\nabla f}_{2}$ ) is β-Lipschitz continuous (respectively δ-Lipschitz continuous). Then the following hold:

(i)
${\nabla f}_{1} - {\nabla f}_{2}$ is β-Lipschitz continuous.
(ii)
Suppose that $f_{1} - f_{2}$ is convex. Then ${\nabla f}_{1} - {\nabla f}_{2}$ is $\frac{1}{β}$ -cocoercive.

Proof

See Appendix D. □

Lemma 2.15

Let $α \in] 0, 1 [$ , let $δ \in] 0, 1]$ , and let $T : X \to X$ . Suppose that T is α-averaged. Then the following hold:

(i)
δT is $(1 - δ (1 - α))$ -averaged.
(ii)
Suppose that $δ \in] 0, 1 [$ . Then δT is a Banach contraction with constant δ.

Proof

See Appendix E. □

Let A be maximally ρ-monotone, where $ρ > - 1$ . Then (see [9, Proposition 3.4] and [3, Corollary 2.11 and Proposition 2.12]) we have

J_{A} is single-valued and dom J_{A} = X .

The following result involves resolvents and reflected resolvents of ρ-monotone operators.

Proposition 2.16

Let A be ρ-monotone, where $ρ > - 1$ . Then the following hold:

(i)
$J_{A}$ is $(1 + ρ)$ - cocoercive, in which case $J_{A}$ is Lipschitz continuous with constant $\frac{1}{1 + ρ}$ .
(ii)
$- R_{A}$ is $\frac{1}{1 + ρ}$ -conically nonexpansive.
(iii)
Suppose that $ρ \leq 0$ . Then $R_{A}$ is Lipschitz continuous with constant $\frac{1 - ρ}{1 + ρ}$ .

Proof

(i): See [9, Lemma 3.3(ii)]. Alternatively, it follows from [3, Corollary 3.8(ii)] that $Id - T$ is $\frac{1}{2 (1 + ρ)}$ -averaged. Now apply Lemma 2.9 with T replaced by $Id - J_{A}$ . (ii): It follows from (i) that there exists a nonexpansive operator $N : X \to X$ such that $J_{A} = \frac{1}{2 (1 + ρ)} (Id + N)$ . Now, $- R_{A} = Id - 2 J_{A} = Id - \frac{1}{1 + ρ} (Id + N) = (1 - \frac{1}{1 + ρ}) Id + \frac{1}{1 + ρ} N$ . (iii): Indeed, let $(x, y) \in X \times X$ and let N be as defined above. We have

∥ R_{A} x - R_{A} y ∥ = ∥ - \frac{ρ}{1 + ρ} (x - y) - \frac{1}{1 + ρ} (N x - N y) ∥ \leq - \frac{ρ}{1 + ρ} ∥ x - y ∥ + \frac{1}{1 + ρ} ∥ N x - N y ∥

14a

\leq \frac{1 - ρ}{1 + ρ} ∥ x - y ∥ .

14b

The proof is complete. □

Compositions

Definition 3.1

( $(α, β)$ -I-N decomposition)

Let $R : X \to X$ be Lipschitz continuous, and let7 $(α, β) \in R \times R_{+}$ . We say that R admits an $(α, β)$ -identity-nonexpansive (I-N) decomposition8 if there exists a nonexpansive operator $N : X \to X$ such that $R = α Id + β N$ .

Throughout the rest of this paper, we assume that

R_{1} : X \to X and R_{2} : X \to X are Lipschitz continuous operators.

Proposition 3.2

Let $α_{1} \in] - \infty, 1 [$ , let $α_{2} \in] - \infty, 1 [$ , let $β_{1} \in R_{+}$ , let $β_{2} \in R_{+}$ , and suppose that $α_{2} (α_{2} - 1) \leq β_{2}^{2}$ . Set

δ_{1} = \frac{α_{1}}{1 - α_{1}} (1 - \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}}),

15a

δ_{2} = \frac{α_{2}}{1 - α_{2}},

15b

δ_{3} = 1 - (\frac{{(1 - α_{1})}^{2} - β_{1}^{2}}{1 - α_{1}} (1 - \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}}) + \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{(1 - α_{2})}) .

15c

Suppose that $R_{1}$ admits an $(α_{1}, β_{1})$ -I-N decomposition and that $R_{2}$ admits an $(α_{2}, β_{2})$ -I-N decomposition. Then $(\forall (x, y) \in X \times X)$ we have

\begin{matrix} {∥ R_{2} R_{1} x - R_{2} R_{1} y ∥}^{2} + δ_{1} {∥ (Id - R_{1}) x - (Id - R_{1}) y ∥}^{2} \\ + δ_{2} {∥ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2} \leq δ_{3} {∥ x - y ∥}^{2} . \end{matrix}

Proof

Set $T_{i} = \frac{1}{2} (Id + R_{i}) = \frac{1 + α_{i}}{2} Id + \frac{β_{i}}{2} N_{i}$ , and observe that by Proposition 2.5 applied with $(T, α, β)$ replaced by $(T_{i}, \frac{1 + α_{i}}{2}, \frac{β_{i}}{2})$ , $i \in {1, 2}$ , we have $(\forall (x, y) \in X \times X)$

\begin{matrix} 〈 T_{i} x - T_{i} y ∣ (Id - T_{i}) x - (Id - T_{i}) y 〉 \\ \geq \frac{α_{i}}{1 - α_{i}} {∥ (Id - T_{i}) x - (Id - T_{i}) y ∥}^{2} + \frac{{(1 - α_{i})}^{2} - β_{i}^{2}}{4 (1 - α_{i})} {∥ x - y ∥}^{2} . \end{matrix}

Equivalently,

\begin{aligned} 〈 (Id + R_{i}) x - (Id + R_{i}) y ∣ (Id - R_{i}) x - (Id - R_{i}) y 〉 \\ \geq \frac{α_{i}}{1 - α_{i}} {∥ (Id - R_{i}) x - (Id - R_{i}) y ∥}^{2} + \frac{{(1 - α_{i})}^{2} - β_{i}^{2}}{1 - α_{i}} {∥ x - y ∥}^{2} . \end{aligned}

Observe also that, because $α_{2} < 1$ , we have

α_{2} (α_{2} - 1) \leq β_{2}^{2} \Leftrightarrow 1 - \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}} \geq 0 .

It follows from (18), applied with $i = 2$ and $(x, y)$ replaced by $(R_{1} x, R_{1} y)$ in (20c) and by $i = 1$ in (20f), in view of (19) that

\begin{matrix} {∥ x - y ∥}^{2} - {∥ R_{2} R_{1} x - R_{2} R_{1} y ∥}^{2} \\ = {∥ x - y ∥}^{2} - {∥ R_{1} x - R_{1} y ∥}^{2} + {∥ R_{1} x - R_{1} y ∥}^{2} - {∥ R_{2} R_{1} x - R_{2} R_{1} y ∥}^{2} \end{matrix}

20a

\begin{matrix} = 〈 (Id + R_{1}) x - (Id + R_{1}) y ∣ (Id - R_{1}) x - (Id - R_{1}) y 〉 \\ + 〈 (Id + R_{2}) R_{1} x - (Id + R_{2}) R_{1} y ∣ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y 〉 \end{matrix}

20b

\begin{matrix} \geq 〈 (Id + R_{1}) x - (Id + R_{1}) y ∣ (Id - R_{1}) x - (Id - R_{1}) y 〉 \\ + \frac{α_{2}}{1 - α_{2}} {∥ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2} + \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}} {∥ R_{1} x - R_{1} y ∥}^{2} \end{matrix}

20c

\begin{matrix} = 〈 (Id + R_{1}) x - (Id + R_{1}) y ∣ (Id - R_{1}) x - (Id - R_{1}) y 〉 \\ + \frac{α_{2}}{1 - α_{2}} {∥ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2} \\ + \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}} ({∥ x - y ∥}^{2} - 〈 (Id + R_{1}) x - (Id + R_{1}) y ∣ (Id - R_{1}) x - (Id - R_{1}) y 〉) \end{matrix}

20d

\begin{matrix} = (1 - \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}}) 〈 (Id + R_{1}) x - (Id + R_{1}) y ∣ (Id - R_{1}) x - (Id - R_{1}) y 〉 \\ + \frac{α_{2}}{1 - α_{2}} {∥ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2} + \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}} {∥ x - y ∥}^{2} \end{matrix}

20e

\begin{matrix} \geq (1 - \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}}) (\frac{α_{1}}{1 - α_{1}} {∥ (Id - R_{1}) x - (Id - R_{1}) y ∥}^{2} + \frac{{(1 - α_{1})}^{2} - β_{1}^{2}}{1 - α_{1}} {∥ x - y ∥}^{2}) \\ + \frac{α_{2}}{1 - α_{2}} {∥ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2} + \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}} {∥ x - y ∥}^{2} \end{matrix}

20f

\begin{matrix} = \frac{α_{1}}{1 - α_{1}} (1 - \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}}) {∥ (Id - R_{1}) x - (Id - R_{1}) y ∥}^{2} \\ + \frac{α_{2}}{1 - α_{2}} {∥ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2} \\ + (\frac{{(1 - α_{1})}^{2} - β_{1}^{2}}{1 - α_{1}} (1 - \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}}) + \frac{{(1 - α_{2})}^{2} - β_{2}^{2}}{1 - α_{2}}) {∥ x - y ∥}^{2} . \end{matrix}

20g

Rearranging yields the desired result. □

Theorem 3.3

Let $α_{1} \in] - \infty, 1 [$ , let $α_{2} \in] - \infty, 1 [$ , let $β_{1} \in R_{+}$ , let $β_{2} \in R_{+}$ , and suppose that $α_{2} (α_{2} - 1) \leq β_{2}^{2}$ . Let $δ_{1}$ , $δ_{2}$ , and $δ_{3}$ be defined as in (15a)–(15c). Set

δ_{4} = \frac{δ_{1} δ_{2}}{δ_{1} + δ_{2}},

and suppose that $δ_{1} + δ_{2} > 0$ , that $δ_{3} - δ_{4} + δ_{3} δ_{4} \geq 0$ , and that $δ_{4} > - 1$ . Suppose that $R_{1}$ admits an $(α_{1}, β_{1})$ -I-N decomposition, and that $R_{2}$ admits an $(α_{2}, β_{2})$ -I-N decomposition. Then $R_{2} R_{1}$ admits an $(α, β)$ -I-N decomposition, where

α = \frac{δ_{4}}{1 + δ_{4}}, β = \frac{\sqrt{δ_{3} - δ_{4} + δ_{3} δ_{4}}}{1 + δ_{4}} .

Proof

Let $\underline{δ} : = min (δ_{1}, δ_{2})$ , let $\bar{δ} : = max (δ_{1}, δ_{2})$ , and let $λ : = - \underline{δ} / \bar{δ}$ (i.e., $λ = - δ_{1} / δ_{2}$ if $δ_{1} \leq δ_{2}$ , and $λ = - δ_{2} / δ_{1}$ if $δ_{1} \geq δ_{2}$ ). Then Proposition 3.2 and Lemma 2.7 imply that

\begin{aligned} δ_{3} {∥ x - y ∥}^{2} - {∥ R_{2} R_{1} x - R_{2} R_{1} y ∥}^{2} \\ \geq δ_{1} {∥ (Id - R_{1}) x - (Id - R_{1}) y ∥}^{2} + δ_{2} {∥ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2} \end{aligned}

23a

= \bar{δ} (\frac{δ_{1}}{\bar{δ}} {∥ (Id - R_{1}) x - (Id - R_{1}) y ∥}^{2} + \frac{δ_{2}}{\bar{δ}} {∥ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2}

23b

\geq \bar{δ} (- \frac{λ}{1 - λ} {∥ (Id - R_{1}) x - (Id - R_{1}) y + (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y ∥}^{2})

23c

= - \frac{λ \bar{δ}}{1 - λ} {∥ (Id - R_{2} R_{1}) x - (Id - R_{2} R_{1}) y ∥}^{2}

23d

= \frac{\underline{δ} \bar{δ}}{\bar{δ} + \underline{δ}} {∥ (Id - R_{2} R_{1}) x - (Id - R_{2} R_{1}) y ∥}^{2}

23e

= δ_{4} {∥ (Id - R_{2} R_{1}) x - (Id - R_{2} R_{1}) y ∥}^{2} .

23f

Comparing (23a)–(23f) to Proposition 2.5 applied with T replaced by $R_{2} R_{1}$ , we learn that there exist a nonexpansive operator $N : X \to X$ and $(α, β) \in R^{2}$ such that $R_{2} R_{1} = α Id + β N$ , where $δ_{3} = \frac{β^{2} + α (1 - α)}{1 - α}$ and $δ_{4} = \frac{α}{1 - α}$ . Equivalently, $α = \frac{δ_{4}}{1 + δ_{4}}$ , hence $β = \frac{\sqrt{δ_{3} - δ_{4} + δ_{3} δ_{4}}}{1 + δ_{4}}$ , as claimed. □

Theorem 3.4

Let $α_{1} \in R$ , let $α_{2} \in R$ , let $β_{1} > 0$ , let $β_{2} > 0$ , suppose that $α_{1} + β_{1} > 0$ , that $α_{2} + β_{2} > 0$ , and that either $\frac{β_{1} β_{2}}{(α_{1} + β_{1}) (α_{2} + β_{2})} < 1$ or $max {\frac{β_{1}}{α_{1} + β_{1}}, \frac{β_{2}}{α_{2} + β_{2}}} = 1$ . Set

κ = (α_{1} + β_{1}) (α_{2} + β_{2}),

24a

θ = {\begin{matrix} \frac{β_{1} α_{2} + β_{2} α_{1}}{α_{1} α_{2} + α_{1} β_{2} + α_{2} β_{1}}, & \frac{β_{1} β_{2}}{(α_{1} + β_{1}) (α_{2} + β_{2})} < 1; \\ 1, & max {\frac{β_{1}}{α_{1} + β_{1}}, \frac{β_{2}}{α_{2} + β_{2}}} = 1 . \end{matrix}

24b

Suppose that $R_{1}$ admits an $(α_{1}, β_{1})$ -I-N decomposition, and that $R_{2}$ admits an $(α_{2}, β_{2})$ -I-N decomposition. Then $θ \in] 0, + \infty [$ and $R_{2} R_{1}$ admits a $(κ (1 - θ), κ θ)$ -I-N decomposition, i.e., $R_{2} R_{1}$ is κ-scaled θ-conically nonexpansive. That is, there exists a nonexpansive operator $N : X \to X$ such that

R_{2} R_{1} = κ (1 - θ) Id + κ θ N .

Proof

Let $θ_{i} = \frac{β_{i}}{α_{i} + β_{i}} > 0$ , and observe that

R_{i} = (α_{i} + β_{i}) ((1 - θ_{i}) Id + θ_{i} N_{i}), i \in {1, 2} .

Next, let ${\tilde{N}}_{2} = \frac{1}{α_{1} + β_{1}} N_{2} \circ (α_{1} + β_{1}) Id$ , and note that ${\tilde{N}}_{2}$ is nonexpansive. Now, set

{\tilde{R}}_{1} = (1 - θ_{1}) Id + θ_{1} N_{1}, {\tilde{R}}_{2} = (1 - θ_{2}) Id + θ_{2} {\tilde{N}}_{2} .

Then (26) and (27) yield

R_{2} R_{1} = ((α_{2} + β_{2}) ((1 - θ_{2}) Id + θ_{2} N_{2})) ((α_{1} + β_{1}) ((1 - θ_{1}) Id + θ_{1} N_{1}))

28a

= (α_{1} + β_{1}) (α_{2} + β_{2}) (\frac{1}{α_{1} + β_{1}} ((1 - θ_{2}) Id + θ_{2} N_{2})) ((α_{1} + β_{1}) {\tilde{R}}_{1})

28b

= (α_{1} + β_{1}) (α_{2} + β_{2}) {\tilde{R}}_{2} {\tilde{R}}_{1} .

28c

We proceed by cases. Case I: $α_{1} α_{2} = 0$ . Observe that $0 \in {α_{1}, α_{2}}$ ⇔ $max {\frac{β_{1}}{α_{1} + β_{1}}, \frac{β_{2}}{α_{2} + β_{2}}} = max {θ_{1}, θ_{2}} = 1$ . The conclusion follows by observing that ${\tilde{R}}_{i}$ is nonexpansive, $i \in {1, 2}$ .

Case II: $α_{1} α_{2} \neq 0$ . By assumption we must have $\frac{β_{1}}{α_{1} + β_{1}} \frac{β_{2}}{α_{2} + β_{2}} = θ_{1} θ_{2} < 1$ . We claim that ${\tilde{R}}_{i}$ , $i \in {1, 2}$ , satisfy the conditions of Theorem 3.3 with $(α_{i}, β_{i})$ replaced by $(1 - θ_{i}, θ_{i})$ . Indeed, observe that $(1 - θ_{2}) (1 - θ_{2} - 1) \leq θ_{2}^{2} \Leftrightarrow θ_{2} (θ_{2} - 1) \leq θ_{2}^{2} \Leftrightarrow θ_{2} - 1 \leq θ_{2}$ , which is always true. Moreover, replacing $(α_{i}, β_{i})$ by $(1 - θ_{i}, θ_{i})$ yields $δ_{1} = \frac{1 - θ_{1}}{θ_{1}}$ , $δ_{2} = \frac{1 - θ_{2}}{θ_{2}}$ , $δ_{3} = 1$ , and, consequently, $δ_{4} = \frac{θ_{2} (1 - θ_{1}) + θ_{1} (1 - θ_{2})}{(1 - θ_{1}) (1 - θ_{2})}$ . We claim that

θ_{1} + θ_{2} - 2 θ_{1} θ_{2} > 0 .

Indeed, recall that $θ_{1} + θ_{2} - 2 θ_{1} θ_{2} = θ_{1} θ_{2} (\frac{1}{θ_{1}} + \frac{1}{θ_{2}} - 2) > θ_{1} θ_{2} (\frac{1}{θ_{1}} + θ_{1} - 2) = θ_{1} θ_{2} {(\sqrt{θ_{1}} - \frac{1}{\sqrt{θ_{1}}})}^{2} > 0$ . This implies that $δ_{1} + δ_{2} = \frac{θ_{1} + θ_{2} - 2 θ_{1} θ_{2}}{θ_{1} θ_{2}} > 0$ . Moreover,

δ_{4} = \frac{(1 - θ_{1}) (1 - θ_{2})}{θ_{2} (1 - θ_{1}) + θ_{1} (1 - θ_{2})} = \frac{1 - θ_{1} - θ_{2} + θ_{1} θ_{2}}{θ_{1} + θ_{2} - 2 θ_{1} θ_{2}} = - 1 + \frac{1 - θ_{1} θ_{2}}{θ_{1} + θ_{2} - 2 θ_{1} θ_{2}} > - 1 .

Therefore, by Theorem 3.3, we conclude that there exists a nonexpansive operator $N : X \to X$ such that ${\tilde{R}}_{2} {\tilde{R}}_{1} = α Id + β N$ , $α = \frac{δ_{4}}{1 + δ_{4}} = \frac{1 - θ_{1} - θ_{2} + θ_{1} θ_{2}}{1 - θ_{1} θ_{2}} = \frac{α_{1} α_{2}}{α_{1} α_{2} + α_{1} β_{2} + α_{2} β_{1}}$ , and $β = \frac{1}{1 + δ_{4}} = \frac{θ_{1} + θ_{2} - 2 θ_{1} θ_{2}}{1 - θ_{1} θ_{2}} = \frac{β_{1} α_{2} + β_{2} α_{1}}{α_{1} α_{2} + α_{1} β_{2} + α_{2} β_{1}}$ . Now combine with (28a)–(28c). □

Applications to special cases

We start this section by recording the following simple lemma which can be easily verified, hence we omit the proof.

Lemma 4.1

Set $({\tilde{R}}_{1}, {\tilde{R}}_{2}) = (- R_{1}, R_{2} \circ (- Id))$ . Then the following hold:

(i)
$R_{2} R_{1} = {\tilde{R}}_{2} {\tilde{R}}_{1}$ .
(ii)
Let $α_{i} > 0$ , let $δ_{i} \in R ∖ {0}$ , and suppose that $- \frac{1}{δ_{i}} R_{i}$ is $α_{i}$ -conically nonexpansive. Then $\frac{1}{δ_{i}} {\tilde{R}}_{i}$ is $α_{i}$ -conically nonexpansive.

Theorem 4.2

Let $i \in {1, 2}$ , let $α_{i} > 0$ , let $δ_{i} \in R ∖ {0}$ , let $R_{i} : X \to X$ be such that $\frac{1}{δ_{i}} R_{i}$ is $α_{i}$ -conically nonexpansive. Suppose that either $α_{1} α_{2} < 1$ or $max {α_{1}, α_{2}} = 1$ . Set

{\begin{matrix} \frac{α_{1} + α_{2} - 2 α_{1} α_{2}}{1 - α_{1} α_{2}}, & α_{1} α_{2} < 1; \\ 1, & max {α_{1}, α_{2}} = 1 . \end{matrix}

Then there exists a nonexpansive operator $N : X \to X$ such that

R_{2} R_{1} = δ_{1} δ_{2} ((1 - α) Id + α N) .

Furthermore, $α < 1 \Leftrightarrow [α_{1} < 1 and α_{2} < 1]$ .

Proof

Set $({\tilde{R}}_{1}, {\tilde{R}}_{2}) = (- R_{1}, R_{2} \circ (- Id))$ and set $R = R_{2} R_{1}$ . The proof proceeds by cases.

Case I: $δ_{i} > 0$ , $i \in {1, 2}$ . By assumption, there exist nonexpansive operators $N_{i} : X \to X$ such that $R_{i} = δ_{i} (1 - α_{i}) Id + δ_{i} α_{i} N_{i}$ . Moreover, one can easily check that $R_{i}$ satisfy the assumptions of Theorem 3.4 with $(α_{i}, β_{i})$ replaced by $(δ_{i} (1 - α_{i}), δ_{i} α_{i})$ . Applying Theorem 3.4, with $(α_{i}, β_{i})$ replaced by $(δ_{i} (1 - α_{i}), δ_{i} α_{i})$ , we learn that there exists a nonexpansive operator $N : X \to X$ such that $R = (δ_{1} (1 - α_{1}) + δ_{1} α_{1}) (δ_{2} (1 - α_{2}) + δ_{2} α_{2}) ((1 - α) Id + α N) = δ_{1} δ_{2} ((1 - α) Id + α N)$ , where

α = \frac{δ_{1} (1 - α_{1}) δ_{2} α_{2} + δ_{2} (1 - α_{2}) δ_{1} α_{1}}{δ_{1} (1 - α_{1}) δ_{2} α_{2} + δ_{2} (1 - α_{2}) δ_{1} α_{1} + δ_{1} (1 - α_{1}) δ_{2} (1 - α_{2})} = \frac{α_{1} + α_{2} - 2 α_{1} α_{2}}{1 - α_{1} α_{2}} .

Finally, observe that $α < 1$ ⇔ [ $α_{1} α_{2} < 1$ and $\frac{α_{1} + α_{2} - 2 α_{1} α_{2}}{1 - α_{1} α_{2}} < 1$ ] ⇔ [ $α_{1} α_{2} < 1$ and $1 - α_{1} α_{2} > α_{1} + α_{2} - 2 α_{1} α_{2}$ ] ⇔ [ $α_{1} α_{2} < 1$ and $(1 - α_{1}) (1 - α_{2}) > 0$ ] ⇔ [ $α_{1} < 1$ and $α_{2} < 1$ ].

Case II: $δ_{i} < 0$ , $i \in {1, 2}$ . Observe that $\frac{1}{δ_{i}} R_{i} = - \frac{1}{| δ_{i} |} R_{i}$ is $α_{i}$ -conically nonexpansive. Therefore, Lemma 4.1(ii), applied with $δ_{i}$ replaced by $| δ_{i} |$ , implies that $\frac{1}{| δ_{i} |} {\tilde{R}}_{i}$ are $α_{i}$ -conically nonexpansive. Now combine Lemma 4.1(i) and Case I applied with $(R_{i}, δ_{i})$ replaced by $({\tilde{R}}_{i}, | δ_{i} |)$ .

Case III: $δ_{1} < 0$ and $δ_{2} > 0$ : Observe that $\frac{1}{δ_{1}} R_{1} = - \frac{1}{| δ_{1} |} R_{1}$ is $α_{1}$ -conically nonexpansive. Now, using Lemma 4.1(i)&(ii), we have $- R = - R_{2} R_{1} = - {\tilde{R}}_{2} {\tilde{R}}_{1}$ , and $- \frac{1}{δ_{2}} {\tilde{R}}_{2}$ is $α_{2}$ -conically nonexpansive. Now combine with Case II, applied with $(R_{1}, R_{2}, δ_{1})$ replaced by $({\tilde{R}}_{1}, - {\tilde{R}}_{2}, | δ_{1} |)$ , to learn that there exists a nonexpansive mapping $N : X \to X$ such that $- R = | δ_{1} | δ_{2} ((1 - α) Id + α N)$ , and the conclusion follows.

Case IV: $δ_{1} > 0$ and $δ_{2} < 0$ : Indeed, $- R = - R_{2} R_{1}$ . Now combine with Case I applied with $R_{2}$ replaced by $- R_{2}$ , in view of Lemma 4.1(ii). □

Corollary 4.3

Let $α \in] 0, 1 [$ , let $β > 0$ , let $δ \in R ∖ {0}$ , let ${i, j} = {1, 2}$ , and suppose that $\frac{1}{δ} R_{i}$ is α-averaged, and that $R_{j}$ is $\frac{1}{β}$ -cocoercive. Set $\overline{α} = \frac{1}{2 - α}$ . Then $\overline{α} \in] 0, 1 [$ , and there exists a nonexpansive operator $N : X \to X$ such that

R_{2} R_{1} = β δ ((1 - \overline{α}) Id + \overline{α} N) .

Proof

Suppose first that $(i, j) = (1, 2)$ , and observe that there exists a nonexpansive operator N̅ such that $R_{2} = \frac{β}{2} (Id + \overline{N})$ . Applying Theorem 4.7 with $m = 2$ , $(α_{1}, α_{2}, δ_{1}, δ_{2})$ replaced by $(α, 1 / 2, δ, β)$ yields that there exists a nonexpansive operator N such that $R_{2} R_{1} = β δ ((1 - \overline{α}) Id + \overline{α} N)$ , where

\overline{α} = \frac{α + \frac{1}{2} - 2 \frac{α}{2}}{1 - \frac{α}{2}} = \frac{1}{2 - α} \in] 0, 1 [.

The case $(i, j) = (2, 1)$ follows similarly. □

The assumption $α_{1} α_{2} < 1$ is critical in the conclusion of Theorem 4.2 as we illustrate below.

Example 4.4

( $α_{1} = α_{2} > 1$ )

Let $α > 1$ , and set $R_{1} = R_{2} = (1 - α) Id - α Id = (1 - 2 α) Id$ . Then

R_{2} R_{1} = {(1 - 2 α)}^{2} Id = (1 - 4 α + 4 α^{2}) Id .

Hence, $Id - R_{2} R_{1} = 4 α (1 - α) Id$ . That is, $Id - R_{2} R_{1}$ is not monotone; hence, $R_{2} R_{1}$ is not conically nonexpansive by Lemma 2.9 applied with T replaced by $R_{2} R_{1}$ .

The following proposition provides an abstract framework to construct a family of operators $R_{1}$ and $R_{2}$ such that $R_{1}$ is $α_{1}$ -conically nonexpansive, $R_{2}$ is $α_{2}$ -conically nonexpansive, $α_{1} α_{2} > 1$ , and the composition $R_{2} R_{1}$ fails to be conically nonexpansive.

Proposition 4.5

Let $θ \in R$ , let $α_{1} > 0$ , let $α_{2} > 0$ , let

R_{θ} = [\begin{array}{c} cos θ & - sin θ \\ sin θ & cos θ \end{array}],

set

R_{1} = (1 - α_{1}) Id + α_{1} R_{θ}, R_{2} = (1 - α_{2}) Id - α_{2} R_{θ},

and set

κ = α_{1} + α_{2} - 2 α_{1} α_{2} {sin}^{2} θ - (α_{1} - α_{2}) cos θ .

Then $R_{1}$ is $α_{1}$ -conically nonexpansive, and $R_{2}$ is $α_{2}$ -conically nonexpansive. Moreover, we have the implication $κ < 0$ ⇒ $R_{2} R_{1}$ is not conically nonexpansive.

Proof

Set $S = R_{π / 2}$ , and observe that $S^{2} = - Id$ and that $R_{θ} = (cos θ) Id + (sin θ) S$ . Now,

R_{2} R_{1} = ((1 - α_{1}) Id + α_{1} R_{θ}) ((1 - α_{2}) Id - α_{2} R_{θ})

40a

= (1 - α_{1} - α_{2} + α_{1} α_{2}) Id + (α_{1} - α_{2}) R_{θ} - α_{1} α_{2} R_{2 θ}

40b

\begin{aligned} = (1 - α_{1} - α_{2} + α_{1} α_{2} + (α_{1} - α_{2}) cos θ - α_{1} α_{2} cos (2 θ)) Id \\ + ((α_{1} - α_{2}) sin θ - α_{1} α_{2} sin (2 θ)) S \end{aligned}

40c

\begin{aligned} = (1 - α_{1} - α_{2} + α_{1} α_{2} + (α_{1} - α_{2}) cos θ - α_{1} α_{2} (2 {cos}^{2} θ - 1)) Id \\ + ((α_{1} - α_{2}) sin θ - α_{1} α_{2} sin (2 θ)) S \end{aligned}

40d

\begin{aligned} = (1 - α_{1} - α_{2} + 2 α_{1} α_{2} {sin}^{2} θ + (α_{1} - α_{2}) cos θ) Id \\ + ((α_{1} - α_{2}) sin θ - α_{1} α_{2} sin (2 θ)) S . \end{aligned}

40e

Consequently,

\begin{aligned} Id - R_{2} R_{1} = & (α_{1} + α_{2} - 2 α_{1} α_{2} {sin}^{2} θ - (α_{1} - α_{2}) cos θ) Id \\ - ((α_{1} - α_{2}) sin θ - α_{1} α_{2} sin (2 θ)) S . \end{aligned}

Hence, $(\forall x \in R^{2})$

〈 (Id - R_{2} R_{1}) x ∣ x 〉 = (α_{1} + α_{2} - 2 α_{1} α_{2} {sin}^{2} θ - (α_{1} - α_{2}) cos θ) {∥ x ∥}^{2} = κ {∥ x ∥}^{2} .

Now, $R_{2} R_{1}$ is conically nonexpansive ⇒ $Id - R_{2} R_{1}$ is monotone by Lemma 2.9, and the conclusion follows in view of (42). □

The following example provides two concrete instances where: (i) $α_{1} > 1$ , $α_{2} > 1$ , hence $α_{1} α_{2} > 1$ , (ii) $α_{1} > 1$ , $α_{2} < 1$ , $α_{1} α_{2} > 1$ . In both cases, $R_{2} R_{1}$ is not conically nonexpansive.

Example 4.6

Suppose that one of the following holds:

(i)
$θ \in] 0, π / 2 [$ , $ϵ \geq 0$ , $δ \geq 0$ , $α_{1} = \frac{1 + ϵ}{{sin}^{2} θ}$ , and $α_{2} = \frac{1 + δ}{{sin}^{2} θ}$ .
(ii)
$θ \in] π / 4, π / 2 [$ , $ϵ > \frac{{cos}^{2} θ (2 - {cos}^{2} θ)}{(1 - 2 {cos}^{2} θ) (1 + cos θ) + cos θ}$ , $α_{1} = \frac{1 + ϵ}{{sin}^{2} θ}$ , and $α_{2} = {sin}^{2} θ$ .

Let $R_{θ}$ be defined as in (37), let $R_{1} = (1 - α_{1}) Id + α_{1} R_{θ}$ , and let $R_{2} = (1 - α_{2}) Id - α_{2} R_{θ}$ . Then $α_{1} α_{2} > 1$ , and $R_{2} R_{1}$ is not conically nonexpansive.

Proof

Let κ be defined as in (39). In view of Proposition 4.5, it is sufficient to show that $κ < 0$ . (i): Note that $κ < 0$ ⇔ $κ {sin}^{2} θ < 0$ . Now,

κ {sin}^{2} θ = 2 + ϵ + δ - (ϵ - δ) cos θ - 2 - 2 ϵ - 2 δ - 2 ϵ δ

43a

= - (ϵ (1 + cos θ) + δ (1 - cos θ) + 2 ϵ δ) < 0 .

43b

(ii): We have

κ = \frac{1 + ϵ + {sin}^{4} θ}{{sin}^{2} θ} - 2 (1 + ϵ) {sin}^{2} θ - \frac{1 + ϵ - {sin}^{4} θ}{{sin}^{2} θ} cos θ

44a

= - \frac{1}{{sin}^{2} θ} (2 (1 + ϵ) {sin}^{4} θ - (1 + ϵ + {sin}^{4} θ) + (1 + ϵ - {sin}^{4} θ) cos θ)

44b

= - \frac{1}{1 - {cos}^{2} θ} ((2 {sin}^{4} θ + cos θ - 1) ϵ + {sin}^{4} θ (1 - cos θ) - (1 - cos θ))

44c

= - \frac{1 - cos θ}{1 - {cos}^{2} θ} ((2 (1 + cos θ) (1 - {cos}^{2} θ) - 1) ϵ + 1 - 2 {cos}^{2} θ + {cos}^{4} θ - 1)

44d

= - \frac{1}{1 + cos θ} ((1 + 2 cos θ - 2 {cos}^{2} θ - 2 {cos}^{3} θ) ϵ - {cos}^{2} θ (2 - {cos}^{2} θ))

44e

= - \frac{1}{1 + cos θ} ((1 - 2 {cos}^{2} θ) (1 + cos θ) + cos θ) ϵ - {cos}^{2} θ (2 - {cos}^{2} θ)) .

44f

Now, observe that $(\forall θ \in] \frac{π}{4}, \frac{π}{2} [)$ $1 - 2 {cos}^{2} θ = - cos (2 θ) > 0$ . Consequently, $(1 - 2 {cos}^{2} θ) (1 + cos θ) + cos θ > cos θ > 0$ . Now use the assumption $ϵ > \frac{{cos}^{2} θ (2 - {cos}^{2} θ)}{(1 - 2 {cos}^{2} θ) (1 + cos θ) + cos θ}$ to learn that $(1 - 2 {cos}^{2} θ) (1 + cos θ) + cos θ) ϵ - {cos}^{2} θ (2 - {cos}^{2} θ) > 0$ , hence $κ < 0$ , and the conclusion follows. □

Theorem 4.7

(composition of m scaled conically nonexpansive operators)

Let $m \geq 2$ be an integer, set $I = {1, \dots, m}$ , let ${(R_{i})}_{i \in I}$ be a family of operators from X to X, let $r \in I$ , let $α_{i}$ be real numbers such that ${α_{i} ∣ i \in I ∖ {r}} \subseteq] 0, 1 [$ and $α_{r} > 0$ , let $δ_{i}$ be real numbers in $R ∖ {0}$ , and suppose that, for every $i \in I$ , $\frac{1}{δ_{i}} R_{i}$ is $α_{i}$ -conically nonexpansive. Set

\overline{α} = \frac{\sum_{\begin{array}{c} i = 1 \\ i \neq r \end{array}}^{m} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{\begin{array}{c} i = 1 \\ i \neq r \end{array}}^{m} \frac{α_{i}}{1 - α_{i}}} .

Suppose that $α_{r} \overline{α} < 1$ , and set

α = {\begin{matrix} \frac{\sum_{i = 1}^{m} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{i = 1}^{m} \frac{α_{i}}{1 - α_{i}}}, & α_{r} \neq 1; \\ 1, & α_{r} = 1 . \end{matrix}

Then there exists a nonexpansive operator $N : X \to X$ such that

R_{m} \dots R_{1} = δ_{m} \dots δ_{1} ((1 - α) Id + α N) .

Proof

First, observe that $(\forall i \in I ∖ {r})$ , $\frac{1}{δ_{i}} R_{i}$ is nonexpansive. If $α_{r} = 1$ , then $(\forall i \in {1, \dots, m})$ $R_{i}$ is $| δ_{i} |$ -Lipschitz continuous and the conclusion readily follows. Now, suppose that $α_{r} \neq 1$ . We proceed by induction on $k \in {2, \dots, m}$ . At $k = 2$ , the claim holds by Theorem 4.2. Now, suppose that the claim holds for some $k \in {2, \dots, m - 1}$ . Let ${(R_{i})}_{1 \leq i \leq k + 1}$ be a family of operators from X to X, let $\overline{r} \in {1, \dots, k, k + 1}$ , let $α_{i}$ be real numbers such that ${α_{i} ∣ i \in {1, \dots, k, k + 1} ∖ {\overline{r}}} \subseteq] 0, 1 [$ and $α_{\overline{r}} \in] 0, + \infty [∖ {1}$ , let $δ_{i}$ be real numbers in $R ∖ {0}$ , and suppose that, for every $i \in {1, \dots, k + 1}$ , $\frac{1}{δ_{i}} R_{i}$ is $α_{i}$ -conically nonexpansive. Set $\overline{β} = \frac{\sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k + 1} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k + 1} \frac{α_{i}}{1 - α_{i}}}$ , and suppose that $α_{\overline{r}} \overline{β} < 1$ . We examine two cases.

Case I: $α_{k + 1} = α_{\overline{r}}$ . In this case the conclusion follows by applying Theorem 4.2 in view of the inductive hypothesis with $(R_{1}, R_{2})$ replaced by $(R_{k} \dots R_{1}, R_{k + 1})$ and $(δ_{1}, δ_{2}, α_{1}, α_{2})$ replaced by $(δ_{1} \dots δ_{k}, δ_{k + 1}, (\sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}) / (1 + \sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}), α_{k + 1})$ .

Case II: $α_{k + 1} \neq α_{\overline{r}}$ . We claim that

α_{k + 1} \frac{\sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}} < 1 .

To this end, set $\hat{α} = \frac{\sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}}$ , and observe that $\hat{α} < \overline{β}$ . By assumption we have $α_{\overline{r}} \overline{β} < 1$ . Altogether, we conclude that $α_{\overline{r}} \hat{α} < 1$ . It follows from the inductive hypothesis that

\frac{1}{δ_{1} \dots δ_{k}} (R_{k} \dots R_{1}) is \frac{\sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}} -conically nonexpansive .

Next note that

\frac{\sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}} = \frac{\frac{\sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}} + \frac{α_{\overline{r}}}{1 - α_{\overline{r}}}}{1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}}}{\frac{1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}} + \frac{α_{\overline{r}}}{1 - α_{\overline{r}}}}{1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}}} = \frac{\hat{α} + \frac{α_{\overline{r}}}{(1 - α_{\overline{r}}) (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}})}}{1 + \frac{α_{\overline{r}}}{(1 - α_{\overline{r}}) (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}})}}

50a

= \frac{\hat{α} (1 - α_{\overline{r}}) (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) + α_{\overline{r}}}{(1 - α_{\overline{r}}) (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) + α_{\overline{r}}}

50b

= \frac{α_{\overline{r}} (1 - \hat{α} (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}})) + \hat{α} (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}})}{1 + (1 - α_{\overline{r}}) \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}} .

50c

Because $α_{\overline{r}} \overline{β} < 1$ , we learn that $1 + (1 - α_{\overline{r}}) \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}} > 0$ . Moreover, because $\hat{α} < 1$ , we have $α_{k + 1} \hat{α} < 1$ . Therefore, (50a)–(50c) implies

α_{k + 1} \frac{\sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}} < 1

51a

\begin{aligned} \Leftrightarrow α_{k + 1} (α_{\overline{r}} (1 - \hat{α} (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}})) + \hat{α} (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}})) \\ < 1 + (1 - α_{\overline{r}}) \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}} \end{aligned}

51b

\begin{aligned} \Leftrightarrow α_{\overline{r}} (α_{k + 1} (1 - \hat{α} (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}})) + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) \\ < (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) (1 - α_{k + 1} \hat{α}) \end{aligned}

51c

\begin{aligned} \Leftrightarrow α_{\overline{r}} (α_{k + 1} (1 - \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) \\ < (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) (1 - α_{k + 1} \hat{α}) \end{aligned}

51d

\Leftrightarrow α_{\overline{r}} \frac{α_{k + 1} (1 - \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}}{(1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) (1 - α_{k + 1} \hat{α})} < 1 .

51e

Now, observe that

\begin{aligned} α_{k + 1} (1 - \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}} & = (\sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}} + \frac{α_{k + 1}}{1 - α_{k + 1}}) (1 - α_{k + 1}) \\ = \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k + 1} \frac{α_{i}}{1 - α_{i}} (1 - α_{k + 1}) \end{aligned}

and

(1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}) (1 - α_{k + 1} \hat{α}) = 1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}} - α_{k + 1} \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}}

53a

= (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k} \frac{α_{i}}{1 - α_{i}} + \frac{α_{k + 1}}{1 - α_{k + 1}}) (1 - α_{k + 1})

53b

= (1 + \sum_{\begin{array}{c} i = 1 \\ i \neq \overline{r} \end{array}}^{k + 1} \frac{α_{i}}{1 - α_{i}}) (1 - α_{k + 1}) .

53c

In view of (52) and (53a)–(53c), (51a)–(51e) becomes

α_{k + 1} \frac{\sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}}{1 + \sum_{i = 1}^{k} \frac{α_{i}}{1 - α_{i}}} < 1 \Leftrightarrow α_{\overline{r}} \frac{\sum_{i = 1}^{k + 1} \frac{α_{1}}{1 - α_{i}}}{1 + \sum_{i = 1}^{k + 1} \frac{α_{1}}{1 - α_{i}}} = α_{\overline{r}} \overline{β} < 1 .

This proves (48). Now proceed similar to Case I in view of (48) and (49). □

The assumption $α_{r} \overline{α} < 1$ is critical in the conclusion of the above theorem as we illustrate in the following example.

Example 4.8

Let $ϵ > 0$ , let $δ > 1$ , let $α_{1} \in] 0, \frac{1}{2} (\sqrt{{(ϵ + δ)}^{2} + 4} - (ϵ + δ)) [$ , let $α_{2} = α_{1} + δ + ϵ$ , and let

S = [\begin{array}{c} 0 & - 1 \\ 1 & 0 \end{array}] .

Set $R_{1} = (1 - α_{1}) Id - α_{1} S$ , $R_{2} = (1 - α_{2}) Id + α_{2} S$ , $R_{3} = - \frac{1}{δ} S$ , and

R = R_{3} R_{2} R_{1} .

Then $R = R_{3} R_{1} R_{2} = R_{1} R_{2} R_{3} = R_{1} R_{3} R_{2} = R_{2} R_{3} R_{1} = R_{2} R_{1} R_{3}$ . Moreover, the following hold:

(i)
$α_{1} \in] 0, 1 [$ , $α_{2} > 1$ , and $α_{1} α_{2} < 1$ .
(ii)
$R_{3}$ is $α_{3}$ -conically nonexpansive where $α_{3} = \frac{1 + δ}{2 δ} \in] 1 / 2, 1]$ .
(iii)
$\frac{α_{1} + α_{2} - 2 α_{1} α_{2}}{1 - α_{1} α_{2}} α_{3} > 1$ .
(iv)
$R = (\frac{ϵ + δ}{δ}) Id + (\frac{α_{1} + α_{2} - 2 α_{1} α_{2} - 1}{δ}) S$ .
(v)
$Id - R = - \frac{ϵ}{δ} Id - (\frac{α_{1} + α_{2} - 2 α α_{2} - 1}{δ}) S$ . Hence, $Id - R$ is not monotone.
(vi)
R is not conically nonexpansive.

Proof

It is straightforward to verify that $R = R_{3} R_{1} R_{2} = R_{1} R_{2} R_{3} = R_{1} R_{3} R_{2} = R_{2} R_{3} R_{1} = R_{2} R_{1} R_{3}$ . (i): It is clear that $α_{1} \in] 0, 1 [$ and that $α_{2} > 1$ . Note that $α_{1} α_{2} < 1 \Leftrightarrow α_{1}^{2} + (ϵ + δ) α_{1} - 1 < 0$ ⇔ $α_{1}$ lies between the roots of the quadratic $x^{2} + (ϵ + δ) x - 1$ , and the conclusion follows from the quadratic formula. (ii): This follows from [2, Proposition 4.38]. (iii): Indeed, in view of (i) we have

\begin{aligned} \frac{α_{1} + α_{2} - 2 α_{1} α_{2}}{1 - α_{1} α_{2}} α_{3} > 1 \\ \Leftrightarrow (α_{1} + α_{2} - 2 α_{1} α_{2}) α_{3} > 1 - α_{1} α_{2} \end{aligned}

57a

\Leftrightarrow (α_{1} + α_{2} - 2 α_{1} α_{2}) (1 + δ) > 2 (1 - α_{1} α_{2}) δ

57b

\Leftrightarrow (α_{1} + α_{2}) (1 + δ) - 2 α_{1} α_{2} - 2 α_{1} α_{2} δ > 2 δ - 2 α_{1} α_{2} δ

57c

\Leftrightarrow (α_{1} + α_{2}) (1 + δ) - 2 α_{1} α_{2} > 2 δ

57d

\Leftrightarrow (2 α_{1} + ϵ + δ) (1 + δ) - 2 α_{1} (α_{1} + ϵ + δ) > 2 δ

57e

\Leftrightarrow 2 α_{1} (1 + δ - α_{1} - ϵ - δ) + δ^{2} + δ (1 + ϵ) + ϵ > 2 δ

57f

\Leftrightarrow 2 α_{1} (α_{1} - 1 + ϵ) < δ^{2} - δ + ϵ δ + ϵ = δ^{2} - δ + (1 + δ) ϵ .

57g

Now, because $α_{1} < 1$ , $δ \geq 1$ , we learn that $2 α_{1} (α_{1} - 1 + ϵ) < 2 α_{1} ϵ < (1 + δ) ϵ < (1 + δ) ϵ + δ^{2} - δ$ , and the conclusion follows. (iv): It is straightforward, by noting that $S^{2} = - Id$ , to verify that $R_{2} R_{1} = R_{1} R_{2} = (1 - α_{1} - α_{2} + α_{1} α_{2}) Id + (α_{2} (1 - α_{1}) - α_{1} (1 - α_{2})) S - α_{1} α_{2} S^{2} = (1 - α_{1} - α_{2} + 2 α_{1} α_{2}) Id + (α_{2} - α_{1}) S$ . Consequently, $R_{3} R_{2} R_{1} = \frac{1}{δ} (- (1 - α_{1} - α_{2} + 2 α_{1} α_{2}) S - (α_{2} - α_{1}) S^{2}) = \frac{1}{δ} ((α_{2} - α_{1}) Id - (1 - α_{1} - α_{2} + 2 α_{1} α_{2}) S) = \frac{ϵ + δ}{δ} Id + \frac{α_{1} + α_{2} - 2 α_{1} α_{2} - 1}{δ} S$ . (v): This is a direct consequence of (iv). (vi): Combine (v) and Lemma 2.9. □

Theorem 4.9

(Composition of cocoercive operators)

Let $m \geq 1$ be an integer, set $I = {1, \dots, m}$ , let ${(R_{i})}_{i \in I}$ be a family of operators from X to X, let $β_{i}$ be real numbers in $] 0, + \infty [$ , and suppose that, for every $i \in I$ , $R_{i}$ is $\frac{1}{β_{i}}$ -cocoercive. Then there exists a nonexpansive operator $N : X \to X$ such that

R_{m} \dots R_{1} = β_{m} \dots β_{1} (\frac{1}{1 + m} Id + \frac{m}{1 + m} N) .

Proof

Apply Theorem 4.7 with $(α_{i}, δ_{i})$ replaced by $(1 / 2, β_{i})$ , $i \in {1, \dots, m}$ . □

Application to the Douglas–Rachford algorithm

Theorem 5.1

(Averagedness of the Douglas–Rachford operator)

Let $μ > ω \geq 0$ , and let $γ \in] 0, (μ - ω) / (2 μ ω) [$ . Suppose that one of the following holds:

(i)
A is maximally $(- ω)$ -monotone and B is maximally μ-monotone.
(ii)
A is maximally μ-monotone and B is maximally $(- ω)$ -monotone.

Set

T = \frac{1}{2} (Id + R_{γ B} R_{γ A}), and α = \frac{μ - ω}{2 (μ - ω - γ μ ω)} .

Then $α \in] 0, 1 [$ and T is α-averaged.

Proof

Suppose that (i) holds. Note that γA is $- γ ω$ -monotone, and

- γ ω > - \frac{μ - ω}{2 μ} \geq - \frac{μ}{2 μ} > - 1 .

Using (13) and Fact 2.1 we learn that $J_{γ A}$ and, in turn, T are single-valued and $dom J_{γ A} = dom T = X$ . It follows from [3, Proposition 4.3 and Table 1] that $- R_{γ A}$ is $\frac{1}{1 + γ μ}$ -conically nonexpansive and $- R_{γ B}$ is $\frac{1}{1 - γ ω}$ -conically nonexpansive. It follows from Theorem 4.2, applied with $(α_{1}, β_{1}, δ_{1}, α_{2}, β_{2}, δ_{2})$ replaced by $(1 - \frac{1}{1 + γ μ}, \frac{1}{1 + γ μ}, - 1, 1 - \frac{1}{1 - γ ω}, \frac{1}{1 - γ ω}, - 1)$ , that $R_{γ B} R_{γ A}$ is $\frac{μ - ω}{μ - ω - γ μ ω}$ -conically nonexpansive. Therefore, there exists a nonexpansive mapping $N : X \to X$ such that

R_{γ B} R_{γ A} = (1 - δ) Id + δ N, δ = \frac{μ - ω}{μ - ω - γ μ ω} .

The conclusion now follows by applying Proposition 2.8 with $(β, N)$ replaced by $(\frac{α}{δ}, R_{γ B} R_{γ A})$ . Finally, notice that $γ < \frac{μ - ω}{2 μ ω}$ , which implies that $0 < μ - ω < 2 (μ - ω - γ μ ω)$ . Therefore,

α = \frac{μ - ω}{2 (μ - ω - γ μ ω)} \in] 0, 1 [.

The proof of (ii) follows similarly. □

Corollary 5.2

([9, Theorem 4.5(ii)])

Let $μ > ω \geq 0$ , and let $γ \in] 0, (μ - ω) / (2 μ ω) [$ . Suppose that one of the following holds:

(i)
A is maximally $(- ω)$ -monotone and B is maximally μ-monotone.
(ii)
A is maximally μ-monotone and B is maximally $(- ω)$ -monotone.

Set $T = \frac{1}{2} (Id + R_{γ B} R_{γ A})$ and let $x_{0} \in X$ . Then $(\exists \overline{x} \in Fix T = Fix R_{γ B} R_{γ A})$ such that $T^{n} x_{0} ⇀ \overline{x}$ .

Proof

Combine Theorem 5.1 and [2, Theorem 5.15]. □

Remark 5.3

In view of (13), one might think that the scaling factor γ is required only to guarantee the single-valuedness and the full domain of T. However, it is actually critical to guarantee convergence as well, as we illustrate in Example 5.4.

Example 5.4

Let $μ > ω \geq 0$ , let U be a closed linear subspace of X, suppose that9

A = N_{U} + μ Id, B = - ω Id .

Then A is μ-monotone, B is −ω-monotone, and $(\forall γ \in [1 / (2 ω), 1 / ω [)$ $J_{γ B}$ is single-valued. Furthermore, we have

T = \frac{1}{2} (Id + R_{γ B} R_{γ A}) = \frac{1 + γ ω}{(1 - γ ω) (1 + γ μ)} P_{U} - \frac{γ ω}{1 - γ ω} Id,

and $(\forall x_{0} \in U^{⊥})$ ${(T^{n} x_{0})}_{n \in N}$ does not converge.

Proof

Indeed, one can verify that

J_{γ A} = \frac{1}{1 - γ ω} Id, J_{γ B} = \frac{1}{1 + γ μ} P_{U} .

Consequently,

R_{γ A} = \frac{1 + γ ω}{1 - γ ω} Id, R_{γ B} = \frac{2}{1 + γ μ} P_{U} - Id,

and (64) follows. Therefore,

T_{| U^{⊥}} = - \frac{γ ω}{1 - γ ω} Id and - \frac{γ ω}{1 - γ ω} \in] - \infty, - 1] .

Hence, $(\forall x_{0} \in U^{⊥})$ ${(T^{n} x_{0})}_{n \in N}$ does not converge. □

Before we proceed to the convergence analysis, we recall that if T is averaged and $Fix T \neq \emptyset$ then $(\forall x \in X)$ we have (see, e.g., [22, Theorem 3.7])

T^{n} x - T^{n + 1} x \to 0 .

We conclude this section by proving the strong convergence of the shadow sequence of the Douglas–Rachford algorithm.

Theorem 5.5

(Convergence analysis of the Douglas–Rachford algorithm)

Let $μ > ω \geq 0$ , and let $γ \in] 0, (μ - ω) / (2 μ ω) [$ . Suppose that one of the following holds:

(i)
A is maximally μ-monotone and B is maximally $(- ω)$ -monotone.
(ii)
A is maximally $(- ω)$ -monotone and B is maximally μ-monotone.

Set

T = \frac{1}{2} (Id + R_{γ B} R_{γ A}),

and let $x_{0} \in X$ . Then $zer (A + B) \neq \emptyset$ . Moreover, there exist $\overline{x} \in Fix T = Fix R_{γ B} R_{γ A}$ , $zer (A + B) = {J_{γ A} \overline{x}} = {J_{γ B} R_{γ A} \overline{x}}$ , $T^{n} x_{0} ⇀ \overline{x}$ , $J_{γ A} T^{n} x_{0} \to J_{γ A} \overline{x}$ , and $J_{γ B} R_{γ A} T^{n} x_{0} \to J_{γ B} R_{γ A} \overline{x}$ .

Proof

Suppose that (i) holds. Since $A + B$ is $(μ - ω)$ -monotone and $μ - ω > 0$ , we conclude from [2, Proposition 23.35] that $zer (A + B)$ is a singleton. Combining with Fact 2.1 with $(A, B)$ replaced by $(γ A, γ B)$ yields $zer (A + B) = zer (γ A + γ B) = {J_{γ A} \overline{x}} = {J_{γ B} R_{γ A} \overline{x}}$ . The claim that $T^{n} x_{0} ⇀ \overline{x}$ follows from Corollary 5.2. It remains to show that $J_{γ A} T^{n} x_{0} \to J_{γ A} \overline{x}$ and $J_{γ B} R_{γ A} T^{n} x_{0} \to J_{γ B} R_{γ A} \overline{x}$ . To this end, note that ${(T^{n} x_{0})}_{n \in N}$ is bounded; consequently, since $J_{γ A}$ and $J_{γ B} R_{γ A}$ are Lipschitz continuous (see Proposition 2.16(i)&(ii)), we learn that

{(J_{γ A} T^{n} x_{0})}_{n \in N} and {(J_{γ B} R_{γ A} T^{n} x_{0})}_{n \in N} are bounded .

On the one hand, in view of (68) we have

(Id - T) T^{n} x_{0} = T^{n} x_{0} - T^{n + 1} x_{0} = J_{γ A} T^{n} x_{0} - J_{γ B} R_{γ A} T^{n} x_{0} \to 0 .

Combining (70) and (71) yields

{∥ J_{γ A} T^{n} x_{0} - J_{γ A} \overline{x} ∥}^{2} - {∥ J_{γ B} R_{γ A} T^{n} x_{0} - J_{γ B} R_{γ A} \overline{x} ∥}^{2}

72a

= 〈 J_{γ A} T^{n} x_{0} - J_{γ B} R_{γ A} T^{n} x_{0} ∣ J_{γ A} T^{n} x_{0} + J_{γ B} R_{γ A} T^{n} x_{0} - J_{γ A} \overline{x} - J_{γ B} R_{γ A} \overline{x} 〉

72b

= 〈 T^{n} x_{0} - T^{n + 1} x_{0} ∣ J_{γ A} T^{n} x_{0} + J_{γ B} R_{γ A} T^{n} x_{0} - J_{γ A} \overline{x} - J_{γ B} R_{γ A} \overline{x} 〉 \to 0 .

72c

On the other hand, combining Lemma 2.3, applied with $(R_{1}, R_{2}, R (λ), λ)$ replaced by $(R_{γ A}, R_{γ B}, T, 1 / 2)$ and $(x, y)$ replaced by $(T^{n} x_{0}, \overline{x})$ , in view of (68) yields

0 \leftarrow 〈 T^{n + 1} x_{0} - \overline{x} ∣ T^{n} x_{0} - T^{n + 1} x_{0} 〉

73a

\geq γ μ ({∥ J_{γ A} T^{n} x_{0} - J_{γ A} \overline{x} ∥}^{2} - \frac{ω}{μ} {∥ J_{γ B} R_{γ A} T^{n} x_{0} - J_{γ B} R_{γ A} \overline{x} ∥}^{2})

73b

\geq - \frac{γ μ ω}{μ - ω} {∥ T^{n} x_{0} - T^{n + 1} x_{0} ∥}^{2} \to 0 .

73c

Therefore,

{∥ J_{γ A} T^{n} x_{0} - J_{γ A} \overline{x} ∥}^{2} - \frac{ω}{μ} {∥ J_{γ B} R_{γ A} T^{n} x_{0} - J_{γ B} R_{γ A} \overline{x} ∥}^{2} \to 0 .

Combining (72a)–(72c) and (74) and noting that $\frac{ω}{μ} < 1$ yields ${∥ J_{γ A} T^{n} x_{0} - J_{γ A} \overline{x} ∥}^{2} \to 0$ and ${∥ J_{γ B} R_{γ A} T^{n} x_{0} - J_{γ B} R_{γ A} \overline{x} ∥}^{2} \to 0$ , which proves (i). The proof of (ii) proceeds similarly. □

Remark 5.6

(Relaxed Douglas–Rachford algorithm)

A careful look at the proofs of Theorem 5.1 and Theorem 5.5 reveals that analogous conclusions can be drawn for the relaxed Douglas–Rachford operator defined by $T_{λ} = (1 - λ) Id + λ R_{γ B} R_{γ A}$ , $λ \in] 0, 1 [$ . In this case, we choose $γ \in] 0, ((1 - λ) (μ - ω)) / (μ ω) [$ . One can verify that the corresponding averagedness constant is $α = \frac{λ (μ - ω)}{μ - ω - γ μ ω} \in] 0, 1 [$ .

Application to the forward–backward algorithm

Throughout this section we assume that

A : X \to X, B : X ⇉ X, μ \geq 0, ω \geq 0, and β > 0 .

In the rest of this section, we prove that the forward–backward operator is averaged, hence its iterates form a weakly convergent sequence in each of the following situations:

A is maximally μ-monotone, $A - μ Id$ is $\frac{1}{β}$ -cocoercive, B is maximally $(- ω)$ -monotone, and $μ \geq ω$ .
A is maximally $(- ω)$ -monotone, $A + ω Id$ is $\frac{1}{β}$ -cocoercive, B is maximally μ-monotone, and $μ \geq ω$ .
A is β-Lipschitz continuous, B is maximally μ-monotone, and $μ \geq β$ .

That is, we do not require A and B to be monotone. Instead, it is enough that the sum $A + B$ is monotone to have an averaged forward–backward map. In addition, we show that the forward–backward map is contractive if the sum $A + B$ is strongly monotone, and we prove the tightness of our contraction factor.

Theorem 6.1

(Case I: A is μ-monotone)

Let $μ \geq ω \geq 0$ , and let $β > 0$ . Suppose that A is maximally μ-monotone, $A - μ Id$ is $\frac{1}{β}$ -cocoercive, and B is maximally $(- ω)$ -monotone. Let $γ \in] 0, 2 / (β + 2 μ) [$ . Set $T = J_{γ B} (Id - γ A)$ , set $ν = γ β / (2 (1 - γ μ))$ , set $δ = (1 - γ μ) / (1 - γ ω)$ , and let $x_{0} \in X$ . Then $δ \in] 0, 1]$ and $ν \in] 0, 1 [$ . Moreover, the following hold:

(i)
$T = δ ((1 - ν) Id + ν N)$ , N is nonexpansive.
(ii)
T is $(1 - (δ (1 - ν)) / (2 - ν))$ -averaged.
(iii)
T is δ-Lipschitz continuous.
(iv)
There exists $\overline{x} \in Fix T = zer (A + B)$ such that $T^{n} x_{0} ⇀ \overline{x}$ .

Suppose that $μ > ω$ . Then we additionally have:

(v)
T is a Banach contraction with a constant $δ < 1$ .
(vi)
$zer (A + B) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $δ < 1$ .

Proof

Clearly, $δ \in] 0, 1]$ and $ν > 0$ . Moreover, we have $ν < 1$ ⇔ $γ β < 2 (1 - γ μ)$ ⇔ $γ < 2 / (β + 2 μ β)$ . Hence, $ν \in] 0, 1 [$ as claimed. Next note that $μ < (β + 2 μ) / 2$ , hence $γ ω < γ μ < (2 γ) / (β + 2 μ) < 1$ . It follows from Proposition 2.2 that $J_{γ B}$ and, in turn, T are single-valued and $dom J_{γ B} = dom T = X$ . The assumption on A implies that there exists $\overline{N} : X \to X$ , N̅ is nonexpansive, such that $A - μ Id = \frac{β}{2} Id + \frac{β}{2} \overline{N}$ . Therefore,

Id - γ A = Id - γ (A - μ Id) - γ μ Id = (1 - γ μ) Id - \frac{γ β}{2} (Id + \overline{N})

75a

= (1 - γ μ) ((1 - ν) Id + ν (- \overline{N})) .

75b

Moreover, Proposition 2.16(i) implies that

J_{γ B} is (1 - γ ω) -cocoercive.

(i): It follows from Corollary 4.3 applied with $(R_{1}, R_{2})$ replaced by $(Id - γ A, J_{γ B})$ and $(α, β, δ)$ replaced by $(ν, 1 / (1 - γ ω), 1 - γ μ)$ , in view of (75a)–(75b) and (76), that there exists a nonexpansive operator N such that $T = J_{γ B} (Id - γ A) = δ ((1 - ν) Id + ν N)$ . (ii): Combine (i) and Lemma 2.15(i). (iii): Combine (i) and (ii). (iv): Applying Proposition 2.2 with $(A, B)$ replaced by $(γ A, γ B)$ yields $zer (A + B) = zer (γ A + γ B) = Fix T$ . The claim that $T^{n} x_{0} ⇀ \overline{x}$ follows from combining (ii) and [2, Theorem 5.15]. (v): Observe that $δ < 1 \Leftrightarrow μ > ω$ . Now, combine with (iii). (vi): Note that $A + B$ is maximally $(μ - ω)$ -monotone and $μ - ω > 0$ , we conclude from [2, Proposition 23.35] that $zer (A + B)$ is a singleton. Alternatively, use (iii) to learn that T is a Banach contraction with a constant $δ < 1$ , hence $zer (A + B) = Fix T$ is a singleton, and the conclusion follows. □

Theorem 6.2

Let $μ > ω \geq 0$ , and let $β > 0$ . Suppose that A is maximally μ-monotone, $A - μ Id$ is $\frac{1}{β}$ -cocoercive, and B is maximally $(- ω)$ -monotone. Let $γ \in [2 / (β + 2 μ), 2 / (β + μ) [$ . Set $T = J_{γ B} (Id - γ A)$ , set $ν = γ β / (2 (γ (μ + β) - 1))$ , set $δ = (1 - γ (μ + β)) / (1 - γ ω)$ , and let $x_{0} \in X$ . Then $δ \in] - 1, 0]$ and $ν \in] 0, 1 [$ . Moreover, the following hold:

(i)
$T = δ ((1 - ν) Id + ν N)$ , N is nonexpansive.
(ii)
T is a Banach contraction with a constant $| δ | < 1$ .
(iii)
There exists $\overline{x} \in X$ such that $Fix T = zer (A + B) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $| δ | < 1$ .

Proof

We proceed similar to the proof of Theorem 6.1 to verify that T is single-valued, $dom T = X$ , $ν \in] 0, 1 [$ , and $δ \in] - 1, 0]$ . The assumption on A implies that there exists $\overline{N} : X \to X$ , N̅ is nonexpansive such that $A - μ Id = \frac{β}{2} Id + \frac{β}{2} \overline{N}$ . Therefore,

Id - γ A = Id - γ (A - μ Id) - γ μ Id = (1 - γ μ) Id - \frac{γ β}{2} (Id + \overline{N})

77a

= (1 - γ (μ + β)) ((1 - ν) Id + ν (\overline{N})) .

77b

Now, proceed similar to the proof of Theorem 6.1(i), (v), and (vi) in view of (76). □

Corollary 6.3

Let $μ > ω \geq 0$ , and let $β > 0$ . Suppose that A is maximally μ-monotone, $A - μ Id$ is $\frac{1}{β}$ -cocoercive, and B is maximally $(- ω)$ -monotone. Let $γ \in] 0, 2 / (β + μ) [$ . Set $T = J_{γ B} (Id - γ A)$ , set $δ = max (1 - γ μ, γ (μ + β) - 1) / (1 - γ ω)$ , and let $x_{0} \in X$ . Then $δ \in [0, 1 [$ , T is a Banach contraction with a constant δ, and there exists $\overline{x} \in X$ such that $Fix T = zer (A + B) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ .

Proof

Combine Theorem 6.1 and Theorem 6.2. □

Remark 6.4

(Tightness of the Lipschitz constant)

(i)
Suppose that the setting of Theorem 6.1 holds. Set $(A, B) = (μ Id, - ω Id)$ . Then $T = \frac{1 - γ μ}{1 - γ ω} Id$ . Hence, the claimed Lipschitz constant is tight.
(ii)
Suppose that the setting of Theorem 6.2 holds. Set $(A, B) = ((μ + β) Id, - ω Id)$ . Then $T = \frac{γ (μ + β) - 1}{1 - γ ω} Id$ . Hence, the claimed contraction factor is tight.

Note in particular that the worst cases are subgradients of convex functions. Hence, the worst cases are attained by the proximal gradient method.

Theorem 6.5

(Case II: $A + ω Id$ is cocoercive)

Let $μ \geq ω \geq 0$ , let $β > 0$ , and let $\overline{β} \in] max {β, μ + ω}, + \infty [$ . Suppose that A is maximally $(- ω)$ -monotone, $A + ω Id$ is β-cocoercive, and B is maximally μ-monotone. Let $γ \in] 0, 2 / (\overline{β} - 2 ω) [$ . Set $T = J_{γ B} (Id - γ A)$ , set $ν = γ \overline{β} / (2 (1 + γ ω))$ , set $δ = (1 + γ ω) / (1 + γ μ)$ , and let $x_{0} \in X$ . Then $δ \in] 0, 1]$ and $ν \in] 0, 1 [$ . Moreover, the following hold:

(i)
$T = δ ((1 - ν) Id + ν N)$ , N is nonexpansive.
(ii)
T is $(1 - (δ (1 - ν)) / (2 - ν))$ -averaged.
(iii)
T is δ-Lipschitz continuous.
(iv)
There exists $\overline{x} \in Fix T = zer (A + B)$ , and $T^{n} x_{0} ⇀ \overline{x}$ .

Suppose that $μ > ω$ . Then we additionally have:

(v)
T is a Banach contraction with a constant $δ < 1$ .
(vi)
$zer (A + B) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $δ < 1$ .

Proof

Observe that the assumption on A and Lemma 2.11 applied with T replaced by $A + ω Id$ imply that there exists $\overline{N} : X \to X$ , N̅ is nonexpansive, such that $A + ω Id = \frac{\overline{β}}{2} Id + \frac{\overline{β}}{2} \overline{N}$ .

Id - γ A = Id - γ (A + ω Id) + γ ω Id = (1 + γ ω) Id - \frac{γ \overline{β}}{2} (Id + \overline{N})

78a

= (1 + γ ω) ((1 - ν) Id + ν (- \overline{N})) .

78b

Moreover, Proposition 2.16(i) implies that

J_{γ B} is (1 + γ μ) -cocoercive.

Now proceed similar to the proof of Theorem 6.1 but use (78a)–(78b) and (79). □

Theorem 6.6

Let $μ > ω \geq 0$ , let $β > 0$ , and let $\overline{β} \in] max {β, μ + ω}, + \infty [$ . Suppose that A is maximally $(- ω)$ -monotone, $A + ω Id$ is β-cocoercive, and B is maximally μ-monotone. Let $γ \in [2 / (\overline{β} - 2 ω), 2 / (\overline{β} - μ - ω) [$ . Set $T = J_{γ B} (Id - γ A)$ , set $ν = γ \overline{β} / (2 (γ \overline{β} - γ ω - 1))$ , set $δ = (1 + γ ω - γ \overline{β}) / (1 + γ μ)$ , and let $x_{0} \in X$ . Then $δ \in] - 1, 0]$ and $ν \in] 0, 1 [$ . Moreover, the following hold:

(i)
$T = δ ((1 - ν) Id + ν N)$ , N is nonexpansive.
(ii)
T is a Banach contraction with a constant $| δ | < 1$ .
(iii)
There exists $\overline{x} \in X$ such that $Fix T = zer (A + B) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $| δ | < 1$ .

Proof

Observe that the assumption on A and Lemma 2.11 applied with T replaced by $A + ω Id$ implies that there exists $\overline{N} : X \to X$ , N̅ is nonexpansive, such that $A + ω Id = \frac{\overline{β}}{2} Id + \frac{\overline{β}}{2} \overline{N}$ .

Id - γ A = Id - γ (A + ω Id) + γ ω Id = (1 + γ ω) Id - \frac{γ \overline{β}}{2} (Id + \overline{N})

80a

= (1 + γ ω - γ \overline{β}) ((1 - ν) Id + ν \overline{N}) .

80b

Now proceed similar to the proof of Theorem 6.5 in view of (79). □

Corollary 6.7

Let $μ > ω \geq 0$ , let $β > 0$ , and let $\overline{β} \in] max {β, μ + ω}, + \infty [$ . Suppose that A is maximally $(- ω)$ -monotone, $A + ω Id$ is β-cocoercive, and B is maximally μ-monotone. Let $γ \in [0, 2 / (\overline{β} - μ - ω) [$ . Set $T = J_{γ B} (Id - γ A)$ , set $δ = max {1 + γ μ, γ \overline{β} - γ ω - 1} / (1 + γ μ)$ , and let $x_{0} \in X$ . Then $δ \in] - 1, 0]$ and $ν \in] 0, 1 [$ . Then $δ \in [0, 1 [$ , T is a Banach contraction with a constant δ, and there exists $\overline{x} \in X$ such that $Fix T = zer (A + B) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ .

Proof

Combine Theorem 6.5 and Theorem 6.6. □

Theorem 6.8

(Case III: A is β-Lipschitz continuous)

Let $μ \geq β > 0$ . Suppose that A is β-Lipschitz continuous and that B is maximally μ-monotone. Let $\overline{β} \in] 2 β, + \infty [$ , and let $γ \in] 0, 2 / (\overline{β} - 2 β)} [$ . Set $T = J_{γ B} (Id - γ A)$ , set $ν = γ \overline{β} / (2 (1 + γ β))$ , set $δ = (1 + γ β) / (1 + γ μ)$ , and let $x_{0} \in X$ . Then $δ \in] 0, 1]$ and $ν \in] 0, 1 [$ . Moreover, the following hold:

(i)
$T = δ ((1 - ν) Id + ν N)$ , N is nonexpansive.
(ii)
T is $(1 - (δ (1 - ν)) / (2 - ν))$ -averaged.
(iii)
T is δ-Lipschitz continuous.
(iv)
There exists $\overline{x} \in Fix T = zer (A + B)$ , and $T^{n} x_{0} ⇀ \overline{x}$ .

Suppose that $μ > 1 / β$ . Then we additionally have:

(v)
T is a Banach contraction with a constant $δ < 1$ .
(vi)
$zer (A + B) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $δ < 1$ .

Proof

Combine Lemma 2.12 and Theorem 6.5 applied with $(ω, β)$ replaced by $(β, 2 β)$ . □

Theorem 6.9

Let $μ > β > 0$ . Suppose that A is β-Lipschitz continuous and that B is maximally μ-monotone. Let $\overline{β} \in] μ + β, + \infty [$ , and let $γ \in [2 / (\overline{β} - 2 β), 2 / (\overline{β} - μ - β) [$ . Set $T = J_{γ B} (Id - γ A)$ , set $ν = γ \overline{β} / (2 (γ \overline{β} - γ β - 1))$ , set $δ = (1 + γ β - γ \overline{β}) / (1 + γ μ)$ , and let $x_{0} \in X$ . Then $δ \in] - 1, 0]$ and $ν \in] 0, 1 [$ . Moreover, the following hold:

(i)
$T = δ ((1 - ν) Id + ν N)$ , N is nonexpansive.
(ii)
T is a Banach contraction with a constant $| δ | < 1$ .
(iii)
There exists $\overline{x} \in X$ such that $Fix T = zer (A + B) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $| δ | < 1$ .

Proof

Combine Lemma 2.12 and Theorem 6.6 applied with $(ω, β)$ replaced by $(β, 2 β)$ . □

Applications to optimization problems

Let $f : X \to] - \infty, + \infty]$ , and let $g : X \to] - \infty, + \infty]$ . Throughout this section, we shall assume that

f and g are proper lower semicontinuous functions.

We shall use ∂f to denote the subdifferential mapping from convex analysis.

Definition 7.1

(see [3, Definition 6.1])

An abstract subdifferential $\partial_{#}$ associates a subset $\partial_{#} f (x)$ of X with f at $x \in X$ , and it satisfies the following properties:

(i)
$\partial_{#} f = \partial f$ if f is a proper lower semicontinuous convex function;
(ii)
$\partial_{#} f = \nabla f$ if f is continuously differentiable;
(iii)
$0 \in \partial_{#} f (x)$ if f attains a local minimum at $x \in dom f$ ;
(iv)
for every $β \in R$ ,
$\partial_{#} (f + β \frac{{∥ \cdot - x ∥}^{2}}{2}) = \partial_{#} f + β (Id - x) .$

The Clarke–Rockafellar subdifferential, Mordukhovich subdifferential, and Frechét subdifferential all satisfy Definition 7.1(i)–(iv), see, e.g., [5, 19, 20], so they are $\partial_{#}$ .

Let $λ > 0$ . Recall that f is λ-hypoconvex (see [23, 26]) if

f ((1 - τ) x + τ y) \leq (1 - τ) f (x) + τ f (y) + \frac{λ}{2} τ (1 - τ) {∥ x - y ∥}^{2}

for all $(x, y) \in X \times X$ and $τ \in] 0, 1 [$ or, equivalently,

f + \frac{λ}{2} {∥ \cdot ∥}^{2} is convex.

For $γ > 0$ , the proximal mapping ${Prox}_{γ f}$ is defined at $x \in X$ by

{Prox}_{γ f} (x) = \underset{y \in X}{argmin} (f (y) + \frac{γ}{2} {∥ x - y ∥}^{2}) .

Fact 7.2

Suppose that $f : X \to] - \infty, + \infty]$ is a proper lower semicontinuous λ-hypoconvex function. Then

\partial_{#} f = \partial (f + \frac{λ}{2} {∥ \cdot ∥}^{2}) - λ Id .

Moreover, we have:

(i)
The Clarke–Rockafellar, Mordukhovich, and Frechét subdifferential operators of f all coincide.
(ii)
$\partial_{#} f$ is maximally −λ-monotone.
(iii)
$(\forall γ \in] 0, 1 / λ [)$ ${Prox}_{γ f}$ is single-valued and $dom {Prox}_{γ f} = X$ .

Proof

See [3, Proposition 6.2 and Proposition 6.3]. □

Proposition 7.3

Let $μ \geq ω \geq 0$ . Suppose that $argmin (f + g) \neq \emptyset$ and that one of the following conditions is satisfied:

(i)
f is μ- strongly convex, g is ω- hypoconvex.
(ii)
f is ω- hypoconvex, and g is μ- strongly convex.

Then $f + g$ is convex and $\partial_{#} (f + g) = \partial (f + g)$ .

If, in addition, one of the following conditions is satisfied:

$0 \in sri (dom f - dom g)$ .
X is finite dimensional and $0 \in ri (dom f - dom g)$ .
X is finite dimensional, f and g are polyhedral, and $dom f \cap dom g \neq \emptyset$ .

Then

\partial_{#} (f + g) = \partial (f + g) = \partial_{#} f + \partial_{#} g,

and

zer \partial_{#} (f + g) = zer (\partial_{#} f + \partial_{#} g) = argmin (f + g) .

Proof

It is clear that either (i) or (ii) implies that $f + g$ is convex, and the identity follows in view of Definition 7.1(i). Now, suppose that (i) holds along with one of the assumptions (a)–(c). Rewrite f and g as $(f, g) = (\overline{f} + \frac{μ}{2} {∥ \cdot ∥}^{2}, \overline{g} - \frac{ω}{2} {∥ \cdot ∥}^{2})$ and observe that both f̅ and g̅ are convex, as is $\overline{f} + \overline{g}$ . Moreover, we have $dom f = dom \overline{f}$ and $dom g = dom \overline{g}$ . Now,

\partial_{#} (f + g) = \partial_{#} (\overline{f} + \overline{g} + \frac{μ - ω}{2} {∥ \cdot ∥}^{2})

87a

= \partial_{#} (\overline{f} + \overline{g}) + (μ - ω) Id = \partial (\overline{f} + \overline{g}) + (μ - ω) Id

87b

= \partial \overline{f} + \partial \overline{g} + (μ - ω) Id = \partial \overline{f} + μ Id + \partial \overline{g} - ω Id

87c

= \partial f + \partial_{#} g = \partial_{#} f + \partial_{#} g .

87d

Here, (87b) follows from applying Definition 7.1(iv) to $\overline{f} + \overline{g}$ , (87c) follows from [2, Theorem 16.47] applied to f̅ and g̅, and (87c) follows from applying Fact 7.2 to f and g and using Definition 7.1(i), which verify (85). Finally, (86) follows from combining (85) and [2, Theorem 16.3]. □

The following theorem provides an alternative proof to [17, Theorem 4.4] and [9, Theorem 5.4(ii)].

Theorem 7.4

Let $μ > ω \geq 0$ , and let $γ \in] 0, (μ - ω) / (2 μ ω) [$ . Suppose that one of the following holds:

(i)
f is μ- strongly convex, g is ω- hypoconvex.
(ii)
f is ω-hypoconvex, and g is μ-strongly convex,

and that $0 \in \partial_{#} f + \partial_{#} g$ (see Proposition 7.3for sufficient conditions). Set

T = \frac{1}{2} (Id + (2 {Prox}_{γ g} - Id) (2 {Prox}_{γ f} - Id)) and α = \frac{μ - ω}{2 (μ - ω - γ μ ω)},

and let $x_{0} \in X$ . Then $α \in] 0, 1 [$ , and T is α-averaged. Moreover, $(\exists \overline{x} \in Fix T)$ such that $T^{n} x_{0} ⇀ \overline{x}$ , $argmin (f + g) = {{Prox}_{f} \overline{x}}$ , and ${Prox}_{f} T^{n} x_{0} \to {Prox}_{f} \overline{x}$ .

Proof

Suppose that (i) holds. Then [2, Example 22.4] (respectively Fact 7.2(ii)) implies that $\partial_{#} f = \partial f$ (respectively $\partial_{#} g$ ) is maximally μ-monotone (respectively maximally $(- ω)$ -monotone). The conclusion follows from applying Theorem 5.5(i) with $(A, B)$ replaced by $(\partial_{#} f, \partial_{#} g)$ . The proof for (ii) follows similarly by using Theorem 5.5(ii). □

Before we proceed further, we recall the following useful fact.

Fact 7.5

(Baillon–Haddad)

Let $f : X \to R$ be a Frechét differentiable convex function, and let $β > 0$ . Then $\nabla f$ is β-Lipschitz continuous if and only if $\nabla f$ is $\frac{1}{β}$ -cocoercive.

Proof

See, e.g., [2, Corollary 18.17]. □

Lemma 7.6

Let $μ \geq 0$ , let $β > 0$ , and let $f : X \to R$ be a Frechét differentiable function. Suppose that f is μ-strongly convex with a β-Lipschitz continuous gradient. Then the following hold:

(i)
$f - \frac{μ}{2} {∥ \cdot ∥}^{2}$ is convex.
(ii)
$\nabla f$ is maximally μ-monotone.
(iii)
$\nabla f - μ Id$ is $\frac{1}{β}$ -cocoercive.

Proof

(i): See, e.g., [2, Proposition 10.8]. (ii): See, e.g., [2, Example 22.4(iv)]. (iii): Combine (i), Lemma 2.10, and Corollary 2.14(ii) applied with $(f_{1}, f_{2})$ replaced by $(f, \frac{μ}{2} {∥ \cdot ∥}^{2})$ . □

Theorem 7.7

(The forward–backward algorithm when f is μ-strongly convex)

Let $μ \geq ω \geq 0$ , and let $β > 0$ . Let f be μ-strongly convex and Frechét differentiable with a β-Lipschitz continuous gradient, and let g be ω-hypoconvex. Suppose that $argmin (f + g) \neq \emptyset$ . Let $γ \in] 0, 2 / (β + 2 μ) [$ , and set $δ = (1 - γ μ) / (1 - γ ω)$ . Set $T = {Prox}_{γ g} (Id - γ \nabla f)$ , and let $x_{0} \in X$ . Then the following hold:

(i)
There exists $\overline{x} \in Fix T = zer (A + B) = argmin (f + g)$ such that $T^{n} x_{0} ⇀ \overline{x}$ .

Suppose that $μ > ω$ . Then we additionally have:

(ii)
$Fix T = argmin (f + g) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $δ < 1$ .

Proof

Note that Definition 7.1(ii) implies that $\partial_{#} f = \nabla f$ . Set $(A, B) = (\nabla f, \partial_{#} g)$ and observe that Proposition 7.3 and Proposition 2.2 imply that $Fix T = zer (A + B) = argmin (f + g)$ . It follows from [2, Example 22.4] (respectively Fact 7.2(ii)) that A (respectively B) is maximally μ-monotone (respectively maximally $(- ω)$ -monotone). Moreover, Lemma 7.6(iii) implies that $A - μ Id$ is $\frac{1}{β}$ -cocoercive. (i)–(ii): Apply Theorem 6.1(iv)&(vi). □

To proceed to the next result, we need the following lemma.

Lemma 7.8

Let $ω \geq 0$ , let $β > 0$ , and let $f : X \to R$ be a Frechét differentiable function. Suppose that g is ω-hypoconvex with a $\frac{1}{β}$ -Lipschitz continuous gradient. Then $\nabla f + ω Id$ is $β / (1 + ω β)$ -cocoercive.

Theorem 7.9

(The forward–backward algorithm when f is ω-hypoconvex)

Let $μ \geq ω \geq 0$ , let $β > 0$ , and let $\overline{β} \in] max {β, 2 ω}, + \infty [$ . Let f be ω-hypoconvex, and let g be μ-strongly convex and Frechét differentiable with a β-Lipschitz continuous gradient. Suppose that $argmin (f + g) \neq \emptyset$ . Let $γ \in] 0, 2 / (\overline{β} - 2 ω) [$ , and set $δ = (1 + γ ω) / (1 + γ μ)$ . Set $T = {Prox}_{γ g} (Id - γ \nabla f)$ , and let $x_{0} \in X$ . Then the following hold:

(i)
There exists $\overline{x} \in Fix T = argmin (f + g)$ such that $T^{n} x_{0} ⇀ \overline{x}$ .

Suppose that $μ > ω$ . Then we additionally have:

(ii)
$Fix T = argmin (f + g) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $δ < 1$ .

Proof

Proceed similar to the proof of Theorem 7.7 but use Theorem 6.5(iv)&(vi). □

Theorem 7.10

(The forward–backward algorithm when f is $1 / β$ -hypoconvex)

Let $μ \geq β > 0$ , and let $\overline{β} \in] 2 β, + \infty [$ . Let f be μ-strongly convex, and let g be Frechét differentiable with a β-Lipschitz continuous gradient. Suppose that $argmin (f + g) \neq \emptyset$ . Let $γ \in] 0, 2 / (\overline{β} - 2 β)} [$ , and set $δ = (1 + γ β) / (1 + γ μ)$ . Set $T = {Prox}_{γ g} (Id - γ \nabla f)$ , and let $x_{0} \in X$ . Then the following hold:

(i)
There exists $\overline{x} \in Fix T = argmin (f + g)$ such that $T^{n} x_{0} ⇀ \overline{x}$ .

Suppose that $μ > 1 / β$ . Then we additionally have:

(ii)
$Fix T = argmin (f + g) = {\overline{x}}$ and $T^{n} x_{0} \to \overline{x}$ with a linear rate $δ < 1$ .

Proof

Combine Lemma 2.12 applied with A replaced by $\nabla f$ and Theorem 7.9 applied with $(ω, β)$ replaced by $(β, 2 β)$ . □

Remark 7.11

The results of Theorem 6.2, Theorem 6.6, and Theorem 6.9 can be directly applied to optimization settings in a similar fashion à la Theorem 7.7, Theorem 7.9, and Theorem 7.10.

Graphical characterizations

This section contains 2D-graphical representations of different Lipschitz continuous operator classes that admit I-N decompositions and of their composition classes. We illustrate exact shapes of the composition classes in 2D and conservative estimates from Theorem 3.4 and Theorem 4.2. Similar graphical representations have appeared before in the literature. In [10, 11], nonexpansiveness and firm nonexpansiveness ( $\frac{1}{2}$ -averagedness) are characterized. Early preprints of [15] have more 2D graphical representations, and the lecture notes [14] contain many such characterizations with the purpose of illustrating how different properties relate to each other and to provide intuition on why different algorithms converge. This has been further extended and formalized in [24]. Not only do these illustrations provide intuition. Indeed, it is a straightforward consequence of, e.g., [24, 25] that for compositions of two operator classes that admit I-N decompositions, there always exists a 2D-worst case. Hence, if the 2D illustration implies that the composition class admits a specific $(α, β)$ -I-N decomposition, so does the full operator class.

In Sect. 8.1, we characterize many well-known special cases of operator classes that admit I-N decompositions. In Sect. 8.2, we characterize classes obtained by compositions of such operator classes and highlight differences between the true composition classes and their characterizations using Theorem 3.4.

Single operators

We consider classes of $(α, β)$ -I-N decomposition of Lipschitz continuous operators. We graphically illustrate properties of some special cases. The illustrations should be read as follows. Assume that $x - y$ is represented by the marker in the figure. The diagram then shows where $R x - R y$ can end up in relation to $x - y$ . If the point $x - y$ is rotated in the picture, the rest of the picture rotates with it. The characterization is, by construction of $(α, β)$ -I-N decompositions, always a circle of radius $β ∥ x - y ∥$ shifted $α ∥ x - y ∥$ along the line defined by the origin and the point $x - y$ .

Lipschitz continuous operators

Let $β > 0$ and let $R : X \to X$ . Then R is β-Lipschitz continuous if and only if R admits an $(α, β)$ -I-N decomposition, with α chosen as 0. Figure 1 shows the case $β = 0.8$ . The radius of the Lipschitz circle is $β ∥ x - y ∥$ .

Illustration of β-Lipschitz continuous operator with $β = 0.8$

Cocoercive operators

Let $β > 0$ , and let $R : X \to X$ . Then R is $\frac{1}{β}$ -cocoercive if and only if R admits an $(α, β)$ -I-N decomposition, with $(α, β)$ chosen as $(\frac{β}{2}, \frac{β}{2})$ . Figure 2 shows the cases $β = 1.4$ and $β = 0.7$ . The diameter is $β ∥ x - y ∥$ . The figure clearly illustrates that $\frac{1}{β}$ -cocoercive operators are also β-Lipschitz (but not necessarily the other way around).

Averaged operators

Let $α \in] 0, 1 [$ , and let $R : X \to X$ . Then R is α-averaged if and only if R admits an $(α, β)$ -I-N decomposition, with $(α, β)$ chosen as $(1 - α, α)$ . Figure 3 shows the cases $α = 0.25$ and $α = 0.5$ , and $α = 0.75$ . All averaged operators are nonexpansive.

Conic operators

Let $α > 0$ , and let $R : X \to X$ . Then R is α-conically nonexpansive if and only if R admits an $(α, β)$ -I-N decomposition, with $(α, β)$ chosen as $(1 - α, α)$ . Figure 4 shows the cases $α = 1.2$ and $α = 1.5$ . Conically nonexpansive operators fail to be nonexpansive for $α > 1$ .

Illustration of α-conically nonexpansive operators with $α = 1.2$ and $α = 1.5$

μ-Monotone operators

Let $μ \in R$ , and suppose that $A : X ⇉ X$ is μ-monotone. The shortest distance between the vertical line and the origin in the illustration is $| μ | ∥ x - y ∥$ . Figure 5 shows the case $μ = 0.2$ .

Illustration of μ-monotone operator with $μ = 0.2$

Compositions of two operators

In this section, we provide illustrations of compositions of different classes of Lipschitz continuous operators. We consider compositions of the form

R = R_{2} R_{1}, where R_{i} admits an (α_{i}, β_{i}) -I-N decomposition,

$\forall i \in {1, 2}$ . Let $(x, y) \in X \times X$ . We illustrate the regions within which $R_{2} R_{1} x - R_{2} R_{1} y$ can end up. For most considered composition classes, we provide two illustrations. The left illustration explicitly shows how the composition is constructed. It shows the region within which $R_{1} x - R_{1} y$ must end up. The second operator $R_{2}$ is applied at a subset, marked by crosses, of boundary points of that region. Given these as starting points for $R_{2}$ application, the dashed circles show where $R_{2} R_{1} x - R_{2} R_{1} y$ can end up for this subset. The right illustration shows, in gray, the resulting exact shape of the composition. It also contains the estimate from Theorem 3.4 that provides an I-N decomposition of the composition. From these illustrations, it is obvious that many different I-N decomposition are valid. The illustrations also reveal that the specific I-N decompositions provided in Theorem 3.4 indeed are suitable for our purpose of characterizing the composition as averaged, conic, or contractive.

Averaged-averaged composition

We first consider $α_{i}$ -averaged $R_{i}$ with $α_{i} \in] 0, 1 [$ . A special case is the forward–backward splitting operator $T = J_{γ B} (Id - γ A)$ with $\frac{1}{β}$ -cocoercive A and maximally monotone B. This implies that $(Id - γ A)$ is $\frac{γ β}{2}$ -averaged for $γ \in] 0, \frac{2}{β} [$ and that $J_{γ B}$ is $\frac{1}{2}$ -averaged. The example in Fig. 6 has individual averagedness parameters $α_{1} = 0.5$ and $α_{2} = 0.5$ , i.e., $R = R_{2} R_{1}$ with $R_{1} = 0.5 Id + 0.5 N_{1}$ and $R_{2} = 0.5 Id + 0.5 N_{2}$ . Theorem 3.4 shows that the composition is of the form $0.33 Id + 0.67 N$ , where N is nonexpansive, i.e., it is 0.67-averaged. The fact that the composition is averaged is already known, see [8, 12].

Illustration of composition of $α_{1}$ -averaged and $α_{2}$ -averaged operators with $α_{1} = α_{2} = 0.5$

The example in Fig. 7 shows $α_{1} = 0.7$ and $α_{2} = 0.6$ . Theorem 3.4 shows that the composition is of the form $0.21 Id + 0.79 N$ , where N is nonexpansive, i.e., it is 0.79-averaged.

Illustration of composition of $α_{1}$ -averaged and $α_{2}$ -averaged operators with $α_{1} = 0.7$ and $α_{2} = 0.6$

Conic-conic composition

We consider $α_{i}$ -averaged $R_{i}$ with $α_{i} > 0$ . Several examples with this setting are considered in for Douglas-Rachford splitting and forward–backward splitting in Sect. 5 and Sect. 6. We know from Theorem 4.2 that the composition is conic if $α_{1} α_{2} < 1$ . The example in Fig. 8 has $α_{1} = 1.7$ and $α_{2} = 0.45$ , that satisfies $α_{1} α_{2} = 0.76 < 1$ . Theorem 4.2 shows that the composition is of the form $- 1.64 Id + 2.64 N$ , where N is nonexpansive, i.e., it is 2.64-conic.

Illustration of composition of $α_{1}$ -conic operator and $α_{2}$ -averaged operator with $α_{1} = 1.7$ and $α_{2} = 0.45$

In Example 4.6, we have shown that the assumption $α_{1} α_{2} < 1$ is critical for the composition to be conic. Figure 9 illustrates the case $α_{1} = 1.7$ and $α_{2} = 0.7$ , which satisfies $α_{1} α_{2} = 1.19 > 1$ , hence Theorem 4.2 cannot be used to deduce that the composition is conic. Indeed, we see from the figure that the composition is not conic. It is impossible to draw a circle that touches the marker at $x - y$ and extends only to the left.

Illustration of composition of $α_{1}$ -conic operator and $α_{2}$ -averaged operator with $α_{1} = 1.7$ and $α_{2} = 0.7$

We conclude the conic composed with conic examples with a forward–backward example. The forward–backward splitting operator $J_{γ B} (Id - γ A)$ with A $\frac{1}{β}$ -cocoercive and B (maximally) monotone is composed of $\frac{1}{2}$ -averaged resolvent $J_{γ B}$ and $\frac{γ β}{2}$ -conic forward step $(Id - γ A)$ . The composition $R = R_{2} R_{1}$ with $R_{i}$ $α_{i}$ -conic is conic if $α_{1} α_{2} < 1$ , Theorem 4.2. In the forward–backward setting, this corresponds to $γ \in (0, \frac{4}{β})$ , which doubles the allowed range compared to guaranteeing an averaged composition. This extended range has been shown before, e.g., in [13, 18].

In Fig. 10, we illustrate the forward–backward setting with $γ = \frac{3.9}{β}$ . This corresponds to conic parameters $α_{1} = 1.95$ and $α_{2} = 0.5$ , i.e., $R = R_{2} R_{1}$ with $R_{1} = - 0.95 Id + 1.95 N_{1}$ and $R_{2} = 0.5 Id + 0.5 N_{2}$ . The composition is of the form $- 18.99 Id + 19.99 N$ , where N is nonexpansive, i.e., it is 19.99-conic, Theorem 4.2. The left figure shows the resulting composition and (parts of) the conic approximation. The conic approximation is very large compared to the actual region. This is due to the local behavior around the point $x - y$ , where it is almost vertical. As $γ ↗ 4 β$ , the exact shape approaches being vertical around $x - y$ and the conic circle approaches to have an infinite radius. For $γ > 4 β$ , the exact shape extends to the right of $x - y$ (as in the figure above), and the composition will not be conic.

To the left is an illustration of the forward–backward composition $J_{γ B} (Id - γ A)$ with $γ = \frac{3.9}{β}$ , where $\frac{1}{β}$ is the cocoercivity constant of A. It is a composition between an $α_{1}$ -conic operator and an $α_{2}$ -averaged operator with $α_{1} = 1.95$ and $α_{2} = 0.5$ . To the right is an illustration of a θ-relaxation of the same forward-backward map with $θ = 0.04$

In the right figure, we consider the relaxed forward–backward map $(1 - θ) Id + θ J_{γ B} (Id - γ A)$ with $θ > 0$ . If the composition $J_{γ B} (Id - γ A)$ is α-conic, it is straightforward to verify that the relaxed map is θα-conic. Therefore, any $θ \in (0, α^{- 1})$ gives an θα-averaged relaxed forward–backward map. An averaged map is needed to guarantee convergence to a fixed-point when iterated. In the figure, we let $θ = 0.04$ , which satisfies $θ < α^{- 1} \approx 0.05$ . The approximation is indeed averaged, but the region within which the composition can end up is very small compared to the conic approximation.

Scaled averaged and cocoercive compositions

Compositions of scaled averaged and cocoercive operators are also special cases of scaled conic composed with scaled conic operators treated in Theorem 4.2. It covers the forward backward examples in Sect. 6, where identity is shifted between the operators and the sum is (strongly) monotone. The operators in the composition are of the form $R_{1} = δ_{1} ((1 - α_{1}) Id + α_{1} N_{1})$ and $R_{2} = \frac{β_{2}}{2} (Id + N_{2})$ , where $α_{1} \in (0, 1)$ , $δ_{1} > 0$ , and $β_{2} > 0$ .

In Fig. 11, we consider the forward–backward setting in Theorem 6.5. The forward backward map is $J_{γ B} (Id - γ A)$ and we let $A + 0.3 Id$ be 1-cocoercive, B be maximally 0.3-monotone. That is, we have shifted 0.3Id from A to B and the sum is monotone. We use step-length $γ = 2$ . The proof of Theorem 6.5 shows that, in our setting, $R_{1}$ is 1.6-scaled 0.62-averaged and that $R_{2}$ is 1.6-cocoercive. Theorem 3.4 implies that the composition is of the form $0.27 Id + 0.73 N$ , where N is nonexpansive, i.e., it is 0.73-averaged.

Illustration of composition of 1.6-scaled 0.62-averaged operator with 1.6-cocoercive operator. The composition comes from the forward–backward map $J_{γ B} (Id - γ A)$ with $A + 0.3 Id$ 1-cocoercive, B 0.3-monotone, and $γ = 2$

Figure 12 considers a similar forward–backward setting, but with a strongly monotone sum. We let $A + 0.2 Id$ be 1-cocoercive, B be maximally 0.3-monotone, which implies that the sum is 0.1-strongly monotone. We keep step-length $γ = 2$ . The proof of Theorem 6.5 shows that, in our setting, $R_{1}$ is 1.4-scaled 0.62-averaged and that $R_{2}$ is 1.6-cocoercive. Theorem 3.4 implies that the composition is of the form $0.19 Id + 0.68 N$ , where N is nonexpansive, i.e., it is 0.87-contractive.

Illustration of composition of 1.4-scaled 0.62-averaged operator with 1.6-cocoercive operator. The composition comes from the forward–backward map $J_{γ B} (Id - γ A)$ with $A + 0.2 Id$ 1-cocoercive, B 0.3-monotone, and $γ = 2$

The final example in Fig. 13 considers a similar forward–backward setting where the sum is not monotone. We let $A + 0.4 Id$ be 1-cocoercive, B be maximally 0.3-monotone, which implies that the sum is −0.1-monotone, i.e., it is not monotone. We use step-length $γ = 2$ . The proof of Theorem 6.5 shows that, in our setting, $R_{1}$ is 1.8-scaled 0.62-averaged and that $R_{2}$ is 1.6-cocoercive. Theorem 3.4 implies that the composition is of the form $0.35 Id + 0.78 N$ , where N is nonexpansive, i.e., it is 1.12-Lipschitz and not conic, averaged, or contractive.

Illustration of composition of 1.8-scaled 0.62-averaged operator with 1.6-cocoercive operator. The composition comes from the forward–backward map $J_{γ B} (Id - γ A)$ with $A + 0.4 Id$ 1-cocoercive, B 0.3-monotone, and $γ = 2$

Acknowledgements

Not applicable.

Appendix A

Proof of Lemma 2.3

Indeed, observe that

R (λ) = (1 - 2 λ) Id + λ (Id + R_{2} R_{1})

and

Id - R (λ) = λ (Id - R_{2} R_{1}) .

In view of (89) and (90) we have

\begin{aligned} 〈 R (λ) x - R (λ) y ∣ (Id - R (λ)) x - (Id - R (λ)) y 〉 \\ = (1 - 2 λ) 〈 x - y ∣ Id - R (λ)) x - (Id - R (λ)) y 〉 \\ + λ^{2} 〈 (x - y) - (R_{2} R_{1} x - R_{2} R_{1} y) ∣ (x - y) + (R_{2} R_{1} x - R_{2} R_{1} y) 〉 \\ = (1 - 2 λ) 〈 x - y ∣ Id - R (λ)) x - (Id - R (λ)) y 〉 + λ^{2} ({∥ x - y ∥}^{2} - {∥ R_{2} R_{1} x - R_{2} R_{1} y ∥}^{2}) \\ = (1 - 2 λ) 〈 x - y ∣ Id - R (λ)) x - (Id - R (λ)) y 〉 \\ + λ^{2} ({∥ x - y ∥}^{2} - {∥ R_{1} x - R_{1} y ∥}^{2} + {∥ R_{1} x - R_{1} y ∥}^{2} - {∥ R_{2} R_{1} x - R_{2} R_{1} y ∥}^{2}) \\ = (1 - 2 λ) 〈 x - y ∣ (Id - R (λ)) x - (Id - R (λ)) y 〉 \\ + λ^{2} 〈 (Id + R_{1}) x - (Id + R_{1}) y ∣ (Id - R_{1}) x - (Id - R_{1}) y 〉 \\ + λ^{2} 〈 (Id + R_{2}) R_{1} x - (Id + R_{2}) R_{1} y ∣ (Id - R_{2}) R_{1} x - (Id - R_{2}) R_{1} y 〉, \end{aligned}

and the conclusion follows. □

Appendix B

Proof of Lemma 2.12

(i): Because $\frac{1}{β} A$ is nonexpansive, we learn from [2, Example 20.7] that $Id + \frac{1}{β} A$ , as is $β Id + A$ , is maximally monotone. The conclusion now follows in view of e.g., [3, Lemma 2.5]. (ii): This is clear by observing that $\frac{1}{2 β} (β Id + A) = \frac{1}{2} (Id + \frac{1}{β} A)$ . □

Appendix C

Proof of Lemma 2.13

Indeed, by assumption, there exist nonexpansive mappings $N_{1} : X \to X$ and $N_{2} : X \to X$ such that

T_{1} = \frac{β}{2} Id + \frac{β}{2} N_{1}, T_{2} = \frac{δ}{2} Id + \frac{δ}{2} N_{2} .

Now,

\frac{1}{β} (T_{1} - T_{2}) = \frac{1}{β} T_{1} - \frac{1}{β} T_{2} = \frac{1}{2} Id + \frac{1}{2} N_{1} - \frac{δ}{2 β} Id - \frac{δ}{2 β} N_{2}

92a

= \frac{β - δ}{2 β} Id + \frac{1}{2} N_{1} - \frac{δ}{2 β} N_{2} .

92b

Using the triangle inequality, one can directly verify that $\frac{1}{β} (T_{1} - T_{2})$ is Lipschitz continuous with a constant $\frac{β - δ}{2 β} + \frac{1}{2} + \frac{δ}{2 β} = 1$ . The proof is complete. □

Appendix D

Proof of Corollary 2.14

(i): It follows from Fact 7.5 that ${\nabla f}_{1}$ (respectively ${\nabla f}_{2}$ ) is $\frac{1}{β}$ -cocoercive (respectively $\frac{1}{δ}$ -cocoercive). Now apply Lemma 2.13 with $(T_{1}, T_{2})$ replaced by $({\nabla f}_{1}, {\nabla f}_{2})$ . (ii): Combine (i) with Fact 7.5 applied with f replaced by $f_{1} - f_{2}$ . □

Appendix E

Proof of Lemma 2.15

(i): Indeed, we have $δ T = (1 - (1 - δ (1 - α))) Id + δ α N = (1 - (1 - δ (1 - α))) Id + (1 - δ (1 - α)) \tilde{N}$ , where $\tilde{N} = ((δ α) / (1 - δ (1 - α))) N$ . Note that $(δ α) / (1 - δ (1 - α) \leq 1$ , hence Ñ is nonexpansive and the conclusion follows. (ii): Clear. □

Authors’ contributions

All authors contributed equally in writing this article. All authors read and approved the final manuscript.

Funding

PG was partially supported by the Swedish Research Council and the Wallenberg AI, Autonomous Systems and Software Program (WASP). WMM was partially supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grant (NSERC-DG).

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Declarations

Competing interests

The authors declare that they have no competing interests.

Footnotes

Let $T : X \to X$ . Then T is α-averaged if $α \in] 0, 1 [$ and nonexpansive $N : X \to X$ exists such that $T = (1 - α) Id + α N$ .

Let $T : X \to X$ . Then T is α-conically nonexpansive if $α \in] 0, \infty [$ and nonexpansive $N : X \to X$ exists such that $T = (1 - α) Id + α N$ .

Let $T : X \to X$ , and let $β > 0$ . Then T is $\frac{1}{β}$ -cocoercive if nonexpansive $N : X \to X$ exists such that $T = \frac{β}{2} (Id + N)$ .

⁴

The paper [1] appeared online while putting the finishing touches on this paper. Partial results of this work were presented by the second author at the Numerical Algorithms in Nonsmooth Optimization workshop at Erwin Schrödinger International Institute for Mathematics and Physics (ESI) in Vienna in February 2019 and at the Operator Splitting Methods in Data Analysis workshop at the Flatiron Institute, in New York in March 2019. Both workshops predate [1].

⁵

Let $A : X X$ be an operator. The resolvent of A, denoted by $J_{A}$ , is defined by $J_{A} = {(Id + A)}^{- 1}$ , and the reflected resolvent of A, denoted by $R_{A}$ , is defined by $R_{A} = 2 J_{A} - Id$

⁶

In passing, we mention that [2, Proposition 26.1(iv)] assume that A and B are maximally monotone, which is not required here. However, the proof is the same.

⁷

Here and elsewhere, we use $R_{+}$ to denote the interval $[0, + \infty [$ .

⁸

The assumption that $β \in R_{+}$ is not restrictive. Indeed, since N is nonexpansive, an operator admits an $(α, β)$ -I-N decomposition if and only if it admits an $(α, - β)$ -I-N decomposition. This is the reason why we define it only for nonnegative β.

⁹

Let C be a nonempty, closed convex subset of X. Here and elsewhere, we shall use $N_{C}$ to denote the normal cone operator associated with C, defined by $N_{C} (x) = {u \in X ∣ sup 〈 C - x ∣ u 〉 \leq 0}$ if $x \in C$ ; and $N_{C} (x) = \emptyset$ , otherwise.

Contributor Information

Pontus Giselsson, Email: pontus.giselsson@control.lth.se.

Walaa M. Moursi, Email: walaa.moursi@uwaterloo.ca

References

1. Bartz, S., Dao, M.N., Phan, H.M.: Conical averagedness and convergence analysis of fixed point algorithms. arXiv preprint (2019). arXiv:1910.14185
2.Bauschke H.H., Combettes P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. 2. New York: Springer; 2017. [Google Scholar]
3.Bauschke H.H., Moursi W.M., Wang X. Generalized monotone operators and their averaged resolvents. Math. Program., Ser. B. 2020 doi: 10.1007/s10107-020-01500-6. [DOI] [Google Scholar]
4.Burachik R.S., Iusem A.N. Set-Valued Mappings and Enlargements of Monotone Operators. Berlin: Springer; 2007. [Google Scholar]
5.Clarke F.H. Optimization and Nonsmooth Analysis. Philadelphia: SIAM; 1990. [Google Scholar]
6.Combettes P.L. Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization. 2004;53(5–6):475–504. doi: 10.1080/02331930412331327157. [DOI] [Google Scholar]
7.Combettes P.L., Pennanen T. Proximal methods for cohypomonotone operators. SIAM J. Control Optim. 2004;43(2):731–742. doi: 10.1137/S0363012903427336. [DOI] [Google Scholar]
8.Combettes P.L., Yamada I. Compositions and convex combinations of averaged nonexpansive operators. J. Math. Anal. Appl. 2015;425(1):55–70. doi: 10.1016/j.jmaa.2014.11.044. [DOI] [Google Scholar]
9.Dao M.N., Phan H.M. Adaptive Douglas–Rachford splitting algorithm for the sum of two operators. SIAM J. Optim. 2019;29(4):2697–2724. doi: 10.1137/18M121160X. [DOI] [Google Scholar]
10. Eckstein, J.: Splitting methods for monotone operators with applications to parallel optimization. Ph.D. thesis, MIT (1989)
11.Eckstein J., Bertsekas D.P. On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 1992;55(1):293–318. doi: 10.1007/BF01581204. [DOI] [Google Scholar]
12.Giselsson P. Tight global linear convergence rate bounds for Douglas–Rachford splitting. J. Fixed Point Theory Appl. 2017;19:2241–2270. doi: 10.1007/s11784-017-0417-1. [DOI] [Google Scholar]
13. Giselsson, P.: Nonlinear forward–backward splitting with projection correction (2019). arXiv:1908.07449
14. Giselsson, P.: Lecture notes on large-scale convex optimization (2015). http://control.lth.se/education/doctorate-program/large-scale-convex-optimization/
15.Giselsson P., Boyd S. Linear convergence and metric selection for Douglas–Rachford splitting and ADMM. IEEE Trans. Autom. Control. 2017;62(2):532–544. doi: 10.1109/TAC.2016.2564160. [DOI] [Google Scholar]
16.Guo K., Han D. A note on the Douglas–Rachford splitting method for optimization problems involving hypoconvex functions. J. Glob. Optim. 2018;72(3):431–441. doi: 10.1007/s10898-018-0660-z. [DOI] [Google Scholar]
17.Guo K., Han D., Yuan X. Convergence analysis of Douglas–Rachford splitting method for strongly + weakly convex programming. SIAM J. Numer. Anal. 2017;55(4):1549–1577. doi: 10.1137/16M1078604. [DOI] [Google Scholar]
18.Latafat P., Patrinos P. Asymmetric forward–backward-adjoint splitting for solving monotone inclusions involving three operators. Comput. Optim. Appl. 2017;68(1):57–93. doi: 10.1007/s10589-017-9909-6. [DOI] [Google Scholar]
19.Mordukhovich B.S. Variational Analysis and Generalized Differentiation I: Basic Theory. Berlin: Springer; 2006. [Google Scholar]
20.Mordukhovich B.S. Variational Analysis and Applications. 2018. [Google Scholar]
21.Ogura N., Yamada I. Non-strictly convex minimization over the fixed point set of an asymptotically shrinking nonexpansive mapping. Numer. Funct. Anal. Optim. 2002;22(1–2):113–137. doi: 10.1081/NFA-120003674. [DOI] [Google Scholar]
22.Reich S. On the asymptotic behavior of nonlinear semigroups and the range of accretive operators. J. Math. Anal. Appl. 1981;79(1):113–126. doi: 10.1016/0022-247X(81)90013-5. [DOI] [Google Scholar]
23.Rockafellar R.T., Wets R.J.-B. Variational Analysis. 3. Berlin: Springer; 1998. [Google Scholar]
24. Ryu, E.K., Hannah, W., Yin, R.: Scaled relative graph: nonexpansive operators via 2D Euclidean geometry (2019). arXiv:1902.09788
25.Ryu E.K., Taylor A.B., Bergeling C., Giselsson P. Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 2020;30(3):2251–2271. doi: 10.1137/19M1304854. [DOI] [Google Scholar]
26.Wang X. On Chebyshev functions and Klee functions. J. Math. Anal. Appl. 2010;368(1):293–310. doi: 10.1016/j.jmaa.2010.03.041. [DOI] [Google Scholar]
27.Wen B., Chen X., Pong T.K. Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 2017;27(1):124–145. doi: 10.1137/16M1055323. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

[CR1] 1. Bartz, S., Dao, M.N., Phan, H.M.: Conical averagedness and convergence analysis of fixed point algorithms. arXiv preprint (2019). arXiv:1910.14185

[CR2] 2.Bauschke H.H., Combettes P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. 2. New York: Springer; 2017. [Google Scholar]

[CR3] 3.Bauschke H.H., Moursi W.M., Wang X. Generalized monotone operators and their averaged resolvents. Math. Program., Ser. B. 2020 doi: 10.1007/s10107-020-01500-6. [DOI] [Google Scholar]

[CR4] 4.Burachik R.S., Iusem A.N. Set-Valued Mappings and Enlargements of Monotone Operators. Berlin: Springer; 2007. [Google Scholar]

[CR5] 5.Clarke F.H. Optimization and Nonsmooth Analysis. Philadelphia: SIAM; 1990. [Google Scholar]

[CR6] 6.Combettes P.L. Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization. 2004;53(5–6):475–504. doi: 10.1080/02331930412331327157. [DOI] [Google Scholar]

[CR7] 7.Combettes P.L., Pennanen T. Proximal methods for cohypomonotone operators. SIAM J. Control Optim. 2004;43(2):731–742. doi: 10.1137/S0363012903427336. [DOI] [Google Scholar]

[CR8] 8.Combettes P.L., Yamada I. Compositions and convex combinations of averaged nonexpansive operators. J. Math. Anal. Appl. 2015;425(1):55–70. doi: 10.1016/j.jmaa.2014.11.044. [DOI] [Google Scholar]

[CR9] 9.Dao M.N., Phan H.M. Adaptive Douglas–Rachford splitting algorithm for the sum of two operators. SIAM J. Optim. 2019;29(4):2697–2724. doi: 10.1137/18M121160X. [DOI] [Google Scholar]

[CR10] 10. Eckstein, J.: Splitting methods for monotone operators with applications to parallel optimization. Ph.D. thesis, MIT (1989)

[CR11] 11.Eckstein J., Bertsekas D.P. On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 1992;55(1):293–318. doi: 10.1007/BF01581204. [DOI] [Google Scholar]

[CR12] 12.Giselsson P. Tight global linear convergence rate bounds for Douglas–Rachford splitting. J. Fixed Point Theory Appl. 2017;19:2241–2270. doi: 10.1007/s11784-017-0417-1. [DOI] [Google Scholar]

[CR13] 13. Giselsson, P.: Nonlinear forward–backward splitting with projection correction (2019). arXiv:1908.07449

[CR14] 14. Giselsson, P.: Lecture notes on large-scale convex optimization (2015). http://control.lth.se/education/doctorate-program/large-scale-convex-optimization/

[CR15] 15.Giselsson P., Boyd S. Linear convergence and metric selection for Douglas–Rachford splitting and ADMM. IEEE Trans. Autom. Control. 2017;62(2):532–544. doi: 10.1109/TAC.2016.2564160. [DOI] [Google Scholar]

[CR16] 16.Guo K., Han D. A note on the Douglas–Rachford splitting method for optimization problems involving hypoconvex functions. J. Glob. Optim. 2018;72(3):431–441. doi: 10.1007/s10898-018-0660-z. [DOI] [Google Scholar]

[CR17] 17.Guo K., Han D., Yuan X. Convergence analysis of Douglas–Rachford splitting method for strongly + weakly convex programming. SIAM J. Numer. Anal. 2017;55(4):1549–1577. doi: 10.1137/16M1078604. [DOI] [Google Scholar]

[CR18] 18.Latafat P., Patrinos P. Asymmetric forward–backward-adjoint splitting for solving monotone inclusions involving three operators. Comput. Optim. Appl. 2017;68(1):57–93. doi: 10.1007/s10589-017-9909-6. [DOI] [Google Scholar]

[CR19] 19.Mordukhovich B.S. Variational Analysis and Generalized Differentiation I: Basic Theory. Berlin: Springer; 2006. [Google Scholar]

[CR20] 20.Mordukhovich B.S. Variational Analysis and Applications. 2018. [Google Scholar]

[CR21] 21.Ogura N., Yamada I. Non-strictly convex minimization over the fixed point set of an asymptotically shrinking nonexpansive mapping. Numer. Funct. Anal. Optim. 2002;22(1–2):113–137. doi: 10.1081/NFA-120003674. [DOI] [Google Scholar]

[CR22] 22.Reich S. On the asymptotic behavior of nonlinear semigroups and the range of accretive operators. J. Math. Anal. Appl. 1981;79(1):113–126. doi: 10.1016/0022-247X(81)90013-5. [DOI] [Google Scholar]

[CR23] 23.Rockafellar R.T., Wets R.J.-B. Variational Analysis. 3. Berlin: Springer; 1998. [Google Scholar]

[CR24] 24. Ryu, E.K., Hannah, W., Yin, R.: Scaled relative graph: nonexpansive operators via 2D Euclidean geometry (2019). arXiv:1902.09788

[CR25] 25.Ryu E.K., Taylor A.B., Bergeling C., Giselsson P. Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 2020;30(3):2251–2271. doi: 10.1137/19M1304854. [DOI] [Google Scholar]

[CR26] 26.Wang X. On Chebyshev functions and Klee functions. J. Math. Anal. Appl. 2010;368(1):293–310. doi: 10.1016/j.jmaa.2010.03.041. [DOI] [Google Scholar]

[CR27] 27.Wen B., Chen X., Pong T.K. Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 2017;27(1):124–145. doi: 10.1137/16M1055323. [DOI] [Google Scholar]

PERMALINK

On compositions of special cases of Lipschitz continuous operators

Pontus Giselsson

Walaa M Moursi

Abstract

Introduction

Organization and notation

Facts and auxiliary results

Fact 2.1

Proof

Proposition 2.2

Proof

Lemma 2.3

Proof

Proposition 2.4

Proof

Proposition 2.5

Proof

Proposition 2.6

Proof

Lemma 2.7

Proof

Proposition 2.8

Proof

Lemma 2.9

Lemma 2.10

Lemma 2.11

Lemma 2.12

Proof

Lemma 2.13

Proof

Corollary 2.14

Proof

Lemma 2.15

Proof

Proposition 2.16

Proof

Compositions

Definition 3.1

Proposition 3.2

Proof

Theorem 3.3

Proof

Theorem 3.4

Proof

Applications to special cases

Lemma 4.1

Theorem 4.2

Proof

Corollary 4.3

Proof

Example 4.4

Proposition 4.5

Proof

Example 4.6

Proof

Theorem 4.7

Proof

Example 4.8

Proof

Theorem 4.9

Proof

Application to the Douglas–Rachford algorithm

Theorem 5.1

Proof

Corollary 5.2

Proof

Remark 5.3

Example 5.4

Proof

Theorem 5.5

Proof

Remark 5.6

Application to the forward–backward algorithm

Theorem 6.1

Proof

Theorem 6.2

Proof

Corollary 6.3

Proof