ABSTRACT
We investigate a forward–backward splitting algorithm of penalty type with inertial effects for finding the zeros of the sum of a maximally monotone operator and a cocoercive one and the convex normal cone to the set of zeroes of an another cocoercive operator. Weak ergodic convergence is obtained for the iterates, provided that a condition expressed via the Fitzpatrick function of the operator describing the underlying set of the normal cone is verified. Under strong monotonicity assumptions, strong convergence for the sequence of generated iterates is proved. As a particular instance we consider a convex bilevel minimization problem including the sum of a non-smooth and a smooth function in the upper level and another smooth function in the lower level. We show that in this context weak non-ergodic and strong convergence can be also achieved under inf-compactness assumptions for the involved functions.
KEYWORDS: Maximally monotone operator, Fitzpatrick function, forward–backward splitting algorithm, convex bilevel optimization
AMS SUBJECT CLASSIFICATIONS: 47H05, 65K05, 90C25
1. Introduction and preliminaries
1.1. Motivation and problems formulation
During the last couple years one can observe in the optimization community an increasing interest in numerical schemes for solving variational inequalities expressed as monotone inclusion problems of the form
| (1) |
where is a real Hilbert space, is a maximally monotone operator, is the set of global minima of the proper, convex and lower semicontinuous function and is the normal cone of the set M. The article [1] was starting point for a series of papers [2–12] addressing this topic or related ones. All these papers share the common feature that the proposed iterative schemes use penalization strategies, namely, they evaluate the penalized h by its gradient, in case the function is smooth (see, for instance, [3]), and by its proximal operator, in case it is non-smooth (see, for instance, [4]).
Weak ergodic convergence has been obtained in [3,4] under the hypothesis:
| (2) |
with , the sequence of step sizes, , the sequence of penalty parameters, , the Fenchel conjugate function of h, and the range of the normal cone operator . Let us mention that (2) is the discretized counterpart of a condition introduced in [1] for continuous-time non-autonomous differential inclusions.
One motivation for studying numerical algorithms for monotone inclusions of type (1) comes from the fact that, when is the convex subdifferential of a proper, convex and lower semicontinuous function , they furnish iterative methods for solving bilevel optimization problems of the form
| (3) |
Among the applications where bilevel programming problems play an important role we mention the modelling of Stackelberg games, the determination of Wardrop equilibria for network flows, convex feasibility problems [13], domain decomposition methods for PDEs [14], image processing problems [6], and optimal control problems [4].
Later on, in [7], the following monotone inclusion problem, which turned out to be more suitable for applications, has been addressed in the same spirit of penalty algorithms
| (4) |
where is a maximally monotone operator, is cocoercive operator and the constraint set M is the set of zeros of another cocoercive operator . The provided algorithm of forward–backward type evaluates the operator A by a backward step and the two single-valued operators by forward steps. For the convergence analysis, (2) has been replaced by a condition formulated in terms of the Fitzpatrick function associated with the operator B, which we will also use in this paper. In [5], several particular situations for which this new condition is fulfilled have been provided.
The aim of this work is to endow the forward–backward penalty scheme for solving (4) from [7] with inertial effects, which means that the new iterate is defined in terms of the previous two iterates. Inertial algorithms have their roots in the time discretization of second-order differential systems [15]. They can accelerate the convergence of iterates when minimizing a differentiable function [16] and the convergence of the objective function values when minimizing the sum of a convex non-smooth and a convex smooth function [17,18]. Moreover, as emphasized in [19], see also [20], algorithms with inertial effects may detect optimal solutions of minimization problems which cannot be found by their non-inertial variants. In the last years, a huge interest in inertial algorithms can be noticed (see, for instance, [8,9,15,17,20–32]).
We prove weak ergodic convergence of the sequence generated by the inertial forward–backward penalty algorithm to a solution of the monotone inclusion problem (4), under reasonable assumptions for the sequences of step sizes, penalty and inertial parameters. When the operator A is assumed to be strongly monotone, we also prove strong convergence of the generated iterates to the unique solution of (4).
In Section 3, we address the minimization of the sum of a convex non-smooth and a convex smooth function with respect to the set of minimizes of another convex and smooth function. Besides the convergence results obtained from the general case, we achieve weak non-ergodic and strong convergence statements under inf-compactness assumptions for the involved functions. The weak non-ergodic theorem is an useful alternative to the one in [9], where a similar statement has been obtained for the inertial forward–backward penalty algorithm with constant inertial parameter under assumptions which are quite complicated and hard to verify (see also [11,12]).
1.2. Notations and preliminaries
In this subsection we introduce some notions and basic results which we will use throughout this paper (see [33–35]). Let be a real Hilbert space with inner product and associated norm .
For a function , we denote its effective domain and say that Ψ is proper, if and for all . The conjugate function of Ψ is . The convex subdifferential of Ψ at the point is the set , whenever . We take by convention , if .
Let M be a non-empty subset of . The indicator function of M, which is denoted by , takes the value 0 on M and otherwise. The convex subdifferential of the indicator function is the normal cone of M, that is , if , and otherwise. Notice that for we have if and only if , where is the support function of M.
For an arbitrary set-value operator we denote by its graph, by its domain, by its range and by its inverse operator, defined by if and only if . We use also the notation for the set of zeros of the operator A. We say that A is monotone, if for all . A monotone operator A is said to be maximally monotone, if there exists no proper monotone extension of the graph of A on . Let us mention that if A is maximally monotone, then is a convex and closed set [33, Proposition 23.39]. We refer to [33, Section 23.4] for conditions ensuring that is non-empty. If A is maximally monotone, then one has the following characterization for the set of its zeros
| (5) |
The operator A is said to be γ- strongly monotone with , if for all . If A is maximally monotone and strongly monotone, then is a singleton, thus non-empty [33, Corollary 23.27].
The resolvent of , is defined by , where denotes the identity operator on . If A is maximally monotone, then is single-value and maximally monotone [33, Proposition 23.7, Corollary 23.10]. For an arbitrary , we have the following identity [33, Proposition 23.18]
We denote the family of proper, convex and lower semicontinuous extended real-valued functions defined on . When and , we denote by the proximal point with parameter γ of function Ψ at point , which is the unique optimal solution of the optimization problem
Notice that , thus is a single-valued operator fulfilling the so-called Moreau's decomposition formula:
The function is said to be strongly convex with , if is a convex function. This property implies that is strongly monotone.
The Fitzpatrick function [36] associated to a monotone operator A is defined as
and it is a convex and lower semicontinuous function. For insights in the outstanding role played by the Fitzpatrick function in relating the convex analysis with the theory of monotone operators we refer to [33,34,37–39] and the references therein. If A is maximally monotone, then is proper and it fulfills
with equality if and only if . Notice that if , then is a maximally monotone operator and it holds . Furthermore, the following inequality is true (see [37]):
| (6) |
We present as follows some statements that will be essential when carrying out the convergence analysis. Let be a sequence in and be a sequence of positive real numbers. The sequence of weighted averages is defined for every as
| (7) |
Lemma 1.1 Opial-Passty —
Let Z be a non-empty subset of and assume that the limit exists for every element . If every sequential weak cluster point of respectively lies in Z, then the sequence respectively converges weakly to an element in Z as .
Two following result can be found in [5,7].
Lemma 1.2
Let and be sequences in with . If there exists such that
and α such that
then the following statements are true:
, where ;
the limit exists.
the sequence belongs to .
The following result follows from Lemma 1.2, applied in case and for all , where ρ is a lower bound for .
Lemma 1.3
Let be a sequence in , which is bounded from below, and , be sequences in with . If there exists such that
then the following statements are true:
the sequence is convergent.
the sequence belongs to .
The following result, which will be useful in this work, shows that statement (ii) in Lemma 1.3 can be obtained also when is not bounded from below, but it has a particular form.
Lemma 1.4
Let be a sequence in and be sequences in with and
where are sequences in and there exists α such that
If there exists such that
(8) then the sequence belongs to .
Proof.
We fix an integer , sum up the inequalities in (8) for and obtain
(9) Hence the sequence is bounded from above. Let be an upper bound of this sequence. For all it holds
from which we deduce that
(10) By induction we obtain for all
(11) Then inequality (9) combined with (10) and (11) leads to
(12) We let converge to and obtain that .
2. The general monotone inclusion problem
In this section we address the following monotone inclusion problem.
Problem 2.1
Let be a real Hilbert space, a maximally monotone operator, an cocoercive with a cocoercive with and assume that . The monotone inclusion problem to solve reads
The following forward–backward penalty algorithm with inertial effects for solving Problem 2.1 will be in the focus of our investigations in this paper.
When D=0 and , where is a convex and differentiable function with -Lipschitz continuous gradient with fulfilling , then Problem 2.1 recovers the monotone inclusion problem addressed in [3, Section 3] and Algorithm 2.2 can be seen as an inertial version of the iterative scheme considered in this paper. When B=0, we have that and Algorithm 2.2 is nothing else than the inertial version of the classical forward–backward algorithm (see for instance [33,40]).
Hypotheses 2.2
The convergence analysis will be carried out in the following hypotheses (see also [7]):
;
for every .
Since A and are maximally monotone operators, the sum is maximally monotone, provided some specific regularity conditions are fulfilled (see [33–35,38]). Furthermore, since D is also maximally monotone [33, Example 20.28] and , if is maximally monotone, then is also maximally monotone.
Let us also notice that for there exists such that , hence, for every it holds
For situations where () is satisfied we refer the reader to [5,8,9,11].
Before formulating the main theorem of this section we will prove some useful technical results.
Lemma 2.3
Let be the sequence generated by Algorithm 2.2 and be an element in such that y=v+Du+p with and . Further, let be such that . Then the following inequality holds for all
(13)
Proof.
Let be fixed. According to definition of the resolvent of the operator A we have
(14) and, since , the monotonicity of A guarantees
(15) or, equivalently,
(16) For the term in the left-hand side of (16) we have
(17) Since
and
by adding the two inequalities, we obtain the following estimation for the second term in the right-hand side of (16)
(18) We turn now our attention to the first term in the right-hand side of (16), which can be written as
(19) We have
(20) and
(21) On the other hand, we have
(22) Since and Bu=0, the cocoercivity of B gives us
(23) Similarly, the cocoercivity of D gives us
(24) Combining (23)–(24) with (22) and by using the definition Fitzpatrick function and the fact that , we obtain
(25) The inequalities (20), (21) and (25) lead to
(26)
From now on we will assume that for the constants and the sequences and are chosen such that
As a consequence, there exists , which means that for all it holds
| (27) |
On the other hand, there exists , which means that for all it holds
| (28) |
Remark 2.4
Since , one can always find such that . One possible choice is and . From the second inequality in it follows that .
Asit is always possible to choose s such that . Since in this case , one has (27).
The following proposition brings us closer to the convergence result.
Proposition 2.5
Let , and the sequences and satisfy condition . Let be the sequence generated by Algorithm 2.2 and assume that the Hypotheses 2.2 are verified. Then the following statements are true:
the sequence belongs to and the sequence belongs to ;
if, moreover, , then and thus every cluster point of the sequence lies in M.
for every , the limit exists.
Proof.
Since , there exists a integer such that for all . According to Lemma 2.3, for every such that y=v+Du+p, with and , and all the following inequality holds
(29) We consider , which means that we can take y=0 in (29). For all we denote
(30) and
(31) Using that is non-decreasing, for all it yields
(32) where s,t>0 are chosen according to (27) and (28), respectively.
Thanks to () and (C) it holds
(33) Hence, according to Lemma 1.4, we obtain
(34) which proves (i). If, in addition, , then , which means every cluster point of the sequence lies in .
In order to prove (iii), we consider again the inequality (29) for an arbitrary element and y=0. With the notations in (30) and (31), we get for all
(35) According to (33) and (34) we have
(36) therefore, by Lemma 1.2, the limit exists, which means that the limit exists, too.
Remark 2.6
The condition (C) that we imposed in combination with on the sequence of inertial parameters is the one proposed in [15, Proposition 2.1] when addressing the convergence of the inertial proximal point algorithm. However, the statements in the proposition above and in the following convergence theorem remain valid if one alternatively assumes that there exists such that for all and
This can be realized if one chooses for a fixed p>1
Indeed, in this situation we have that for all , which gives
Now we are ready to prove the main theorem of this section, which addresses the convergence of the sequence generated by Algorithm 2.2.
Theorem 2.8
Let , and the sequences and satisfy condition . Let be the sequence generated by Algorithm 2.2, be the sequence defined in (7) and assume that the Hypotheses 2.2 are verified. Then the following statements are true:
the sequence converges weakly to an element in as .
if A is γ-strongly monotone with , then converges strongly to the unique element in as .
Proof.
According to Proposition 2.5 (iii), the limit exists for every . Let z be a sequential weak cluster point of . We will show that , by using the characterization (5) of the maximal monotonicity, and the conclusion will follow by Lemma 1.1. To this end we consider an arbitrary such that y=v+Du+p, where and . From (29), with the notations (30) and (31), we have for allRecall that from (33) that . Since is bounded, the sequence is also bounded.
(37) We fix an arbitrary integer and sum up the inequalities in (37) for . This yieldsAfter dividing this last inequality by , we obtain
where . By passing in (38) to the limit and by using that , we get
(38) As z is a sequential weak cluster point of , the above inequality gives us , which finally means that .
Let be the unique element in . Since A is strongly monotone with , the formula in (15) reads for allor, equivalently,
By using again (17), (18) and (26) we obtain for all
By using the notations in (30) and (31), this yields for all
By taking into account (36), from Lemma 1.2 we get
According to (C) we have , which implies that the limit must be equal to zero. This provides the desired conclusion.
3. Applications to convex bilevel programming
We will employ the results obtained in the previous section, in the context of monotone inclusions, to the solving of convex bilevel programming problems.
Problem 3.1
Let be a real Hilbert space, a proper, convex and lower semicontinuous function and differentiable functions with -Lipschitz continuous and, respectively, -Lipschitz continuous gradients. Suppose that and . The bilevel programming problem to solve reads
The assumption is not restrictive as, otherwise, one can replace h with .
Hypotheses 3.2
The convergence analysis will be carry out in the following hypotheses:
;
for every .
In the above hypotheses, we have that and hence . Since according to the Theorem of Baillon–Haddad (see, e.g. [33, Corollary 18.16]), and are -cocoercive and, respectively, - cocoercive, and , solving the bilevel programming problem in Problem 3.1 reduces to solving the monotone inclusion
By using to this end Algorithm 2.2, we receive the following iterative scheme.
By using the inequality (6), one can easily notice, that implies , which means that the convergence statements for Algorithm 3.3 can be derived as particular instances of the ones derived in the previous section.
Alternatively, one can use to this end the following lemma and employ the same ideas and techniques as in Section 2. Lemma 3.3 is similar to Lemma 2.3, however, it will allow us to provide convergence statements also for the sequence of function values .
Lemma 3.3
Let be the sequence generated by Algorithm 3.3 and be an element in such that with and . Further, let be such that . Then the following inequality holds for all
Proof.
Let be fixed. The proof follows by combining the estimates used in the proof of Lemma 2.3 with some inequalities which better exploit the convexity of h. From (23) we have
Since h is convex, the following relation also holds
Summing up the two inequalities above gives
Using the same techniques as in the derivation of (25), we get
With these improved estimates, the conclusion follows as in the proof of Lemma 2.3.
By using now Lemma 3.3, one obtains, after slightly adapting the proof of Proposition 2.5, the following result.
Proposition 3.4
Let , and the sequences and satisfy condition . Let be the sequence generated by Algorithm 3.3 and assume that the Hypotheses 3.2 are verified. Then the following statements are true:
the sequence belongs to and the sequences and belong to ;
if, moreover, , then and thus every cluster point of the sequence lies in .
for every , the limit exists.
Finally, the above proposition leads to the following convergence result.
Theorem 3.6
Let , and the sequences and satisfy condition . Let be the sequence generated by Algorithm 3.3, be the sequence defined in (7) and assume that the Hypotheses 3.2 are verified. Then the following statements are true:
the sequence converges weakly to an element in as .
if f is strongly convex with , then converges strongly to the unique element in as .
As follows we will show that under inf-compactness assumptions one can achieve weak non-ergodic convergence for the sequence . Weak non-ergodic convergence has been obtained for Algorithm 3.3 in [9] when for all and for restrictive choices for both the sequence of step sizes and penalty parameters.
We denote by . For every element x in , we denote by the distance from x to . In particular, , where denotes the projection of x onto . The projection operator is firmly non-expansive [33, Proposition 4.8], this means
| (39) |
Denoting for all , one has that is differentiable and it holds for all .
Lemma 3.6
Let be the sequence generated by Algorithm 3.3 and assume that the Hypotheses 3.2 are verified. Then the following inequality holds for all
(40)
Proof.
Let be fixed. Since d is convex, we have
(41) Then there exists such that (see (14))
and, so,
(42) Since , we get
(43) Using the convexity of g it follows
(44) On the other hand, the Descent Lemma gives
(45) By adding (44) and (45), it yields
(46) Using the cocoercivity of combined with the fact that (as belongs to ), it yields
Therefore
(47) Further, we have
and
By adding two relations above, we obtain
(48) By combining (43), (46), (47) and (48) with (42) we obtain the desired conclusion.
Definition 3.7
A function is sad to be inf-compact if for every r>0 and the set
is relatively compact in .
An useful property of inf-compact functions follows.
Lemma 3.8
Let be inf-compact and be a bounded sequence in such that is bounded as well. If the sequence converges weakly to an element in as , then it converges strongly to this element.
Proof.
Let be and such that for all
Hence, belongs to the set , which is relatively compact. Then has at least one strongly convergent subsequence. Since every strongly convergent subsequence of has as limit , the desired conclusion follows.
We can formulate now the weak non-ergodic convergence result.
Theorem 3.10
Let , , the sequences and satisfy the condition , be the sequence generated by Algorithm 3.3, and assume that the Hypotheses 3.2 are verified and that either f+g or h is inf-compact. Then the following statements are true:
;
the sequence converges weakly to an element in as ;
if h is inf-compact, then the sequence converges strongly to an element in as .
Proof.
Thanks to Lemma 3.6, for all we havewhere
(49) From Proposition 3.4 (i), combined with the fact that both sequences and are bounded, it follows that .
In general, since is not necessarily included in , we have to treat two different cases.
Case 1: There exists an integer such that for all . In this case, we obtain from Lemma 1.2 that:
the limit exists. . Moreover, since , we must have
(50) Consider a subsequence of such thatand note that, thanks to (50), the sequence is bounded. From Proposition 3.4 (ii) –(iii) we get that also and are bounded. Thus, since either f+g or h is inf-compact, there exists a subsequence of , which converges strongly to an element as . According to Proposition 3.4 (ii) –(iii), belongs to . On the other hand,
We deduce from (50)–(51) that , or in other words, that . In conclusion, thanks to the continuity of d,
(51) Case 2: For all there exists some such that . We define the set
There exists an integer such that for all the set is non-empty. Hence, for all the number
is well-defined. By definition for all and moreover the sequence is non-decreasing and . Indeed, if , then for all it holds , contradiction. Choose an integer .
If , then and we have
(54) For all we define . In both cases it yieldsPassing in (55) to limit as we obtain that
(55) Let be . For all we have
(56) which shows that is bounded, as exists. We obtain
Further, for all we have , which gives
(57) This means that the sequence is bounded from above. Consider a subsequence of such that
(58) From Proposition 3.4 (ii)–(iii) we get that also and are bounded. Thus, since either f+g or h is inf-compact, there exists a subsequence of , which converges strongly to an element as . According to Proposition 3.4 (ii)–(iii), belongs to . Furthermore, it holds
We deduce from (58) and (59) that
(59) which gives . Thanks to the continuity of d we get
By combining (56), (57) and (60), it yields
(60) which implies and thus
According to (i) we have , thus every weak cluster point of the sequence belongs to . From Lemma 1.1 it follows that converges weakly to a point in as .
Funding Statement
Research partially supported by FWF (Austrian Science Fund), projects I 2419-N32 and W1260-N35.
Acknowledgments
The second author gratefully acknowledges the financial support of the Doctoral Programme Vienna Graduate School on Computational Optimization (VGSCO) which is funded by Austrian Science Fund (FWF, project W1260-N35).
Disclosure statement
No potential conflict of interest was reported by the authors.
ORCID
Radu Ioan Boţ http://orcid.org/0000-0002-4469-314X
References
- [1].Attouch H, Czarnecki M-O.. Asymptotic behavior of coupled dynamical systems with multiscale aspects. J Differ Equ. 2010;248(6):1315–1344. [Google Scholar]
- [2].Attouch H, Cabot A, Czarnecki M-O.. Asymptotic behavior of nonautonomous monotone and subgradient evolution equations. Trans Am Math Soc. 2018;370(2):755–790. [Google Scholar]
- [3].Attouch H, Czarnecki M-O, Peypouquet J.. Coupling forward–backward with penalty schemes and parallel splitting for constrained variational inequalities. SIAM J Optim. 2011;21(4):1251–1274. [Google Scholar]
- [4].Attouch H, Czarnecki M-O, Peypouquet J.. Prox-penalization and splitting methods for constrained variational problems. SIAM J Optim. 2011;21(1):149–173. [Google Scholar]
- [5].Banert S, Boţ RI.. Backward penalty schemes for monotone inclusion problems. J Optim Theory Appl. 2015;166(3):930–948. [Google Scholar]
- [6].Boţ RI, Csetnek ER.. A Tseng's type penalty scheme for solving inclusion problems involving linearly composed and parallel-sum type monotone operators. Vietnam J Math. 2014;42(4):451–465. [Google Scholar]
- [7].Boţ RI, Csetnek ER.. Forward–backward and Tseng's type penalty schemes for monotone inclusion problems. Set-Valued Var Anal. 2014;22(2):313–331. [Google Scholar]
- [8].Boţ RI, Csetnek ER, Nimana N.. Gradient-type penalty method with inertial effects for solving constrained convex optimization problems with smooth data. Optim Lett. 2018;12(1):17–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Boţ RI, Csetnek ER, Nimana N.. An inertial proximal-gradient penalization scheme for constrained convex optimization problems. Vietnam J Math. 2018;46(1):53–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Frankel P, Peypouquet J.. Lagrangian-penalization algorithm for constrained optimization and variational inequalities. Set-Valued Var Anal. 2012;20(2):169–185. [Google Scholar]
- [11].Noun N, Peypouquet J.. Forward–backward penalty scheme for constrained convex minimization without inf-compactness. J Optim Theory Appl. 2013;158(3):787–795. [Google Scholar]
- [12].Peypouquet J. Coupling the gradient method with a general exterior penalization scheme for convex minimization. J Optim Theory Appl. 2012;153(1):123–138. [Google Scholar]
- [13].Attouch H, Briceno-Arias LM, Combettes PL.. A parallel splitting method for coupled monotone inclusions. SIAM J Control Optim. 2010;48(5):3246–3270. [Google Scholar]
- [14].Attouch H, Bolte J, Redont P, Soubeyran A.. Alternating proximal algorithms for weakly coupled convex minimization problems, applications to dynamical games and PDE's. J Convex Anal. 2008;15(3):485–506. [Google Scholar]
- [15].Alvarez F, Attouch H.. An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 2001;9(1):3–11. [Google Scholar]
- [16].Polyak BT. Introduction to optimization. New York: Publications Division, Optimization Software Inc.; 1987. Translations Series in Mathematics and Engineering. [Google Scholar]
- [17].Beck A, Teboulle M.. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci. 2009;2(1):183–202. [Google Scholar]
- [18].Chambolle A, Dossal C.. On the convergence of the iterates of the ‘Fast Iterative Shrinkage/Thresholding Algorithm’. J Optim Theory Appl. 2015;166(3):968–982. [Google Scholar]
- [19].Bertsekas DP. Nonlinear programming. Cambridge (MA): Athena Scientific; 1999. [Google Scholar]
- [20].Boţ RI, Csetnek ER, Laszlo SC.. An inertial forward–backward algorithm for the minimization of the sum of two nonconvex functions. EURO J Comput Optim. 2016;4(1):3–25. [Google Scholar]
- [21].Alvarez F. On the minimizing property of a second order dissipative system in Hilbert spaces. SIAM J Control Optim. 2000;38(4):1102–1119. [Google Scholar]
- [22].Alvarez F. Weak convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in Hilbert space. SIAM J Optim. 2004;14(3):773–782. [Google Scholar]
- [23].Attouch H, Czarnecki M-O.. Asymptotic behavior of gradient-like dynamical systems involving inertia and multiscale aspects. J Differ Equ. 2017;262(3):2745–2770. [Google Scholar]
- [24].Attouch H, Peypouquet J, Redont P.. A dynamical approach to an inertial forward–backward algorithm for convex minimization. SIAM J Optim. 2014;24(1):232–256. [Google Scholar]
- [25].Boţ RI, Csetnek ER.. An inertial forward–backward–forward primal–dual splitting algorithm for solving monotone inclusion problems. Numer Algorithms. 2016;71(3):519–540. [Google Scholar]
- [26].Boţ RI, Csetnek ER.. Penalty schemes with inertial effects for monotone inclusion problems. Optimization. 2017;66(6):313–331. [Google Scholar]
- [27].Boţ RI, Csetnek ER, Hendrich C.. Inertial Douglas–Rachford splitting for monotone inclusion problems. Appl Math Comput. 2015;256(1):472–487. [Google Scholar]
- [28].Chen C, Chan RH, Ma S, Yang J.. Inertial proximal ADMM for linearly constrained separable convex optimization. SIAM J Imag Sci. 2015;8(4):2239–2267. [Google Scholar]
- [29].Chen C, Ma S, Yang J.. A general inertial proximal point algorithm for mixed variational inequality problem. SIAM J Optim. 2015;25(4):2120–2142. [Google Scholar]
- [30].Maingé P-E. Convergence theorems for inertial Krasnosel'skiĭ–Mann type algorithms. J Comput Appl Math. 2008;219(1):223–236. [Google Scholar]
- [31].Maingé P-E, Moudafi A.. Convergence of new inertial proximal methods for dc programming. SIAM J Optim. 2008;19(1):397–413. [Google Scholar]
- [32].Moudafi A, Oliny M.. Convergence of a splitting inertial proximal method for monotone operators. J Comput Appl Math. 2003;155(2):447–454. [Google Scholar]
- [33].Bauschke HH, Combettes PL.. Convex analysis monotone operator theory in Hilbert spaces. New York (NY): Springer; CMS Books in Mathematics; 2011. [Google Scholar]
- [34].Boţ RI. Conjugate duality in convex optimization. Vol. 637 Berlin, Heidelberg: Springer; 2010. Lecture Notes in Economics and Mathematical Systems. [Google Scholar]
- [35].Zălinescu C. Convex analysis in general vector spaces. Singapore: World Scientific; 2002. [Google Scholar]
- [36].Fitzpatrick S. Representing monotone operators by convex functions. Proceedings of the Centre for Mathematical Analysis. Workshop/Miniconference on Functional Analysis and Optimization, Vol. 20, Australian National University, Canberra; 1988.
- [37].Bauschke HH, McLaren DA, Sendov HS.. Fitzpatrick functions: inequalities, examples and remarks on a problem by S. Fitzpatrick. J Convex Anal. 2006;13(3):499–523. [Google Scholar]
- [38].Borwein JM. Maximal monotonicity via convex analysis. J Convex Anal. 2006;13(3):561–586. [Google Scholar]
- [39].Burachik RS, Svaiter BF.. Maximal monotone operators, convex functions and a special family of enlargements. Set-Valued Anal. 2002;10(4):29–316. [Google Scholar]
- [40].Combettes PL. Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization. 2004;53(5–6):475–504. [Google Scholar]
