Goal-oriented adaptive finite element methods with optimal computational complexity

Roland Becker; Gregor Gantner; Michael Innerberger; Dirk Praetorius

doi:10.1007/s00211-022-01334-8

. 2022 Nov 16;153(1):111–140. doi: 10.1007/s00211-022-01334-8

Goal-oriented adaptive finite element methods with optimal computational complexity

Roland Becker ¹, Gregor Gantner ², Michael Innerberger ^3,^✉, Dirk Praetorius ³

PMCID: PMC9829645 PMID: 36644212

Abstract

We consider a linear symmetric and elliptic PDE and a linear goal functional. We design and analyze a goal-oriented adaptive finite element method, which steers the adaptive mesh-refinement as well as the approximate solution of the arising linear systems by means of a contractive iterative solver like the optimally preconditioned conjugate gradient method or geometric multigrid. We prove linear convergence of the proposed adaptive algorithm with optimal algebraic rates. Unlike prior work, we do not only consider rates with respect to the number of degrees of freedom but even prove optimal complexity, i.e., optimal convergence rates with respect to the total computational cost.

Mathematics Subject Classification: 65N30, 65N50, 65Y20, 41A25, 65N22

Introduction

Let $Ω \subset R^{d}$ be a bounded Lipschitz domain, $d \geq 2$ . For given $f \in L^{2} (Ω)$ and $f \in {[L^{2} (Ω)]}^{d}$ , we consider a linear symmetric and elliptic partial differential equation

\begin{matrix} \begin{matrix} - div A \nabla u^{⋆} + c & u^{⋆} = f + div f & in Ω, \\ u^{⋆} = 0 & on Γ : = \partial Ω, \end{matrix} \end{matrix}

where $A (x) \in R_{sym}^{d \times d}$ is symmetric and $c (x) \in R$ . As usual, we assume that $A, c \in L^{\infty} (Ω)$ , that $A$ is uniformly positive definite and that the weak form (see (5) below) fits into the setting of the Lax–Milgram lemma. Standard adaptivity aims to approximate the unknown solution $u^{⋆} \in H_{0}^{1} (Ω)$ of (1) in the energy norm at optimal rate; see [1, 6, 7, 9, 11, 19, 23] for adaptive finite element methods (AFEMs) and [5] for an overview of available results. Instead, the quantity of interest for goal-oriented adaptivity is only some functional value of the unknown solution $u^{⋆} \in H_{0}^{1} (Ω)$ of (1), and the present paper aims to compute the linear goal functional

\begin{matrix} G (u^{⋆}) : = \int_{Ω} (g u^{⋆} - g \cdot \nabla u^{⋆}) d x, \end{matrix}

for given $g \in L^{2} (Ω)$ and $g \in {[L^{2} (Ω)]}^{d}$ . To approximate $G (u^{⋆})$ accurately, it is not necessary (and might even waste computational time) to accurately approximate the solution $u^{⋆}$ on the whole computational domain. Due to this potential decrease of computational cost, goal-oriented adaptivity is of high relevance in practice as well as in mathematical research; see, e.g., [3, 4, 10, 17] for some prominent contributions.

The present work formulates a goal-oriented adaptive finite element method (GOAFEM), where the sought goal $G (u^{⋆})$ is approximated by some computable $G_{ℓ}$ such that

\begin{matrix} | G (u^{⋆}) - G_{ℓ} | \overset{ℓ \to \infty}{\to} 0 even at optimal algebraic rate . \end{matrix}

The earlier works [2, 12, 14, 20] are essentially concerned with optimal convergence rates for GOAFEM, where all arising linear FEM systems are solved exactly. While [12, 14] particularly aim to transfer ideas from the AFEM analysis of [5, 6] to GOAFEM for general elliptic PDEs, the seminal work [20] considers the Poisson model problem and additionally addresses the total computational cost by formulating realistic assumptions on a generic inexact solver (called GALSOLVE in [20, 23]).

The focus of the present work is also on the iterative (and hence inexact) solution of the arising FEM systems. However, we avoid any realistic assumptions on the solver, but rather rely on energy contraction per solver step, which is proved to hold for the preconditioned CG method with optimal multilevel additive Schwarz preconditioner [8] or the geometric multigrid method [25]. In the proposed GOAFEM algorithm, the termination of such a contractive iterative solver is then based on appropriate computable a posteriori error estimates by a similar criterion as in [20, 23]. We discuss several implementations of such termination criteria and prove that these allow to control the total computational cost of computing the approximate goal value $G_{ℓ}$ , where we already stress now that $G (u^{⋆}) \approx G_{ℓ} = G (u_{ℓ}) + R_{ℓ}$ , where $u_{ℓ} \approx u^{⋆}$ is a FEM approximation of $u^{⋆}$ and $R_{ℓ}$ is a residual correction related to inexact solution of the FEM formulation. While [20] shows algebraic convergence with optimal rates (in the present setting of FEM on quasi-uniform meshes) with respect to the overall computational cost for the final iterates on every level for sufficiently small adaptivity parameters (for mesh-refinement and solver termination), our main contribution is full linear convergence, i.e., linear convergence of the estimator product independently of the algorithmic decision for either mesh-refinement or solver step and even for arbitrary adaptivity parameters. An immediate consequence is that the convergence rate of the computed solutions with respect to the number of elements will be the same as with respect to the overall computational cost (i.e., the cumulative computational time). Moreover, for sufficiently small adaptivity parameters, we show convergence with optimal rates with respect to the number of elements and, hence, with respect to the overall computational cost. This extends the results of [20] to the present setting of symmetric second-order linear elliptic PDEs. Finally, we stress that, unlike [20], our GOAFEM algorithm does not require any inner loop for data approximation and therefore does not require different (but still nested) meshes for the primal and dual problem. Overall, the present paper thus provides further mathematical understanding for bridging the gap between applied GOAFEM and theoretical optimality results.

Outline In Sect. 2, we present our GOAFEM algorithm (Algorithm 3) and the details of its individual steps. This includes the details of our finite element discretization as well as the precise assumptions for the iterative solver, the marking strategy, and the error estimators. We then state in Sect. 3 that Algorithm 3 leads to linear convergence for arbitrary stopping parameters (Theorem 6) and even achieves optimal rates with respect to the total computational cost if the adaptivity parameters are sufficiently small (Theorem 8). We emphasize that linear convergence applies to all steps of the adaptive strategy, independently of whether the algorithm decides for one solver step or one step of local mesh-refinement. This turns out to be the key argument for optimal rates with respect to the total computational cost (see Corollary 7). Section 3.2 comments on alternative termination criteria for the iterative solver. Section 4 then illustrates our theoretical findings with numerical experiments. Finally, we give a proof of our main Theorems 6 and 8 in Sects. 5 and 6, respectively.

Notation In the following text, we write $a ≲ b$ for $a, b \in R$ if there exists a constant $C > 0$ (which is independent of the mesh width h) such that $a \leq C b$ . If there holds $a ≲ b ≲ a$ , we abbreviate this by $a ≃ b$ . Furthermore, we denote by $# A$ the cardinality of a finite set A and by $| ω |$ the d-dimensional Lebesgue measure of a subset $ω \subset R^{d}$ .

Goal-oriented adaptive finite element method

Variational formulation

Defining the symmetric bilinear form

\begin{matrix} a (u, v) : = \int_{Ω} A \nabla u \cdot \nabla v d x + \int_{Ω} c u v d x, \end{matrix}

we suppose that $a (\cdot, \cdot)$ is continuous and elliptic on $H_{0}^{1} (Ω)$ and thus fits into the setting of the Lax–Milgram lemma, i.e., there exist constants $0 < C_{ell} \leq C_{cnt} < \infty$ such that

\begin{matrix} C_{ell} {‖ u ‖}_{H_{0}^{1} (Ω)}^{2} \leq a (u, u) and a (u, v) \leq C_{cnt} {‖ u ‖}_{H_{0}^{1} (Ω)} {‖ v ‖}_{H_{0}^{1} (Ω)} for all u, v \in H_{0}^{1} (Ω) . \end{matrix}

In particular, $a (\cdot, \cdot)$ is a scalar product that yields an equivalent norm ${| | | v | | |}^{2} : = a (v, v)$ on $H_{0}^{1} (Ω)$ . The weak formulation of (1) reads

\begin{matrix} a (u^{⋆}, v) = F (v) : = \int_{Ω} (f v d x - f \cdot \nabla v) d x for all v \in H_{0}^{1} (Ω) . \end{matrix}

The Lax–Milgram lemma proves existence and uniqueness of the solution $u^{⋆} \in H_{0}^{1} (Ω)$ of (5). The same argument applies and proves that the dual problem

\begin{matrix} a (v, z^{⋆}) = G (v) for all v \in H_{0}^{1} (Ω) \end{matrix}

admits a unique solution $z^{⋆} \in H_{0}^{1} (Ω)$ , where the linear goal functional $G \in H^{- 1} (Ω) : = H_{0}^{1} {(Ω)}^{'}$ is defined by (2).

Remark 1

For ease of presentation, we restrict our model problem (1) to homogeneous Dirichlet boundary conditions. We note, however, that for mixed homogeneous Dirichlet and inhomogeneous Neumann boundary conditions our main results hold true with the obvious modifications. In particular, with the partition $\partial Ω = {\bar{Γ}}_{D} \cup {\bar{Γ}}_{N}$ into Dirichlet boundary $Γ_{D}$ with $| Γ_{D} | > 0$ and Neumann boundary $Γ_{N}$ , the space $H_{0}^{1} (Ω)$ (and its discretization) has to be replaced by $H_{D}^{1} (Ω) : = {v \in H^{1} {(Ω) : v |}_{Γ_{D}} = 0 in the sense of traces}$ and the Neumann data has to be given in $L^{2} (Γ_{N})$ . Furthermore, the coefficient $f$ must vanish in a neighborhood of $Γ_{N}$ to go from the strong form (1) to the weak form (5) via integration by parts.

Finite element discretization and solution

For a conforming triangulation $T_{H}$ of $Ω$ into compact simplices and a polynomial degree $p \geq 1$ , let

\begin{matrix} X_{H} : = {v_{H} \in H_{0}^{1} (Ω) : \forall T \in T_{H}, v_{H} |_{T} is a polynomial of degree \leq p} . \end{matrix}

To obtain conforming finite element approximations $u^{⋆} \approx u_{H} \in X_{H}$ and $z^{⋆} \approx z_{H} \in X_{H}$ , we consider the Galerkin discretizations of (5)–(6). First, we note that the Lax–Milgram lemma yields the existence and uniqueness of exact discrete solutions $u_{H}^{⋆}, z_{H}^{⋆} \in X_{H}$ , i.e., there holds that

\begin{matrix} a (u_{H}^{⋆}, v_{H}) = F (v_{H}) and a (v_{H}, z_{H}^{⋆}) = G (v_{H}) for all v_{H} \in X_{H} . \end{matrix}

In practice, the discrete systems (8) are rarely solved exactly (or up to machine precision). Instead, a suitable iterative solver is employed, which yields approximate discrete solutions $u_{H}^{m}, z_{H}^{n} \in X_{H}$ . We suppose that this iterative solver is contractive, i.e., for all $m, n \in N$ , it holds that

\begin{matrix} | | | u_{H}^{⋆} - u_{H}^{m} | | | \leq q_{ctr} | | | u_{H}^{⋆} - u_{H}^{m - 1} | | | and | | | z_{H}^{⋆} - z_{H}^{n} | | | \leq q_{ctr} | | | z_{H}^{⋆} - z_{H}^{n - 1} | | |, \end{matrix}

where $0 < q_{ctr} < 1$ is a generic constant and, in particular, independent of $X_{H}$ . Assumption (9) is satisfied, e.g., for an optimally preconditioned conjugate gradient (PCG) method (see [8]) or geometric multigrid solvers (see [25]); see also the discussion in [16]. We note that these solvers are also guaranteed to satisfy the realistic assumptions from [20, 23] (which require that any initial energy error can be improved by a factor $0 < τ < 1$ at $O (| log (τ) | # T_{H})$ cost). However, while (9) is slightly less general, it allows to prove full linear convergence; see Theorem 6 below.

Discrete goal quantity

To approximate $G (u^{⋆})$ , we proceed as in [17]: For any $u_{H}, z_{H} \in X_{H}$ , it holds that

Defining the discrete quantity of interest

\begin{matrix} G_{H} (u_{H}, z_{H}) : = G (u_{H}) + [F (z_{H}) - a (u_{H}, z_{H})], \end{matrix}

the goal error can be controlled by means of the Cauchy–Schwarz inequality

\begin{matrix} | G (u^{⋆}) - G_{H} (u_{H}, z_{H}) | \leq | a (u^{⋆} - u_{H}, z^{⋆} - z_{H}) | \leq | | | u^{⋆} - u_{H} | | | | | | z^{⋆} - z_{H} | | | . \end{matrix}

We note that the additional term in (10) is the residual of the discrete primal problem (8) evaluated at an arbitrary function $z_{H} \in X_{H}$ and hence $G (u_{H}^{⋆}) = G_{H} (u_{H}^{⋆}, z_{H})$ .

In the following, we design an adaptive algorithm that provides a computable upper bound to (11) which tends to zero at optimal algebraic rate with respect to the number of elements $# T_{H}$ as well as with respect to the total computational cost.

Mesh refinement

Let $T_{0}$ be a given conforming triangulation of $Ω$ . We suppose that the mesh-refinement is a deterministic and fixed strategy, e.g., newest vertex bisection [24]. For each conforming triangulation $T_{H}$ and marked elements $M_{H} \subseteq T_{H}$ , let $T_{h} : = refine (T_{H}, M_{H})$ be the coarsest conforming triangulation, where all $T \in M_{H}$ have been refined, i.e., $M_{H} \subseteq T_{H} \ T_{h}$ . We write $T_{h} \in T (T_{H})$ , if $T_{h}$ results from $T_{H}$ by finitely many steps of refinement. To abbreviate notation, let $T : = T (T_{0})$ . We note that the order on $T$ is respected by the finite element spaces, i.e., $T_{h} \in T (T_{H})$ implies that $X_{H} \subseteq X_{h}$ .

We further suppose that each refined element has at least two sons, i.e.,

\begin{matrix} # (T_{H} \ T_{h}) + # T_{H} \leq # T_{h} for all T_{H} \in T and all T_{h} \in T (T_{H}), \end{matrix}

and that the refinement rule satisfies the mesh-closure estimate

\begin{matrix} # T_{ℓ} - # T_{0} \leq C_{cls} \sum_{j = 0}^{ℓ - 1} # M_{j} for all ℓ \in N, \end{matrix}

where $C_{cls} > 0$ depends only on $T_{0}$ . For newest vertex bisection, this has been proved under an additional admissibility assumption on $T_{0}$ in [1, 24] and for 2D even without any additional assumption in [18]. Finally, we suppose that the overlay estimate holds, i.e., for all triangulations $T_{H}, T_{h} \in T$ , there exists a common refinement $T_{H} \oplus T_{h} \in T (T_{H}) \cap T (T_{h})$ which satisfies that

\begin{matrix} # (T_{H} \oplus T_{h}) \leq # T_{H} + # T_{h} - # T_{0}, \end{matrix}

which has been proved in [6, 23] for newest vertex bisection.

Estimator properties

For $T_{H} \in T$ and $v_{H} \in X_{H}$ , let

\begin{matrix} η_{H} (T, v_{H}) \geq 0 and ζ_{H} (T, v_{H}) \geq 0 for all T \in T_{H} \end{matrix}

be given refinement indicators. For $μ_{H} \in {η_{H}, ζ_{H}}$ , we use the usual convention that

\begin{matrix} μ_{H} (v_{H}) : = μ_{H} (T_{H}, v_{H}), where μ_{H} (U_{H}, v_{H}) = (\sum_{T \in U_{H}} μ_{H} {(T, v_{H})}^{2})^{1 / 2} \end{matrix}

for all $v_{H} \in X_{H}$ and all $U_{H} \subseteq T_{H}$ .

We suppose that the estimators $η_{H}$ and $ζ_{H}$ satisfy the so-called axioms of adaptivity (which are designed for, but not restricted to, weighted-residual error estimators) from [5]: There exist constants $C_{stab}, C_{rel}, C_{drel} > 0$ and $0 < q_{red} < 1$ such that for all $T_{H} \in T (T_{0})$ and all $T_{h} \in T (T_{H})$ , the following assumptions are satisfied:

Stability: For all $v_{h} \in X_{h}$ , $v_{H} \in X_{H}$ , and $U_{H} \subseteq T_{h} \cap T_{H}$ , it holds that
$\begin{matrix} | η_{h} (U_{H}, v_{h}) - η_{H} (U_{H}, v_{H}) | + | ζ_{h} (U_{H}, v_{h}) - ζ_{H} (U_{H}, v_{H}) | & \leq C_{stab} | | | v_{h} - v_{H} | | | . \end{matrix}$
Reduction: For all $v_{H} \in X_{H}$ , it holds that
$\begin{matrix} η_{h} (T_{h} \ T_{H}, v_{H}) \leq q_{red} η_{H} (T_{H} \ T_{h}, v_{H}) and \\ ζ_{h} (T_{h} \ T_{H}, v_{H}) \leq q_{red} ζ_{H} (T_{H} \ T_{h}, v_{H}) . \end{matrix}$
Reliability: The Galerkin solutions $u_{H}^{⋆}, z_{H}^{⋆} \in X_{H}$ to (8) satisfy that
$\begin{matrix} | | | u^{⋆} - u_{H}^{⋆} | | | & \leq C_{rel} η_{H} (u_{H}^{⋆}) and | | | z^{⋆} - z_{H}^{⋆} | | | \leq C_{rel} ζ_{H} (z_{H}^{⋆}) . \end{matrix}$
Discrete reliability: The Galerkin solutions $u_{H}^{⋆}, z_{H}^{⋆} \in X_{H}$ and $u_{h}^{⋆}, z_{h}^{⋆} \in X_{h}$ to (8) satisfy that
$\begin{matrix} | | | u_{h}^{⋆} - u_{H}^{⋆} | | | & \leq C_{drel} η_{H} (T_{H} \ T_{h}, u_{H}^{⋆}) and | | | z_{h}^{⋆} - z_{H}^{⋆} | | | \leq C_{drel} ζ_{H} (T_{H} \ T_{h}, z_{H}^{⋆}) . \end{matrix}$

By assumptions (A1) and (A3), we can estimate for every discrete function $w_{H} \in X_{H}$ the errors in the energy norm of the primal and the dual problem by

\begin{matrix} | | | u^{⋆} - w_{H} | | | \leq C [η_{H} (w_{H}) + | | | u_{H}^{⋆} - w_{H} | | |] and \\ | | | z^{⋆} - w_{H} | | | \leq C [ζ_{H} (w_{H}) + | | | z_{H}^{⋆} - w_{H} | | |], \end{matrix}

respectively, where $C = max {C_{rel}, C_{rel} C_{stab} + 1} > 0$ . Together with (11), we then obtain that the goal error for approximations $u_{H}^{m} \approx u_{H}^{⋆}$ and $z_{H}^{n} \approx z_{H}^{⋆}$ in $X_{H}$ is bounded by

\begin{matrix} | G (u^{⋆}) - G_{H} (u_{H}^{m}, z_{H}^{n}) | \leq C^{2} [η_{H} (u_{H}^{m}) + | | | u_{H}^{⋆} - u_{H}^{m} | | |] [ζ_{H} (z_{H}^{n}) + | | | z_{H}^{⋆} - z_{H}^{n} | | |] . \end{matrix}

In the following sections, we provide building blocks for our adaptive algorithm that allow to control the arising estimators (by a suitable marking strategy) as well as the arising norms in the upper bound of (16) (by an appropriate stopping criterion for the iterative solver).

Marking strategy

We suppose that the refinement indicators $η_{H} (T, u_{H}^{m})$ and $ζ_{H} (T, z_{H}^{n})$ for some $m, n \in N$ are used to mark a subset $M_{H} \subseteq T_{H}$ of elements for refinement, which, for fixed marking parameter $0 < θ \leq 1$ , satisfies that

\begin{matrix} 2 θ η_{H} {(u_{H}^{m})}^{2} ζ_{H} {(z_{H}^{n})}^{2} \leq η_{H} {(M_{H}, u_{H}^{m})}^{2} ζ_{H} {(z_{H}^{n})}^{2} + ζ_{H} {(M_{H}, z_{H}^{n})}^{2} η_{H} {(u_{H}^{m})}^{2} . \end{matrix}

Remark 2

Given $0 < ϑ \leq 1$ , possible choices of marking strategies satisfying assumption (17) are the following:

The strategy proposed in [2] defines the weighted estimator
$\begin{matrix} ρ_{H} {(T, u_{H}^{m}, z_{H}^{n})}^{2} : = η_{H} {(T, u_{H}^{m})}^{2} ζ_{H} {(z_{H}^{n})}^{2} + η_{H} {(u_{H}^{m})}^{2} ζ_{H} {(T, z_{H}^{n})}^{2} \end{matrix}$
and then determines a set $M_{H} \subseteq T_{H}$ such that
$\begin{matrix} ϑ ρ_{H} (u_{H}^{m}, z_{H}^{n}) \leq ρ_{H} (M_{H}, u_{H}^{m}, z_{H}^{n}) \end{matrix}$ 18
which is the Dörfler marking criterion introduced in [9] and well-known in the context of AFEM analysis; see, e.g., [5]. This strategy satisfies (17) with $θ = ϑ^{2}$ .
The strategy proposed in [20] determines sets ${\bar{M}}_{H}^{u}, {\bar{M}}_{H}^{z} \subseteq T_{H}$ such that
$\begin{matrix} ϑ η_{H} (u_{H}^{m}) \leq η_{ℓ} ({\bar{M}}_{H}^{u}, u_{H}^{m}) and ϑ ζ_{H} (z_{H}^{n}) \leq ζ_{H} ({\bar{M}}_{H}^{z}, z_{H}^{n}) \end{matrix}$ 19
and then chooses $M_{H} : = arg min {# {\bar{M}}_{H}^{u}, # {\bar{M}}_{H}^{z}}$ . This strategy satisfies (17) with $θ = ϑ^{2} / 2$ .
A more aggressive variant of (b) was proposed in [14]: Let ${\bar{M}}_{H}^{u}$ and ${\bar{M}}_{H}^{z}$ as above. Then, choose $M_{H}^{u} \subseteq {\bar{M}}_{H}^{u}$ and $M_{H}^{z} \subseteq {\bar{M}}_{H}^{z}$ with $# M_{H}^{u} = # M_{H}^{z} = min {# {\bar{M}}_{H}^{u}, # {\bar{M}}_{H}^{z}}$ . Finally, define $M_{H} : = M_{H}^{u} \cup M_{H}^{z}$ . Again, this strategy satisfies (17) with $θ = ϑ^{2} / 2$ .

Note that our main results of Theorem 6 and 8 below hold true for all presented marking criteria (a)–(c). For our numerical experiments, we focus on criterion (a), which empirically tends to achieve slightly better performance in practice.

Adaptive algorithm

Any adaptive algorithm strives to drive down the bound in (16). However, the errors of the iterative solver, $| | | u_{H}^{⋆} - u_{H}^{m} | | |$ and $| | | z_{H}^{⋆} - z_{H}^{n} | | |$ , cannot be computed in general since the exact discrete solutions $u_{H}^{⋆}, z_{H}^{⋆} \in X_{H}$ to (8) are unknown and will not be computed. Thus, we note that (9) and the triangle inequality prove that

\begin{matrix} (1 - q_{ctr}) | | | u_{H}^{⋆} - u_{H}^{m - 1} | | | \leq | | | u_{H}^{m} - u_{H}^{m - 1} | | | \leq (1 + q_{ctr}) | | | u_{H}^{⋆} - u_{H}^{m - 1} | | | \end{matrix}

20a

as well as

\begin{matrix} (1 - q_{ctr}) | | | z_{H}^{⋆} - z_{H}^{n - 1} | | | \leq | | | z_{H}^{n} - z_{H}^{n - 1} | | | \leq (1 + q_{ctr}) | | | z_{H}^{⋆} - z_{H}^{n - 1} | | | . \end{matrix}

20b

With $C_{goal} = max {C_{rel}, C_{rel} C_{stab} + 1} (1 + q_{ctr} / (1 - q_{ctr}))$ , (16) leads to

\begin{matrix} | G (u^{⋆}) - G_{H} (u_{H}^{m}, z_{H}^{n}) | \leq C_{goal}^{2} [η_{H} (u_{H}^{m}) + | | | u_{H}^{m} - u_{H}^{m - 1} | | |] [ζ_{H} (z_{H}^{n}) + | | | z_{H}^{n} - z_{H}^{n - 1} | | |], \end{matrix}

which is a computable upper bound to the goal error if $m, n \geq 1$ . Moreover, given some $λ_{ctr} > 0$ , this motivates to stop the iterative solvers as soon as

\begin{matrix} | | | u_{H}^{m} - u_{H}^{m - 1} | | | \leq λ_{ctr} η_{H} (u_{H}^{m}) and | | | z_{H}^{n} - z_{H}^{n - 1} | | | \leq λ_{ctr} ζ_{H} (z_{H}^{n}) \end{matrix}

to equibalance the contributions of the upper bound in (21); alternative stopping criteria are introduced and analyzed below. Overall, we thus consider the following adaptive algorithm.

Algorithm 3

Let $u_{0}^{0}, z_{0}^{0} \in X_{0}$ be initial guesses. Let $0 < θ \leq 1$ as well as $λ_{ctr} > 0$ be arbitrary but fixed marking parameters. For all $ℓ = 0, 1, 2, \dots$ , perform the following steps (i)–(vi):

(i)
Employ (at least one step of) the iterative solver to compute iterates $u_{ℓ}^{1}, \dots, u_{ℓ}^{m}$ and $z_{ℓ}^{1}, \dots, z_{ℓ}^{n}$ together with the corresponding refinement indicators $η_{ℓ} (T, u_{ℓ}^{k})$ and $ζ_{ℓ} (T, z_{ℓ}^{k})$ for all $T \in T_{ℓ}$ , until
$\begin{matrix} | | | u_{ℓ}^{m} - u_{ℓ}^{m - 1} | | | \leq λ_{ctr} η_{ℓ} (u_{ℓ}^{m}) and | | | z_{ℓ}^{n} - z_{ℓ}^{n - 1} | | | \leq λ_{ctr} ζ_{ℓ} (z_{ℓ}^{n}) . \end{matrix}$ 22
(ii)
Define $\underline{m} (ℓ) : = m$ and $\underline{n} (ℓ) : = n$ .
(iii)
If $η_{ℓ} (u_{ℓ}^{m}) = 0$ or $ζ_{ℓ} (z_{ℓ}^{m}) = 0$ , then define $\underline{ℓ} : = ℓ$ and terminate.
(iv)
Otherwise, find a set $M_{ℓ} \subseteq T_{ℓ}$ such that the marking criterion (17) is satisfied.
(v)
Generate $T_{ℓ + 1} : = refine (T_{ℓ}, M_{ℓ})$ .
(vi)
Define the initial guesses $u_{ℓ + 1}^{0} : = u_{ℓ}^{m}$ and $z_{ℓ + 1}^{0} : = z_{ℓ}^{n}$ for the iterative solver.

Remark 4

Theorem 6 below proves (linear) convergence for any choice of the marking parameters $0 < θ \leq 1$ and $λ_{ctr} > 0$ , and for any of the marking strategies from Remark 2. Theorem 8 below proves optimal convergence rates (with respect to the number of elements and the total computational cost) if both parameters are sufficiently small (see (32) for the precise condition) and if the set $M_{ℓ}$ is constructed by one of the strategies from Remark 2, where the respective sets have quasi-minimal cardinality.

Remark 5

Note that Algorithm 3(i) requires to evaluate the error estimator after each solver step. Clearly, it would be favorable to replace $η_{ℓ} (u_{ℓ}^{m})$ (resp. $ζ_{ℓ} (z_{ℓ}^{n})$ ) by $η_{ℓ} (u_{ℓ}^{0})$ (resp. $ζ_{ℓ} (z_{ℓ}^{0})$ ) in (22). Arguing as in [13, Lemma 8], this allows to prove convergence of the adaptive strategy, but full linear convergence (Theorem 6 below) and optimal convergence rates (Theorem 8 below) are exptected to fail.

For each adaptive level $ℓ$ , Algorithm 3 performs at least one solver step to compute $u_{ℓ}^{m}$ as well as one solver step to compute $z_{ℓ}^{n}$ . By definition, $\underline{m} (ℓ) \geq 1$ is the solver step, for which the discrete solution $u_{ℓ}^{\underline{m} (ℓ)}$ is accepted (to contribute to the set of marked elements $M_{ℓ}$ ). Analogously, $\underline{n} (ℓ) \geq 1$ is the solver step, for which the discrete solution $z_{ℓ}^{\underline{n} (ℓ)}$ is accepted (to contribute to $M_{ℓ}$ ). If the iterative solver for either the primal or the dual problem fails to terminate for some level $ℓ \in N_{0}$ , i.e., (22) cannot be achieved for finite m, or n, we define $\underline{m} (ℓ) : = \infty$ , or $\underline{n} (ℓ) : = \infty$ , respectively, and $\underline{ℓ} : = ℓ$ . With $\underline{k} (ℓ) : = max {\underline{m} (ℓ), \underline{n} (ℓ)}$ , we define

\begin{matrix} \begin{matrix} u_{ℓ}^{k} & : = u_{ℓ}^{\underline{m} (ℓ)} for all k \in N with \underline{m} (ℓ) < k \leq \underline{k} (ℓ), \\ z_{ℓ}^{k} & : = z_{ℓ}^{\underline{n} (ℓ)} for all k \in N with \underline{n} (ℓ) < k \leq \underline{k} (ℓ) . \end{matrix} \end{matrix}

For ease of presentation, we omit the $ℓ$ -dependence of the indices for final iterates $\underline{m} (ℓ)$ , $\underline{n} (ℓ)$ , and $\underline{k} (ℓ)$ in the following, if they appear as upper indices and write, e.g., $u_{ℓ}^{\underline{m}} : = u_{ℓ}^{\underline{m} (ℓ)}$ and $u_{ℓ}^{\underline{m} - 1} : = u_{ℓ}^{\underline{m} (ℓ) - 1}$ . If Algorithm 3 does not terminate in step (iii) for some $ℓ \in N$ , then we define $\underline{ℓ} : = \infty$ . To formulate the convergence of Algorithm 3, we define the ordered set

\begin{matrix} Q : = {(ℓ, k) \in N_{0}^{2} : ℓ \leq \underline{ℓ} and 1 \leq k \leq \underline{k} (ℓ)}, where | (ℓ, k) | : = k + \sum_{j = 0}^{ℓ - 1} \underline{k} (j) . \end{matrix}

Note that $| (ℓ, k) |$ is proportional to the overall number of solver steps to compute the estimator product $η_{ℓ} (u_{ℓ}^{k}) ζ_{ℓ} (z_{ℓ}^{k})$ . Additionally, we sometimes require the notation

\begin{matrix} Q_{0} : = {(ℓ, k) \in N_{0}^{2} : ℓ \leq \underline{ℓ} and 0 \leq k \leq \underline{k} (ℓ)} = Q \cup {(ℓ, 0) \in N_{0}^{2} : ℓ \leq \underline{ℓ}} . \end{matrix}

To estimate the work necessary to compute a pair $(u_{ℓ}^{k}, z_{ℓ}^{k}) \in X_{ℓ} \times X_{ℓ}$ , we make the following assumptions which are usually satisfied in practice:

The iterates $u_{ℓ}^{k}$ and $z_{ℓ}^{k}$ are computed in parallel and each step of the solver in Algorithm 3(i) can be done in linear complexity $O (# T_{ℓ})$ ;
Computation of all indicators $η_{ℓ} (T, u_{ℓ}^{k})$ and $ζ_{ℓ} (T, z_{ℓ}^{k})$ for $T \in T_{ℓ}$ requires $O (# T_{ℓ})$ steps;
The marking in Algorithm 3(iv) can be performed at linear cost $O (# T_{ℓ})$ (according to [23] this can be done for the strategies outlined in Remark 2 with $M_{ℓ}$ having almost minimal cardinality; moreover, we refer to a recent own algorithm in [21] with linear cost even for $M_{ℓ}$ having minimal cardinality);
We have linear cost $O (# T_{ℓ})$ to generate the new mesh $T_{ℓ + 1}$ .

Since a step $(ℓ, k) \in Q$ of Algorithm 3 depends on the full history of preceding steps, the total work spent to compute $(u_{ℓ}^{k}, z_{ℓ}^{k}) \in X_{ℓ} \times X_{ℓ}$ is then of order

\begin{matrix} work (ℓ, k) : = \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \leq | (ℓ, k) | \end{matrix}} # T_{ℓ^{'}} for all (ℓ, k) \in Q . \end{matrix}

Finally, we note that Algorithm 3(vi) employs nested iteration to obtain the initial guesses $u_{ℓ + 1}^{0}, z_{ℓ + 1}^{0}$ of the solver from the final iterates $u_{ℓ}^{\underline{m}}, z_{ℓ}^{\underline{n}}$ for the mesh $T_{ℓ}$ . According to (21), this allows for a posteriori error control for all indices $(ℓ, k) \in Q_{0} \ {(0, 0)}$ beyond the initial step.

Main results

Linear convergence with optimal rates

Our first main result states linear convergence of the quasi-error product

\begin{matrix} Λ_{ℓ}^{k} : = [| | | u_{ℓ}^{⋆} - u_{ℓ}^{k} | | | + η_{ℓ} (u_{ℓ}^{k})] [| | | z_{ℓ}^{⋆} - z_{ℓ}^{k} | | | + ζ_{ℓ} (z_{ℓ}^{k})] for all (ℓ, k) \in Q_{0} \end{matrix}

for every choice of the stopping parameter $λ_{ctr} > 0$ . Recall from (16) that the quasi-error product is an upper bound for the error $| G (u^{⋆}) - G_{ℓ} (u_{ℓ}^{k}, z_{ℓ}^{k}) |$ . Moreover, if $k = \underline{k} (ℓ)$ , then () and (22) give that $Λ_{ℓ}^{\underline{k}} ≃ η_{ℓ} (u_{ℓ}^{\underline{k}}) ζ_{ℓ} (z_{ℓ}^{\underline{k}})$ .

Theorem 6

Suppose (A1)–(A3). Suppose that $0 < θ \leq 1$ and $λ_{ctr} > 0$ . Then, Algorithm 3 satisfies linear convergence in the sense of

\begin{matrix} Λ_{ℓ^{'}}^{k^{'}} \leq C_{lin} q_{lin}^{| (ℓ^{'}, k^{'}) | - | (ℓ, k) |} Λ_{ℓ}^{k} for all (ℓ, k), (ℓ^{'}, k^{'}) \in Q \cup {(0, 0)} with | (ℓ^{'}, k^{'}) | \geq | (ℓ, k) | . \end{matrix}

The constants $C_{lin} > 0$ and $0 < q_{lin} < 1$ depend only on $C_{stab}$ , $q_{red}$ , $C_{rel}$ , $q_{ctr}$ , and the (arbitrary) adaptivity parameters $0 < θ \leq 1$ and $λ_{ctr} > 0$ .

Full linear convergence implies that convergence rates with respect to degrees of freedom and with respect to total computational cost are equivalent. From this point of view, full linear convergence indeed turns out to be the core argument for optimal complexity.

Corollary 7

Recall the definition of the total computational cost $work (ℓ, k)$ from (26). Let $r > 0$ and $C_{r} : = {sup}_{(ℓ, k) \in Q} {(# T_{ℓ} - # T_{0} + 1)}^{r} Λ_{ℓ}^{k} \in [0, \infty]$ . Then, under the assumptions of Theorem 6, it holds that

\begin{matrix} C_{r} \leq sup_{(ℓ, k) \in Q} {(# T_{ℓ})}^{r} Λ_{ℓ}^{k} \leq sup_{(ℓ, k) \in Q} work {(ℓ, k)}^{r} Λ_{ℓ}^{k} \leq C_{rate} C_{r}, \end{matrix}

where the constant $C_{rate} > 0$ depends only on r, $# T_{0}$ , and on the constants $q_{lin}, C_{lin}$ from Theorem 6.

Proof

The first two estimates in (29) are obvious. It remains to prove the last estimate in (29). To this end, note that it follows from the definition of $C_{r}$ that

\begin{matrix} # T_{ℓ} - # T_{0} + 1 \leq (Λ_{ℓ}^{k})^{- 1 / r} C_{r}^{1 / r} for all (ℓ, k) \in Q . \end{matrix}

Moreover, elementary algebra yields that

\begin{matrix} # T_{ℓ^{'}} \leq # T_{0} (# T_{ℓ^{'}} - # T_{0} + 1) for all (ℓ^{'}, 0) \in Q_{0} . \end{matrix}

For $(ℓ, k) \in Q$ , Theorem 6 and the geometric series thus show that

\begin{matrix} work (ℓ, k) & \overset{26}{=} \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \leq | (ℓ, k) | \end{matrix}} # T_{ℓ^{'}} \leq # T_{0} \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \leq | (ℓ, k) | \end{matrix}} (# T_{ℓ^{'}} - # T_{0} + 1) \\ \leq # T_{0} C_{r}^{1 / r} \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \leq | (ℓ, k) | \end{matrix}} (Λ_{ℓ^{'}}^{k^{'}})^{- 1 / r} \leq # T_{0} C_{r}^{1 / r} C_{lin}^{1 / r} \frac{1}{1 - q_{lin}^{1 / r}} (Λ_{ℓ}^{k})^{- 1 / r} . \end{matrix}

With $C_{rate} : = {(# T_{0})}^{r} C_{lin} 1 / {(1 - q_{lin}^{1 / r})}^{r}$ , this gives that

\begin{matrix} work {(ℓ, k)}^{r} Λ_{ℓ}^{k} \leq C_{rate} C_{r} for all (ℓ, k) \in Q . \end{matrix}

This shows the final inequality in (29) and thus concludes the proof.

If $θ$ and $λ_{ctr}$ are small enough, we are able to show that linear convergence from Theorem 6 even guarantees optimal rates with respect to both the number of unknowns $# T_{ℓ}$ and the total cost $work (ℓ, k)$ . Given $N \in N_{0}$ , let $T (N)$ be the set of all $T_{H} \in T$ with $# T_{H} - # T_{0} \leq N$ . With

\begin{matrix} ‖ u^{⋆} ‖_{A_{r}} : = sup_{N \in N_{0}} {(N + 1)}^{r} min_{T_{opt} \in T (N)} η_{opt} (u_{opt}^{⋆}) \in [0, \infty] \end{matrix}

30a

and

\begin{matrix} ‖ z^{⋆} ‖_{A_{r}} : = sup_{N \in N_{0}} {(N + 1)}^{r} min_{T_{opt} \in T (N)} ζ_{opt} (z_{opt}^{⋆}) \in [0, \infty] \end{matrix}

30b

for all $r > 0$ , there holds the following result.

Theorem 8

Recall the definition of the total computational cost $work (ℓ, k)$ from (26). Suppose the mesh properties (12)–(14) as well as the axioms (A1)–(A4). Define

\begin{matrix} θ_{⋆} : = \frac{1}{1 + C_{stab}^{2} C_{drel}^{2}} and λ_{⋆} : = \frac{1 - q_{ctr}}{q_{ctr} C_{stab}} . \end{matrix}

Let both adaptivity parameters $0 < θ \leq 1$ and $0 < λ_{ctr} < λ_{⋆}$ be sufficiently small such that

\begin{matrix} 0 < (\frac{\sqrt{2 θ} + λ_{ctr} / λ_{⋆}}{1 - λ_{ctr} / λ_{⋆}})^{2} < θ_{⋆} . \end{matrix}

Let $1 \leq C_{mark} < \infty$ . Suppose that the set of marked elements $M_{ℓ}$ in Algorithm 3(iv) is constructed by one of the strategies from Remark 2(a)–(c), where the sets in (18) and (19) have up to the factor $C_{mark}$ minimal cardinality. Let $s, t > 0$ with $‖ u^{⋆} ‖_{A_{s}} + {‖ z^{⋆} ‖}_{A_{t}} < \infty$ . Then, there exists a constant $C_{opt} > 0$ such that

\begin{matrix} sup_{(ℓ, k) \in Q} work {(ℓ, k)}^{s + t} Λ_{ℓ}^{k} \leq C_{opt} max {‖ u^{⋆} ‖_{A_{s}} ‖ z^{⋆} ‖_{A_{t}}, Λ_{0}^{0}} . \end{matrix}

The constant $C_{opt}$ depends only on $C_{cls}$ , $C_{stab}$ , $q_{red}$ , $C_{rel}$ , $C_{drel}$ , $q_{ctr}$ , $C_{mark}$ , $θ$ , $λ_{ctr}$ , $# T_{0}$ , s, and t.

Remark 9

The constraint (32) is enforced by our analysis of the marking strategy from Remark 2(a), while the marking strategies from Remark 2(b)–(c) allow to relax the condition to

\begin{matrix} 0 < (\frac{\sqrt{θ} + λ_{ctr} / λ_{⋆}}{1 - λ_{ctr} / λ_{⋆}})^{2} < θ_{⋆} . \end{matrix}

Alternative termination criteria for iterative solver

The above formulations of Algorithm 3 stops the iterative solver for $u_{ℓ}^{m}$ and the iterative solver for $z_{ℓ}^{n}$ independently of each other as soon as the respective termination criteria in (22) are satisfied. In this section, we briefly discuss two alternative termination criteria:

Stronger termination: The current proof of linear convergence (and of the subsequent proof of optimal convergence) does only exploit that $u_{ℓ}^{\underline{k}}$ and $z_{ℓ}^{\underline{k}}$ satisfy the stopping criterion and the previous iterates do not (cf. Lemma 10(iii)). This can also be ensured by the following modification of Algorithm 3(i):

(i)
Employ the iterative solver to compute iterates $u_{ℓ}^{1}, \dots, u_{ℓ}^{k}$ and $z_{ℓ}^{1}, \dots, z_{ℓ}^{k}$ together with the corresponding refinement indicators $η_{ℓ} (T, u_{ℓ}^{k})$ and $ζ_{ℓ} (T, z_{ℓ}^{k})$ for all $T \in T_{ℓ}$ , until
$\begin{matrix} | | | u_{ℓ}^{k} - u_{ℓ}^{k - 1} | | | \leq λ_{ctr} η_{ℓ} (u_{ℓ}^{k}) and | | | z_{ℓ}^{k} - z_{ℓ}^{k - 1} | | | \leq λ_{ctr} ζ_{ℓ} (z_{ℓ}^{k}) . \end{matrix}$ 35

Note that this will lead to more solver steps, since now $k = \underline{k} (ℓ)$ (if it exists) is the smallest index for which the stopping criterion holds simultaneously for both $u_{ℓ}^{\underline{k}}$ and $z_{ℓ}^{\underline{k}}$ .

Inspecting the proof of Lemma 10 below, we see that all results hold verbatim also for this stopping criterion. Thus, we conclude linear and optimal convergence (in the sense of Theorem 6 and Theorem 8) also in this case.

Natural termination: The following stopping criterion (which is somehow the most natural candidate) also leads to linear convergence: Let $\underline{m} (ℓ), \underline{n} (ℓ) \in N$ be minimal with (22). If either of them do not exist, we set again $\underline{m} (ℓ) = \infty$ , or $\underline{n} (ℓ) = \infty$ , respectively. Define $\underline{k} (ℓ) : = max {\underline{m} (ℓ), \underline{n} (ℓ)}$ . Then, employ the iterative solver $\underline{k} (ℓ)$ times for both the primal and the dual problem, i.e., the solver provides iterates $u_{ℓ}^{k}$ and $z_{ℓ}^{k}$ until both stopping criteria in (22) have been satisfied once (which avoids the artificial definition (23)). For instance, if $\underline{m} (ℓ) < \underline{n} (ℓ) = \underline{k} (ℓ) < \infty$ , we continue to iterate for the primal problem until $u_{ℓ}^{\underline{k}}$ is obtained (or never stop the iteration if $\underline{n} (ℓ) = \underline{k} (ℓ) = \infty$ ). If $λ_{ctr} > 0$ is sufficiently small such that $1 - \frac{q_{ctr}}{1 - q_{ctr}} C_{stab} (1 + q_{ctr}) λ_{ctr} > 0$ , then we can define

\begin{matrix} λ_{ctr} \leq λ_{ctr}^{'} : = max {1, \frac{(1 + q_{ctr}) q_{ctr}}{(1 - q_{ctr}) (1 - \frac{q_{ctr}}{1 - q_{ctr}} C_{stab} (1 + q_{ctr}) λ_{ctr})}} λ_{ctr} < \infty, \end{matrix}

and we can guarantee the stopping condition (22) with the larger constant $λ_{ctr}^{'}$ , i.e.,

\begin{matrix} | | | u_{ℓ}^{\underline{k}} - u_{ℓ}^{\underline{k} - 1} | | | \leq λ_{ctr}^{'} η_{ℓ} (u_{ℓ}^{\underline{k}}) and | | | z_{ℓ}^{\underline{k}} - z_{ℓ}^{\underline{k} - 1} | | | \leq λ_{ctr}^{'} ζ_{ℓ} (z_{ℓ}^{\underline{k}}) ; \end{matrix}

see the proof below. Again, we notice that then the assumptions of Lemma 10 below are met. Hence, we conclude linear convergence (in the sense of Theorem 6) also for this stopping criterion. Moreover, optimal rates in the sense of Theorem 8 hold if $λ_{ctr}$ in (32) is replaced by $λ_{ctr}^{'}$ .

Proof of (36)

Without loss of generality, let us assume that $\underline{m} (ℓ) < \underline{k} (ℓ) = \underline{n} (ℓ) < \infty$ . First, we have that

\begin{matrix} | | | u_{ℓ}^{\underline{k}} - u_{ℓ}^{\underline{m}} | | | \leq | | | u_{ℓ}^{⋆} - u_{ℓ}^{\underline{k}} | | | + | | | u_{ℓ}^{⋆} - u_{ℓ}^{\underline{m}} | | | \leq (1 + q_{ctr}^{\underline{k} (ℓ) - \underline{m} (ℓ)}) | | | u_{ℓ}^{⋆} - u_{ℓ}^{\underline{m}} | | | . \end{matrix}

Then, using the fact that $u_{ℓ}^{\underline{m}}$ satisfies the stopping criterion in (22) and stability (A1), we get that

For $λ_{ctr} < (1 - q_{ctr}) / [C_{stab} q_{ctr} (1 + q_{ctr}^{\underline{k} (ℓ) - \underline{m} (ℓ)})]$ we can absorb the last term to obtain

\begin{matrix} | | | u_{ℓ}^{⋆} - u_{ℓ}^{\underline{m}} | | | \leq \frac{q_{ctr}}{1 - q_{ctr}} (1 - \frac{C_{stab} q_{ctr}}{1 - q_{ctr}} (1 + q_{ctr}^{\underline{k} (ℓ) - \underline{m} (ℓ)}) λ_{ctr})^{- 1} λ_{ctr} η_{ℓ} (u_{ℓ}^{\underline{k}}) . \end{matrix}

Finally, we observe that

\begin{matrix} | | | u_{ℓ}^{\underline{k}} - u_{ℓ}^{\underline{k} - 1} | | | \leq (1 + q_{ctr}) | | | u_{ℓ}^{⋆} - u_{ℓ}^{\underline{k} - 1} | | | \leq (1 + q_{ctr}) q_{ctr}^{\underline{k} - \underline{m} - 1} | | | u_{ℓ}^{⋆} - u_{ℓ}^{\underline{m}} | | | . \end{matrix}

Combining the last two estimates we obtain that

\begin{matrix} | | | u_{ℓ}^{\underline{k}} - u_{ℓ}^{\underline{k} - 1} | | | \leq \frac{(1 + q_{ctr}) q_{ctr}^{\underline{k} (ℓ) - \underline{m} (ℓ)}}{(1 - q_{ctr}) (1 - \frac{q_{ctr}}{1 - q_{ctr}} C_{stab} (1 + q_{ctr}^{\underline{k} (ℓ) - \underline{m} (ℓ)}) λ_{ctr})} λ_{ctr} η_{ℓ} (u_{ℓ}^{\underline{k}}) . \end{matrix}

Hence, (36) follows with $q_{ctr}^{\underline{k} (ℓ) - \underline{m} (ℓ)} \leq q_{ctr}$ and $| | | z_{ℓ}^{\underline{k}} - z_{ℓ}^{\underline{k} - 1} | | | \leq λ_{ctr} ζ_{ℓ} (z_{ℓ}^{\underline{k}}) \leq λ_{ctr}^{'} ζ_{ℓ} (z_{ℓ}^{\underline{k}})$ . $□$

Numerical examples

In this section, we consider two numerical examples which solve the equation

\begin{matrix} \begin{matrix} - Δ u^{⋆} & = f in Ω, \\ u^{⋆} & = 0 on Γ_{D}, \\ \nabla u^{⋆} \cdot n & = ϕ on Γ_{N}, \end{matrix} \end{matrix}

where $ϕ \in L^{2} (Γ_{N})$ and $n$ is the element-wise outwards facing unit normal vector. We refer the reader to Remark 1 for a comment on the applicability of our results to this model problem. We further suppose that the goal functional is a slight variant of the one proposed in [20], i.e.,

\begin{matrix} G (v) = - \int_{ω} \nabla v \cdot g d x for v \in H_{D}^{1} (Ω), \end{matrix}

with a subset $ω \subseteq Ω$ and a fixed direction $g (x) = g_{0} \in R^{2}$ . Moreover, for error estimation, we employ standard residual error estimators, which in our case, for all $(ℓ, k) \in Q$ and all $T \in T_{ℓ}$ , read

\begin{matrix} η_{ℓ} {(T, u_{ℓ}^{k})}^{2} & : = h_{T}^{2} ‖ Δ u_{ℓ}^{k} {+ f ‖}_{L^{2} (T)}^{2} + h_{T} {‖ [[\nabla u_{ℓ}^{k} \cdot n]] ‖}_{L^{2} (\partial T \cap Ω)}^{2} \\ + h_{T} {‖ \nabla u_{ℓ}^{k} \cdot n - ϕ ‖}_{L^{2} (\partial T \cap Γ_{N})}^{2}, \\ ζ_{ℓ} {(T, z_{ℓ}^{k})}^{2} & : = h_{T}^{2} ‖ div (\nabla z_{ℓ}^{k} + g) ‖_{L^{2} (T)}^{2} + h_{T} {‖ [[(\nabla z_{ℓ}^{k} + g) \cdot n]] ‖}_{L^{2} (\partial T \cap Ω)}^{2}, \end{matrix}

where $h_{T} = {| T |}^{1 / 2}$ is the local mesh-width and $[[\cdot]]$ denotes the jump across interior edges. It is well-known [5, 14] that $η_{ℓ}$ and $ζ_{ℓ}$ satisfy the assumptions (A1)–(A4). The examples are chosen to showcase the performance of the proposed GOAFEM algorithm for different types of singularities.

Throughout this section, we solve (37) as well as the corresponding dual problem numerically using Algorithm 3, where we make the following choices:

We solve the problems on the lowest order finite element space, i.e., with polynomial degree $p = 1$ .
As initial values, we use $u_{0}^{0} = z_{0}^{0} = 0$ .
To solve the arising linear systems, we use a preconditioned conjugate gradient (PCG) method with an optimal additive Schwarz preconditioner. We refer to [8, 22] for details and, in particular, the proof that this iterative solver satisfies (9).
We use the marking criterion from Remark 2(a) and choose $M_{ℓ}$ such that it has minimal cardinality.
Unless mentioned otherwise, we use $ϑ = 0.5$ and $λ_{ctr} = 10^{- 5}$ .

Singularity in goal functional only

In our first example, the primal problem is (37) with $f = 2 x_{1} (1 - x_{1}) + 2 x_{2} (1 - x_{2})$ on the unit square $Ω = {(0, 1)}^{2}$ , and $Γ_{D} = \partial Ω$ (and thus, $Γ_{N} = \emptyset$ ). For this problem, the exact solution reads

\begin{matrix} u^{⋆} (x) = x_{1} x_{2} (1 - x_{1}) (1 - x_{2}) . \end{matrix}

The goal functional is (38) with $ω = T_{1} : = {x \in Ω : x_{1} + x_{2} \geq 3 / 2}$ and $g_{0} = (- 1, 0)$ . The exact goal value can be computed analytically to be

\begin{matrix} G (u^{⋆}) = \int_{T_{1}} \frac{\partial u^{⋆}}{\partial x_{1}} d x = 11 / 960 . \end{matrix}

The initial mesh $T_{0}$ as well as a visualization of the set $T_{1}$ can be seen in Fig. 1.

For this setting, we compare our iterative solver to a conjugate gradient method without preconditioner in Fig. 2, where we plot the computable upper bound from (21),

\begin{matrix} Ξ_{ℓ}^{k} : = [η_{ℓ} (u_{ℓ}^{k}) + | | | u_{ℓ}^{k} - u_{ℓ}^{k - 1} | | |] [ζ_{ℓ} (z_{ℓ}^{k}) + | | | z_{ℓ}^{k} - z_{ℓ}^{k - 1} | | |] for all (ℓ, k) \in Q, \end{matrix}

over $work (ℓ, k)$ for all iterates $(ℓ, k) \in Q$ and the estimator product for the final iterates $η_{ℓ} (u_{ℓ}^{\underline{k}}) ζ_{ℓ} (z_{ℓ}^{\underline{k}})$ over $# T_{ℓ}$ . We stress that, for $(ℓ, k) \in Q$ , the computable upper bound $Ξ_{ℓ}^{k}$ and the quasi-error product $Λ_{ℓ}^{k}$ from (27) are related by $Λ_{ℓ}^{k} ≲ Ξ_{ℓ}^{k} ≲ Λ_{ℓ}^{k - 1}$ so that linear convergence (28) with optimal rates (33) of $Λ_{ℓ}^{k}$ also yields linear convergence with optimal rates of $Ξ_{ℓ}^{k}$ . Since in our experiments $λ_{ctr} = 10^{- 5}$ is small, it is plausible to assume that the final estimates on every level approximate the exact solutions sufficiently well in the sense of estimator products, i.e., $η_{ℓ} (u_{ℓ}^{\underline{k}}) ζ_{ℓ} (z_{ℓ}^{\underline{k}}) \approx η_{ℓ} (u_{ℓ}^{⋆}) ζ_{ℓ} (z_{ℓ}^{⋆})$ (cf. Lemma 13 below) for which [14] proves optimal convergence rates with respect to $# T_{ℓ}$ . Indeed, we see optimal rates for $η_{ℓ} (u_{ℓ}^{\underline{k}}) ζ_{ℓ} (z_{ℓ}^{\underline{k}})$ with respect to $# T_{ℓ}$ for both solvers in Fig. 2. However, the non-preconditioned CG method fails to satisfy uniform contraction (9) and thus Theorem 8 cannot be applied. In fact, Fig. 2 shows that this method fails to drive down $Ξ_{ℓ}^{k}$ with optimal rates with respect to $work (ℓ, k)$ (cf. (26)), as opposed to the optimally preconditioned CG method.

Fig. 2 — Comparison between iterative solvers for the problem from Sect. 4.1. A conjugate gradient method without preconditioner (CG) leads to optimal rates with respect to $# T_{ℓ}$ for the final iterates where $k = \underline{k} (ℓ)$ , but not with respect to $work (ℓ, k)$ for every $(ℓ, k) \in Q$ . Our choice of the iterative solver (ML) achieves optimal rates with respect to both measures

Furthermore, we plot in Fig. 3 different error measures over $work (ℓ, k)$ for every iterate $(ℓ, k) \in Q$ . This shows that the corrector term

\begin{matrix} a (u_{ℓ}^{k}, z_{ℓ}^{k}) - F (z_{ℓ}^{k}) \end{matrix}

(which is the residual of $u_{ℓ}^{k}$ evaluated at the dual solution $z_{ℓ}^{k}$ ) in the definition of the discrete goal functional (10) is indeed necessary. We see that throughout the iteration, the goal value $G (u_{ℓ}^{k})$ highly oscillates and, for large values of $λ_{ctr}$ , even shows a different rate than the $Ξ_{ℓ}^{k}$ over $work (ℓ, k)$ . In general, we thus cannot expect the quantity $Ξ_{ℓ}^{k}$ to bound the uncorrected goal-error $| G (u^{⋆}) - G (u_{ℓ}^{k}) |$ .

Fig. 3 — Comparison between $Ξ_{ℓ}^{k}$ , discrete goal $G_{ℓ} (u_{ℓ}^{k}, z_{ℓ}^{k})$ , primal residual evaluated at the dual solution $z_{ℓ}^{k}$ , and direct evaluation of goal functional $G (u_{ℓ}^{k})$ for every iterate $(ℓ, k) \in Q$ and different values of $λ_{ctr} \in {1, 10^{- 2}, 10^{- 4}, 10^{- 6}}$ . The primal residual evaluated at the dual solution $z_{ℓ}^{k}$ is the difference between goal and discrete goal; see (10)

For the discrete goal, the corrector term compensates the oscillations of the goal functional, such that their sum decreases with the same rate as $Ξ_{ℓ}^{k}$ , as predicted by (21). Smaller values of $λ_{ctr}$ imply that on every level $ℓ$ the approximate solutions $u_{ℓ}^{k}, z_{ℓ}^{k}$ are computed more accurately, such that the corrector term becomes smaller and the effect on the rate of the goal value becomes negligible.

Geometrical singularity

Our second example is the classical example of a geometric singularity on the so-called Z-shape $Ω = {(- 1, 1)}^{2} \ conv {(- 1, - 1), (0, 0), (- 1, 0)}$ , where $Γ_{D}$ is only the re-entrant corner (cf. Fig. 4). The primal problem is (37) with $f = 0$ and $ϕ = \nabla u^{⋆} \cdot n$ , where the exact solution in polar coordinates r(x) and $φ (x)$ of $x \in R^{2}$ is prescribed as

\begin{matrix} u^{⋆} (x) = r {(x)}^{4 / 7} sin (\frac{4}{7} φ (x) + \frac{3 π}{7}) . \end{matrix}

The goal functional is (38) with $ω = T_{2} : = {(0.5, 0.5)}^{2} \cap Ω$ and $g_{0} = (- 1, - 1)$ and can be computed directly via numerical integration to be

\begin{matrix} G (u^{⋆}) = \int_{T_{2}} (\frac{\partial u^{⋆}}{\partial x_{1}} + \frac{\partial u^{⋆}}{\partial x_{2}}) d x \approx 0.82962247157810 . \end{matrix}

In Fig. 4, the initial triangulation $T_{0}$ as well as the mesh after several iterations of Algorithm 3 can be seen. The adaptive algorithm resolves the singularity at the re-entrant corner, as well as critical points of the goal functional, which are at the corners of $T_{2}$ .

Figure 5 shows the rate of the estimator product $η_{ℓ} (u_{ℓ}^{\underline{k}}) ζ_{ℓ} (z_{ℓ}^{\underline{k}})$ of the final iterates over $# T_{ℓ}$ as well as the rate of $Ξ_{ℓ}^{k}$ over $work (ℓ, k)$ for all $(ℓ, k) \in Q$ .

Fig. 5 — Rates of the estimator product for final iterates over $# T_{ℓ}$ and $Ξ_{ℓ}^{k}$ as well as goal error over $work (ℓ, k)$ for all $(ℓ, k) \in Q$

Proof of Theorem 6

The following core lemma extends one of the key observations of [16] to the present setting, where we stress that the nonlinear product structure of $Δ_{ℓ}^{k}$ leads to technical challenges which go much beyond [16].

Lemma 10

Suppose (A1)–(A3). Then, there exist constants $μ, C_{aux} > 0$ , and $0 < q_{aux} < 1$ , and some scalar sequence ${(R_{ℓ})}_{ℓ \in N_{0}} \subset R$ such that the quasi-error product

\begin{matrix} Δ_{ℓ}^{k} : = [| | | u_{ℓ}^{⋆} - u_{ℓ}^{k} | | | + μ η_{ℓ} (u_{ℓ}^{k})] [| | | z_{ℓ}^{⋆} - z_{ℓ}^{k} | | | + μ ζ_{ℓ} (z_{ℓ}^{k})] for all (ℓ, k) \in Q_{0} \end{matrix}

satisfies the following statements (i)–(v):

(i)
$Δ_{ℓ}^{k} \leq Δ_{ℓ}^{j}$ for all $0 \leq j \leq k \leq \underline{k} (ℓ)$ .
(ii)
$Δ_{ℓ}^{\underline{k} - 1} \leq C_{aux} Δ_{ℓ}^{\underline{k}}$ if $\underline{k} (ℓ) < \infty$ .
(iii)
$Δ_{ℓ}^{k} \leq q_{aux} Δ_{ℓ}^{k - 1}$ for all $0 < k < \underline{k} (ℓ)$ .
(iv)
$Δ_{ℓ + 1}^{0} \leq q_{aux} Δ_{ℓ}^{\underline{k} - 1} + R_{ℓ}$ for all $0 < ℓ < \underline{ℓ}$ .
(v)
$\sum_{ℓ = ℓ^{'}}^{\underline{ℓ} - 1} R_{ℓ}^{2} \leq C_{aux} {(Δ_{ℓ}^{\underline{k} - 1})}^{2}$ for all $0 \leq ℓ^{'} < \underline{ℓ} - 1$ .

The constants $μ$ , $C_{aux}$ , and $q_{aux}$ depend only on $C_{stab}$ , $q_{red}$ , $C_{rel}$ , and $q_{ctr}$ as well as on the (arbitrary) adaptivity parameters $0 < θ \leq 1$ and $λ_{ctr} > 0$ .

For the following proofs, we define

\begin{matrix} α_{ℓ}^{k} & : = | | | u_{ℓ}^{⋆} - u_{ℓ}^{k} | | |, & x_{ℓ}^{⋆} & : = | | | u_{ℓ + 1}^{⋆} - u_{ℓ}^{⋆} | | |, \\ β_{ℓ}^{k} & : = | | | z_{ℓ}^{⋆} - z_{ℓ}^{k} | | |, & y_{ℓ}^{⋆} & : = | | | z_{ℓ + 1}^{⋆} - z_{ℓ}^{⋆} | | |, \end{matrix}

such that the quasi-error product reads $Δ_{ℓ}^{k} = [α_{ℓ}^{k} + μ η_{ℓ} (u_{ℓ}^{k})] [β_{ℓ}^{k} + μ ζ_{ℓ} (z_{ℓ}^{k})]$ with a free parameter $μ > 0$ which will be fixed below.

Proof of Lemma 10(i)

Recall from (23) that $u_{ℓ}^{k} = u_{ℓ}^{\underline{m}}$ for all $\underline{m} (ℓ) < k \leq \underline{k} (ℓ)$ . Thus, we have that

\begin{matrix} α_{ℓ}^{k} + μ η_{ℓ} (u_{ℓ}^{k}) = α_{ℓ}^{\underline{m}} + μ η_{ℓ} (u_{ℓ}^{\underline{m}}) for all \underline{m} (ℓ) < k \leq \underline{k} (ℓ) . \end{matrix}

For $0 < k < \underline{m} (ℓ)$ , on the other hand, the solution $u_{ℓ}^{k}$ is obtained by one step of the iterative solver. From stability (A1) and solver contraction (9), we have for all $0 \leq j < k \leq \underline{m} (ℓ)$ that

If $μ$ is chosen small enough such that $q_{ctr} + 2 μ C_{stab} \leq 1$ , together with the trivial case $j = k$ , the last two equations show that

\begin{matrix} α_{ℓ}^{k} + μ η_{ℓ} (u_{ℓ}^{k}) \leq α_{ℓ}^{j} + μ η_{ℓ} (u_{ℓ}^{j}) for all 0 \leq j \leq k \leq \underline{k} (ℓ) . \end{matrix}

The same argument shows that

\begin{matrix} β_{ℓ}^{k} + μ ζ_{ℓ} (z_{ℓ}^{k}) \leq β_{ℓ}^{j} + μ ζ_{ℓ} (z_{ℓ}^{j}) . for all 0 \leq j \leq k \leq \underline{k} (ℓ) . \end{matrix}

Multiplication of the last two estimates shows the assertion. $□$

Proof of Lemma 10(ii)

Recall that for the index $\underline{k} (ℓ)$ there holds (22). From the triangle inequality, we thus get for the primal estimator that

Furthermore, stability (A1) leads to

Combining the last two estimates, we see that

\begin{matrix} α_{ℓ}^{\underline{k} - 1} + μ η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) \leq (1 + λ_{ctr} (C_{stab} + μ^{- 1})) [α_{ℓ}^{\underline{k}} + μ η_{ℓ} (u_{ℓ}^{\underline{k}})] . \end{matrix}

Together with the analogous estimate for $β_{ℓ}^{\underline{k} - 1} + μ ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1})$ , we conclude the proof with $C_{aux} = (1 + λ_{ctr} (C_{stab} + μ^{- 1}))^{2}$ . $□$

Proof of Lemma 10(iii)

Without loss of generality, suppose that $\underline{k} (ℓ) = \underline{m} (ℓ)$ and thus $| | | u_{ℓ}^{k} - u_{ℓ}^{k - 1} | | | > λ_{ctr} η_{ℓ} (u_{ℓ}^{k})$ . Then, this yields that

With contraction of the solver (9), this leads to

\begin{matrix} α_{ℓ}^{k} + μ η_{ℓ} (u_{ℓ}^{k}) \leq q_{ctr} α_{ℓ}^{k - 1} + μ λ_{ctr}^{- 1} (1 + q_{ctr}) α_{ℓ}^{k - 1} for all 0 < k < \underline{k} (ℓ) . \end{matrix}

From (40) for $μ$ small enough, we see that $β_{ℓ}^{k} + μ ζ_{ℓ} (z_{ℓ}^{k}) \leq β_{ℓ}^{k - 1} + μ ζ_{ℓ} (z_{ℓ}^{k - 1})$ . Together with the previous estimate, this shows that

\begin{matrix} Δ_{ℓ}^{k} \leq (q_{ctr} + μ λ_{ctr}^{- 1} (1 + q_{ctr})) Δ_{ℓ}^{k - 1} . \end{matrix}

Up to the choice of $μ$ , this concludes the proof. $□$

Proof of Lemma 10(iv)

First, we note that $η_{ℓ} (u_{ℓ}^{\underline{k}}) ζ_{ℓ} (z_{ℓ}^{\underline{k}}) \neq 0$ , according to Algorithm 3(iii) and the assumption that $ℓ < \underline{ℓ}$ . From reduction of the solver (9) and nested iteration, we get that

\begin{matrix} \begin{matrix} α_{ℓ + 1}^{0} & = | | | u_{ℓ + 1}^{⋆} - u_{ℓ}^{\underline{k}} | | | \leq | | | u_{ℓ + 1}^{⋆} - u_{ℓ}^{⋆} | | | + q_{ctr} | | | u_{ℓ}^{⋆} - u_{ℓ}^{\underline{k} - 1} | | | = x_{ℓ}^{⋆} + q_{ctr} α_{ℓ}^{\underline{k} - 1}, \\ β_{ℓ + 1}^{0} & = | | | z_{ℓ + 1}^{⋆} - z_{ℓ}^{\underline{k}} | | | \leq | | | z_{ℓ + 1}^{⋆} - z_{ℓ}^{⋆} | | | + q_{ctr} | | | z_{ℓ}^{⋆} - z_{ℓ}^{\underline{k} - 1} | | | = y_{ℓ}^{⋆} + q_{ctr} β_{ℓ}^{\underline{k} - 1} \end{matrix} \end{matrix}

and thus

\begin{matrix} α_{ℓ + 1}^{0} β_{ℓ + 1}^{0} \leq q_{ctr}^{2} α_{ℓ}^{\underline{k} - 1} β_{ℓ}^{\underline{k} - 1} + q_{ctr} (α_{ℓ}^{\underline{k} - 1} y_{ℓ}^{⋆} + β_{ℓ}^{\underline{k} - 1} x_{ℓ}^{⋆}) + x_{ℓ}^{⋆} y_{ℓ}^{⋆} . \end{matrix}

For the estimator terms, we have with stability (A1) and reduction (A2) that

\begin{matrix} η_{ℓ + 1} {(u_{ℓ + 1}^{0})}^{2} = η_{ℓ + 1} {(u_{ℓ}^{\underline{k}})}^{2} & = η_{ℓ + 1} {(T_{ℓ + 1} \cap T_{ℓ}, u_{ℓ}^{\underline{k}})}^{2} + η_{ℓ + 1} {(T_{ℓ + 1} \ T_{ℓ}, u_{ℓ}^{\underline{k}})}^{2} \\ \leq η_{ℓ} {(T_{ℓ + 1} \cap T_{ℓ}, u_{ℓ}^{\underline{k}})}^{2} + q_{red}^{2} η_{ℓ} {(T_{ℓ} \ T_{ℓ + 1}, u_{ℓ}^{\underline{k}})}^{2} \\ = η_{ℓ} {(u_{ℓ}^{\underline{k}})}^{2} - (1 - q_{red}^{2}) η_{ℓ} {(T_{ℓ} \ T_{ℓ + 1}, u_{ℓ}^{\underline{k}})}^{2} . \end{matrix}

On the one hand, with $C_{1} : = C_{stab} (1 + q_{red})$ , this implies that

On the other hand, with $0 < q_{θ} : = 1 - (1 - q_{red}^{2}) θ < 1$ , we get that

\begin{matrix} \frac{η_{ℓ + 1} {(u_{ℓ + 1}^{0})}^{2}}{η_{ℓ} {(u_{ℓ}^{\underline{k}})}^{2}} \leq q_{θ} + (1 - q_{red}^{2}) [θ - \frac{η_{ℓ} {(T_{ℓ} \ T_{ℓ + 1}, u_{ℓ}^{\underline{k}})}^{2}}{η_{ℓ} {(u_{ℓ}^{\underline{k}})}^{2}}] . \end{matrix}

Using (45), the corresponding estimate for the dual estimator, and the Young inequality, we obtain that

\begin{matrix} \frac{η_{ℓ + 1} (u_{ℓ + 1}^{0})}{η_{ℓ} (u_{ℓ}^{\underline{k}})} \frac{ζ_{ℓ + 1} (z_{ℓ + 1}^{0})}{ζ_{ℓ} (z_{ℓ}^{\underline{k}})} \leq q_{θ} + \frac{(1 - q_{red}^{2})}{2} [2 θ - \frac{η_{ℓ} {(T_{ℓ} \ T_{ℓ + 1}, u_{ℓ}^{\underline{k}})}^{2}}{η_{ℓ} {(u_{ℓ}^{\underline{k}})}^{2}} - \frac{ζ_{ℓ} {(T_{ℓ} \ T_{ℓ + 1}, z_{ℓ}^{\underline{k}})}^{2}}{ζ_{ℓ} {(z_{ℓ}^{\underline{k}})}^{2}}] . \end{matrix}

The marking criterion (17), which is applicable due to $ℓ < \underline{ℓ}$ , estimates the term in brackets by zero. Thus stability (A1) leads to

For the mixed terms in $Δ_{ℓ + 1}^{0}$ , we have with (42) and (44) that

\begin{matrix} \begin{matrix} η_{ℓ + 1} (u_{ℓ + 1}^{0}) β_{ℓ + 1}^{0} & \leq [η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) + C_{1} α_{ℓ}^{\underline{k} - 1}] [y_{ℓ}^{⋆} + q_{ctr} β_{ℓ}^{\underline{k} - 1}] \\ = q_{ctr} η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) β_{ℓ}^{\underline{k} - 1} + η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) y_{ℓ}^{⋆} + C_{1} α_{ℓ}^{\underline{k} - 1} y_{ℓ}^{⋆} + C_{1} q_{ctr} α_{ℓ}^{\underline{k} - 1} β_{ℓ}^{\underline{k} - 1} . \end{matrix} \end{matrix}

Analogously, we see that

\begin{matrix} ζ_{ℓ + 1} (z_{ℓ + 1}^{0}) α_{ℓ + 1}^{0} \leq q_{ctr} ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) α_{ℓ}^{\underline{k} - 1} + ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) x_{ℓ}^{⋆} + C_{1} β_{ℓ}^{\underline{k} - 1} x_{ℓ}^{⋆} + C_{1} q_{ctr} α_{ℓ}^{\underline{k} - 1} β_{ℓ}^{\underline{k} - 1} . \end{matrix}

Combining (43) and (46)–(48), we get that

\begin{matrix} Δ_{ℓ + 1}^{0} & = α_{ℓ + 1}^{0} β_{ℓ + 1}^{0} + μ [η_{ℓ + 1} (u_{ℓ + 1}^{0}) β_{ℓ + 1}^{0} + ζ_{ℓ + 1} (z_{ℓ + 1}^{0}) α_{ℓ + 1}^{0}] \\ + μ^{2} η_{ℓ + 1} (u_{ℓ + 1}^{0}) ζ_{ℓ + 1} (z_{ℓ + 1}^{0}) \\ \leq q_{ctr}^{2} α_{ℓ}^{\underline{k} - 1} β_{ℓ}^{\underline{k} - 1} + q_{ctr} (α_{ℓ}^{\underline{k} - 1} y_{ℓ}^{⋆} + β_{ℓ}^{\underline{k} - 1} x_{ℓ}^{⋆}) + x_{ℓ}^{⋆} y_{ℓ}^{⋆} \\ + μ [q_{ctr} η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) β_{ℓ}^{\underline{k} - 1} + η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) y_{ℓ}^{⋆} + C_{1} α_{ℓ}^{\underline{k} - 1} y_{ℓ}^{⋆} + C_{1} q_{ctr} α_{ℓ}^{\underline{k} - 1} β_{ℓ}^{\underline{k} - 1}] \\ + μ [q_{ctr} ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) α_{ℓ}^{\underline{k} - 1} + ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) x_{ℓ}^{⋆} + C_{1} β_{ℓ}^{\underline{k} - 1} x_{ℓ}^{⋆} + C_{1} q_{ctr} α_{ℓ}^{\underline{k} - 1} β_{ℓ}^{\underline{k} - 1}] \\ + μ^{2} [q_{θ} η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) + q_{θ} C_{1} (η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) β_{ℓ}^{\underline{k} - 1} + ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) α_{ℓ}^{\underline{k} - 1}) \\ + C_{1}^{2} α_{ℓ}^{\underline{k} - 1} β_{ℓ}^{\underline{k} - 1}] . \end{matrix}

Rearranging the terms, we obtain that

\begin{matrix} \begin{matrix} Δ_{ℓ + 1}^{0} & \leq (q_{ctr}^{2} + 2 μ q_{ctr} C_{1} + μ^{2} C_{1}^{2}) α_{ℓ}^{\underline{k} - 1} β_{ℓ}^{\underline{k} - 1} \\ + μ (q_{ctr} + μ q_{θ} C_{1}) [η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) β_{ℓ}^{\underline{k} - 1} + ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) α_{ℓ}^{\underline{k} - 1}] \\ + μ^{2} q_{θ} η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) + R_{ℓ}, \end{matrix} \end{matrix}

where the remainder term is defined as

\begin{matrix} R_{ℓ} : = μ [η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) y_{ℓ}^{⋆} + ζ_{ℓ} (z_{ℓ}^{\underline{k} - 1}) x_{ℓ}^{⋆}] + (q_{ctr} + μ C_{1}) [α_{ℓ}^{\underline{k} - 1} y_{ℓ}^{⋆} + β_{ℓ}^{\underline{k} - 1} x_{ℓ}^{⋆}] + x_{ℓ}^{⋆} y_{ℓ}^{⋆} . \end{matrix}

Up to the choice of $μ$ , this concludes the proof. $□$

Proof of Lemma 10 (choosing $μ$ ) For Lemma 10(i), we choose $μ$ small enough such that $q_{ctr} + 2 μ C_{stab} \leq 1$ . From (41) and (49) in the proofs of Lemma 10(iii)–(iv), we see that we additionally require

\begin{matrix} q_{ctr} + μ λ_{ctr}^{- 1} (1 + q_{ctr}) < 1, q_{ctr}^{2} + 2 μ q_{ctr} C_{1} + μ^{2} C_{1}^{2} < 1, and q_{ctr} + μ q_{θ} C_{1} < 1 . \end{matrix}

Choosing $μ$ small enough, we satisfy all estimates. We define $q_{aux} < 1$ as the maximum of all terms in (51) and $q_{θ}$ . $□$

Proof of Lemma 10(v)

First, we note that from stability (A1) it follows that

\begin{matrix} η_{ℓ} (u_{ℓ}^{\underline{k} - 1}) ≲ η_{ℓ} (u_{ℓ}^{⋆}) + α_{ℓ}^{\underline{k} - 1} and η_{ℓ} (u_{ℓ}^{⋆}) ζ_{ℓ} (z_{ℓ}^{⋆}) ≲ Δ_{ℓ}^{j} for all 0 \leq j \leq \underline{k} . \end{matrix}

Furthermore, Galerkin orthogonality and reliability (A3) imply that, for all $n \in N$ with $ℓ^{'} + n < \underline{ℓ}$ ,

With (52) and (53) for $n = 1$ , we can bound the remainder term from (50) by

\begin{matrix} R_{ℓ} ≲ η_{ℓ} (u_{ℓ}^{⋆}) y_{ℓ}^{⋆} + ζ_{ℓ} (z_{ℓ}^{⋆}) x_{ℓ}^{⋆} + α_{ℓ}^{\underline{k} - 1} y_{ℓ}^{⋆} + β_{ℓ}^{\underline{k} - 1} x_{ℓ}^{⋆} . \end{matrix}

Next, let us recall from [5, Lemma 3.6] the quasi-monotonicity of the estimator, which follows from (A1)–(A3) and the Céa lemma, i.e., for all $ℓ^{'} \leq ℓ < \underline{ℓ}$ ,

\begin{matrix} η_{ℓ} (u_{ℓ}^{⋆}) \leq η_{ℓ^{'}} (u_{ℓ^{'}}^{⋆}) + C_{stab} | | | u_{ℓ}^{⋆} - u_{ℓ^{'}}^{⋆} | | | \leq η_{ℓ^{'}} (u_{ℓ^{'}}^{⋆}) + C_{stab} | | | u^{⋆} - u_{ℓ^{'}}^{⋆} | | | ≲ η_{ℓ^{'}} (u_{ℓ^{'}}^{⋆}) . \end{matrix}

For $η_{ℓ} (u_{ℓ}^{⋆}) y_{ℓ}$ , we get by summation for all $0 \leq j \leq \underline{k} (ℓ^{'})$ and all $n \in N$ with $ℓ^{'} + n < \underline{ℓ}$ that

Analogously, we see that

\begin{matrix} \sum_{ℓ = ℓ^{'}}^{ℓ^{'} + n} {(x_{ℓ}^{⋆})}^{2} ≲ η_{ℓ^{'}} {(u_{ℓ^{'}}^{⋆})}^{2} as well as \sum_{ℓ = ℓ^{'}}^{ℓ^{'} + n} ζ_{ℓ} {(z_{ℓ}^{⋆})}^{2} {(x_{ℓ}^{⋆})}^{2} ≲ {(Δ_{ℓ^{'}}^{j})}^{2} . \end{matrix}

We proceed with $α_{ℓ}^{\underline{k} - 1} y_{ℓ}^{⋆}$ . From (42) and the Young inequality with $δ > 0$ , we see for $0 < ℓ^{'} \leq ℓ < \underline{ℓ}$ that

\begin{matrix} {(α_{ℓ}^{\underline{k} - 1})}^{2} \leq {(α_{ℓ}^{0})}^{2} \overset{}{\leq} (1 + δ^{- 1}) {(x_{ℓ - 1}^{⋆})}^{2} + q_{ctr} (1 + δ) {(α_{ℓ - 1}^{\underline{k} - 1})}^{2} . \end{matrix}

For $δ$ small enough such that $q_{2} : = q_{ctr} (1 + δ) < 1$ and all for $0 \leq ℓ \leq ℓ^{'} < \underline{ℓ}$ , the geometric series proves that

\begin{matrix} {(α_{ℓ}^{\underline{k} - 1})}^{2} \leq (1 + δ^{- 1}) \sum_{j = ℓ^{'}}^{ℓ - 1} {(x_{j}^{⋆})}^{2} + {(α_{ℓ}^{\underline{k} - 1})}^{2} \sum_{j = 0}^{\infty} q_{2}^{j} \overset{}{≲} η_{ℓ^{'}} {(u_{ℓ^{'}}^{⋆})}^{2} + {(α_{ℓ^{'}}^{\underline{k} - 1})}^{2} \end{matrix}

and thus

Analogously, we see that $\sum_{ℓ = ℓ^{'}}^{ℓ^{'} + n} {(β_{ℓ}^{\underline{k} - 1})}^{2} {(x_{ℓ}^{⋆})}^{2} ≲ {(Δ_{ℓ^{'}}^{\underline{k} - 1})}^{2}$ . Combining all estimates with

\begin{matrix} R_{ℓ}^{2} ≲ η_{ℓ} {(u_{ℓ}^{⋆})}^{2} {(y_{ℓ}^{⋆})}^{2} + ζ_{ℓ} {(z_{ℓ}^{⋆})}^{2} {(x_{ℓ}^{⋆})}^{2} + {(α_{ℓ}^{\underline{k} - 1})}^{2} {(y_{ℓ}^{⋆})}^{2} + {(β_{ℓ}^{\underline{k} - 1})}^{2} {(x_{ℓ}^{⋆})}^{2}, \end{matrix}

we conclude the proof. $□$

With the foregoing auxiliary result, we are in the position to prove linear convergence.

Proof of Theorem 6

Let $(ℓ, k) \in Q$ . We recall the quasi-error products

\begin{matrix} Λ_{ℓ}^{k} & = [| | | u_{ℓ}^{⋆} - u_{ℓ}^{k} | | | + η_{ℓ} (u_{ℓ}^{k})] [| | | z_{ℓ}^{⋆} - z_{ℓ}^{k} | | | + ζ_{ℓ} (z_{ℓ}^{k})], \\ Δ_{ℓ}^{k} & = [| | | u_{ℓ}^{⋆} - u_{ℓ}^{k} | | | + μ η_{ℓ} (u_{ℓ}^{k})] [| | | z_{ℓ}^{⋆} - z_{ℓ}^{k} | | | + μ ζ_{ℓ} (z_{ℓ}^{k})] \end{matrix}

from Theorem 6 and Lemma 10, respectively. Note that

\begin{matrix} Λ_{ℓ}^{k} \leq Δ_{ℓ}^{k} \leq μ^{2} Λ_{ℓ}^{k} if μ \geq 1, Δ_{ℓ}^{k} \leq Λ_{ℓ}^{k} \leq μ^{- 2} Δ_{ℓ}^{k} if μ < 1, \end{matrix}

which yields the equivalence

\begin{matrix} min {1, μ^{2}} Λ_{ℓ}^{k} \leq Δ_{ℓ}^{k} \leq max {1, μ^{2}} Λ_{ℓ}^{k} . \end{matrix}

We first show linear convergence of $Δ_{ℓ}^{k}$ . By Lemma 10(i), we can absorb the term $Δ_{ℓ^{'}}^{\underline{k}} \leq Δ_{ℓ^{'}}^{\underline{k} - 1}$ for all $ℓ^{'}$ . Paying attention to the possible case $k = \underline{k} (ℓ)$ , this allows us to estimate

\begin{matrix} \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \geq | (ℓ, k) | \end{matrix}} {(Δ_{ℓ^{'}}^{k^{'}})}^{2} ≲ {(Δ_{ℓ}^{k})}^{2} + \sum_{k^{'} = k}^{\underline{k} (ℓ) - 1} {(Δ_{ℓ}^{k^{'}})}^{2} + \sum_{ℓ^{'} = ℓ + 1}^{\underline{ℓ}} \sum_{k^{'} = 0}^{\underline{k} (ℓ^{'}) - 1} {(Δ_{ℓ^{'}}^{k^{'}})}^{2} . \end{matrix}

Lemma 10(iii) shows uniform reduction of the quasi-error on every level. This yields that

\begin{matrix} \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \geq | (ℓ, k) | \end{matrix}} {(Δ_{ℓ^{'}}^{k^{'}})}^{2} & ≲ {(Δ_{ℓ}^{k})}^{2} \sum_{k^{'} = k}^{\underline{k} (ℓ)} q_{aux}^{2 (k^{'} - k)} + \sum_{ℓ^{'} = ℓ + 1}^{\underline{ℓ}} {(Δ_{ℓ^{'}}^{0})}^{2} \sum_{k^{'} = 0}^{\underline{k} (ℓ^{'}) - 1} q_{aux}^{2 k^{'}} \\ ≲ {(Δ_{ℓ}^{k})}^{2} + \sum_{ℓ^{'} = ℓ + 1}^{\underline{ℓ}} {(Δ_{ℓ^{'}}^{0})}^{2} . \end{matrix}

To estimate the sum over all levels, we use that, for the refinement step, Lemma 10(iv) shows contraction up to a remainder term. The Young inequality with $δ > 0$ and Lemma 10(i) then prove that

\begin{matrix} {(Δ_{ℓ^{'}}^{0})}^{2} & \leq q_{aux}^{2} (1 + δ) {(Δ_{ℓ^{'} - 1}^{\underline{k} - 1})}^{2} + (1 + δ^{- 1}) R_{ℓ^{'} - 1}^{2} \\ \leq q_{aux}^{2} (1 + δ) {(Δ_{ℓ^{'} - 1}^{0})}^{2} + (1 + δ^{- 1}) R_{ℓ^{'} - 1}^{2} for all 0 < ℓ^{'} \leq \underline{ℓ} . \end{matrix}

Choosing $δ$ small enough such that $q : = q_{aux}^{2} (1 + δ) < 1$ , we obtain from repeatedly applying the previous estimates that

\begin{matrix} {(Δ_{ℓ^{'}}^{0})}^{2} \leq q^{ℓ^{'} - ℓ} {(Δ_{ℓ}^{\underline{k} - 1})}^{2} + (1 + δ^{- 1}) \sum_{n = ℓ}^{ℓ^{'} - 1} q^{(ℓ^{'} - 1) - n} R_{n}^{2} for all 0 \leq ℓ < ℓ^{'} \leq \underline{ℓ} . \end{matrix}

Using this estimate and a change of summation indices, the geometric series and Lemma 10(v) uniformly bound the sum over all levels by

\begin{matrix} \sum_{ℓ^{'} = ℓ + 1}^{\underline{ℓ}} {(Δ_{ℓ^{'}}^{0})}^{2} & ≲ \sum_{ℓ^{'} = ℓ + 1}^{\underline{ℓ}} [q^{ℓ^{'} - ℓ} {(Δ_{ℓ}^{\underline{k} - 1})}^{2} + \sum_{n = ℓ}^{ℓ^{'} - 1} q^{(ℓ^{'} - 1) - n} R_{n}^{2}] \\ ≲ {(Δ_{ℓ}^{\underline{k} - 1})}^{2} + \sum_{n = ℓ}^{\underline{ℓ} - 1} R_{n}^{2} \sum_{i = 0}^{\infty} q^{i} ≲ {(Δ_{ℓ}^{\underline{k} - 1})}^{2} + \sum_{n = ℓ}^{\underline{ℓ} - 1} R_{n}^{2} \overset{(v)}{≲} {(Δ_{ℓ}^{\underline{k} - 1})}^{2} . \end{matrix}

Combining the estimates above, we obtain that

\begin{matrix} \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \geq | (ℓ, k) | \end{matrix}} {(Δ_{ℓ^{'}}^{k^{'}})}^{2} ≲ {(Δ_{ℓ}^{k})}^{2} + \sum_{ℓ^{'} = ℓ + 1}^{\underline{ℓ}} {(Δ_{ℓ^{'}}^{0})}^{2} ≲ {(Δ_{ℓ}^{k})}^{2} + {(Δ_{ℓ}^{\underline{k} - 1})}^{2} . \end{matrix}

In the case $k < \underline{k} (ℓ)$ , Lemma 10(i) proves that

\begin{matrix} \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \geq | (ℓ, k) | \end{matrix}} {(Δ_{ℓ^{'}}^{k^{'}})}^{2} \leq C {(Δ_{ℓ}^{k})}^{2} . \end{matrix}

In the case $k = \underline{k} (ℓ)$ , this follows with Lemma 10(ii). In either case, the constant $C > 0$ depends only on $C_{aux}$ and $q_{aux}$ from Lemma 10. Basic calculus then provides the existence of $C_{lin}^{'} : = {(1 + C)}^{1 / 2} > 1$ and $0 < q_{lin} : = {(1 - C^{- 1})}^{- 1 / 2} < 1$ such that

\begin{matrix} Δ_{ℓ^{'}}^{k^{'}} \leq C_{lin}^{'} q_{lin}^{| (ℓ^{'}, k^{'}) | - | (ℓ, k) |} Δ_{ℓ}^{k} for all (ℓ, k), (ℓ^{'}, k^{'}) \in Q with (ℓ^{'}, k^{'}) \geq (ℓ, k) ; \end{matrix}

see [5, Lemma 4.9]. Finally, the claim of Theorem 6 follows from (56) with $C_{lin} = max {μ^{- 2}, μ^{2}} C_{lin}^{'}$ . $□$

Proof of Theorem 8 (optimal rates)

We recall the following comparison lemma from [12]. While [12] is concerned with point errors in boundary element computations, we stress that the proof of [12, Lemma 14] works on a completely abstract level and thus is applicable here as well.

Lemma 11

([12, Lemma 14]) The overlay estimate (14) and the axioms (A1)–(A2) and (A4) yield the existence of a constant $C_{1} > 0$ such that, given $0 < κ < 1$ , each mesh $T_{H} \in T$ admits some refinement $T_{h} \in T (T_{H})$ such that for all $s, t > 0$ , it holds that

\begin{matrix} η_{h} {(u_{h}^{⋆})}^{2} ζ_{h} {(z_{h}^{⋆})}^{2} & \leq κ^{2} η_{H} {(u_{H}^{⋆})}^{2} ζ_{H} {(z_{H}^{⋆})}^{2}, \end{matrix}

57a

\begin{matrix} # T_{h} - # T_{H} & \leq 2 (C_{1} κ^{- 1} ‖ u^{⋆} ‖_{A_{s}} {‖ z^{⋆} ‖}_{A_{t}})^{1 / (s + t)} (η_{H} (u_{H}^{⋆}) ζ (z_{H}^{⋆}))^{1 / (s + t)} . \end{matrix}

57b

The constant $C_{1}$ depends only on $C_{stab}$ , $q_{red}$ , and $C_{drel}$ . $□$

Note that (57a) immediately implies that

\begin{matrix} η_{h} {(u_{h})}^{2} \leq κ η_{H} {(u_{H}^{⋆})}^{2} or ζ_{h} {(z_{h}^{⋆})}^{2} \leq κ ζ_{H} {(z_{H}^{⋆})}^{2} . \end{matrix}

We will employ this lemma in combination with the so-called optimality of Dörfler marking from [5].

Lemma 12

([5, Proposition 4.12]) Under (A1) and (A4), for all $0 < Θ^{'} < 1 / (1 + C_{stab}^{2} C_{drel}^{2})$ , there exists $0 < κ_{Θ^{'}} < 1$ such that for all $T_{H} \in T$ and all $T_{h} \in T (T_{H})$ , (58) with $κ = κ_{Θ^{'}}$ implies that

\begin{matrix} Θ^{'} η_{H} {(u_{H}^{⋆})}^{2} \leq η_{H} {(T_{H} \ T_{h}, u_{H}^{⋆})}^{2} or Θ^{'} ζ_{H} {(z_{H}^{⋆})}^{2} \leq ζ_{H} {(T_{H} \ T_{h}, z_{H}^{⋆})}^{2} . \end{matrix}

The constant $κ_{Θ^{'}}$ depends only on $C_{stab}$ , $C_{drel}$ , and $Θ^{'}$ . $□$

The next lemma is already implicitly found in [15]. It shows that, if $λ_{ctr} > 0$ is sufficiently small, then Dörfler marking for the exact discrete solution implicitly implies Dörfler marking for the approximate discrete solution. This will turn out to be the key observation to prove optimal convergence rates. We include the proof for the convenience of the reader.

Lemma 13

Suppose (A1)–(A3). Let $0 < Θ \leq 1$ and $0 < λ_{ctr} < λ_{⋆} : = (1 - q_{ctr}) / (q_{ctr} C_{stab})$ . Define $Θ^{'} : = (\frac{\sqrt{Θ} + λ_{ctr} / λ_{⋆}}{1 - λ_{ctr} / λ_{⋆}})^{2}$ . Then, as soon as the iterative solver terminates (22), there hold the following statements (i)–(iv) for all $0 \leq ℓ < \underline{ℓ}$ and all $U_{ℓ} \subseteq T_{ℓ}$ :

(i)
$(1 - λ_{ctr} / λ_{⋆}) η_{ℓ} (u_{ℓ}^{\underline{m}}) \leq η_{ℓ} (u_{ℓ}^{⋆}) \leq (1 + λ_{ctr} / λ_{⋆}) η_{ℓ} (u_{ℓ}^{\underline{m}})$ .
(ii)
$Θ η_{ℓ} {(u_{ℓ}^{\underline{m}})}^{2} \leq η_{ℓ} {(U_{ℓ}, u_{ℓ}^{\underline{m}})}^{2}$ provided that $Θ^{'} η_{ℓ} {(u_{ℓ}^{⋆})}^{2} \leq η_{ℓ} {(U_{ℓ}, u_{ℓ}^{⋆})}^{2}$ .
(iii)
$(1 - λ_{ctr} / λ_{⋆}) ζ_{ℓ} (z_{ℓ}^{\underline{n}}) \leq ζ_{ℓ} (z_{ℓ}^{⋆}) \leq (1 + λ_{ctr} / λ_{⋆}) ζ_{ℓ} (z_{ℓ}^{\underline{n}})$ .
(iv)
$Θ ζ_{ℓ} (z_{ℓ}^{\underline{n}}) \leq ζ_{ℓ} (U_{ℓ}, z_{ℓ}^{\underline{n}})$ provided that $Θ^{'} ζ_{ℓ} {(z_{ℓ}^{⋆})}^{2} \leq ζ_{ℓ} {(U_{ℓ}, z_{ℓ}^{⋆})}^{2}$ .

Proof

It holds that

\begin{matrix} η_{ℓ} (U_{ℓ}, u_{ℓ}^{⋆}) (A 1) \leq η_{ℓ} (U_{ℓ}, u_{ℓ}^{\underline{m}}) + C_{stab} | | | u_{ℓ}^{⋆} - u_{ℓ}^{\underline{m}} | | | (20) \leq η_{ℓ} (U_{ℓ}, u_{ℓ}^{\underline{m}}) \\ + C_{stab} \frac{q_{ctr}}{1 - q_{ctr}} | | | u_{ℓ}^{\underline{m}} - u_{ℓ}^{\underline{m} - 1} | | | \\ (22) \leq η_{ℓ} (U_{ℓ}, u_{ℓ}^{\underline{m}}) + C_{stab} \frac{q_{ctr}}{1 - q_{ctr}} λ_{ctr} η_{ℓ} (u_{ℓ}^{\underline{m}}) = η_{ℓ} (U_{ℓ}, u_{ℓ}^{\underline{m}}) + \frac{λ_{ctr}}{λ_{⋆}} η_{ℓ} (u_{ℓ}^{\underline{m}}) . \end{matrix}

The same argument proves that

\begin{matrix} η_{ℓ} (U_{ℓ}, u_{ℓ}^{\underline{m}}) \leq η_{ℓ} (U_{ℓ}, u_{ℓ}^{⋆}) + \frac{λ_{ctr}}{λ_{⋆}} η_{ℓ} (u_{ℓ}^{\underline{m}}) . \end{matrix}

For $U_{ℓ} = T_{ℓ}$ , the latter two estimates lead to

\begin{matrix} (1 - λ_{ctr} / λ_{⋆}) η_{ℓ} (u_{ℓ}^{\underline{m}}) \leq η_{ℓ} (u_{ℓ}^{⋆}) \leq (1 + λ_{ctr} / λ_{⋆}) η_{ℓ} (u_{ℓ}^{\underline{m}}) . \end{matrix}

This concludes the proof of (i). To see (ii), we use the assumption

\begin{matrix} (1 - λ_{ctr} / λ_{⋆}) \sqrt{Θ^{'}} η_{ℓ} (u_{ℓ}^{\underline{m}}) \overset{(i)}{\leq} \sqrt{Θ^{'}} η_{ℓ} (u_{ℓ}^{⋆}) \leq η_{ℓ} (U_{ℓ}, u_{ℓ}^{⋆}) \leq η_{ℓ} (U_{ℓ}, u_{ℓ}^{\underline{m}}) + \frac{λ_{ctr}}{λ_{⋆}} η_{ℓ} (u_{ℓ}^{\underline{m}}) . \end{matrix}

Noting that $\sqrt{Θ} = (1 - λ_{ctr} / λ_{⋆}) \sqrt{Θ^{'}} - λ_{ctr} / λ_{⋆}$ , this concludes the proof of (ii). The remaining claims (iii)–(iv) follow verbatim. $□$

Proof of Theorem 8

By Corollary 7, it is sufficient to prove that

\begin{matrix} C_{s + t} = sup_{(ℓ, k) \in Q} (# T_{ℓ} - # T_{0} + 1)^{s + t} Λ_{ℓ}^{k} ≲ max {‖ u^{⋆} ‖_{A_{s}} ‖ z^{⋆} ‖_{A_{t}}, Λ_{0}^{0}} . \end{matrix}

We prove this inequality in two steps.

Step 1: In this step, we bound the number of marked elements $# M_{ℓ^{'}}$ for arbitrary $0 \leq ℓ^{'} < \underline{ℓ}$ . Let $Θ > 0$ and corresponding $Θ^{'}$ from Lemma 13 such that

\begin{matrix} Θ^{'} = (\frac{\sqrt{Θ} + λ_{ctr} / λ_{⋆}}{1 - λ_{ctr} / λ_{⋆}})^{2} < \frac{1}{1 + C_{stab}^{2} C_{drel}^{2}} . \end{matrix}

Let $T_{h (ℓ^{'})} \in T (T_{ℓ^{'}})$ be the corresponding mesh as in Lemma 11. With Lemma 12, this yields that

\begin{matrix} Θ^{'} η_{ℓ^{'}} {(u_{ℓ^{'}}^{⋆})}^{2} \leq η_{ℓ^{'}} {(T_{ℓ^{'}} \ T_{h (ℓ^{'})}, u_{ℓ^{'}}^{⋆})}^{2} or Θ^{'} ζ_{ℓ^{'}} {(z_{ℓ^{'}}^{⋆})}^{2} \leq ζ_{ℓ^{'}} {(T_{ℓ^{'}} \ T_{h (ℓ^{'})}, z_{ℓ^{'}}^{⋆})}^{2} . \end{matrix}

Lemma 13 with $U_{ℓ^{'}} = T_{ℓ^{'}} \ T_{h (ℓ^{'})}$ shows that

\begin{matrix} Θ η_{ℓ^{'}} {(u_{ℓ^{'}}^{\underline{m}})}^{2} \leq η_{ℓ^{'}} {(T_{ℓ^{'}} \ T_{h (ℓ^{'})}, u_{ℓ^{'}}^{⋆})}^{2} or Θ ζ_{ℓ^{'}} {(z_{ℓ^{'}}^{\underline{n}})}^{2} \leq ζ_{ℓ^{'}} {(T_{ℓ^{'}} \ T_{h (ℓ^{'})}, z_{ℓ^{'}}^{⋆})}^{2} . \end{matrix}

We consider the marking strategies from Remark 2 separately.

For strategy (a), we have with $Θ : = 2 θ$ and assumption (32) that (60) is satisfied. Hence, (61) implies that there holds (17), i.e.,

\begin{matrix} 2 θ η_{ℓ^{'}} {(u_{ℓ^{'}}^{\underline{m}})}^{2} ζ_{ℓ^{'}} {(z_{ℓ^{'}}^{\underline{n}})}^{2} \leq η_{ℓ^{'}} {(T_{ℓ^{'}} \ T_{h (ℓ^{'})}, u_{ℓ^{'}}^{\underline{m}})}^{2} ζ_{ℓ^{'}} {(z_{ℓ^{'}}^{\underline{n}})}^{2} + η_{ℓ^{'}} {(u_{ℓ^{'}}^{\underline{m}})}^{2} ζ_{ℓ^{'}} {(T_{ℓ^{'}} \ T_{h (ℓ^{'})}, z_{ℓ^{'}}^{\underline{n}})}^{2} . \end{matrix}

By assumption of Theorem 8, $M_{ℓ^{'}}$ is essentially minimal with (17). We infer that

\begin{matrix} # M_{ℓ^{'}} \leq C_{mark} # (T_{ℓ^{'}} \ T_{h (ℓ^{'})}) \overset{}{≲} # T_{h (ℓ^{'})} - # T_{ℓ^{'}} . \end{matrix}

For the strategies (b)–(c), we set $Θ = θ$ and note that assumption (32) (as well as the weaker assumption (34)) imply (60), and hence (61). Again, by assumption of Theorem 8, $M_{ℓ}$ is chosen essentially minimal (with an additional factor two for the strategy (c)) such that (61) holds. For all three strategies, we therefore conclude that

Recall that () and (22) give that $η_{ℓ^{'}} (u_{ℓ^{'}}^{\underline{k}}) ζ_{ℓ^{'}} (z_{ℓ^{'}}^{\underline{k}}) ≃ Λ_{ℓ^{'}}^{\underline{k}}$ . This finally shows that

\begin{matrix} # M_{ℓ^{'}} ≲ (‖ u^{⋆} ‖_{A_{s}} {‖ z^{⋆} ‖}_{A_{t}})^{1 / (s + t)} {(Λ_{ℓ^{'}}^{\underline{k}})}^{- 1 / (s + t)} . \end{matrix}

Step 2: Let $(ℓ, k) \in Q$ . First, we consider $ℓ > 0$ and thus $# T_{ℓ} > # T_{0}$ . The closure estimate and Step 1 prove that

\begin{matrix} # T_{ℓ} - # T_{0} + 1 ≃ # T_{ℓ} - # T_{0} & \overset{}{≲} \sum_{ℓ^{'} = 0}^{ℓ - 1} # M_{ℓ^{'}} ≲ (‖ u^{⋆} ‖_{A_{s}} {‖ z^{⋆} ‖}_{A_{t}})^{1 / (s + t)} \sum_{ℓ^{'} = 0}^{ℓ - 1} {(Λ_{ℓ^{'}}^{\underline{k}})}^{- 1 / (s + t)} \\ \leq (‖ u^{⋆} ‖_{A_{s}} {‖ z^{⋆} ‖}_{A_{t}})^{1 / (s + t)} \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \geq | (ℓ, k) | \end{matrix}} {(Λ_{ℓ^{'}}^{k^{'}})}^{- 1 / (s + t)} . \end{matrix}

Linear convergence of Theorem 6, further shows that

\begin{matrix} # T_{ℓ} - # T_{0} + 1 & ≲ (‖ u^{⋆} ‖_{A_{s}} {‖ z^{⋆} ‖}_{A_{t}})^{1 / (s + t)} C_{lin}^{1 / (s + t)} {(Λ_{ℓ}^{k})}^{- 1 / (s + t)} \\ \sum_{\begin{matrix} (ℓ^{'}, k^{'}) \in Q \\ | (ℓ^{'}, k^{'}) | \geq | (ℓ, k) | \end{matrix}} {(q_{lin}^{1 / (s + t)})}^{| (ℓ, k) | - | (ℓ^{'}, k^{'}) |} \\ \leq (‖ u^{⋆} ‖_{A_{s}} {‖ z^{⋆} ‖}_{A_{t}})^{1 / (s + t)} \frac{C_{lin}^{1 / (s + t)}}{1 - q_{lin}^{1 / (s + t)}} C_{lin}^{1 / (s + t)} {(Λ_{ℓ}^{k})}^{- 1 / (s + t)} . \end{matrix}

Rearranging this estimate, we see that

\begin{matrix} {(# T_{ℓ} - # T_{0} + 1)}^{s + t} Λ_{ℓ}^{k} ≲ ‖ u^{⋆} ‖_{A_{s}} {‖ z^{⋆} ‖}_{A_{t}} for all (ℓ, k) \in Q with ℓ > 0 . \end{matrix}

It remains to consider $ℓ = 0$ . By Theorem 6, we have that

\begin{matrix} {(# T_{ℓ} - # T_{0} + 1)}^{s + t} Λ_{ℓ}^{k} = Λ_{0}^{k} ≲ Λ_{0}^{0} for all (ℓ, k) \in Q with ℓ = 0 . \end{matrix}

This concludes the proof. $□$

Acknowledgements

The authors thankfully acknowledge support by the Austrian Science Fund (FWF) through the doctoral school Dissipation and dispersion in nonlinear PDEs (grant W1245), the SFB Taming complexity in partial differential systems (grant SFB F65), the stand-alone project Computational nonlinear PDEs (grant P33216), and the Erwin Schrödinger Fellowship Optimal adaptivity for space-time methods (grant J4379).

Funding Information

Open access funding provided by Austrian Science Fund (FWF).

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Roland Becker, Email: roland.becker@univ-pau.fr.

Gregor Gantner, Email: g.gantner@uva.nl.

Michael Innerberger, Email: michael.innerberger@asc.tuwien.ac.at.

Dirk Praetorius, Email: dirk.praetorius@asc.tuwien.ac.at.

References

1.Binev P, Dahmen W, DeVore R. Adaptive finite element methods with convergence rates. Numer. Math. 2004;97(2):219–268. doi: 10.1007/s00211-003-0492-7. [DOI] [Google Scholar]
2.Becker R, Estecahandy E, Trujillo D. Weighted marking for goal-oriented adaptive finite element methods. SIAM J. Numer. Anal. 2011;49(6):2451–2469. doi: 10.1137/100794298. [DOI] [Google Scholar]
3.Becker R, Rannacher R. An optimal control approach to a posteriori error estimation in finite element methods. Acta Numer. 2001;10:1–102. doi: 10.1017/S0962492901000010. [DOI] [Google Scholar]
4.Bangerth W, Rannacher R. Adaptive finite element methods for differential equations. Basel: Birkhäuser; 2003. [Google Scholar]
5.Carstensen C, Feischl M, Page M, Praetorius D. Axioms of adaptivity. Comput. Math. Appl. 2014;67(6):1195–1253. doi: 10.1016/j.camwa.2013.12.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Cascon JM, Kreuzer C, Nochetto RH, Siebert KG. Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal. 2008;46(5):2524–2550. doi: 10.1137/07069047X. [DOI] [Google Scholar]
7.Cascón JM, Nochetto RH. Quasioptimal cardinality of AFEM driven by nonresidual estimators. IMA J. Numer. Anal. 2012;32(1):1–29. doi: 10.1093/imanum/drr014. [DOI] [Google Scholar]
8.Chen L, Nochetto RH, Jinchao X. Optimal multilevel methods for graded bisection grids. Numer. Math. 2012;120(1):1–34. doi: 10.1007/s00211-011-0401-4. [DOI] [Google Scholar]
9.Dörfler W. A convergent adaptive algorithm for Poisson’s equation. SIAM J. Numer. Anal. 1996;33(3):1106–1124. doi: 10.1137/0733054. [DOI] [Google Scholar]
10.Eriksson K, Estep D, Hansbo P, Johnson C. Introduction to adaptive methods for differential equations. Acta Numer. 1995;4:105–158. doi: 10.1017/S0962492900002531. [DOI] [Google Scholar]
11.Feischl M, Führer T, Praetorius D. Adaptive FEM with optimal convergence rates for a certain class of nonsymmetric and possibly nonlinear problems. SIAM J. Numer. Anal. 2014;52(2):601–625. doi: 10.1137/120897225. [DOI] [Google Scholar]
12.Feischl M, Gantner G, Haberl A, Praetorius D, Führer T. Adaptive boundary element methods for optimal convergence of point errors. Numer. Math. 2016;132(3):541–567. doi: 10.1007/s00211-015-0727-4. [DOI] [Google Scholar]
13.Führer T, Praetorius D. A linear Uzawa-type FEM-BEM solver for nonlinear transmission problems. Comput. Math. Appl. 2018;75(8):2678–2697. doi: 10.1016/j.camwa.2017.12.035. [DOI] [Google Scholar]
14.Feischl M, Praetorius D, van der Zee KG. An abstract analysis of optimal goal-oriented adaptivity. SIAM J. Numer. Anal. 2016;54(3):1423–1448. doi: 10.1137/15M1021982. [DOI] [Google Scholar]
15.Gantner G, Haberl A, Praetorius D, Stiftner B. Rate optimal adaptive FEM with inexact solver for nonlinear operators. IMA J. Numer. Anal. 2018;38:1797–1831. doi: 10.1093/imanum/drx050. [DOI] [Google Scholar]
16.Gantner G, Haberl A, Praetorius D, Schimanko S. Rate optimality of adaptive finite element methods with respect to overall computational costs. Math. Comp. 2021;90(331):2011–2040. doi: 10.1090/mcom/3654. [DOI] [Google Scholar]
17.Giles, M., Süli, Endre: Adjoint methods for PDEs: a posteriori error analysis and postprocessing by duality. Acta Numer. 11, 145–236 (2002)
18.Karkulik M, Pavlicek D, Praetorius D. On 2D newest vertex bisection: optimality of mesh-closure and $H^{1}$ -stability of $L_{2}$ -projection. Constr. Approx. 2013;38(2):213–234. doi: 10.1007/s00365-013-9192-4. [DOI] [Google Scholar]
19.Morin P, Nochetto RH, Siebert KG. Data oscillation and convergence of adaptive FEM. SIAM J. Numer. Anal. 2000;38(2):466–488. doi: 10.1137/S0036142999360044. [DOI] [Google Scholar]
20.Mommer MS, Stevenson R. A goal-oriented adaptive finite element method with convergence rates. SIAM J. Numer. Anal. 2009;47(2):861–886. doi: 10.1137/060675666. [DOI] [Google Scholar]
21.Pfeiler CM, Praetorius D. Dörfler marking with minimal cardinality is a linear complexity problem. Math. Comp. 2020;89(326):2735–2752. doi: 10.1090/mcom/3553. [DOI] [Google Scholar]
22.Schimanko, S.: On rate-optimal adaptive algorithms with inexact solvers. PhD thesis, TU Wien, Institute of Analysis and Scientific Computing, 2021
23.Stevenson R. Optimality of a standard adaptive finite element method. Found. Comput. Math. 2007;7(2):245–269. doi: 10.1007/s10208-005-0183-0. [DOI] [Google Scholar]
24.Stevenson R. The completion of locally refined simplicial partitions created by bisection. Math. Comp. 2008;77(261):227–241. doi: 10.1090/S0025-5718-07-01959-X. [DOI] [Google Scholar]
25.Jinbiao W, Zheng H. Uniform convergence of multigrid methods for adaptive meshes. Appl. Numer. Math. 2017;113:109–123. doi: 10.1016/j.apnum.2016.11.005. [DOI] [Google Scholar]

[CR1] 1.Binev P, Dahmen W, DeVore R. Adaptive finite element methods with convergence rates. Numer. Math. 2004;97(2):219–268. doi: 10.1007/s00211-003-0492-7. [DOI] [Google Scholar]

[CR2] 2.Becker R, Estecahandy E, Trujillo D. Weighted marking for goal-oriented adaptive finite element methods. SIAM J. Numer. Anal. 2011;49(6):2451–2469. doi: 10.1137/100794298. [DOI] [Google Scholar]

[CR3] 3.Becker R, Rannacher R. An optimal control approach to a posteriori error estimation in finite element methods. Acta Numer. 2001;10:1–102. doi: 10.1017/S0962492901000010. [DOI] [Google Scholar]

[CR4] 4.Bangerth W, Rannacher R. Adaptive finite element methods for differential equations. Basel: Birkhäuser; 2003. [Google Scholar]

[CR5] 5.Carstensen C, Feischl M, Page M, Praetorius D. Axioms of adaptivity. Comput. Math. Appl. 2014;67(6):1195–1253. doi: 10.1016/j.camwa.2013.12.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Cascon JM, Kreuzer C, Nochetto RH, Siebert KG. Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal. 2008;46(5):2524–2550. doi: 10.1137/07069047X. [DOI] [Google Scholar]

[CR7] 7.Cascón JM, Nochetto RH. Quasioptimal cardinality of AFEM driven by nonresidual estimators. IMA J. Numer. Anal. 2012;32(1):1–29. doi: 10.1093/imanum/drr014. [DOI] [Google Scholar]

[CR8] 8.Chen L, Nochetto RH, Jinchao X. Optimal multilevel methods for graded bisection grids. Numer. Math. 2012;120(1):1–34. doi: 10.1007/s00211-011-0401-4. [DOI] [Google Scholar]

[CR9] 9.Dörfler W. A convergent adaptive algorithm for Poisson’s equation. SIAM J. Numer. Anal. 1996;33(3):1106–1124. doi: 10.1137/0733054. [DOI] [Google Scholar]

[CR10] 10.Eriksson K, Estep D, Hansbo P, Johnson C. Introduction to adaptive methods for differential equations. Acta Numer. 1995;4:105–158. doi: 10.1017/S0962492900002531. [DOI] [Google Scholar]

[CR11] 11.Feischl M, Führer T, Praetorius D. Adaptive FEM with optimal convergence rates for a certain class of nonsymmetric and possibly nonlinear problems. SIAM J. Numer. Anal. 2014;52(2):601–625. doi: 10.1137/120897225. [DOI] [Google Scholar]

[CR12] 12.Feischl M, Gantner G, Haberl A, Praetorius D, Führer T. Adaptive boundary element methods for optimal convergence of point errors. Numer. Math. 2016;132(3):541–567. doi: 10.1007/s00211-015-0727-4. [DOI] [Google Scholar]

[CR13] 13.Führer T, Praetorius D. A linear Uzawa-type FEM-BEM solver for nonlinear transmission problems. Comput. Math. Appl. 2018;75(8):2678–2697. doi: 10.1016/j.camwa.2017.12.035. [DOI] [Google Scholar]

[CR14] 14.Feischl M, Praetorius D, van der Zee KG. An abstract analysis of optimal goal-oriented adaptivity. SIAM J. Numer. Anal. 2016;54(3):1423–1448. doi: 10.1137/15M1021982. [DOI] [Google Scholar]

[CR15] 15.Gantner G, Haberl A, Praetorius D, Stiftner B. Rate optimal adaptive FEM with inexact solver for nonlinear operators. IMA J. Numer. Anal. 2018;38:1797–1831. doi: 10.1093/imanum/drx050. [DOI] [Google Scholar]

[CR16] 16.Gantner G, Haberl A, Praetorius D, Schimanko S. Rate optimality of adaptive finite element methods with respect to overall computational costs. Math. Comp. 2021;90(331):2011–2040. doi: 10.1090/mcom/3654. [DOI] [Google Scholar]

[CR17] 17.Giles, M., Süli, Endre: Adjoint methods for PDEs: a posteriori error analysis and postprocessing by duality. Acta Numer. 11, 145–236 (2002)

[CR18] 18.Karkulik M, Pavlicek D, Praetorius D. On 2D newest vertex bisection: optimality of mesh-closure and $H^{1}$ -stability of $L_{2}$ -projection. Constr. Approx. 2013;38(2):213–234. doi: 10.1007/s00365-013-9192-4. [DOI] [Google Scholar]

[CR19] 19.Morin P, Nochetto RH, Siebert KG. Data oscillation and convergence of adaptive FEM. SIAM J. Numer. Anal. 2000;38(2):466–488. doi: 10.1137/S0036142999360044. [DOI] [Google Scholar]

[CR20] 20.Mommer MS, Stevenson R. A goal-oriented adaptive finite element method with convergence rates. SIAM J. Numer. Anal. 2009;47(2):861–886. doi: 10.1137/060675666. [DOI] [Google Scholar]

[CR21] 21.Pfeiler CM, Praetorius D. Dörfler marking with minimal cardinality is a linear complexity problem. Math. Comp. 2020;89(326):2735–2752. doi: 10.1090/mcom/3553. [DOI] [Google Scholar]

[CR22] 22.Schimanko, S.: On rate-optimal adaptive algorithms with inexact solvers. PhD thesis, TU Wien, Institute of Analysis and Scientific Computing, 2021

[CR23] 23.Stevenson R. Optimality of a standard adaptive finite element method. Found. Comput. Math. 2007;7(2):245–269. doi: 10.1007/s10208-005-0183-0. [DOI] [Google Scholar]

[CR24] 24.Stevenson R. The completion of locally refined simplicial partitions created by bisection. Math. Comp. 2008;77(261):227–241. doi: 10.1090/S0025-5718-07-01959-X. [DOI] [Google Scholar]

[CR25] 25.Jinbiao W, Zheng H. Uniform convergence of multigrid methods for adaptive meshes. Appl. Numer. Math. 2017;113:109–123. doi: 10.1016/j.apnum.2016.11.005. [DOI] [Google Scholar]

PERMALINK

Goal-oriented adaptive finite element methods with optimal computational complexity

Roland Becker

Gregor Gantner

Michael Innerberger

Dirk Praetorius

Abstract

Introduction

Goal-oriented adaptive finite element method

Variational formulation

Remark 1

Finite element discretization and solution

Discrete goal quantity

Mesh refinement

Estimator properties

Marking strategy

Remark 2

Adaptive algorithm

Algorithm 3

Remark 4

Remark 5

Main results

Linear convergence with optimal rates

Theorem 6

Corollary 7

Proof

Theorem 8

Remark 9

Alternative termination criteria for iterative solver

Proof of (36)

Numerical examples

Singularity in goal functional only

Fig. 1.

Fig. 2.

Fig. 3.

Geometrical singularity

Fig. 4.

Fig. 5.

Proof of Theorem 6

Lemma 10

Proof of Lemma 10(i)

Proof of Lemma 10(ii)

Proof of Lemma 10(iii)

Proof of Lemma 10(iv)

Proof of Lemma 10(v)

Proof of Theorem 6

Proof of Theorem 8 (optimal rates)

Lemma 11

Lemma 12

Lemma 13

Proof

Proof of Theorem 8

Acknowledgements

Funding Information

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases