Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2016 Jan 6.
Published in final edited form as: Int J Numer Methods Eng. 2014 Jul 27;99(4):290–312. doi: 10.1002/nme.4674

Classical and all-floating FETI methods for the simulation of arterial tissues

Christoph M Augustin 1,3,*, Gerhard A Holzapfel 2, Olaf Steinbach 1
PMCID: PMC4702352  EMSID: EMS66544  PMID: 26751957

Abstract

High-resolution and anatomically realistic computer models of biological soft tissues play a significant role in the understanding of the function of cardiovascular components in health and disease. However, the computational effort to handle fine grids to resolve the geometries as well as sophisticated tissue models is very challenging. One possibility to derive a strongly scalable parallel solution algorithm is to consider finite element tearing and interconnecting (FETI) methods. In this study we propose and investigate the application of FETI methods to simulate the elastic behavior of biological soft tissues. As one particular example we choose the artery which is – as most other biological tissues – characterized by anisotropic and nonlinear material properties. We compare two specific approaches of FETI methods, classical and all-floating, and investigate the numerical behavior of different preconditioning techniques. In comparison to classical FETI, the all-floating approach has not only advantages concerning the implementation but in many cases also concerning the convergence of the global iterative solution method. This behavior is illustrated with numerical examples. We present results of linear elastic simulations to show convergence rates, as expected from the theory, and results from the more sophisticated nonlinear case where we apply a well-known anisotropic model to the realistic geometry of an artery. Although the FETI methods have a great applicability on artery simulations we will also discuss some limitations concerning the dependence on material parameters.

Keywords: artery, biological soft tissues, all-floating FETI, parallel computing

1 Introduction

The modeling of hyperelastic materials is realized by using a strain–energy function Ψ. For a comprehensive overview and the mathematical theory on elastic deformations, see, e.g., [1, 2, 3, 4]. A well established model for arterial tissues was introduced by Holzapfel et al. [5, 6]. This model was further developed and enlarged to collagen fiber dispersion in [6, 7, 8]; see [9] for the modeling of residual stresses in arteries which play also an important role in tissue engineering. An adequate model for the myocardium can be found in [10]. The fine mesh structure to model cardiovascular organs normally results in a very large number of degrees of freedom. The combination with the high complexity of the underlying partial differential equations demands fast solution algorithms and, conforming to up–to–date computer hardware architectures, parallel methods. One possibility to achieve these specifications is to use domain decomposition (DD) methods which acquired a lot of attention in the last years and resulted in the development of several overlapping as well as non–overlapping DD methods, see [11]. They all work according to the same principle: the computational domain Ω0 is subdivided into a set of (overlapping or non–overlapping) subdomains Ω0,i. DD algorithms now decompose the large global problem into a set of smaller local problems on the subdomains, with suitable transmission or interface conditions. This yields a natural parallelization of the underlying problem. In addition to well established standard DD methods, other examples for more advanced domain decomposition methods are hybrid methods [12], mortar methods [13, 14, 15] and tearing and interconnecting methods [16].

In this paper we focus on the finite element tearing and interconnecting (FETI) method where the strategy is to decompose the computational domain into a finite number of non–overlapping subdomains. Therein the corresponding local problems can be handled efficiently by direct solvers. The reduced global system, that is related to discrete Lagrange multipliers on the interface, is then solved with a parallel Krylov space method to deduce the desired dual solution. This is, in the case of elasticity, the boundary stress and subsequentely, in a postprocessing step, we compute the primal unknown, i.e. the displacements, locally. For the global Krylov space method, such as the conjugate gradient (CG) or the generalized minimal residual (GMRES) method, we need to have a suitable preconditioning technique. Here we consider a simple lumped preconditioner and an almost optimal Dirichlet preconditioner, as proposed by Farhat et al. [17].

A variant of the classical FETI method is the all-floating tearing and interconnecting approach (AF-FETI) where, in contrast to the classical approach, the Dirichlet boundary acts as a part of the interface. It was introduced independently for the boundary element method by Steinbach and Of [18, 19] and as the Total-FETI (TFETI) method for finite elements by Dostál et al. [20]. This approach shows advantages in the implementation and, due to mapping properties of the involved operators, improves the convergence of the global iterative method for the considered problems. This behavior is illustrated with numerical examples, which are – to the best of our knowledge – the first application of all-floating FETI method to nonlinear and anisotropic biological materials.

An essential part of FETI methods is solving the local subproblems. Challenges occur with so-called floating subdomains which have no contribution to the Dirichlet boundary. These cases correspond to local Neumann problems and the solutions are – in the case of elasticity – only unique up to the six rigid body modes. For classical FETI it can happen that the kernel of the local operator is non-trivial and its dimension is lower than six. The problem to identify these kernels reliably causes trouble. One possibility to overcome this trouble is a modification of the classical approach, the dual-primal FETI (FETI-DP) method, cf. Farhat et al. [21] and Klawonn and Widlund [22]. In this variant some specific primal degrees of freedom are fixed. This yields solvable systems for all subdomains. Choosing the primal degrees of freedom may be very sophisticated [23]. This approach was already applied to model arterial tissues using FETI-DP by Klawonn and Rheinbach [24, 25], Brands et al. [26], Balzani et al. [28, 29] and Brinkhues et al. [27]. Note that for all-floating FETI the identification of the kernel of the local operators is no problem at all, since we treat all subdomains as floating subdomains, and hence have a kernel equal to six for all local operators. Moreover the resulting local systems are typically better conditioned than those arising in the FETI-DP approach, see Brzobohatỳ et al. [30]. All-floating FETI was used to model myocardial tissue in the preliminary work [31].

Both the classical FETI method, as well as all-floating FETI, need the construction of a generalized inverse matrix. This may be achieved using direct solvers with a sparsity preserving stabilization, see, e.g. [30], or stabilized iterative methods. For a mathematical analysis of FETI methods including convergence proofs for the classical one-level FETI method, see, e.g., [22, 32, 33].

2 Modeling Arterial Tissues

The deformation of a body B is described by a function ϕ : Ω0 → Ωt with the reference configuration Ω0R3 at time t = 0 and the current configuration Ωt at time t > 0. With this we introduce the displacement field U in the reference configuration and the displacement field u in the current configuration,

x=ϕ(X)=X+U(X)Ωt,X=ϕ1(x)=xu(x)Ω0, (1)

and the deformation gradient as, see, e.g., [2],

F=Gradϕ(X)=I+GradU. (2)

Moreover, we denote by J = det F > 0 the Jacobian of F and by C = FF the right Cauchy–Green tensor. For later use, to model the nearly incompressible behavior of biological soft tissues, we introduce the following split of the deformation gradient in a volumetric and an isochoric part, compare Flory [34], i.e.

F=J13F¯,with detF¯=1. (3)

Consequently, this multiplicative split can be applied to other tensors such as the right Cauchy–Green tensor. Thus

C=J23C¯,withC¯=F¯F¯and detC¯=1. (4)

As a starting point for the modeling of biological soft tissues the stationary equilibrium equations in the current configuration are considered to find a displacement field u according to

divσ(u,x)+bt(x)=0forxΩt, (5)

where σ(u, x) is the Cauchy stress tensor and bt(x) is the body force at time t.

In addition, we incorporate boundary conditions to describe displacements or normal stresses on the boundary Γt = Ωt, which is decomposed into disjoint parts such that Ωt=Γ¯t,DΓ¯t,N. Dirichlet boundary conditions on Γt,D correspond to a given displacement field u = uD(x), while Neumann boundary conditions on Γt,N are identified physically with a given surface traction σ(u, x) nt(x) = gt(x), where nt(x) denotes the exterior normal vector at time t.

The equilibrium equations and the boundary conditions may also be formulated in terms of the reference configuration, i.e.

DivFS(U,X)+b0(X)=0forXΩ0, (6)
U(X)=UD(X)forXΓ0,D, (7)
FS(U,X)N0(X)=G0(X)forXΓ0,N, (8)

where S is the second Piola–Kirchhoff tensor and b0(X) is the body force at time t = 0. In order to formulate the boundary conditions we introduce a prescribed displacement field UD(X), the exterior normal vector N0(X) and the surface traction G0(X) in the reference configuration.

Considering the study of the properties of soft biological soft tissues we have to deal with a nonlinear relationship between stress and strain, with large deformations and an anisotropic material. Since linear elasticity models are not adequate for treating such a complex behavior, we take a look at the more general concept of nonlinear elasticity.

The nonlinear stress-strain response is modeled via a constitutive equation that links the stress to a derivative of a strain-energy function Ψ, representing the elastic stored energy per unit reference volume. Derived from the Clausius–Duhem inequality, see [35, 36], we formulate the constitutive equations as

σ=2J1FΨ(C)CFandS=2Ψ(C)C. (9)

We make use of the Rivlin–Ericksen representation theorem [37] and its extension to anisotropic materials, cf. [38], to find a representation of the strain-energy function Ψ in terms of the principal invariants of C.

Arteries are vessels that transport blood from the heart to the organs. In vivo the artery is a prestretched material under an internal pressure load. Healthy arteries are highly deformable composite structures and show a nonlinear stress-strain response with a typical stiffening effect at higher pressures. Reasons for this are the embedded collagen fibers which lead to an anisotropic mechanical behavior of arterial walls. We denote by a0, 1 and a0, 2 the predominant collagen fiber directions in the reference configuration. An important observation is that arteries do not change their volume within the physiological range of deformation, hence they are treated as a nearly incompressible material, see, e.g., [5]. In this work we focus on the in vitro passive behavior of the healthy artery, see Fig. 1. To capture the nearly incompressibility condition we remember the decomposition (3), which yields an additive split of the strain-energy function into a so-called volumetric and an isochoric part, i.e.

Ψ(C)=Ψvol(J)+Ψ¯(C¯). (10)

This procedure leads to constitutive equations in which the stress tensors are also additively decomposed into a volumetric and an isochoric part, i.e., cf. [2],

σ=pI+2J1FΨ¯(C¯)CFandS=JpC1+2Ψ¯(C¯)C. (11)

Here, the scalar-valued hydrostatic pressure is defined as

pΨvol(J)J. (12)

To capture the specifics of this fiber-reinforced composite, Holzapfel and Weizsäcker [39] and Holzapfel et al. [5] proposed an additional split of the strain-energy function into an isotropic and an anisotropic part so that the complete energy function Ψ can be written as

Ψ(C)=Ψvol(J)+Ψ¯iso(C¯)+Ψ¯aniso(C¯,a1,0)+Ψ¯aniso(C¯,a2,0). (13)

Following the classical approach we describe the volume changing part by

Ψvol(J)=κ2(J1)2, (14)

where κ > 0, comparable to the bulk modulus in linear elasticity, serves as a penalty parameter to enforce the incompressibility constraint.

Figure 1.

Figure 1

Diagrammatic model of the major components of a healthy elastic artery, from [5]. The intima, the innermost layer is negligible for the modeling of healthy arteries, it plays a very important role in the modeling of diseased arteries, though. The two predominant directions of the collagen fibers in the media and the adventitia are indicated with black curves.

To model the isotropic non-collagenous matrix material the classical neo-Hookean model is used [2]. Thus

Ψ¯iso(C¯)=c2(I¯13), (15)

where c > 0 is a stress-like material parameter and I¯1=tr(C¯) is the first principal invariant of the isochoric part of the right Cauchy–Green tensor. In (13), Ψ¯aniso is associated with the deformation of the collagen fibers. According to [5], this transversely isotropic response is described by

Ψ¯aniso(C¯,a1,0)=k12k2{exp[k2(I¯41)2]1}, (16)
Ψ¯aniso(C¯,a2,0)=k12k2{exp[k2(I¯61)2]1}, (17)

with the invariants I¯4a1,0(C¯a1,0),I¯6a2,0(C¯a2,0) and the material parameters k1 and k2, which are both assumed to be positive. It is worth to mention that for the anisotropic responses, (16) and (17) only contribute for the cases I¯4>1 or I¯6>1, respectively. This condition is explained with the wavy structure of the collagen fibers, which are regarded as not being able to support compressive stresses. Thus, the fibers are assumed to be active in tension (I¯i>1) and inactive in compression (I¯i<1). This assumption is not only based on physical reasons but it is also essential for reasons of stability, see Holzapfel et al. [40].

The material parameters can be fitted to an experimentally observed response of the biological soft tissue. Following [5] we use the material parameters summarized in Table 1.

Table 1.

Material parameters used in the numerical experiments; parameters taken from Holzapfel et al. [5].

c = 3.0 kPa k1 = 2.3632 kPa k2 = 0.8393 (−)

Similar models can also be used for the description of other biological materials, e.g., for the myocardium, cf. [10].

3 Finite Element Approximation

3.1 Variational formulation of nonlinear elasticity problems

In this section we consider the variational formulation of the equilibrium equations (5) and (6) with the corresponding Dirichlet and Neumann boundary conditions. In particular, using spatial coordinates, the boundary value problem (5) is formally equivalent to the variational equations

At(u),vΩtΩtσ(u):ε(v)dx=Ωtbtvdx+Γt,NgtvdsxF,vΩt, (18)

valid for a smooth enough tensor field σ(u):Ω¯tR3×3 and all smooth enough vector fields v:Ω¯tR3, which vanish on Γt,D, see, e.g., [1, Theorem 2.4-1]. Additionally,

ε(v)=12(gradv+(gradv)) (19)

and At is the nonlinear operator in the current configuration which is induced by the stress tensor representation (11), and by using the related duality pairing ⟨·, ·⟩Ωt. For later use, we introduce the corresponding terms in the reference configuration Ω0 as A0(U),VΩ0 and F0,VΩ0. Note that (18) formally corresponds to a variational formulation in linear elasticity. However, the integral and the involved terms have to be evaluated in the current configuration which comprises the nonlinearity of the system. If the test function v is interpreted as the spatial velocity gradient, then ε(v) is the rate of deformation tensor so that At(u),vΩt has the physical interpretation of the rate of internal mechanical work.

In terms of the reference configuration, the boundary value problem (6), (8) is formally equivalent to the variational equations

A0(U),VΩ0=Ω0S(U):Σ(U,V)dX=Ω0b0VdX+Γ0,NG0VdsX=F0,VΩ0, (20)

valid for a smooth enough tensor field S(U):Ω¯0R3×3 and all smooth enough vector fields V:Ω¯0R3 with V = 0 on Γ0,D, see, e.g., [1, Theorem 2.6-1]. In (20) we use the definition of the directional derivative of the Green–Lagrange strain tensor, i.e.

Σ(U,V)=12(GradVF(U)+F(U)GradV), (21)

which is also known as the variation or the material time derivative of the Green–Lagrange strain tensor in the literature.

It is important to note that results on existence of solutions in nonlinear elasticity can be stated given a polyconvex strain-energy function Ψ, which holds true for the anisotropic model (13) discussed in Section 2. For more details we refer to the results of Ball [41, 42], see also [1, 43] and Balzani et al. [44].

3.2 Linearization and discretization

In the following we confine ourselves to the reference configuration Ω0. The formulations in the current configuration Ωt can be deduced in an analogous way.

For the solution of the nonlinear system (20) we apply Newton’s method to obtain the recursion

ΔU,A0(Uk)VΩ0=F0,VΩ0A0(Uk),VΩ0,Uk+1=Uk+ΔU, (22)

with the tangential term A0(Uk), the displacement field of the k-th Newton step Uk, the increment ΔU and a suitable initial guess U0.

For the computational domain Ω0R3 we consider an admissible decomposition into N tetrahedral shape regular finite elements τℓ of mesh size hℓ, i.e. Ω¯0=T¯N==1Nτ¯, and we introduce a conformal finite element space Xh ⊂ [H10)]3, M = dimXh, of piecewise polynomial continuous basis functions φi. Then the Galerkin finite element discretization of the linearized variational formulation (22) results in a system of algebraic equations to find ΔUhXh, ΔUh = 0 on Γ0,D such that

ΔUh,A0(Uhk)VhΩ0=F0,VhΩ0A0(Uhk),VhΩ0,Uhk+1=Uhk+ΔUh, (23)

holds for all VhXh, Vh = 0 on Γ0,D. Note that the initial guess Uh0 has to satisfy an approximate Dirichlet boundary condition Uh0=UD,h on Γ0,D to fulfill condition (7), where UD,hXh∣Γ0,D denotes a suitable approximation of the given displacement UD. For the computation of the tangential term A0(Uhk) we need to evaluate

ΔUh,A0(Uhk)VhΩ0=Ω0Grad(ΔUh)S(Uhk):GradVhdX+Ω0FGradΔUh:C(Uhk):FGrad(Vh)dX. (24)

For a more detailed presentation how to compute the tangential term, in particular the forth-order elasticity tensor C(Uhk) we refer to [46, 45].

Note that the convergence rate of the Newton method is dependent on the initial guess, on the parameters used in the model and on the inhomogeneous Dirichlet and Neumann boundary conditions which influence F0.

In a time-stepping scheme we use zero for the initial guess, and the result of the k-th time step as initial solution for the next step. The initial guess may also be the solution of a modified nonlinear elasticity problem such as the solution of the same nonlinear model but with modified parameters, e.g., a reduced penalty parameter κ, or modified boundary conditions, e.g., a reduced pressure on the surface. The latter is equivalent to an incremental load stepping scheme with a parameter τ ∈ (0, 1], τ → 1, so that

ΔUh,A(Uhk)VhΩ0=τF0,VhΩ0A(Uhk),VhΩ0,Uhk+1=Uhk+ΔUh. (25)

Klawonn and Rheinbach [24] used a load stepping scheme of this kind, for more information on load stepping and global Newton methods, see [48, 47]. The standard finite element method (FEM) now yields a linear system of equations which is equivalent to the discretized variational formulation (23). Finally, we have to solve

K(U¯k)ΔU¯=F¯K¯(U¯k),U¯k+1=U¯k+ΔU¯, (26)

with the solution vector Uk in the k-th Newton step and the increment ΔU. The tangent stiffness matrix K′ is calculated according to

K(U¯k)[i,j]φj,A(Uhk)φiΩ0, (27)

and the terms of the right hand side are constructed by

F¯[i]F0,φiΩ0andK¯(U¯k)[i]A(Uhk),φiΩ0. (28)

The additive split of the stress tensors (11) and the introduction of the hydrostatic pressure (12) leads to the additional equation

pΨvol(J)J=0, (29)

which has to be satisfied in a weak sense. For this we use the idea of static condensation where this volumetric variable is eliminated element-wise, see, e.g., [46]. This may be achieved in using discontinuous basis functions; in this paper we will concentrate on piecewise constants. In the case of tetrahedral elements, this approach leads to PkP0 elements. Here k is the order of the basis functions for the displacement field. It is known that linear finite elements are very prone to volumetric locking. Hence, for nearly incompressible materials piecewise quadratic elements (k = 2) are a better choice, see Simo [49]. The resulting P2P0 element is also the preferred choice to model nearly incompressible arterial materials in [24]–[29]. For the numerical results in this work (Section 5) we use both linear (P1P0 element) and quadratic (P2P0 element) ansatz functions for the displacement field and compare the results.

Note that due to the symmetry of the stress tensor S and the major and minor symmetry properties of the elasticity tensor C the operator A0(Uk) is self-adjoint. We can also show, using the positive definiteness of the elasticity tensor, see [4], and the polyconvexity of the strain-energy function (Section 3.1), that this operator is [H01(Ω0,Γ0,D)]3-elliptic and bounded, see [4, 45]. With these properties of the operator A0(Uhk) we can state that the linearized system (23),(24) admits a unique solution ΔUh. Furthermore, the tangent stiffness matrix K′ is symmetric and positive definite.

Simulations with large deformations and the hence required derivative of the Neumann boundary conditions (8) would yield an additional non-symmetric mass matrix on the left hand side of (26). To stay with an symmetric system we neglect this matrix but compensate it with a surface update of the geometry after each Newton step. Thus, our whole system is symmetric and we can use the CG method as an iterative solver. Nonetheless, the FETI methods described in Section 4 also work for non-symmetric systems by using the GMRES method.

4 Finite Element Tearing and Interconnecting

To solve the linearized equations (26) arising in the Newton method we apply the finite element tearing and interconnecting approach [16], see also [24, 50, 51], and references given therein. The derivation of the FETI system for nonlinear mechanics will be performed in the reference configuration. In an analogous way this is also valid for the formulation in the current configuration. For a bounded domain Ω0R3 we introduce a non-overlapping domain decomposition

Ω¯0=i=1pΩ¯0,iwithΩ0,iΩ0,j=forij,Γ0,i=Ω0,i, (30)

see Fig. 2. The local interfaces are given by Γ0,ij := Γ0,i ∩ Γ0,j for all i < j. The skeleton of the domain decomposition (30) is denoted as

Γ0,Ci=1pΓ0,i=Γ0i<jΓ¯0,ij. (31)

We assume that the finite element mesh TN matches the domain decomposition (30), i.e., we can reorder the degrees of freedom to rewrite the linear system (26) as

(K11(U¯1k)K1C(U¯1k)A1Kpp(U¯pk)KpC(U¯pk)ApA1KC1(U¯1k)ApKCp(U¯pk)i=1pAiKCC(U¯ik)Ai)(ΔU¯1,IkΔU¯p,IkΔU¯Ck)=(K¯1(U¯1k)K¯p(U¯pk)i=1pAiK¯C(U¯ik)), (32)

where the increments ΔU¯i,Ik, the stiffness matrices Kii(U¯ik) and the terms on the right hand side K¯i(U¯ik),i=1,,p, are related to the local degrees of freedom within the subdomain Ω0,i. All terms with an index C correspond to degrees of freedom on the coupling boundary Γ0,C, see (31), while Ai denote simple reordering matrices taking boolean values.

Figure 2.

Figure 2

Decomposition of a domain Ω0 into four subdomains Ω0,i, i = 1, … , 4.

4.1 Classical FETI method

Starting from (32), the tearing is now carried out by

ΔU¯i=(ΔU¯i,IkAiΔU¯Ck),Ki=(Kii(U¯ik)KiC(U¯ik)KCi(U¯ik)KCC(U¯ik)),f¯i=(K¯i(U¯ik)K¯C(U¯ik)), (33)

where AiΔU¯Ck is related to degrees of freedom on the coupling boundary Γ0,iΓ0. As the unknowns ΔU¯i are typically not continuous over the interfaces we have to ensure the continuity of the solution on the interface, i.e.

ΔU¯i=ΔU¯jonΓ0,ij,i,j=1,,p. (34)

This is done by applying the interconnecting

i=1pBiΔU¯i=0¯, (35)

where the matrices Bi are constructed from {0, 1, −1} such that (34) holds. By using discrete Lagrange multipliers λ to enforce the constraint (35) we finally have to solve the linear system

(K1B1KpBpB1Bp0)(ΔU¯1ΔU¯pλ¯)=(f¯1f¯p0¯). (36)

4.2 all-floating FETI method

The idea of this special FETI method, cf., e.g., Of and Steinbach [19], is to treat all subdomains as floating subdomains, i.e. domains with no Dirichlet boundary conditions. In addition to the standard procedure of ‘gluing’ the subregions along the auxiliary interfaces, the Lagrange multipliers are now also used for the implementation of the Dirichlet boundary conditions, see Fig. 3. This simplifies the implementation of the FETI procedure since it is possible to treat all subdomains in the same way. In addition, some tests (Section 5) show more efficiency than the classical FETI approach and the asymptotic behavior improves. This is due to the mapping properties of the Steklov–Poincaré operator, see [19, Remark 1]. The drawback is an increasing number of degrees of freedom and Lagrange multipliers. Compare also to Dostál et al. [20] for the related Total-FETI method. If all regions are treated as floating subdomains the conformance of the Dirichlet boundary conditions is not given; they have to be enhanced in the system of constraints using the slightly modified interconnecting

i=1pB~iΔU¯i=b¯, (37)

where B~i is a block matrix of the kind B~i=[Bi,BD,i] and the vector b is of the form b = [0, bD] such that BD,i[j, k] = 1, if and only if k is the index of a Dirichlet node j of the subdomain Ωi, while b[j] equals the Dirichlet values corresponding to the vertices Xk ∈ Γ0,D, see also [19].

Figure 3.

Figure 3

Fully redundant classical FETI (a) and all-floating FETI (b) formulation: Ω0,i, i = 1, … , 5, denote the local subdomains, the black dots correspond to the subdomain vertices and the dashed lines correspond to the constraints (34). The gray strip indicates Dirichlet boundary conditions. Note that the number of constraints for the all-floating approach rises with the number of vertices on the Dirichlet boundary.

For three-dimensional elasticity problems all subdomain stiffness matrices have now the same and known defect, which equals the number of six rigid body motions and which also simplifies the calculation of the later needed generalized inverse matrices Ki. For all-floating FETI we finally get the linearized system of equations

(K1B~1KpB~pB~1B~p0)(ΔU¯1ΔU¯pλ¯)=(f¯1f¯pb¯). (38)

4.3 Solving the FETI system

To solve the linearized systems (36) and (38) we follow the standard approach of tearing and interconnecting methods. For convenience we outline the procedure by means of the classical FETI formulation (Section 4.1). However the modus operandi is analogous for the all-floating approach.

First, note that in the case of a floating subdomain Ω0,i, i.e. Γ0,iΓ0,D=, the local matrices Ki are not invertible. Hence, we introduce a generalized inverse Ki to represent the local solutions as

ΔU¯i=Ki(f¯iBiλ¯)+k=16γk,ir¯k,i. (39)

Here, rk,i ∈ ker Ki correspond to the rigid body motions of elasticity and γk,i are unknown constants. For floating subdomains we additionally require the solvability conditions

(f¯iBiλ¯,r¯k,i)=0fori=1,,6. (40)

In the case of a non-floating subdomain, i.e. ker Ki=, we may set Ki=Ki1. Note that it may happen that the kernel ker Ki is non-trivial and its dimension is lower than 6. This is the case if the set Γ0,i ∩ Γ0,D is either a vertex or an edge. For classical FETI methods this requires the implementation of an effective method to identify these kernels reliably. Note that this is a key advantage of the all-floating FETI approach because all subdomains are here treated as floating subdomains, and hence we know the kernel of each local operator ker Ki=6. With these kernels the solution of the local problems to find the generalized inverse Ki can be reduced to sparse systems which are typically better conditioned as the systems arising from the FETI-DP method, see Brzobohatỳ et al. [30]. In Section 4.2 we comment on an all-floating approach where also Dirichlet boundary conditions are incorporated by using discrete Lagrange multipliers.

In general, the Schur complement system of (36) is constructed to obtain

i=1pBiKiBiλ¯i=1pk=16γk,iBir¯k,i=i=1pBiKif¯i,(f¯iBiλ¯,r¯k,i)=0. (41)

This can be expressed as

(FGG0)(λ¯γ¯)=(d¯e¯), (42)

with

F=i=1pBiKiBi,G=i=1pk=16Bir¯k,i,d¯=i=1pBiKif¯i, (43)

and e is constructed using ek,i = (fi, rk,i for i = 1, … , p and k = 1, … , 6. For the solution of the linearized system (42) the projection

PIG(GG)1G (44)

is introduced. It now remains to consider the projected system

PFλ¯=Pd¯. (45)

This can be solved by using a parallel iterative method with suitable preconditioning of the form

M1i=1pBD,iYiBD,i, (46)

with modified jump operators BD,i which are obtained by multiplicity scaling, see [24, 51]. Since the local subproblems all yield symmetric tangent stiffness matrices Ki, i = 1, … , p, cf. Section 3, the matrix PF is also symmetric. This enables us to use the CG method as the global solver for (45). Be aware that the initial approximate solution λ0 has to satisfy the compatibility condition G λ0 = e. A possible choice is

λ¯0=G(GG)1e¯. (47)

In a post processing we finally recover the vector of constants

γ¯=(GG)1G(Fλ¯d¯), (48)

and subsequently the desired solution (39).

4.4 Preconditioning

Following Farhat et al. [17] we apply either the lumped preconditioner

ML1i=1pBD,iKiBD,i, (49)

or the optimal Dirichlet preconditioner

MD1i=1pBD,i(000Si)BD,i, (50)

where

Si=KCC(U¯ik)KCi(U¯ik)Kii1(U¯ik)KiC(U¯ik) (51)

is the Schur complement of the local finite element matrix Ki. Alternatively, one may also use scaled hypersingular boundary integral operator preconditioners, as proposed in [52]. For comparison we employ an identity preconditioner which is constructed by using the identity matrix for Yi in eq. (46).

5 Numerical Results

In this section some representative numerical examples for the finite element tearing and interconnecting approach for linear and nonlinear elasticity problems are presented. First, the FETI implementation is tested within linear elasticity. Here we are able to compare the computed results to a given exact solution. This enables us to show the efficiency of our implementation and also the convergence rates, as predicted from the theory. We compare the different preconditioning techniques and present differences between the classical FETI and the all-floating FETI approach.

Subsequently, we apply the FETI method to nonlinear elasticity problems. Thereby, we focus on the anisotropic model, as described in Section 2, and use a realistic triangulations of the aorta and a common carotid artery. As in the linear elastic case, different preconditioning techniques for the all-floating and for the classical FETI method are compared. In Section 5.3, we analyze the biomechanical behavior of an aorta up to an internal pressure of 300 mmHg and plot stress and displacement evolutions as a function of the internal pressure. Finally, in Section 5.4, we analyze our computational framework with respect to strong scaling properties.

The calculations were performed by using the VSC2-cluster (http://vsc.ac.at/) in Vienna. This Linux cluster features 1314 compute nodes, each with two AMD Opteron Magny Cours 6132HE (8 Cores, 2.2 GHz) processors and 8 × 4 RAM. This yields the total number of 21 024 available processing units. As local direct solver we use Pardiso [53, 54], included in Intel’s Math Kernel Library (MKL).

5.1 Linear elasticity

In this section of numerical benchmarks we consider a linear elastic problem with the academic example of a unit cube which is decomposed into a certain number of subcubes. Dirichlet boundary conditions are imposed all over the surface ΓD = Ω. The parameters used are Young’s modulus E = 210 GPa and Poisson’s ratio ν = 0.45. The calculated solution is compared to the fundamental solution of linear elastostatics

U1k(x,x)=18π1E1+ν1ν[(34ν)δ1kxx+(x1x1)(xlxl)xx3],k=1,2,3 (52)

for all xΩ,xR3 is an arbitrary point outside of the domain Ω, and δij is the Kronecker delta, see [55]. The different strategies of preconditioning are compared and also the all-floating and classical FETI approaches. As global iterative method we use the CG method with a relative error reduction of ε = 10−8. Under consideration is a linear elasticity problem using linear tetrahedral elements (P1 elements) with a uniform refinement over five levels ( = 1, … , 5) given a cube with 512 subdomains.

Hence, the number of degrees of freedom associated with the coarsest mesh is 9 981 for the all-floating FETI approach and 6 621 for the classical FETI approach. The difference of the numbers is due to the decoupling of the Dirichlet boundary ΓD. For the finest mesh we have 31 116 861 (all-floating) and 31 073 181 (classical) degrees of freedom. The number of Lagrange multipliers varies between 38 052 for level 1 and 2 908 692 for level 5. Again we have a higher number of Lagrange multipliers for the all-floating approach due to the decoupling of the Dirichlet boundary conditions. The computations were performed on VSC2 using 512 processing units.

First note in Table 2 that for all examined settings, the L2 error, i.e.

uuhL2(Ω), (53)

where uh is the approximate and u the exact solution, and the estimated order of convergence

eoc=lnuuh,L2(Ω)lnuuh,+1L2(Ω)ln2 (54)

behaves as predicted from the theory, i.e. it is of second order. As expected the least iteration numbers were observed for the optimal Dirichlet preconditioner. Nonetheless, since no additional time is required to compute the lumped preconditioner, in contrast to the more sophisticated Dirichlet preconditioner, this type of preconditioning yields comparable computational times for each level of refinement. As a comparison we also list the results of a very simple preconditioning technique, using the identity matrix for Yi in (46), where almost no reduction of the condition numbers can be noticed.

Table 2.

Iteration numbers (it.), condition numbers and computational time (in s) for each preconditioning technique using P1 elements; is the level of uniform refinement. For the L2 error the definition is given in (53), while for the estimated error of convergence eoc the definition is given in (54).

all-floating
identity prec. lumped prec. Dirichlet prec. L2 error eoc
1 61 it. 53.6 20.9 s 27 it. 10.3 19.7 s 21 it. 7.6 19.5 s 1.42E-04 -
2 71 it. 70.0 19.6 s 38 it. 19.7 18.8 s 26 it. 10.4 18.4 s 3.71E-05 1.94
3 88 it. 108.8 21.7 s 45 it. 26.1 22.3 s 27 it. 9.7 22.3 s 9.40E-06 1.98
4 119 it. 216.8 28.8 s 62 it. 53.2 26.4 s 32 it. 13.1 26.6 s 2.37E-06 1.99
5 160 it. 432.7 116.6 s 91 it. 126.2 99.0 s 37 it. 16.8 105.9 s 5.96E-07 1.99

classical
identity prec. lumped prec. Dirichlet prec. L2 error eoc
1 80 it. 98.2 7.1 s 35 it. 14.1 5.9 s 29 it. 10.0 5.9 s 1.47E-04 -
2 105 it. 161.4 7.8 s 58 it. 41.9 6.1 s 37 it. 16.4 5.8 s 3.72E-05 1.98
3 140 it. 295.7 9.3 s 85 it. 105.9 7.9 s 46 it. 25.4 7.7 s 9.41E-06 1.98
4 188 it. 580.9 15.2 s 125 it. 252.1 13.1 s 54 it. 35.8 12.2 s 2.37E-06 1.99
5 251 it. 1150.3 103.4 s 179 it. 555.7 88.2 s 60 it. 46.3 83.6 s 5.96E-07 1.99

Moreover, we observe that all-floating FETI yields better condition numbers for all preconditioners, and hence better convergence rates of the global conjugate gradient method. Although the global iterative method converges in less iterations for this approach, we achieve lower computational time for the classical FETI method for the linear elastic case with P1 elements. This is mainly due to the larger expenditure of time to set up the all-floating FETI system, the larger coarse matrix GG, cf. (44), and due to the higher amount of Lagrange multipliers.

From level 4, with a maximum of 8 907 local degrees of freedom, to level 5, with a maximum of 66 195 local degrees of freedom, we observe an increase in the local assembling and factorization time from approximately 1.8 seconds up to about 13 seconds for all kinds of preconditioners. This is mainly due to the higher memory requirements of the direct solver. Note also that the factorization of the local stiffness matrices by the direct solver is unfeasible, if the number of local degrees of freedom gets too large. The reason for that are memory limitations on the VSC2 cluster. A possibility to overcome this problem is the use of fast local iterative solvers, e.g., the CG method with a multigrid or a BPX preconditioner. Summing it up seems that the simple lumped preconditioner and the classical FETI approach appear to be favorable for this academic example, with very structured subdomains and the boundary ΓD = Ω. The latter yields a large number of floating subdomains for all-floating FETI which are non-floating for the classical FETI approach, and hence a much larger coarse matrix GG for all-floating FETI. The inversion of this matrix is the most time consuming part for the levels = 1, … , 4 that also results in the higher computational time for all-floating FETI in these cases.

Next, we consider a linear elastic problem by using tetrahedral elements and quadratic ansatz functions, i.e. P2 elements for the same mesh and parameter properties as above. The number of degrees of freedom now varies between 53 181 (level = 1) and 26 398 269 (level = 4) and the number of Lagrange multipliers between 77 700 and 2 908 692. Note that for all preconditioning types and for both the all-floating and the classical FETI method the L2 error compared to the fundamental solution behaves as predicted from the theory as we get a cubic convergence rate, see Table 3.

Table 3.

Iteration numbers (it.), condition numbers and computational time (in s) for each preconditioning technique using P2 elements; is the level of uniform refinement. For the L2 error the definition is given in (53), while for the estimated error of convergence eoc the definition is given in (54).

all-floating
identity prec. lumped prec. Dirichlet prec. L2 error eoc
1 149 it. 444.7 23.3 s 73 it. 73.7 22.0 s 47 it. 36.7 18.7 s 1.13E-05 -
2 129 it. 330.8 21.9 s 75 it. 74.3 20.8 s 43 it. 27.7 19.3 s 1.44E-06 2.97
3 114 it. 210.3 30.3 s 73 it. 68.8 27.3 s 36 it. 16.6 28.5 s 1.81E-07 2.99
4 105 it. 167.8 99.8 s 69 it. 65.2 93.4 s 33 it. 14.4 90.2 s 2.26E-08 3.00

classical
identity prec. lumped prec. Dirichlet prec. L2 error eoc
1 120 it. 405.0 7.5 s 65 it. 48.9 6.9 s 40 it. 21.0 6.5 s 1.17E-05 -
2 108 it. 302.6 7.5 s 69 it. 57.6 6.7 s 41 it. 20.6 7.5 s 1.46E-06 3.00
3 112 it. 253.4 12.6 s 91 it. 116.2 11.7 s 42 it. 21.0 12.3 s 1.82E-07 3.01
4 136 it. 273.1 76.3 s 128 it. 262.8 77.3 s 48 it. 27.7 79.1 s 2.26E-08 3.01

For all-floating FETI we have the very interesting case that the global CG iteration numbers remain almost constant for the lumped preconditioner, and it even seems to be a decay for the identity and the Dirichlet preconditioner, if we increase the local degrees of freedom, i.e. increase the refinement level .

For the classical FETI approach the iteration numbers stay almost constant for the Dirichlet preconditioner and increase marginally for the other two preconditioning techniques. Concerning the computational time we have an analogous result as in the previous case with linear ansatz functions: the classical approach with the lumped preconditioner seems to be the best choice for this particular example.

5.2 Arterial model on a realistic mesh geometry

In this section we present examples to show the applicability of the FETI approaches for biomechanical applications, in particular the inflation of an artery segment. We consider the mesh of an aorta and the mesh of a common carotid artery, see Figs. 4 and 5. The geometries are from AneuriskWeb [56] and Gmsh [57]. The generation of the volume mesh was performed using VMTK and Gmsh [57].

Figure 4.

Figure 4

Mesh of an aorta seen from above showing the brachiocephalic artery, and the left common carotid and subclavian arteries. The fine mesh consists of 5 418 594 tetrahedrons and 1 055 901 vertices, while colors indicate the displacement field with an internal pressure of 1 mmHg. Additionally, the splits show the decomposition of the mesh into 480 subdomains (left). Coarser mesh consisting of 720 060 tetrahedrons and 150 725 vertices used in Section 5.3 with 5 selected vertices A–E (right); colors show the distribution of the stress magnitude σmag according to (56) with an internal pressure of 300 mmHg. For both images red indicates high and blue low values.

Figure 5.

Figure 5

Mesh of a segment of a common carotid artery from two different points of view. The mesh consists of 9 195 336 tetrahedrons and 1 621 365 vertices. Color indicates the distribution of the stress magnitude σmag according to (56) due to an internal pressure of 1 mmHg, red indicates high and blue low values. Additionally, the splits show the decomposition of the mesh into 512 subdomains.

The fiber directions, see Fig. 6 (right), were calculated using a method described by Bayer et al. [58] for the myocardium. To adapt this method for the artery we first solved the Laplace equation on the domain Ω0 with homogeneous Dirichlet boundary conditions on the inner surface and inhomogeneous Dirichlet boundary conditions on the outer wall. The gradient of the solution is used to define the transmural direction e^2 in each element. As a second step we repeat this procedure using homogeneous Dirichlet boundary conditions on the inlet surface and inhomogeneous boundary conditions on the outlet surfaces which yields the longitudinal direction e^1. The cross product of these two vectors eventually provides the circumferential direction e^0. With a rotation we get the two desired fiber directions a0,1 and a0,2 in the media and the adventitia, respectively. Thus,

(a0,1a0,2e^2)=(e^0e^1e^2)(cosαsinα0sinαcosα0001)(e^0e^1e^2)(e^0e^1e^2). (55)

The value for the angle α are 29° for the media and 62° for the adventitia, taken from [5].

Figure 6.

Figure 6

Distribution of the stress magnitude σmag inside the aorta (left); values of high stress in red and of low stress in blue. To the right the fiber directions (black curves) and the two layers (adventitia in red and media in orange) of the carotid artery are shown.

To describe the anisotropic and nonlinear arterial tissue, we use the material model (1317), with the parameters given in Table 1 and κ is varied. Dirichlet boundary conditions (7) are imposed on the respective intersection areas. We perform an inflation simulation on the artery segment where the interior wall is exposed to a constant pressure p. This is performed using Neumann boundary conditions (8). If not stated otherwise, we present the results of one load step applying a rather low pressure of 1 mmHg. This is necessary to have a converging Newton method. Nonetheless, the material model as used is anisotropic. To simulate a higher pressure, an appropriate load stepping scheme, see (25), has to be used. However, this does not affect the number of local iterations significantly. As already mentioned in Section 4 we use the CG method as global iterative solver. Experiments with a standard non-symmetric nonlinear elasticity system and the necessary GMRES method as an iterative solver showed similar results, as presented in the following with the symmetric system. However, the memory requirements of the GMRES solver are much higher.

The local generalized pseudo-inverse matrices are realized with a sparsity preserving regularization by fixing nodes, see, e.g., [30], and the direct solver package Pardiso. The global nonlinear finite element system is solved by a Newton scheme, where the FETI approach is used in each Newton step. For the considered examples the Newton scheme needed four to six iterations. Due to the non-uniformity of the subdomains the efficiency of a global preconditioner becomes more important. It may happen that the decomposition of a mesh results in subdomains that have only a few points on the Dirichlet boundary. This negatively affects the convergence of the CG method using classical FETI, but does not affect the global iterative method of the all-floating approach at all. This is a major advantage of all-floating FETI since here all subdomains are treated the same, and hence all subdomains are stabilized. This behavior is observed for almost all settings for preconditioners and the penalty parameter κ as well as for linear and quadratic ansatz functions, see Tables 47.

Table 4.

Iteration numbers (it.) per Newton step and computational time (in s) per Newton step for the all-floating and the classical FETI approach with linear ansatz functions comparing the three considered preconditioners. The penalty parameter κ was varied from 10 to 1000 kPa. Mesh: mesh of the aorta subdivided in 480 subdomains, computed with 480 cores.

all-floating
κ identity preconditioner lumped preconditioner Dirichlet preconditioner
10 1052 it. 57.6 s 160 it. 31.0 s 56 it. 22.8 s
100 1879 it. 94.6 s 305 it. 29.5 s 85 it. 25.4 s
1000 4122 it. 177.1 s 681 it. 48.8 s 209 it. 31.8 s

classical
κ identity preconditioner lumped preconditioner Dirichlet preconditioner
10 2056 it. 98.7 s 305 it. 35.5 s 117 it. 27.2 s
100 3711 it. 149.8 s 540 it. 35.5 s 144 it. 28.4 s
1000 8245 it. 327.8 s 1190 it. 60.9 s 263 it. 32.9 s

Table 5.

Iteration numbers (it.) per Newton step and computational time (in s) per Newton step for the all-floating and the classical FETI approach with linear ansatz functions comparing the three considered preconditioners. The penalty parameter κ was set to 1000 kPa. Mesh: mesh of the carotid artery with two layers (adventitia and media) subdivided in 512 subdomains, computed with 512 cores.

type identity preconditioner lumped preconditioner Dirichlet preconditioner
all-floating > 10000 it. − s 1084 it. 100.6 s 497 it. 85.5 s
classical 5130 it. 357 s 1794 it. 200.2 s 588 it. 97.7 s

Table 6.

Iteration numbers (it.) per Newton step and computational time (in s) per Newton step for the all-floating and the classical FETI approach with quadratic ansatz functions comparing the three considered preconditioners. The penalty parameter κ was varied from 10 to 1000 kPa. Mesh: mesh of the aorta subdivided in 480 subdomains, computed with 480 cores.

all-floating
κ identity preconditioner lumped preconditioner Dirichlet preconditioner
10 940 it. 491.1 s 283 it. 209.5 s 71 it. 157.3 s
100 1519 it. 1186.4 s 523 it. 332.0 s 105 it. 178.1 s
1000 3371 it. 2584.5 s 1372 it. 746.0 s 206 it. 282.7 s

classical
κ identity preconditioner lumped preconditioner Dirichlet preconditioner
10 1319 it. 654.2 s 333 it. 225.2 s 113 it. 188.4 s
100 2362 it. 1140.6 s 664 it. 402.6 s 110 it. 177.5 s
1000 5563 it. 4168.3 s 1742 it. 943.1 s 204 it. 280.1 s

Table 7.

Iteration numbers (it.) per Newton step and computational time (in s) per Newton step for the all-floating and the classical FETI approach with quadratic ansatz functions comparing the three considered preconditioners. The penalty parameter κ was set to 1000 kPa. Mesh: mesh of the carotid artery with two layers (adventitia and media) subdivided in 1024 subdomains, calculated with 1024 cores.

type identity preconditioner lumped preconditioner Dirichlet preconditioner
all-floating > 10000 it. − s 2163 it. 1133.9 s 674 it. 994.6 s
classical 6006 it. 2672.6 s 4798 it. 2306.8 s 764 it. 771.2 s

For example, applying all-floating FETI with the Dirichlet preconditioner to the mesh of the aorta using a penalty parameter κ = 1000 kPa the global CG method converged in considerable less iterations (209) than the CG method using classical FETI (263), see Table 4. The advantage of the smaller number of iterations is not so significantly reflected in the computational time since, as for the linear case, we have higher set up times and a larger coarse system GG. Nonetheless, for the considered examples it shows that all-floating FETI yields lower iteration numbers of the global systems and it is also competitive or even advantageous with respect to the classical approach concerning the computational time.

In contrast to the academic example in Section 5.1 the more complex Dirichlet preconditioner is the best choice for all considered settings. Especially for k ⪢ 1 the iteration numbers with the lumped and the identity preconditioner escalate. Admittedly, the numbers in Table 4 also show that the convergence of the CG method, within all FETI approaches and preconditioner settings, is dependent on the penalty parameter κ.

Using quadratic ansatz functions we have a total number of 23 031 620 degrees of freedom for the aorta mesh and 36 527 435 degrees of freedom for the carotid artery mesh. In order to not infringe the memory limitations on the VSC2 cluster we have to use a decomposition into 1024 subdomains (instead of 512) for the carotid artery. For the aorta it was possible to stay with 480 subdomains. The number of Lagrange multipliers are then 1 552 665 (aorta) and 4 585 203 (carotid artery). Comparing the numbers in Table 6 and Table 7 show similar results as in the case with linear ansatz functions. The Dirichlet preconditioner is preferable for all test cases and the all-floating approach is competitive to the classical FETI approach. Albeit quadratic ansatz functions resolve the nearly incompressible elastic behavior better than linear ansatz functions we also notice a correlation between the global iteration numbers and the penalty parameter κ, see Table 6. Nonetheless, the iteration numbers do not increase as much as for the P1P0 element case and the values of J = det F in each element are much closer to 1 for the P2P0 elements.

5.3 Load stepping scheme

In this section we analyze the biomechanical behavior of the aorta up to an internal pressure of 300 mmHg. Higher pressures would induce damage and softening behavior which cannot be captured with the arterial model discussed in Section 2. For that purpose we consider a coarser version of the mesh of the aorta (see Fig. 4), which is subdivided into 32 subdomains since for this mesh the all-floating FETI method looks significantly advantageous. The reasons for that are as follows: (i) we have lower iteration numbers for the all-floating FETI approach, as already observed in Section 5.2; (ii) the matrix GG in (44) is small, and hence less time is needed to compute the inverse of this coarse system, especially in comparison to the assembly time and the global solving time of the CG method.

With this mesh we simulate an arterial model with the parameters from Table 1 and with c = 6 kPa and κ = 1000 kPa using the Dirichlet preconditioner. The results of a load stepping scheme, where we applied an internal pressure up to 300 mmHg over 572 loading steps, are found in the Figs. 7 and 8. Note that the average iteration number over one time step increased from 248 to 268 for all-floating FETI and from 340 to 358 for the classical FETI approach for higher pressures, and, consequently, a more anisotropic material behavior. The simulation needed four to five Newton steps and the solving times for all-floating FETI are significantly faster, see Fig. 8.

Figure 7.

Figure 7

Stress magnitude σmag versus relative displacement urel (left) and evolution of the displacement norm unorm over the load steps up to an internal pressure p of 300 mmHg (right). The plots were generated using data at the specific points A–E, as shown in Fig. 4 (right).

Figure 8.

Figure 8

Comparison of all-floating FETI (gray) and classical FETI (black) for a time stepping scheme. Average iteration numbers of one time step (left) and solving times in seconds for one time step (right) over 572 load steps.

In our plots we used a stress magnitude σmag according to

σmag=σ112+σ222+σ332+2σ122+2σ132+2σ232, (56)

used as a measure to visualize our data. For advantages and disadvantages of certain stress values concerning the analysis of rupture and failure in aortic tissues, see, e.g., [59]. Other values used in Fig. 7 are the displacement norm unorm and the relative displacement urel, i.e.

unorm=u12+u22+u32,urel=unormumax, (57)

for a point with the displacement vector u = (u1, u2, u3) at the time step t, and umax is the largest occurring displacement norm for that point over all time steps.

5.4 Strong scaling for nonlinear elasticity

Here we analyze our computational framework with respect to strong scaling efficiency, i.e.

eff=tIPtP, (58)

where tI is the amount of time to complete a computation with the initial number of processing units I (in our case I = 16) and tP is the amount of time to complete the same computation with P processing units. In particular, we consider the meshes of the carotid artery and the aorta as in Section 5.2, both subdivided into 512 subdomains. We apply the arterial model with the parameters from Table 1 and use a κ = 100 with the lumped preconditioner and linear ansatz functions. For the aorta we used all-floating FETI and needed an average of 324 global CG iterations to reach an absolute error of ε = 10−8 and 5 Newton steps to reach an absolute error of 10−6. In the case of the carotid artery and classical FETI we needed 674 global CG iterations and also 5 Newton steps to reach the same error limits as above.

In the Tables 8 and 9 we present the following numbers: the local time is the sum of all assembling and local factorization times during the solution steps. The factorization of the local problems was performed with the direct solver package Pardiso. In most cases we observed a super-linear speedup, and hence an efficiency greater than 1 for this value. This is due to memory issues, mainly so-called cache effects. For more information on this well-known phenomenon, see, e.g., [60]. The global CG time is the duration of all CG solution steps together. We see that this value scales very well up to 256 cores for the aorta and up to 128 cores for the carotid artery. The total time is the total computational time including input and output functions. It also scales admissibly well up to 256 processing units for the aorta, and up to 128 cores for the carotid artery, see Tables 8 and 9, and Fig. 9. For a higher number of cores, at least for the specific examples, the speedup is rather low. Possibilities to overcome this problem are, for example, the usage of parallel solver packages such as hypre and a more efficient assembling of the coarse system of the FETI method. It also needs a more elaborate strategy with MPI and the memory management. Note that at some point the subdomains get too small and the increasingly dominant MPI communication impedes further strong scaling.

Table 8.

Computational time (in s) and efficiency (eff) according to (58) for a nonlinear elastic problem using a varying number of processing units P. The time is measured for 1 time step with 5 Newton steps for all-floating FETI and the lumped preconditioner.

P local time eff global CG time eff total time eff
16 407.7 s 1.000 1311.7 s 1.000 2028.6 s 1.000
32 203.1 s 1.004 666.4 s 0.984 1054.2 s 0.962
64 101.7 s 1.002 345.4 s 0.949 562.0 s 0.902
128 50.5 s 1.009 184.7 s 0.888 316.7 s 0.801
256 25.3 s 1.007 103.8 s 0.790 192.8 s 0.658
512 12.7 s 1.000 67.6 s 0.606 161.0 s 0.394

Table 9.

Computational time (in s) and efficiency (eff) according to (58) for a nonlinear elastic problem on the carotid artery mesh using a varying number of processing units P. The time is measured for 1 time steps with 5 Newton steps for classical FETI and the lumped preconditioner.

P local time eff global CG time eff total time eff
16 726.0 s 1.000 4725.8 s 1.000 6519.7 s 1.000
32 351.3 s 1.033 2368.2 s 0.998 3497.0 s 0.932
64 170.5 s 1.065 1262.9 s 0.936 1991.2 s 0.819
128 90.7 s 1.001 694.5 s 0.851 1194.1 s 0.682
256 47.3 s 0.960 443.6 s 0.666 914.4 s 0.446
512 23.9 s 0.949 297.2 s 0.497 667.4 s 0.305

Figure 9.

Figure 9

Computation times (in s) for a simulation of the anisotropic arterial model with the aorta mesh (left) and the carotid artery mesh (right) using a varying number of cores.

6 Discussion and Limitations

We have shown the application of the finite element tearing and interconnecting method to elasticity problems, in particular to the simulation of the nonlinear elastic behavior of cardiovascular tissues such as the artery. The main ideas of domain decomposition methods were summarized and the classical and the all-floating FETI approach were discussed in detail.

Illustrated by representative numerical examples we have shown certain advantages of the all-floating FETI method compared to the classical FETI approach. To the best of our knowledge the application of the all-floating approach to nonlinear anisotropic elasticity problems cannot be found in the literature. Certainly, the mentioned advantages are influenced by the mesh structure and the choice of the boundary conditions, and hence the method to choose depends on the specific problem.

We have presented and compared different techniques of preconditioning: the lumped preconditioner and the optimal Dirichlet preconditioner. Furthermore, the numerical examples exposed some instabilities of the global iterative method for nearly incompressible material parameters, i.e. for a very large penalty parameter κ. Here we were able to present, like it was also shown in earlier contributions, that quadratic ansatz functions resolve the incompressible elastic behavior better than linear ansatz functions.

Acknowledgements

This work was supported by the Austrian Science Fund (FWF) and by Graz University of Technology within the SFB Mathematical Optimization and Applications in Biomedical Sciences. The authors would like to thank Dr. Günther Of, Graz University of Technology and Dr. Clemens Pechstein, Johannes Kepler University of Linz, for the fruitful cooperation and many helpful discussions.

References

  • [1].Ciarlet PG. Mathematical Elasticity. Vol. I. North-Holland; Amsterdam: 1988. ( Studies in Mathematics and its Applications, vol. 20 ). [Google Scholar]
  • [2].Holzapfel GA. Nonlinear Solid Mechanics. A Continuum Approach for Engineering. John Wiley & Sons Ltd; Chichester: 2000. [Google Scholar]
  • [3].Marsden JE, Hughes TJR. Mathematical Foundations of Elasticity. Dover; New York: 1994. [Google Scholar]
  • [4].Ogden RW. Non-Linear Elastic Deformations. Dover; New York: 1997. [Google Scholar]
  • [5].Holzapfel GA, Gasser TC, Ogden RW. A new constitutive framework for arterial wall mechanics and a comperative study of material models. J. Elasticity. 2000;61:1–48. [Google Scholar]
  • [6].Holzapfel GA, Ogden RW. Constitutive modelling of arteries. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 2010;466(2118):1551–1596. [Google Scholar]
  • [7].Gasser TC, Ogden RW, Holzapfel GA. Hyperelastic modelling of arterial layers with distibuted collagen fibre orientations. J. R. Soc. Interface. 2006;3:15–35. doi: 10.1098/rsif.2005.0073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Holzapfel GA. Collagen in arterial walls: Biomechanical aspects. In: Fratzl P, editor. Collagen. Structure and Mechanics. Springer-Verlag; 2008. pp. 285–324. chap. 11. [Google Scholar]
  • [9].Holzapfel GA, Ogden RW. Modelling the layer-specific 3D residual stresses in arteries, with an application to the human aorta. Journal of the Royal Society Interface. 2010;7:787–799. doi: 10.1098/rsif.2009.0357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Holzapfel GA, Ogden RW. Constitutive modelling of passive myocardium: a structurally based framework for material characterization. Phil. Trans. R. Soc. A. 2009;367:3445–3475. doi: 10.1098/rsta.2009.0091. [DOI] [PubMed] [Google Scholar]
  • [11].Proceedings of the International Conference on Domain Decomposition Methods in Science and Engineering, I–XXI; http://www.ddm.org/ [Google Scholar]
  • [12].Steinbach O. Stability Estimates for Hybrid Coupled Domain Decomposition Methods. Springer-Verlag; Berlin: 2003. ( Lecture Notes in Mathematics, vol. 1809 ). [Google Scholar]
  • [13].Bernardi C, Maday Y, Patera AT. Nonlinear partial differential equations and their applications. Collège de France Seminar, Vol. XI (Paris, 1989–1991) Longman Sci. Tech.; Harlow: 1994. A new nonconforming approach to domain decomposition: the mortar element method; pp. 13–51. ( Pitman Res. Notes Math. Ser. vol. 299 ). [Google Scholar]
  • [14].Maday Y, Mavriplis C, Patera AT. Nonconforming mortar element methods: application to spectral discretizations. Domain decomposition methods; 1988; Los Angeles, CA. Philadelphia, PA: SIAM; 1989. pp. 392–418. [Google Scholar]
  • [15].Wohlmuth BI. A mortar finite element method using dual spaces for the Lagrange multiplier. SIAM J. Numer. Anal. 2000;38(3):989–1012. [Google Scholar]
  • [16].Farhat C, Roux FX. A method of finite element tearing and interconnecting and its parallel solution algorithm. Int. J. Numer. Methods Engrg. 1991;32:1205–1227. [Google Scholar]
  • [17].Farhat C, Mandel J, Roux FX. Optimal convergence properties of the FETI domain decomposition method. Comput. Methods Appl. Mech. Engrg. 1994;115:365–385. [Google Scholar]
  • [18].Of G. BETI-Gebietszerlegungsmethoden mit schnellen Randelementverfahren und Anwendungen. Universität Stuttgart; 2006. PhD Thesis. [Google Scholar]
  • [19].Of G, Steinbach O. The all-floating boundary element tearing and interconnecting method. J. Numer. Math. 2009;17(4):277–298. [Google Scholar]
  • [20].Dostál Z, Horák D, Kučera R. Total FETI - an easier implementable variant of the FETI method for numerical solution of elliptic PDE. Comm. Numer. Methods Engrg. 2006;22:1155–1162. [Google Scholar]
  • [21].Farhat C, Lesoinne M, LeTallec P, Pierson K, Rixen D. FETI-DP: a dual-primal unified FETI method. I. A faster alternative to the two-level FETI method. Int. J. Numer. Methods Engrg. 2001;50(7):1523–1544. [Google Scholar]
  • [22].Klawonn A, Widlund OB. FETI and Neumann-Neumann iterative substructuring methods: connections and new results. Comm. Pure Appl. Math. 2001;54(1):57–90. [Google Scholar]
  • [23].Klawonn A, Widlund OB. Domain decomposition methods in science and engineering. Springer; Berlin: 2005. Selecting constraints in dual-primal FETI methods for elasticity in three dimensions; pp. 67–81. ( Lect. Notes Comput. Sci. Eng. vol., 40 ). [Google Scholar]
  • [24].Klawonn A, Rheinbach O. Highly scalable parallel domain decomposition methods with an application to biomechanics. ZAMM Z. Angew. Math. Mech. 2010;90(1):5–32. [Google Scholar]
  • [25].Rheinbach O. Parallel iterative substructuring in structural mechanics. Arch. Comput. Methods Eng. 2009;16(4):425–463. [Google Scholar]
  • [26].Brands D, Klawonn A, Rheinbach O, Schröder J. Modelling and convergence in arterial wall simulations using a parallel feti solution strategy. Computer Methods in Biomechanics and Biomedical Engineering. 2008;11(5):569–583. doi: 10.1080/10255840801949801. [DOI] [PubMed] [Google Scholar]
  • [27].Brinkhues S, Klawonn A, Rheinbach O, Schröder J. Augmented Lagrange methods for quasi-incompressible materials – Applications to soft biological tissue. Int. J. Numer. Meth. Engrg. 2013;29(3):332–350. doi: 10.1002/cnm.2504. [DOI] [PubMed] [Google Scholar]
  • [28].Balzani D, Brands D, Klawonn A, Rheinbach O, Schröder J. On the mechanical modeling of anisotropic biological soft tissue and iterative parallel solution strategies. Arch. Appl. Mech. 2010;80(5):479–488. [Google Scholar]
  • [29].Balzani D, Böse D, Brands D, Erbel R, Klawonn A, Rheinbach O, Schröder J. Parallel simulation of patient-specific atherosclerotic arteries for the enhancement of intravascular ultrasound diagnostics. Eng. Computation. 2012;29(8):888–906. [Google Scholar]
  • [30].Brzobohatỳ T, Dostál Z, Kozubek T, Kovář P, Markopoulos A. Cholesky decomposition with fixing nodes to stable computation of a generalized inverse of the stiffness matrix of a floating structure. Int. J. Numer. Meth. Engrg. 2011;88(5):493–509. [Google Scholar]
  • [31].Augustin CM, Steinbach O. FETI methods for the simulation of biological tissues. In: Bank R, Holst M, Widlund O, Xu J, editors. Domain Decomposition Methods in Science and Engineering XX. Springer; Berlin: 2013. pp. 503–510. ( Lecture Notes in Computational Science and Engineering, vol. 91 ). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Klawonn A, Widlund OB. A domain decomposition method with Lagrange multipliers and inexact solvers for linear elasticity. SIAM J. Sci. Comput. 2000;22(4):1199–1219. [Google Scholar]
  • [33].Mandel J, Tezaur R. Convergence of a substructuring method with Lagrange multipliers. Numer. Math. 1996;73(4):473–487. [Google Scholar]
  • [34].Flory PJ. Thermodynamic relations for high elastic materials. Trans. Faraday Soc. 1961;57:829–838. [Google Scholar]
  • [35].Coleman BD, Noll W. The thermodynamics of elastic materials with heat conduction and viscosity. Archive for Rational Mechanics and Analysis. 1963;13:167–178. [Google Scholar]
  • [36].Truesdell C, Toupin RA. The classical field theories. In: Flügge S, editor. Encyclopedia of Physics. III/1. Springer-Verlag; Berlin: 1960. pp. 226–793. [Google Scholar]
  • [37].Rivlin RS, Ericksen JL. Stress-deformation relations for isotropic materials. J. Ration. Mech. Anal. 1955;4:323–425. [Google Scholar]
  • [38].Raoult A. Symmetry groups in nonlinear elasticity: An exercise in vintage mathematics. Communications on Pure and Applied Analysis. 2009;8(1):435–456. [Google Scholar]
  • [39].Holzapfel GA, Weizsäcker HW. Biomechanical behavior of the arterial wall and its numerical characterization. Comp. Biol. Med. 1998;28:377–392. doi: 10.1016/s0010-4825(98)00022-5. [DOI] [PubMed] [Google Scholar]
  • [40].Holzapfel GA, Gasser TC, Ogden RW. Comparison of a multi-layer structural model for arterial walls with a fung-type model, and issues of material stability. J. Biomech. Eng. 2004;126:264–275. doi: 10.1115/1.1695572. [DOI] [PubMed] [Google Scholar]
  • [41].Ball JM. Nonlinear analysis and mechanics: Heriot-Watt Symposium (Edinburgh, 1976) I. Pitman; London: 1977. Constitutive inequalities and existence theorems in nonlinear elastostatics; pp. 187–241. Res. Notes in Math., No. 17. [Google Scholar]
  • [42].Ball JM. Convexity conditions and existence theorems in nonlinear elasticity. Arch. Rational Mech. Anal. 1977;63(4):337–403. [Google Scholar]
  • [43]. Dacorogna B. Direct Methods in the Calculus of Variations, Applied Mathematical Sciences. vol. 78 Second edn. Springer; New York: 2008. [Google Scholar]
  • [44].Balzani D, Neff P, Schröder J, Holzapfel GA. A polyconvex framework for soft biological tissues. adjustment to experimental data. Int. J. of Solids and Structures. 2006;43:6052–6070. [Google Scholar]
  • [45].Augustin CM. Classical and all-floating FETI methods with applications to biomechanical models. Graz University of Technology; 2012. PhD Thesis. [Google Scholar]
  • [46].Holzapfel GA. Structural and numerical models for the (visco)elastic response of arterial walls with residual stresses. In: Holzapfel GA, Ogden RW, editors. Biomechanics of Soft Tissue in Cardiovascular Systems. Springer; Wien, New York: 2003. [Google Scholar]
  • [47].Deuflhard P. Newton methods for nonlinear problems: Affine invariance and adaptive algorithms. Springer; 2011. ( Springer Series in Computational Mathematics, vol. 35 ). [Google Scholar]
  • [48].Wriggers P. Nonlinear Finite Element Methods. Springer; 2008. [Google Scholar]
  • [49].Simo JC. Numerical analysis and simulation of plasticity. In: Ciarlet PG, Lions JL, editors. Numerical Methods for Solids (Part 3) Numerical Methods for Fluids (Part 1) Vol. 6. Elsevier; 1998. pp. 183–499. [Google Scholar]
  • [50].Pechstein C. Finite and Boundary Element Tearing and Interconnecting Solvers for Multiscale Problems. Springer; 2013. ( Lecture Notes in Computational Science and Engineering, vol. 90 ). [Google Scholar]
  • [51].Toselli A, Widlund OB. Domain Decomposition Methods – Algorithms and Theory. Springer; Berlin, Heidelberg: 2005. [Google Scholar]
  • [52].Langer U, Steinbach O. Boundary element tearing and interconnecting methods. Computing. 2003;71(3):205–228. [Google Scholar]
  • [53].Schenk O, Bollhöfer M, Römer RA. On large scale diagonalization techniques for the Anderson model of localization. SIAM Review. 2008;50(1):91–112. SIGEST Paper. [Google Scholar]
  • [54].Schenk O, Wächter A, Hagemann M. Matching-based preprocessing algorithms to the solution of saddle-point problems in large-scale nonconvex interior-point optimization. Comput. Optim. Appl. 2007;36(2-3):321–341. [Google Scholar]
  • [55].Steinbach O. Numerical approximation methods for elliptic boundary value problems. Springer; New York: 2008. Finite and boundary elements, Translated from the 2003 German original. [Google Scholar]
  • [56].Aneurisk-Team AneuriskWeb project website. 2012 http://ecm2.mathcs.emory.edu/aneuriskweb.
  • [57].Geuzaine C, Remacle JF. Gmsh: a three-dimensional finite element mesh generator with builtin pre- and post-processing facilities. Int. J. Numer. Methods Engrg. 2009;79(11):1309–1331. [Google Scholar]
  • [58].Bayer JD, Blake R, Plank G, Trayanova NA. A novel rule-based algorithm for assigning myocar-dial fiber orientation to computational heart models. Ann. Biomed. Eng. 2012;40(10):2243–2254. doi: 10.1007/s10439-012-0593-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [59].Humphrey JD, Holzapfel GA. Mechanics, mechanobiology, and modeling of human abdominal aorta and aneurysms. J. Biomech. 2012;45(5):804–814. doi: 10.1016/j.jbiomech.2011.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [60].Hennessy JL, Patterson DA. Computer Architecture: a Quantitative Approach. Elsevier; 2012. [Google Scholar]

RESOURCES