Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Dec 15.
Published in final edited form as: J Comput Phys. 2022 Sep 8;451:111585. doi: 10.1016/j.jcp.2022.111585

On the efficient evaluation of the azimuthal Fourier components of the Green’s function for Helmholtz’s equation in cylindrical coordinates

James Garritano a,b,*, Yuval Kluger a,b, Vladimir Rokhlin c, Kirill Serkh d
PMCID: PMC9512147  NIHMSID: NIHMS1835909  PMID: 36171963

Abstract

In this paper, we develop an efficient algorithm to evaluate the azimuthal Fourier components of the Green’s function for the Helmholtz equation in cylindrical coordinates. A computationally efficient algorithm for this modal Green’s function is essential for solvers for electromagnetic scattering from bodies of revolution (e.g., radar cross sections, antennas). Current algorithms to evaluate this modal Green’s function become computationally intractable when the source and target are close or when the wavenumber is large or complex. Furthermore, most state-of-the-art methods cannot be easily parallelized. In this paper, we present an algorithm for evaluating the modal Green’s function that has performance independent of both source-to-target proximity and wavenumber, and whose cost grows as O(m), where m is the Fourier mode. Our algorithm’s performance is independent of whether the wavenumber is real or complex. Furthermore, our algorithm is embarrassingly parallelizable.

Keywords: Helmholtz equation, Modal Green’s function, Potential theory, Electromagnetics, Volume of revolution, Axisymmetric problems

1. Introduction

This paper details how to efficiently compute the azimuthal Fourier components of the Green’s function (i.e., the modal Green’s function) for the Helmholtz equation in three dimensions, given by the formula

Gm(x,x)=Gm(|xx|)=12πππ14πeik|xx||xx|eimθdθ, (1.1)

where x, x3, k is the wavenumber, and m is the azimuthal Fourier mode. Rewriting this equation in cylindrical coordinates, with x = (r, θ, z) and x=(r,θ,z), and letting ϕ=θθ, the formula for the mth Fourier coefficient becomes

Gm(r,z,r,z)=18π2R0ππeiκ1αcosϕ1αcosϕeimϕdϕ, (1.2)

where κ = kR0, α=2rr/R02, and R02=r2+r2+(zz)2.

This integral has two features which make numeric integration difficult: the integrand is oscillatory, and it is near-singular when the distance between (r, z) and (r,z) is small (i.e., as α is close to one). However, the integrand vanishes for sufficiently large imaginary values of κ, suggesting that Cauchy’s theorem can be used to construct a contour on which all the oscillations occur where the integrand is negligible.

When devising an appropriate contour, it is helpful to consider three cases: 1) when κ is zero and m0,2) when κ is arbitrary and m is small, and 3) when both κ and m are large.

Determining the appropriate contour when κ = 0 and m0 (when the Helmholtz equation becomes the Laplace equation) is trivial, because, on any vertical contour in the lower half-plane the integrand of (1.2) monotonically decays. When κ > 0 and m = 1, the appropriate contours were solved by Gustafsson [9] via the method of steepest descent. However, Gustafsson did not analyze cases where both κ > 0 and m > 1. It would appear that steepest descent contours for the entire integrand are indeed possible (see Figure 13), suggesting that an O(1) evaluator for an arbitrary Fourier mode is possible (see Section 6.5). We observe that the resulting contours are defined implicitly as the solution to a transcendental equation, and depend on every parameter appearing in the integrand. Consequently, the relationship between the contours and the parameters is fairly complicated, and an evaluator based on these contours is challenging to implement. However, in practice, a single mode is rarely of interest, and instead the user requires a collection of M modes, where M scales with the wavenumber and the source-to-target distance (see Section 1.2.1). As it turns out, an O(1) evaluator is not necessary to evaluate M modes in O(M) time. If a single Fourier mode can be evaluated in O(m) time, using a procedure described in Section 6.3, a collection of M modes may be computed in O(M) time. Therefore, any O(m) evaluator for Gm can be used to achieve an amortized cost of O(1).

Fig. 13: Phase-amplitude plot of the oscillatory part of the integrand and the associated steepest-descent contour.

Fig. 13:

Shown is the phase-amplitude plot of the oscillatory part of the integrand of formula (1.2) (i.e., the product of the numerator of the spherical-wave term and the Fourier exponent) for κ = 85, β = 0.49, and m = 10. The red contour beginning at −π is a steepest descent contour on which all oscillations occur where the integrand is negligible.

In this paper, we describe such an O(m) evaluator. First, observe that the integral (1.2) can be written in the form

Gm(r,z,r,z)=14π2R00πeiκ1αcosϕ1αcosϕcosmϕdϕ. (1.3)

We develop on Gustafsson’s work by integrating along the contour on which the numerator of the spherical-wave term,

eiκ1αcosϕ1αcosϕ, (1.4)

monotonically decays. The part of the integrand of (1.3) dependent on azimuthal frequency, cos(), behaves poorly and grows on this contour. To circumvent this behavior, we replace the term cos() with a rational function approximation which does not grow in the complex plane. The growth of cos() along the contour is subsumed in a collection of residues which must be added to the resulting integral. The resulting algorithm’s cost scales as O(m) and is completely independent of the source-to-target distance and wavenumber.

1.1. Motivation for a Fast Evaluator of the Modal Green’s Function

A boundary integral equation defined on a curve in 2 is easier to compute than one defined on a surface in 3. For the case of a body of revolution and a Green’s function which is rotationally invariant about the axis of symmetry, by use of the decomposition described below, we can convert a problem in 3 to a series of decoupled problems in 2, each of which utilize the azimuthal Fourier expansion of the Green’s function evaluated at points along the boundary generating curve. Let Γ be the body of revolution generated by rotating the boundary-generating curve γ(t) about the z-axis. Consider the second-kind integral equation

σ(x)+ΓG(x,x)σ(x)da(x)=f(x), (1.5)

where x3, G is a rotationally invariant Green’s function, f is the given function, and σ is the solution. We first express x in cylindrical coordinates as (r, θ, z), then expand the solution and the right-hand side in terms of their respective Fourier series in the azimuthal direction, given by

σ(r,z)=σm(r,z)eimθ,f(r,z)=fm(r,z)eimθ, (1.6)

where the Fourier coefficients σm and fm are given by

σm(r,z)=12πππσ(r,z,θ)eimθdθ,fm(r,z)=12πππf(r,z,θ)eimθdθ. (1.7)

Substituting the Fourier expansions for σ and f into (1.5) and collecting terms mode-by-mode, we arrive at the decoupled integral equations

σm(r,z)+γσm(r,z)ππG(r,z,r,z,θθ)eimθrdθdrdz=fm(r,z), (1.8)

which simplifies to

σm(r,z)+2πγGm(r,z,r,z)σm(r,r)rdrdz=fm(r,z), (1.9)

where for all m, Gm are the Fourier modes of G, given by

Gm(r,r,z,z)=12πππG(r,r,z,z,ϕ)eimϕdϕ. (1.10)

Observe that solving (1.9) requires numerous evaluations of Gm(r,z,r,z) for points along γ. Therefore, converting this problem from 3 to the m decoupled problems in 2 is only computationally efficient if the evaluation of Gm(r,z,r,z) is fast.

1.2. The Modal Green’s Function for the Helmholtz Equation

The Green’s function for the Helmholtz equation in three dimensions satisfies

(2+k2)Gk(x,x)=δ(xx), (1.11)

where k is the wave number and x, x3. There are three basic solutions,

G0,k(x,x)=14πcosk|xx||xx|,G+,k(x,x)=14πeik|xx||xx|,G,k(x,x)=14πeik|xx||xx|. (1.12)

The solution to the Helmholtz equation, when viewed as the solution to the time-harmonic wave equation, has two time-harmonic conventions, eiωt and e+iωt. With the eiωt time-harmonic convention, these Green’s functions correspond to the stationary, outgoing, and incoming spherical waves, respectively. For a given application, the choice of the Green’s function is driven by convenience, such that it has the desired asymptomatic behavior away from a volume of interest given the practitioner’s time-harmonic convention. When the function grows with distance from the origin, it is referred to as the advanced Green’s function, and when the function decays with distance from the origin, it is referred to as the retarded Green’s function. In this work, we examine the retarded Green’s function with a negative time-harmonic convention. This function corresponds to the form G−,k, which requires adopting the convention that Re k0; for attenuating media this convention also requires that Im k0. In other words, we require k to be in quadrant IV of the complex plane. To use our algorithm to evaluate the retarded Green’s function with the positive time convention, observe that

G+,k=G+,k¯¯=14πeik|xx|¯|xx|¯=14πeik¯|xx||xx|¯=G,k¯¯. (1.13)

We consider a problem with rotational symmetry (i.e., a body of revolution). Switching to cylindrical coordinates and expanding G−,k in its Fourier series, we have

G,k(r,z,r,z)=m=G,m,k(r,z,r,z)eim(θθ), (1.14)

where x = (r, θ, z), x=(r,θ,z). Let ϕ=θθ denote the difference in azimuthal angles. The formula for the mth coefficient is given by (1.10). We adopt notation consistent with the literature (see, for example, [5, 7, 21]) and omit the subscripts specifying the choice of the Green’s function and wavenumber (i.e., we omit the subscripts – and k). Expanding the representation for the mth Fourier coefficient, we have

Gm(r,z,r,z)=12πππeikr2+r22rrcosϕ+(zz)24πr2+r22rrcosϕ+(zz)2eimϕdϕ. (1.15)

We then use the parameter R0 (introduced in Section 1) to rewrite (1.15), by defining κ=kR0 and α=2rr/R02, and apply the formula for the Fourier coefficient of an even function to obtain

Gm(r,z,r,z)=14π2R00πeiκ1αcosϕ1αcosϕcosmϕdϕ. (1.16)

When solving problems involving boundary integrals for vector-valued σ, there are two additional necessary modal Green’s functions which must be evaluated,

Gc,m=14π2R00πeiκ1αcosϕ1αcosϕcosmϕcosϕdϕ,Gs,m=14π2R00πeiκ1αcosϕ1αcosϕsinmϕsinϕdϕ. (1.17)

We refer the reader to [12] for a derivation of the Fourier decomposition associated with problems with vector-valued σ and the resulting decoupled equations in terms of Gm, Gc,m, Gs,m. This work presents an algorithm for the evaluation of Gm; a straight-forward substitution of the integrand results in an algorithm for Gc,m. Likewise, substitution of the integrand results in an algorithm for Gs,m, with a minor difference related to removing a singularity (see Remark 4.1).

For notational convenience, we introduce a scaled modal Green’s function,

Gms=4π2R0Gm=0πeiκ1αcosϕ1αcosϕcosmϕdϕ, (1.18)

where Gm is understood to be a function of r, r, z, and z. In a slight abuse of notation, we will denote Gms by Gm for the remainder of the paper. Any numerical scheme for evaluating Gm must depend on four parameters: κ, α, R0, and m. Notably, α is bounded by 0 ≤ α < 1, and determines the growth of the integrand near ϕ = 0. In Section 3.1.1, we will follow the notation of Gustafsson [9] and introduce the parameters β and β+, defined to be

β=1/α1,β+=1/α+1. (1.19)

We also introduce the parameters Δ and ρ0, defined as

Δ=(rr)2+(zz)2, (1.20)
ρ0=2rr. (1.21)

Note that Δ is the minimum distance between the source and the target, R0 is the maximum distance between the source and the target, and that Δ2=R02ρ0. Lastly, we observe that β and β+ are also given by the formulae

β=Δρ0,β+=R02+ρ0ρ0. (1.22)

We note that numerically computing β from α using (1.19) will result in cancellation error when α ≈ 1, so it is better to compute β directly from formula (1.22), and α from β using (1.19).

A representative sample of the literature related to the evaluation of the modal Green’s function can be found in [2, 6, 7, 8, 9, 10, 12, 15, 19].

1.2.1. Number of Fourier Coefficients Needed

Matviyenko in [15] derived an upper bound, r+, such that all Fourier modes m > r+ geometrically decay as m increases, with r+ given by

r+=κ21+1α2, (1.23)

where α=ρ0/R02 and κ = kR0 (see [15], formulae (37) and (38)). When α1, formula (1.23) simplifies to

r+κ2. (1.24)

Using Matviyenko’s formula for the decay of the modal Green’s functions (see [15], formula (40)), it can be shown that the magnitude of any Fourier coefficient m > r+ is bounded by

|Gm|<|Gr+|(11α21+1α2)mr+. (1.25)

Substituting α=1/(β2+1) into (1.25), this bound can be simplified to

|Gm|<|Gr+|(1ββ2+21+β2)mr+2(1+ββ2+21+β2)mr+2, (1.26)

where β is the scaled source-to-target distance. When β is small, 1+β21, and (1.26) can be approximated as

|Gm||Gr+|(1β21+β2)mr+2|Gr+|(1β22)mr+2, (1.27)

where we have replaced the exponentiated term with its truncated Taylor expansion in β. Formula (1.27) can be used to determine the Fourier mode M such that, for m > M, |Gm|<ϵ, where M is given by

M2log(ϵ)+2log(|Gr+|)log(122β)+r+. (1.28)

By substituting (1.23) into (1.28), we can characterize M as a function of β and κ when the source and target are close (i.e., α0.99 or, equivalently, β102) as

M2log(ϵ)+2log(|Gr+|)log(122β)+κ2=(112β)(log(ϵ)+log(|Gr+|)+κ2+O(β2)=O(1β+κ), (1.29)

where we have replaced the denominator of (1.28) with its Taylor expansion in β.

Remark 1.1.

For complex wavenumber, the rate of decay of the Fourier coefficients also decreases as |Imκ| grows.

Consider the modal Green’s function expressed as the product

eiκ1αcosϕ1αcosϕ=eiReκ1αcosϕ1αcosϕeImκ1αcosϕ. (1.30)

The decay of the Fourier coefficients of the left-hand term in the product is characterized in the preceding section. When Im κ < 0, the right-hand term monotonically increases with ϕ on [0, π], resembling a scaled dirac function centered at π (i.e., δ(ϕπ) when Im κ0). Therefore, by the convolution theorem, the total number of required Fourier components increases with |Im(κ)|.

1.3. Review of the Literature

Recall from Section 1.2 that the modal Green’s function is a function of three parameters: κ, m, and α. We divide the literature on fast algorithms for evaluating the modal Green’s function into two categories: those that evaluate the general case for any combination of input parameters, and those that evaluate special cases of input parameters (e.g., when the source and target are well-separated, when m = 1, etc.).

Almost all modern fast general-case algorithms are based on the application of the fast Fourier transform (FFT) (see, for example, [7, 8, 10, 11, 12, 19, 20, 21]). In contrast, the special-case algorithms have a diverse set of methodologies which cannot easily be summarized. Because this paper’s topic is a general-case algorithm which works for all input parameters, we do not review the literature of special-case algorithms, with the exception of Gustfasson’s contour integration technique [9], which we develop on extensively in this paper.

Because the FFT is inefficient for non-smooth functions, and the modal Green’s function is not very smooth for α ≈ 1, modern FFT-based algorithms utilize kernel-splitting, a technique in which the integral is split into a smooth part and a non-smooth part. The smooth part’s Fourier coefficients are evaluated with the FFT, and the non-smooth part is handled separately, often with a purpose-made recurrence. Two splittings are used in the literature, the splitting of Helsing [11] and the splitting of Gedney [8]. It can be shown that the fastest algorithm using the splitting of Gedney (presented in [19]) actually also used the splitting of Helsing (i.e., the algorithm uses both splittings), and is computationally equivalent to the fastest implementation of Helsing’s splitting (presented in [7]). Therefore, we only present the method of Epstein et al., which utilizes Helsing’s splitting.

1.3.1. Method of Epstein et al.

In Epstein et al. [7], the modal Green’s functions are computed using a fast-Fourier-transform-based method, with the kernel splitting of Helsing [11]. In the following, the definitions for m, κ, α, and R0 are identical to those used in Section 1.2.

The authors divide the evaluation of the modal Green’s functions for the Fourier modes −M, −M + 1, … , M − 1, M into two cases: one where the source and target are well-separated (0 ≤ α < 1/1.005), and one where the source and target are close (1/1.005α<1).

In the former case, the integrand is relatively smooth, and the modal Green’s functions are computed using an L-point FFT, obtaining near double precision accuracy when L4|κ|.

For the near-singular case, α1, the authors follow [11] by first rewriting (1.16) as

Gm(x,x)=12πππcos(κ1αcosϕ)+isin(κ1αcosϕ)4πR01αcosϕeimϕdϕ (1.31)

(see [11], Section 3, formula (9)). The integrand of (1.31) is split into a smooth sine term, Hs, and a near-singular cosine term, Hc, where Hs and Hc are given by

Hs(ϕ;κ,α)=sin(κ1αcosϕ)1αcosϕ,Hc(ϕ;κ,α)=cos(κ1αcosϕ)1αcosϕ. (1.32)

The Fourier modes of Hc are computed as the linear convolution of the Fourier modes of cos(κ1αcosϕ) and the Fourier modes of 1/1αcosϕ. The Fourier modes of cos(κ1αcosϕ) are computed via the FFT, while the Fourier modes of 1/1αcosϕ are known to be proportional to Qm1/2(χ) (see [5]), where Qm1/2 is the Legendre function of the second kind of half-integer order, with χ given by

χ=r2+r2+(zz)22rr=1α. (1.33)

Note that χ1 when α1 (i.e., when the minimum distance between the source and target is very small). The authors complete their algorithm by computing Qm1/2(χ) via a recurrence, which has a cost that grows as O(1/β), where β=1/α1. Thus, their recurrence has poor performance for χ1 (i.e., when the target and source are close). We note that a fast algorithm was recently introduced by Bremer in [3], which evaluates Qm1/2(χ) in constant run-time independent of m. Bremer’s algorithm for evaluating the Legendre function of the second kind of half-integer order [3] is useful, not only as an improvement to [7], but as an ingredient in a potential O(1) evaluator for an arbitrary mode of the Green’s function for the Laplace equation (see also Section 6.2 for an alternative algorithm). The total computational cost of Epstein et al.’s algorithm (and of kernel splitting techniques) is summarized as follows. Recall that R=1αcosϕ. After performing the splitting of Helsing, the sin(R)/R term is evaluated in O(L log L) time with the FFT, where L is the maximum of 4κ and M. The cos(R)/R term is evaluated as the convolution of the Fourier coefficients of 1/R (the Laplace term) and the Fourier coefficients of cos(R). The Fourier coefficients of cos(R) are evaluated in O(L log L) time, and the coefficients of the 1/R term are evaluated in O(1/β) time, where β=1/α1. Lastly, the convolution of the coefficients of cos(R) and the coefficients of 1/R is evaluated in O(κM) time. Finally, we summarize Epstein et al.’s algorithm for the modal Green’s function and its cost as

(cos(κ1αcosϕ))O(LlogL)O(κM)(11αcosϕ)O(1/β)+(sin(κ1αcosϕ)1αcosϕ),O(LlogL) (1.34)

where ★ is the discrete convolution operator, is the discrete Fourier transform (with its cost denoted by its implementation via the FFT), L = max(4κ, M), and β is the scaled minimum source-to-target distance given by β=1/α1. Hence, the cost of Epstein et al.’s algorithm for the modal Green’s function is

O(LlogL)+O(κM)+O(1/β), (1.35)

where L = max(4κ, M) and M=O(1/β+κ). Epstein et al.’s algorithm can be improved by the application of an O(1) evaluator for the modal Green’s function for the Laplace equation, resulting in a cost of

O(LlogL)+O(κM). (1.36)
Remark 1.2.

For complex κ, both the cos(κ) term and the i sin(κ) grow exponentially with |Imκ|. However, their sum is bounded by the value of the integrand at ϕ = 0, which decays exponentially with Im κ < 0. Because the sum of two exponentially growing terms is bounded by an exponentially decaying term, for complex κ with Im κ0, kernel splitting techniques incur catastrophic cancellation error.

2. Preliminaries

In this section, we review formulae necessary to evaluate the modal Green’s function via contour integration.

2.1. Chebyshev Polynomials

The Chebyshev polynomials are a collection of polynomials on the unit interval [−1, 1], denoted by Tn(x), which are orthogonal with respect to the weight function 1/1x2. The nth Chebyshev polynomial is given by the formula

Tn(x)=cos(narccos(x)) (2.1)

(see [1]).

2.2. The Joukowski Transformation

The Joukowski transformation

J(v)=12(v+1v), (2.2)

is both a bijection from the deleted disc D\{0}, where D={v:|v|<1}, to the region \[1,1], and a bijection from \D¯ to \[1,1], with the point at v = 0 mapped to the point at ∞ and the unit circle mapped to the interval [−1, 1]. The inverse transformation from \[1,1]D\{0} is given by the formula

J11(z)=zz+1z1, (2.3)

and the inverse transformation from \[1,1]\D¯ is given by the formula

J21(z)=z+z+1z1, (2.4)

for all z\[1,1], where the functions are all taken with respect to the principal branch.

The forward mapping is analytic on \{0}, while the inverse mappings are analytic on \[1,1], with branch cuts along [−1, 1] and square root singularities at z = ±1.

2.3. The Chebyshev Polynomials Evaluated on the Bernstein Ellipse

Recall that the mth order Chebyshev polynomial with complex argument, Tm(z), is given by

Tm(z)=cosmθ, (2.5)

where θ = arccos(z). An equivalent form of (2.5), often used for applications in the complex plane (see, for example, [18]), is given by

Tm(w)=zm+zm2, (2.6)

where w=12(z+z1) and z=exp(iθ). This form can be conveniently rewritten in terms of the Joukouwski transformation (2.2), so that (2.6) becomes

Tm(J(z))=zm+zm2, (2.7)

for all z.

Let Cρ denote a circle of radius ρ. The Joukowski transformations of the circles Cρ with ρ ≠ 1 have special significance in approximation theory, and are named the Bernstein ellipses, denoted Eρ, given by

Eρ(θ)=J(Cρ(θ))=J(ρeiθ)=12(ρeiθ+ρ1eiθ)=12(ρcosθ+iρsinθ+ρ1cosθiρ1sinθ), (2.8)

where we have used the standard parametrization of the circle, Cρ(θ)=ρeiθ. Note that both Cρ and C1/ρ under the Joukowski transformation yield the same Bernstein ellipse, that is, Eρ=E1/ρ. We adopt the convention in the literature (see, for example, [14, 18]) of parameterizing the Bernstein ellipses by ρ > 1. Formula (2.8) can be simplified into the familiar form of an ellipse,

Eρ(θ)=acosθ+ibsinθ, (2.9)

where

a=12(ρ+ρ1),b=12(ρρ1). (2.10)

Because the Bernstein ellipses are the Joukowski transformations of circles, formula (2.7) yields a formula for the composition of a Chebyshev polynomial and the parameterization of the Bernstein ellipse, given by

Tm(Eρ(θ))=Tm(J(Cρ(θ)))=ρmeimθ+ρmeimθ2. (2.11)

Formula (2.11) leads to a useful inequality,

12(ρmρm)|Tm(Eρ(θ))|12(ρm+ρm), (2.12)

for ρ > 1.

2.4. The Decay of Chebyshev Expansion Coefficients of Analytic functions

The following theorem states that, if a function f(z) can be analytically continued to the Bernstein ellipse Eρ, then the decay of the coefficients of its Chebyshev expansion can be nicely bounded. A discussion of this theorem can be found in, for example, Chapter 8 of [17], and a proof can be found in, for example, Chapter 5, §5 of [13].

Theorem 2.1.

Suppose that f(z) is an analytic function on a neighborhood of the interior of the Bernstein ellipse Eρ, where it satisfies |f(z)|L for all zEρo, for some constant L > 0. Suppose further that

f(z)=k=0akTk(z), (2.13)

for all z ∈ [−1, 1], where Tk(z) is the Chebyshev polynomial of order k. Then its Chebyshev expansion coefficients ak satisfy

|ak|2Lρk, (2.14)

for all k1.

2.5. The Number of Terms in the Chebyshev Expansions of Analytic Functions

The following corollary of Theorem 2.1 bounds the number of Chebyshev polynomials required to represent a function f(z), that is analytic and bounded in absolute value by L on the interior of Eρ, with ρ = M1/m, in terms of M, L, and m.

Corollary 2.2.

Suppose that M > 1, and let ρ=M1/m, for some integer m > 1. Suppose further that f(z) is an analytic function on the interior of the Bernstein ellipse Eρ, where it satisfies |f(z)|L for all zEρo for some constant L > 0. Suppose further that

f(z)=k=0akTk(z), (2.15)

for all z ∈ [−1, 1], where Tk(z) is the Chebyshev polynomial of order k. Finally, let 0<ϵ1 be some small real number. Then, if

k0=m(log(2L)log(ϵ))/log(M), (2.16)

then |ak|ϵ for all positive kk0.

Proof.

The proof follows in a straightforward way from Theorem 2.1. ■

2.6. Recurrence for a Certain Integral Involving a Monomial Divided by aτ2+b

In Gustafsson (see [9], equations (25) and (26)), a recurrence relation is given for the integral of an nth degree monomial divided by the square root of a pure quadratic,

τnaτ2+bdτ=τn1aτ2+bna(n1)bnaτn2aτ2+bdτ, (2.17)

for n2, with the base cases given by the formulae

1aτ2+bdτ=1aln(τa+aτ2+b) (2.18)

and

τaτ2+bdτ=aτ2+bba. (2.19)

This recurrence can be evaluated stably when |b|<|a|.

2.7. Recurrence for a Certain Integral Involving a Monomial Times aτ2+b

The recurrence relation for the integral of a monomial multiplied by the square root of a pure quadratic is given by the formula

τnaτ2+bdτ=τn1(aτ2+b)32(n+2)a(n1)b(n+2)aτn2aτ2+bdτ, (2.20)

with the base cases given by the formulae

aτ2+bdτ=τaτ2+b2b2alog(aτ2+bτa) (2.21)

and

τaτ2+bdτ=(aτ2+b)323a. (2.22)

This recurrence can be evaluated stably when |b|<|a|.

2.8. The Mapping Between a Legendre Expansion and a Taylor Series

The following standard formula relates the Legendre polynomials and their derivatives (see, for example, [1]).

Theorem 2.3.

Suppose that n1 is an integer. Then

(2n+1)Pn(x)=Pn+1(x)Pn1(x). (2.23)

This formula can be used to spectrally differentiate a Legendre expansion, as follows. Suppose that

p(x)=i=0nciPi(x), (2.24)

and that

p(x)=i=0n1ciPi(x). (2.25)

The coefficients ci can be computed from ci by iterating from from k = n, n − 1, … , 2 and, at each iteration, assigning ck1 the value (2k1)ck, and assigning ck−2 the value ck2+ck.

To compute the Taylor series of an n-term Legendre expansion at the point x0, it is sufficient to compute the expansion coefficients of its first n derivatives by spectral differentiation, and then to evaluate each successive derivative at x0. Since each derivative will require O(n) operations to compute, the Taylor series can be computed in O(n2) operations.

2.9. Contour Integral of a Monomial Divided by a First Degree Polynomial

For any k0, note the elementary indefinite integral

zkzxdz=i=0k1zkixiki+xklog(zx), (2.26)

for all x.

3. Analytical Apparatus

In this section, we review the contour integration method of Gustafsson [9] to evaluate the modal Green’s function. We demonstrate that Gustafsson’s variable substitution requires special treatment of the resulting Chebyshev polynomial, and then propose a technique to replace the Chebyshev polynomial with an approximation, based on applying a quadrature rule to Cauchy’s integral formula. Lastly, we present a geometric argument which demonstrates that the domain of integration necessary for our approximation is well-separated from Gustafsson’s contours for all m.

3.1. Steepest Descent Contour

Recall that the modal Green’s function (1.18) is the mth Fourier coefficient of the spherical wave

Hw(cosϕ)=eiκ1αcosϕ1αcosϕ. (3.1)

Rewriting (1.18) using (3.1), we have

Gm=0πHw(cosϕ)cos(mϕ)dϕ, (3.2)

where we have omitted rewriting the variables r, r, z, z. In formula (3.2), Gm is understood to be a function of four parameters: R0, α, κ, and m. Lastly, we denote the integrand of (3.2) by Hm, where

Hm(cosϕ)=Hw(cosϕ)cos(mϕ). (3.3)

This leads to an abbreviated form of Gm, given by

Gm=0πHm(cosϕ)dϕ. (3.4)

When κ or m are large, Hm(cosϕ) is highly oscillatory along the real axis. However, Hw(cosϕ) decays to zero in quadrant IV of the complex plane for complex arguments with sufficiently large negative imaginary components, provided that 0<Re(ϕ)<π. This suggests that contour integration may be used to avoid evaluating the oscillatory segment along the real axis. The integrand is analytic on a neighborhood of [0, π], so Cauchy’s integral theorem can be used to deform the integration contour to complex-valued ϕ.

Applying Cauchy’s integral theorem, we have

ΓHm(cosϕ)dϕ=0, (3.5)

where Γ is some closed contour passing along the interval [0, π] on the real axis, and extending into quadrant IV in the complex plane. We rearrange (3.5) into an expression for Gm, given by

Gm=Γ\[0,π]Hm(cosϕ)dϕ, (3.6)

where the contour is traversed in the counterclockwise direction. Determining an appropriate contour Γ \ [0, π] is the subject of the subsequent section. Ideally, one would construct a contour on which Hm(cosϕ) undergoes a finite number of oscillations where Hm(cosϕ) is not negligible, such that the number of oscillations is independent of both κ and m. Although it is difficult to construct a contour on which Hm(cosϕ) (given by formula (3.3)) has a finite number of such oscillations (see Figure 13 and Section 6.5), it is straightforward to construct a contour on which the numerator of the spherical-wave term (i.e., the numerator of Hw(cosϕ) does not oscillate, regardless of κ, α, or R0.

3.1.1. Gustafsson’s Contours

Gustafsson [9] proposed using contour integration to evaluate the modal Green’s function by constructing a contour on which the numerator of the spherical-wave term (see formula (3.1)) is non-oscillatory. In this section, we present an alternative construction of Gustafsson’s contours which, unlike Gustafsson’s construction, permits complex-valued κ. Recall that our goal is to construct a contour Γ \ [0, π] which begins at the point ϕ = 0, travels down into the complex plane sufficiently far, traverses right, and then travels up to the point ϕ = π. An adequate contour has the property that the numerator of the spherical-wave term decays (or grows) monotonically on the first and last segments, which we denote γ1 and γ2, respectively. The contour which connects γ1 and γ2, corresponds to an integral which, by design, is negligible. We denote this segment by γc for “connecting.” Because it is noncontributory, we do not derive an expression for it. We split the integral in (3.5) into

[0,π]Hm(cosϕ)dϕ+γ1Hm(cosϕ)dϕ+γcHm(cosϕ)dϕ+γ2Hm(cosϕ)dϕ=0, (3.7)

where the contours are traversed in the counterclockwise direction. We then substitute formula (3.4) into (3.7) to obtain the formula

Gm=γ1Hm(cosϕ)dϕγcHm(cosϕ)dϕγ2Hm(cosϕ)dϕ. (3.8)

We construct the contours γ1 and γ2 as follows. To construct γ1, we choose a curve which intersects the point ϕ = 0, and on which the numerator of Hw(cosϕ) does not oscillate. From formula (3.1), it is easy to see that this occurs when Re(κ1αcosϕ) is constant. Because γ1 must intersect ϕ = 0, this contour is defined by

γ1={ϕ˜:Re(κ1αcosϕ˜)=Re(κ)1α}. (3.9)

Representing κ in polar form κ=reiϕ, formula (3.9) simplifies to

γ1={ϕ˜:Re(eiϕ1αcosϕ˜)=Re(eiϕ)1α}. (3.10)

To represent this curve in parametric form, we first perform a change of variables cos ϕ = z. The curve, γ1(s), which is the solution to

eiϕ1αγ1(s)=αis+eiϕ1α, (3.11)

satisfies (3.10). With the negative time-harmonic convention, the spherical-wave term decays on γ1(s) in the s > 0 direction. Note that the constant associated with the is term in equation (3.11) is arbitrary; our choice of α is for convenience. We solve for γ1(s) by dividing both sides by eiϕ, squaring both sides, subtracting one from each side, and dividing both sides by −α, arriving at the formula

γ1(s)=e2iϕs2+2βieiϕ1ααs+1, (3.12)

where s0. Introducing the parameters β=1/α1 and β+=1/α1 (see Section 1.2), we rewrite (3.12) as

γ1(s)=e2iϕs2+2iβeiϕs+1, (3.13)

where s ≥ 0. To construct γ2, we use the same method, except that we require γ2 to intersect ϕ = π. A similar procedure used to arrive at (3.13) results in a formula for γ2(s) in the cos ϕ-plane, given by

γ2(s)=e2iϕs2+2iβ+eiϕs1, (3.14)

where s0. We note that our formulae for γ1(s) and γ2(s) differ from the formulae in the literature in that they contain the coefficients eiϕ and e2iϕ. Observe that for real κ, ϕ = 0. Lastly, for the case of real κ, making the substitution s=t/β for γ1 and s=t/β+ for γ2 yields the formulae in [9] (see [9], formula (9)), given by

γ1(t)=t24β2+it+1, (3.15)
γ2(t)=t24β+2+it1. (3.16)

Integration on these contours requires the change of variables z=cosϕ. Thus, dz=sinϕ and dϕ=1z2. Recalling that the Chebyshev polynomials of the first kind are given by

Tm(z)=cos(marccos(z)), (3.17)

the cos term with the above substitution becomes Tm(z). Thus, the integral in formula (3.6) becomes

Gm=Γ\[0,π]Hm(cosϕ)dϕ=Γ\[0,1]Hm(z)1z2dz, (3.18)

where the contour on the right-hand side is traversed in the counterclockwise direction.

It is not difficult to show that, for κ in quadrant IV of the complex plane, Hm(z) vanishes as Im(z)+ provided that 0Re(z)1 or, equivalently, that 0Re(ϕ)π. Thus, if we construct γ1 and γ2 to travel sufficiently high into the complex plane, we have that

γcHm(z)1z2dz0, (3.19)

where γc is the contour connecting γ1 and γ2. After this change of variables, we arrive at a formula for Gm where the integrand has a non-oscillatory term corresponding to the numerator of the spherical-wave term, given by

Gm=γ1eiκ1αz1αz1z2Tm(z)dzγ2eiκ1αz1αz1z2Tm(z)dz, (3.20)

where we have used (3.19) to omit the integral corresponding to γc.

Our formula (3.20) departs from the form given in [9] (see [9], formula (19)) in that (3.20) is a formula for all m, while the formula appearing in [9] is for the special case where m = 1. Although the integrand in (3.20) has a spherical-wave term which monotonically decays on γ1 and γ2, the rest of the integrand oscillates and grows along γ1 and γ2. In the subsequent section, we characterize the growth, oscillation, and sign behavior of the integrand on these contours. We then demonstrate that this results in concomitant cancellation error when evaluating formula (3.20).

3.1.2. Cancellation Error on Gustafsson’s Contours

We consider the integrand in (3.20) as the product of two terms, Hw(z), and Tm(z), with Hw(z) given as

Hw(z)=eiκ1αz1αz. (3.21)

On both contours γ1 and γ2, for points distant from the real axis (i.e., points with large imaginary component), the exponential term in Hw(z) decays far faster than Tm(z) grows, meaning the integrand decays to zero as Im(z)+. However, for points on γ1 and γ2 near the real axis, Tm(z) can be far larger than 1/Hw(z), meaning that the integrand takes on values with large magnitude, particularly when evaluating the modal Green’s function for large values of m and small values of κ.

Being the Fourier coefficient of an analytic function, Gm exhibits geometric decay in m, but is expressed as the sum of two integrals, each of which exhibits geometric growth in m. We summarize this behavior with the formula

O(am)Gm=γ1Hw(z)Tm(z)1z2dz=γ2Hw(z)Tm(z)1z2dzO(am)+O(am), (3.22)

which is only possible if the integrals have opposite sign. Therefore, integrating the form in (3.20) incurs cancellation error which grows geometrically with m.

3.2. Rational Function Approximation of the Chebyshev Polynomial

Evaluation of (3.20) incurs cancellation error which grows geometrically in m, due to the growth of the Chebyshev polynomial away from the real axis. In this section, we characterize its growth, and then propose a rational function approximation which approximately equals the Chebyshev polynomial on the interval [−1, 1], but instead decays in the complex plane.

3.2.1. The Growth of the Chebyshev Polynomial in the Complex Plane

It is helpful to characterize the growth of the Chebyshev polynomial in the complex plane. Recall that the formula for the Bernstein ellipse indexed by the parameter ρ > 0 is

Eρ(θ)=acosθ+ibsinθ, (3.23)

where

a=12(ρ+ρ1),b=12(ρρ1). (3.24)

Recall that (2.12) provides the bound

12(ρmρm)|Tm(Eρ(θ))|12(ρm+ρm), (3.25)

characterizing the growth of Tm(Eρ(θ)). Note that (3.25) can be immediately extended to any point z in the interior of the Bernstein ellipse Eρ by the maximum principle. Thus,

12(ρmρm)|Tm(z)|12(ρm+ρm), (3.26)

for all zEρo, where Eρo denotes the interior of the region bounded by Eρ.

3.2.2. Choice of the Bernstein Ellipse Parameter ρ for an mth Order Chebyshev Polynomial

Recall that, by convention, the parameter ρ > 1. Hence, by (3.26),

|Tm(z)|12(ρm+ρm)<ρm, (3.27)

for all zEρo. Thus, to bound the mth order Chebyshev polynomial by an arbitrary constant M, we pick ρ by the formula,

ρ=M1m, (3.28)

which by (3.27) bounds Tm(z) by |Tm(z)|<M for zEρo, where Eρo denotes the interior of the region bounded by Eρ.

3.2.3. Rational Function Approximation of the Chebyshev Polynomial via the Cauchy Integral Formula

In this section, we construct a rational function approximation which is approximately equal to Tm(z) on the interval [−1, 1], but, instead of exhibiting polynomial growth in the complex plane, decays.

The Chebyshev polynomial Tm(z) is analytic everywhere in the complex plane. Thus, by Cauchy’s integral formula

Tm(z)=12πiΓTm(v)vzdv, (3.29)

where Γ is any simple closed contour, and z is a point in the interior of Γ. Let Γ be a Bernstein ellipse with parameter ρ, denoted by Eρ. Then (3.29) is given by

Tm(z)=12πi02πTm(Eρ(θ))Eρ(θ)Eρ(θ)zdθ. (3.30)

Suppose that the integral in (3.30) can be efficiently estimated with a quadrature rule, given by the nodes θ1, θ2, … , θn and weights w1, w2, … , wn. Then, Tm(z)Rm(z), where

Rm(z)=12πii=1nTm(Eρ(θi))Eρ(θi)Eρ(θi)zwi. (3.31)

Recall from Section 2.3 that

Tm(Eρ(θ))=Tm(J(Cρ(θ)))=ρmeimθ+ρmeimθ2. (3.32)

Thus, we rewrite (3.31) as

Rm(z)=12πii=1naiviz, (3.33)

where

ai=(ρmeimθi+ρmeimθi2)Eρ(θi)wi,vi=Eρ(θi), (3.34)

for i = 1, 2, … , n.

3.3. An Analytic Mapping Exchanging the Bernstein Ellipse with the Interval [–1, 1]

In order to estimate the number of quadrature nodes required for the construction of the approximation Rm(z) described in the previous section, we will make use of the following map. Let denote the upper half-plane, and let Eρ denote the Bernstein ellipse with parameter ρ > 1. From the discussions of the Joukowski transformation in Section 2.2 and the Bernstein ellipse in Section 2.3, we can construct a conformal mapping that exchanges the upper half of the Bernstein ellipse Eρ and the unit interval [−1, 1] (see Figure 2).

Fig. 2: The mapping ϕ exchanging the upper half of the Bernstein ellipse with the interval [−1, 1].

Fig. 2:

The mapping ϕ given by formula (3.35) is a conformal mapping from Eρ2o to Eρo\((,1][1,)), where denotes the upper half-plane, such that the upper half of the Bernstein ellipse Eρ and the unit interval [−1, 1] are exchanged.

Lemma 3.1.

Let J(v) denote the Joukowski transformation, given by formula (2.2), and let J21(z) denote the inverse of the Joukowski transformation, given by formula (2.4). Then the mapping

ϕ(z)=J(1ρJ21(z)) (3.35)

is a conformal mapping from Eρ2o to Eρo\((,1][1,)), and exchanges the unit interval [−1, 1] with the upper half of the Bernstein ellipse Eρ, in the sense that ϕ([1,1])=Eρ and ϕ(Eρ)=[1,1].

Note that the mapping ϕ described above approaches the identity map as ρ approaches 1.

3.4. The Decay of Chebyshev Expansion Coefficients of p(z)/(zw) for z ∈ [−1, 1], where wEρ

The following theorem bounds the decay of the coefficients of the Chebyshev expansion of p(z)/(zw) for z ∈ [−1, 1], where p(z) is a polynomial of order m and wEρ. Note the similarities between this theorem and Theorem 2.1.

Theorem 3.2.

Suppose that p(z) is a polynomial of order m which satisfies |p(z)|L for all zEρo, where Eρo denotes the interior of the Bernstein ellipse Eρ, for some constant L > 0. Suppose further that wEρ, and that

p(z)zw=k=0akTk(z), (3.36)

for all z ∈ [−1, 1], where Tk(z) is the Chebyshev polynomial of order k. Finally, let vCρ be the point on the circle Cρ of radius ρ > 1 such that w=J(v)=12(v+v1). Then the Chebyshev expansion coefficients ak satisfy

|ak|4Lρ|v21|ρk, (3.37)

for all positive km.

Proof.

We begin by observing that

ak=2π11p(s)swTk(s)1s2ds, (3.38)

for all k1. Making the change of variables s=12(z+z1) and using identity (2.7), we have that

ak=iπCp(12(z+z1))12(z+z1)wzk+zk2dzz, (3.39)

where C is the circle of radius one with the usual counterclockwise orientation. Since wEρ, where Eρ is the Bernstein ellipse with parameter ρ, there exists some vCρ, where Cρ is the circle of radius ρ, such that w=12(v+v1) (see Section 2.3). Expressing w in terms of v, formula (3.39) becomes

ak=iπCp(12(z+z1))12(z+z1)12(v+v1)zk+zk2dzz, (3.40)

where vCρ, which simplifies to

ak=iπCp(12(z+z1))(z+z1)(v+v1)(zk+zk)dzz, (3.41)

where vCρ. Since

1(z+z1)(v+v1)=1zv11(zv)1=1zv111z1v, (3.42)

we observe that the integrand of (3.41) has two simple poles, one at v and the other at v−1. We assume that ρ > 1 (recall that Eρ=E1/ρ), so that |v|=ρ>1 and |v1|=1/ρ<1. Splitting the integral in (3.41) into two parts, we write ak=ak(1)+ak(2), where

ak(1)=iπC1zvp(12(z+z1))1(zv)1zkdzz, (3.43)

and

ak(2)=iπC1zv1p(12(z+z1))1z1vzkdzz. (3.44)

We first consider ak(1). Since the integrand has a simple pole at vCρ, by the residue theorem we can express the contour integral over C as the sum of the residue at v and a contour integral over CR, where R > ρ > 1, so that

ak(1)=2p(12(v+v1))vv1vk+iπCR1zvp(12(z+z1))1(zv)1zkdzz. (3.45)

Since p(z)=O(zm) as |z|, we see that p(12(z+z1))=O(zm) as |z|. Furthermore, since

1zv11(zv)11z (3.46)

as |z|, we have that, when km, the integral over CR in (3.45) vanishes as R. Thus,

ak(1)=2p(12(v+v1))vv1vk, (3.47)

for all km.

Next, we consider ak(2). Since the integrand has a simple pole at v1Cρ1, by the residue theorem we have that

ak(2)=2p(12(v+v1))vv1vk+iπCr1zv1p(12(z+z1))1z1vzkdzz, (3.48)

where r<ρ1<1. Like before, we observe that, since p(z)=O(zm) as |z|, p(12(z+z1))=O(zm) as |z|0. Since

1zv111z1vz (3.49)

as |z|0, we have that, when km, the integral over Cr in (3.48) vanishes as r0. Thus,

ak(2)=2p(12(v+v1))vv1vk, (3.50)

for all km.

Combining (3.47) and (3.50), we have

ak=4p(12(v+v1))vv1vk, (3.51)

for all km. Since vCρ and |p(z)|L for zEρo, it is easy to see that

|ak|4Lρ|v21|ρk, (3.52)

for all km, and we are done. ■

3.5. The Number of Terms in the Chebyshev Expansions of p(z)/(zw) for z ∈ [−1, 1], where wEρ

The following corollary of Theorem 3.2 bounds the number of Chebyshev polynomials required to represent a function p(z)/(zw) for z ∈ [−1, 1], where p(z) is a polynomial of order m that is bounded by L in Eρo, with wEρ and ρ=M1/m, in terms of M, L, and m.

Corollary 3.3.

Suppose that M > 1, and let ρ=M1/m, for some integer m1. Suppose further that p(z) is a polynomial of order m which satisfies |p(z)|L for all zEρo, where Eρo denotes the interior of the Bernstein ellipse Eρ, for some constant L > 0. Suppose that wEρ, and that

p(z)zw=k=0akTk(z), (3.53)

for all z ∈ [−1, 1], where Tk(z) is the Chebyshev polynomial of order k. Finally, let 0<ϵ1 be some small real number, and let

k0=m(log(2L)log(ϵ))/log(M). (3.54)

Then, when w is well-separated from ±1,

|ak|ϵ (3.55)

for all kmax(k0,m), and when w±1,

|ak|ϵ|w21|, (3.56)

for all kmax(k0,m).

Proof.

By Theorem 3.2, the Chebyshev expansion coefficients ak satisfy

|ak|4Lρ|v21|ρk, (3.57)

for all positive km. Let vCρ be the point of the circle of radius ρ > 1 such that w=J(v)=12(v+v1). Since ρ=M1/m, it follows that, when m is large, ρ1. Suppose that w is well-separated from ±1. In this case, it is easy to see that |v21|2 (observe that |v21|=2 exactly when v=±i). Thus,

4Lρ|v21|2L (3.58)

when w is well-separated from ±1.

Suppose now that w±1. Since w=J(v)=12(v+v1), it follows from formula (2.4) that,

v=J21(w)=w+w+1w1. (3.59)

We observe that, if w1, then v1. Subtracting one from both sides of (3.59), we have that

v1=w1+w+1w1. (3.60)

As w1, w −1 becomes much smaller than w1, so we have that |v1||w21| as w1. Since |v+1|2 as w1, it also follows that

|v21|2|w21|, (3.61)

as w1. By essentially the same argument, we can show that (3.61) also holds as w1. Putting all of this together, we have that, when w±1,

4Lρ|v21|2L|w21|. (3.62)

Letting

k0=m(log(2L)log(ϵ))/log(M), (3.63)

we observe that

2Lρkϵ (3.64)

for for positive kk0. Combining this with (3.57), (3.58), and (3.62), we find that, when w is well-separated from ±1,

|ak|ϵ (3.65)

for all kmax(k0,m), and when w±1,

|ak|ϵ|w21|, (3.66)

for all kmax(k0,m), and we are done. ■

3.6. The Geometry of the Bernstein Ellipse

Our approach is to compute the modal Green’s function by integrating along Gutsafson’s contours, where the spherical-wave term is non-oscillatory. However, on these contours, the Chebyshev polynomial term oscillates. To avoid the cancellation error which occurs from integrating the Chebyhsev polynomial term along Gustafson’s contours, we instead replace the Chebyshev polynomial with a rational function approximation described in Section 3.2.3. Such a function, by design, vanishes outside the Bernstein ellipse. Therefore, the behavior of our algorithm is defined entirely by the contour within the Bernstein ellipse. In this section, we describe several properties of the Bernstein ellipse in relation to the contours γ1 and γ2, which we will require when we describe our algorithm.

3.6.1. Approximations for the Major and Minor Axes as Functions of m

Recall from Section 3.2.2 that, when constructing the rational function approximation for the mth order Chebyshev polynomial, we choose the Bernstein ellipse parameter ρ using the formula

ρ=M1m, (3.67)

where M > 1 is an arbitrary constant. Recall also from Section 2.3 that the axes of the Bernstein ellipse Eρ, where ρ > 1, are given by

a=12(ρ+1ρ),b=12(ρ1ρ), (3.68)

where a is the semi-major axis (along the real axis) and b is the semi-minor axis (along the imaginary axis). For convenience we analyze the case where M = e, so that ρ=e1/m. It can be shown that, by taking the Taylor expansion of ρ and 1/ρ, the semi-major axis of the Bernstein ellipse in terms of m is

a=12(ρ+1ρ)=1+1m2+O(1m4). (3.69)

Likewise, the semi-minor axis is

b=12(ρ1ρ)=1m+O(1m3). (3.70)

3.6.2. The Distances from the Points z = 1 and z = −1 to the Bernstein Ellipse as Functions of m

Recall that Gustafsson’s contours γ1 and γ2 begin at the points z = 1 and z = −1, respectively, which are the foci of the Bernstein ellipses (see Section 3.1.1). For each focus, we are interested in two quantities: the vertical distance from the focus to the Bernstein ellipse, and the horizontal distance from the focus to the Bernstein ellipse. We examine the distances for z = 1, knowing that, by symmetry, they are the same for z = −1. Observe that the horizontal distance from z = 1 to the Bernstein ellipse is equal to a − 1. Formula (3.69) immediately yields

a1=1m2+O(1m4). (3.71)

Observe that the vertical distance from z = 1 to the Bernstein ellipse is the same as the y-coordinate of the intersection of the vertical line x = 1 with the Bernstein ellipse Eρ in the upper half-plane. This intersection’s y-coordinate is approximated by substituting the Taylor series expansions of the semi-major and semi-minor axes into the formula for the Bernstein ellipse. Solving for the resulting y, it can be shown that

y=2m41+O(1m2). (3.72)

Recall that the Taylor series of 1+x is

1+x=1+x2x28+. (3.73)

Substituting (3.73) into (3.72), we have that

y=2m2+O(1m4). (3.74)

Hence, the vertical distance from both z = 1 and z = −1 to the Bernstein ellipse is on the order of 2/m2 (see Figure 3).

Fig. 3: The distances a − 1, b, and the distance from the focus z = 1 to the intersection of the line x = 1 with the Bernstein ellipse, as functions of m.

Fig. 3:

The distance from z = 1 to the intersection of Eρ with the y = 0 axis is 1/m2, and is equal to a − 1, where a is the semi-major axis of Eρ. The intersection of x = 1 with Eρ has a distance of 2/m2 to the point z = 1. The vertical distance from z = 0 to Eρ is ≈ 1/m, and is equal to b, the semi-minor axis of Eρ.

3.6.3. Geometry of the Angles of Intersection Between Gustafsson’s Contours and the Bernstein Ellipse

Recall from Section 3.1.1 that Gustafsson’s contours γ1 and γ2 can be parameterized as

γ1(t)=t14β2+it+1, (3.75)
γ2(t)=t24β+2+it1, (3.76)

for t > 0, where β=1/α1 and β+=1/α+1, for 0 < α < 1. Consider the sets Γ1 and Γ2, consisting of all possible γ1 and γ2, respectively, defined as

Γ1={γ1:0<β<},Γ2={γ2:1<β+<}. (3.77)

The boundary of Γ1, denoted by ∂Γ1, is given by the union of the contours γ1 associated with the limits β0 and β. Likewise, the boundary of Γ2, denoted by ∂Γ2, is given by the union of the contours γ2 associated with the value β+=1 and the limit β+ (see Figure 4). We observe that, in the limit as β and β+, the contours γ1 and γ2 become vertical lines. From the sets ∂Γ1 and ∂Γ2, together with the bounds from Section 3.6.2, it is easy to see that the angles that γ1 and γ2 make with Eρ are bounded from below. In other words, Gustafsson’s contours are never close to being tangent to the Bernstein ellipse.

Fig. 4: The set of all possible Gustafsson contours together with the Bernstein ellipse Eρ in the z = cos ϕ-plane.

Fig. 4:

Recall that Gustafsson’s contours are denoted γ1 and γ2, where γ1 begins at the point z = 1 and γ2 begins at the point z = −1. The set of all possible γ1 is denoted by Γ1. The set of all possible γ2 is denoted by Γ2.

Finally, we bound the length of Gustafsson’s contours within the Bernstein ellipse. Because the signs of the quadratic term and the linear term in Gustafson’s contours (see (3.75) and (3.76)) are the same, the lengths of the contours within the ellipse are well-approximated by the distance from the foci to the intersections of the contours with the Bernstein ellipse. For the contour γ1 (associated with the point z = 1), because the Bernstein ellipse is convex, the length is at most O(1/m). For the contour γ2 (associated with the point z = −1), because the angle of intersection is nicely bounded, the length is also at most O(1/m). Therefore, on either contour, the mth-order Chebyshev polynomial term oscillates at most once within the Bernstein ellipse, as does the rational approximation.

3.7. Evaluating the Modal Green’s Function

In this section, we use the rational function approximation Rm(z), described in Section 3.2, to express the modal Green’s function in terms of integrals over Gustafsson’s contours, in such a way that the integrals do not incur the large cancellation errors described in Section 3.1.2.

Recall from Section 3.1.1 that, after the variable substitution of z = cos ϕ, dz = − sin ϕ dϕ, the formula for the modal Green’s function, Gm, is

Gm=[1,1]eiκ1αz1αz1z2Tm(z)dz. (3.78)

Our rational function approximation, Rm(z), is approximately equal to Tm(z) on the interval [−1, 1]. Therefore, substituting Rm(z) for Tm(z), we arrive at a formula for Gm,

Gm[1,1]eiκ1αz1αz1z2Rm(z)dz. (3.79)

The integrand of (3.79) is analytic everywhere in the complex plane except for a finite number of poles, so the integral can be deformed. By Cauchy’s residue theorem,

Γeiκ1αz1αz1z2Rm(z)dz=2πik=1nResz=zk(eiκ1αz1αz1z2Rm(z)), (3.80)

where z1, …, zn are the poles inside Γ. Thus, if Γ is a closed contour which includes the interval [−1, 1], we have that

GmΓ\[1,1]eiκ1αz1αz1z2Rm(z)dz+2πik=1nResz=zk(eiκ1αz1αz1z2Rm(z)), (3.81)

where Γ is a contour starting at z = 1 and ending at z = −1. We select Γ \ [−1, 1] to be the Gustafsson contour γ1 + γc + γ2, which we describe in Section 3.1.1. Since the integrand vanishes over γc, we have that

Gmγ1eiκ1αz1αz1z2Rm(z)dzγ2eiκ1αz1αz1z2Rm(z)dz+2πik=1nResz=zk(eiκ1αz1αz1z2Rm(z)). (3.82)

Since Rm(z), unlike Tm(z), does not grow as Im(z)+, it follows that formula (3.82) can be evaluated without cancellation error, provided that the nodes and weights of the quadrature formula used to construct Rm(z) are chosen correctly, which is described in the sequel.

3.8. Removing the Singularity

Recall that the integral in (3.20) corresponding to the γ1 contour has the formula

γ1eiκ1αz1αz1z2Tm(z)dz=γ1eiκ1αz1αz1z1+zTm(z)dz. (3.83)

Observe that the integrand in (3.83) has square-root singularities at z = 1 and z = −1. Furthermore, when α1, the product of the terms,

11αz1z11z, (3.84)

meaning that the integrand will have a 1/z-type singularity at z = 1. By careful reparameterization of the contour γ1, the singularities in (3.83) can be removed. The variable substitutions and analysis of the singularities in this section are unchanged when Rm(z) is substituted for Tm(z). Recall from Section 3.1.1 that the contour γ1 can be parameterized as

γ1(t)=t24β2+it+1, (3.85)

for t > 0. We then follow [9] and perform the substitution t=2βτ2 and reparameterize the contour γ1 as γ˜1, given by

γ˜1(τ)=γ1(2βτ2)=τ4+2iβτ2+1. (3.86)

Gustafsson showed (see [9], equations (15) and (16)) that, after substituting z=γ˜1(τ),dz=γ˜1(τ)dτ,

dz=4τ(τ2+iβ)dτ, (3.87)
1αz=αi(τ2+iβ). (3.88)

Thus, with the parameterization z=γ˜1(τ), formula (3.83) becomes

4iα0eiκ1αγ˜1(τ)1γ˜1(τ)1+γ˜1(τ)Tm(γ˜(τ))τdτ, (3.89)

where we have used (3.87) and (3.88) to cancel the 1αz term. The integrand in (3.89) has a square-root singularity near z = 1. After substituting (3.86) into the (1γ˜1(τ)) term in the denominator of (3.89) and factoring the radical, formula (3.89) becomes

4α0eiκ1αγ˜1(τ)τ2+2iβ1+γ˜1(τ)Tm(γ˜1(τ))dτ. (3.90)

Note that the integrand of (3.90) is the product of a smooth function and the function 1/τ2+2iβ. Let F1(τ) be the smooth term, given by the formula

F1(τ)=eiκ1αγ˜1(τ)1+γ˜1(τ)Tm(γ˜1(τ)). (3.91)

We now rewrite (3.90) using (3.91), so that the integral in (3.20) corresponding to the γ1 contour is given by

4α0F1(τ)τ2+iβdτ. (3.92)

The variable substitutions for the integral corresponding to the γ2 contour are similar. Recall that γ2 can be parameterized as

γ2(t)=t24β+2+it1. (3.93)

We reparameterize γ2(t) as γ˜2(τ), given by the formula

γ˜2(τ)=τ4+2iβ+τ21. (3.94)

By proceeding as before, we arrive at the formula for F2(τ),

F2(τ)=eiκ1αγ˜2(τ)1γ˜2(τ)Tm(γ˜2(τ)), (3.95)

such that the formula for the integral in (3.20) corresponding to the γ2 contour is given by

4iα0F2(τ)τ2+2iβ+dτ. (3.96)

We combine (3.92) and (3.96) to write a formula for the mth modal Green’s function,

Gm=4α0F1(τ)τ2+2iβdτ4iα0F2(τ)τ2+2iβ+dτ. (3.97)

Because β+ is bounded from below by 1, the denominator in (3.96) is always greater than 1. In contrast, when α1, we have that β0, which means that the denominator in (3.92) τ2=τ.

4. Algorithm

Recall that kernel splitting methods (e.g., [7]) have computational cost which scales with both |κ| and 1/β, and cannot be easily parallelized (see Section 1.3.1). Furthermore, kernel splitting techniques experience catastrophic cancellation error for modest |Imκ|. In contrast, the method of Gustafsson [9] has computational cost independent of κ and β, but incurs cancellation error which grows geometrically in m (see Section 3.1.2).

Our technique is to compute the modal Green’s function by integrating along Gustafsson’s contours using a rational function approximation in place of the Chebyshev polynomial. Because the spherical-wave term in the integrand monotonically decays, our algorithm’s cost is completely independent of κ. Unlike the method of Gustafsson, because the size our rational function approximation Rm(z) is bounded by our choice of Bernstein ellipse Eρ, our approach does not have cancellation error which geometrically grows in m. This comes at the price of having to evaluate the residues of Rm(z) on the boundary of the corresponding Bernstein ellipse Eρ, with a cost which scales with m. We also use the same technique as Gustafsson to evaluate the modal Green’s function when β0 (i.e., when the source and target are close), in time independent of of β. Consequently, our algorithm’s computational cost depends only on m and is independent of both κ and β, and scales as O(m).

4.1. Choice of the Rational Function Approximation

Recall from Section 3.2.3 that the Chebyshev polynomial Tm(z) can be approximated on the interval [−1, 1] with a rational function, Rm(z), constructed via an application of Cauchy’s integral formula followed by the application of a quadrature rule. This rational function approximation decays quickly in the complex plane. In this section, we introduce a different approximation, also denoted Rm(z), which is the sum of a Cauchy integral and a rational function.

By Cauchy’s integral formula, Tm(z) can be expressed as the contour integral,

Tm(z)=12πiΓTm(v)vzdv, (4.1)

where Γ is any simple closed contour, and z is a point in the interior of Γ. Similarly,

12πiΓTm(v)vzdv=0, (4.2)

for all z outside Γ. Recall from 3.2.2 that for any mth order Chebyshev polynomial, if ρ = M1/m, then, within the Bernstein ellipse Eρ, Tm(z) is bounded by the constant M. Furthermore, within the interior of Eρ, the Chebyshev polynomial oscillates exactly once along any possible Gustafsson contour (see Section 3.6). We also note that the Bernstein ellipse Eρ2 has minor axis twice the length of the minor axis of Eρ, and major axis four times the length of the major axis of Eρ (see Section 3.6 and Figure 5).

Fig. 5: Contours of interest with respect to the function Rm(z) in the z = cos ϕ-plane.

Fig. 5:

Gustafsson’s contours are labeled as γ1 and γ2. The inner Bernstein ellipse is denoted by Eρ. The outer ellipse is denoted by Eρ2. The intersection of γ1 with Eρ is denoted by p1, and the intersection of γ2 with Eρ is denoted by p2. The arc of Eρ between p2 and p1 is denoted by Cρ. The contours highlighted in red and region shaded in red correspond to the values of z on which the quadrature in (4.4) must be accurate, in the sense of (4.6)-(4.9).

Let Cρ denote the part of the Bernstein ellipse Eρ between the contours γ1 and γ2, which corresponds to the portion of Eρ between p1 and p2, where p1 and p2 are the intersection points of γ1 and γ2 with Eρ, respectively (see Figure 5). We split the Cauchy integral into two parts,

Tm(z)=12πiCρTm(v)vzdv+12πiEρ\CρTm(v)vzdv. (4.3)

Now, suppose that θ1, … , θn, w1, … , wn are the nodes and weights of a quadrature formula such that

12πiCρTm(v)vzdv12πii=1nTm(vi)vizdviwi, (4.4)

where vi=Eρ(θi), dvi=Eρ(θi) and the quadrature is accurate to precision ϵ > 0 for all z ∈ [−1, 1], zγ1Eρ2o, zγ2Eρ2o, z\Eρ2o, where Eρ2o is the interior of Eρ2 (see Figure 5). Now, let Rm(z) be defined by

Rm(z)=12πiEρ\CρTm(v)vzdv+12πii=1NTm(vi)vizdviwi. (4.5)

We observe that, due to formula (4.1), we have that

|Tm(z)Rm(z)|<ϵ, (4.6)

for z ∈ [−1, 1]. We also observe that, due to formula (4.1), we have that

|Tm(z)Rm(z)|<ϵ, (4.7)

for zγ1Eρo and zγ2Eρo. Likewise, due to formula (4.2),

|Rm(z)|<ϵ, (4.8)

for zγ1\Eρo and zγ2\Eρo. We also observe that, due to formula (4.2),

|Rm(z)|<ϵ, (4.9)

for z\Eρ2o.

4.1.1. Deformation of the Contour

Recall from Section 3.7 that, after the variable substitution of z = cos ϕ, dz = − sin ϕ dϕ, the formula for the modal Green’s function, Gm, is

Gm=[1,1]eiκ1αz1αz1z2Tm(z)dz. (4.10)

Our approximation, Rm(z), by formula (4.1), is approximately equal to Tm(z) on the interval [−1, 1]. Therefore, substituting Rm(z) for Tm(z), we arrive at a formula for Gm,

Gm[1,1]eiκ1αz1αz1z2Rm(z)dz. (4.11)

The integrand of (4.11) is analytic everywhere in the complex plane except for a finite number of poles, so the integral can be deformed. Recall that, for any closed contour Γ, by Cauchy’s residue theorem,

Γeiκ1αz1αz1z2Rm(z)dz=2πik=1nResz=zk(eiκ1αz1αz1z2Rm(z)), (4.12)

where z1, …, zn are the poles inside Γ. For brevity, let the portion of the integrand in (4.12) corresponding to the product of the spherical-wave term and the 1/1z2 term be represented by the function H(z), given by

H(z)=eiκ1αz1αz1z2. (4.13)

If Γ is a closed contour which includes the interval [−1, 1], we have that

GmΓ\[1,1]H(z)Rm(z)dz+2πik=1nResz=zk(H(z)Rm(z)), (4.14)

where we have substituted formula (4.13) to abbreviate the integral. We select Γ \ [−1, 1] to be Gustafsson’s contours within the outer ellipse, Eρ2, with both segments connected by a short segment γc\Eρ2o (see Figure 6). Substituting this choice of Γ \ [−1, 1] into (4.14), we have

Gmγ1Eρ2oH(z)Rm(z)dzγ2Eρ2oH(z)Rm(z)dzγcH(z)Rm(z)dz+2πik=1nResz=zk(H(z)Rm(z)), (4.15)

where γ1 and γ2 are Gustafsson’s contours as described in Section 3.1.1, and Eρ2o is the interior of the Bernstein ellipse introduced earlier (see Figure 6). We split the integral corresponding to the γ1 contour into

γ1Eρ2oH(z)Rm(z)dz=γ1EρoH(z)Rm(z)dz+(γ1Eρ2o)\EρoH(z)Rm(z)dz. (4.16)
Fig. 6: Contours used in formula (4.15) in the z = cos ϕ-plane.

Fig. 6:

The interior Bernstein ellipse is denoted by Eρ and is drawn in blue. The exterior ellipse is denoted by Eρ2. Gustafsson’s contours within the exterior ellipse are denoted by γ2Eρ2o and γ2Eρ2o and are drawn in red. The contour γc\Eρ2o, connecting the γ1 and γ2 segments, is drawn in green. The intersection of γ1 with Eρ is denoted by p1, and the intersection of γ2 with Eρ is denoted by p2.

Recall that by formula (4.7), Rm(z) ≈ Tm(z) for zγ1Eρo and for zγ2Eρo. Also, recall that by formula (4.8), Rm(z) ≈ ϵ for zγ1\Eρo and for zγ2\Eρo. Combining (4.7) and (4.8) with (4.16), we arrive at a formula for the integral over the γ1 contour within the interior of Eρ2, given by

γ1Eρ2oH(z)Rm(z)dzγ1EρoH(z)Tm(z)dz. (4.17)

Likewise, the formula for the integral over the γ2 contour within the interior of Eρ2 is

γ2Eρ2oH(z)Rm(z)dzγ2EρoH(z)Tm(z)dz. (4.18)

We also observe that, due to formula (4.9), the integral corresponding to γc evaluates to zero. We now substitute our formulae for the γ1Eρ2o, γ2Eρ2o, and γc contour integrals into (4.15) to arrive at

Gmγ1Eρoeiκ1αz1αz1z2Tm(z)dzγ2Eρoeiκ1αz1αz1z2Tm(z)dz+2πik=1nResz=zk(eiκ1αz1αz1z2Rm(z)). (4.19)

4.1.2. Interpretation of the Residues in Formula (4.19) as a Quadrature Formula over the Contour Cρ

By Cauchy’s integral theorem,

Gmγ1EρoH(z)Tm(z)dzγ2EρoH(z)Tm(z)dzCρH(z)Tm(z)dz, (4.20)

where γ1, γ2, Eρo, and Cρ are described in Section 4.1.

Subtracting (4.19) from (4.20), and rearranging, we arrive at a formula for the integral over the Cρ contour,

Cρeiκ1αz1αz1z2Tm(z)dz2πik=1nResz=zk(eiκ1αz1αz1z2Rm(z)). (4.21)

Recall from Section 4.1 that

Rm(z)=12πiEρ\CρTm(v)vzdv+12πii=1nTm(vi)vizdviwi, (4.22)

where vi=Eρ(θi), dvi=Eρ(θi), and θ1, … , θn, w1, … , wn are nodes and weights of the quadrature formula constructed in (4.4). Thus, the residues in (4.21) at the points z1, … , zn correspond to residues at the points v1, … , vn and

2πii=1nResz=zi(eiκ1αz1αz1z2Rm(z))=i=1neiκ1αvi1αvi1vi2Tm(vi)dviwi. (4.23)

Substituting (4.23) into (4.21), we have that

i=1neiκ1αvi1αvi1vi2Tm(vi)dviwiCρeiκ1αz1αz1z2Tm(z)dz, (4.24)

which resembles a quadrature formula for the contour integral on Cρ. Substituting formula (4.24) into formula (4.19), we arrive at

Gmγ1Eρoeiκ1αz1αz1z2Tm(z)dzγ2Eρoeiκ1αz1αz1z2Tm(z)dzi=1neiκ1αvi1αvi1vi2Tm(vi)dviwi, (4.25)

where vi=Eρ(θi), dvi=Eρ(θi), and θ1, … , θn, w1, … , wn are the nodes and weights of the quadrature formula constructed in (4.4).

4.2. Evaluation of the Integral on Gustafsson’s Contours when α ≈ 1

Recall from Section 4.1.2 that the formula for the mth modal Green’s function is

Gmγ1Eρoeiκ1αz1αz1z2Tm(z)dzγ2Eρoeiκ1αz1αz1z2Tm(z)dzi=1neiκ1αvi1αvi1vi2Tm(vi)dviwi, (4.26)

where vi=Eρ(θi), dvi=Eρ(θi), and θ1, … , θn, w1, … , wn are the nodes and weights of the quadrature formula constructed in (4.4). Recall also from Section 3.8 that the integrals in (4.26) can be written as

Gm4α0τ1F1(τ)τ2+2iβdτ4iα0τ2F2(τ)τ2+2iβ+dτi=1neiκ1αvi1αvi1vi2Tm(vi)dviwi, (4.27)

where F1(τ) and F2(τ) are smooth functions corresponding to the γ1 and γ2 contours, respectively (see Section 3.8, formulae (3.92) and (3.96)), τ1 and τ2 are positive parameters such that γ1(τ1) and γ2(τ2) are the intersections of γ1 and γ2 with Eρ, respectively, and

β=1/α1,β+=1/α+1. (4.28)

When α1, the parameter β+2, meaning that the integrand in (4.27) corresponding to γ2 remains a smooth function of τ for all values of 0 < α < 1, and can be evaluated efficiently with a Gauss-Legendre quadrature. In contrast, when α1, the parameter β0. Consequently, for α1, the integrand in (4.27) corresponding to the γ1 contour has a singularity resembling 1/τ at τ = 0.

4.2.1. Evaluation of the Integral on the Contour γ1 when α1

We integrate along the contour γ1 using the following procedure. Observe that for τ sufficiently large, the integrand is smooth. Thus we split the integral into two parts,

4α0τ1F1(τ)τ2+iβdτ=4α0τ0F1(τ)τ2+iβdτ+4ατ0τ1F1(τ)τ2+iβdτ. (4.29)

The integral corresponding to the interval [τ0, τ1] can be efficiently computed using a Gauss-Legendre quadrature. The integral corresponding to the interval [0, τ0] is evaluated with a specialized recurrence based on the technique used by Gustafsson (see [9], Section 4.2), described below. Recall that F1(τ) is smooth, given by the formula

F1(τ)=eiκ1αγ˜1(τ)1+γ˜1(τ)Tm(γ˜1(τ)), (4.30)

where γ˜1(τ) is

γ˜1(τ)=τ4+iβτ2+1. (4.31)

We expand F1(τ) in k terms of its Taylor series about the point τ = 0, given by the formula

F1(τ)n=0kanτn. (4.32)

We compute the coefficients an by first forming a Legendre expansion of F1(τ) on the interval [0, τ0], and then repeatedly spectrally differentiating this expression as described in Section 2.8.

We substitute (4.32) into the integral corresponding to the interval [0, τ0] in formula (4.29), resulting in

4α0τ0F1(τ)τ2+2iβdτ4α0τ0n=0kanτnτ2+iβdτ=4αn=0kan0τ0τnτ2+2iβdτ. (4.33)

Recall from Section 2.6 that the integral of τn divided by aτ2+b has the recurrence relation

τnaτ2+bdτ=τn1aτ2+bna(n1)bnaτn2aτ2+bdτ, (4.34)

for n2, where the bases cases have the formulae

1aτ2+bdτ=1aln(τa+aτ2+b),τaτ2+bdτ=aτ2+bba. (4.35)

This recurrence is known to be stable when |b|<|a|. We observe that β1 when α1, meaning that the recurrence given by (4.34) is stable when applied to (4.33).

Remark 4.1.

The same treatment is used to evaluate Gs,m, with one minor change. After making the variable substitution z = cos ϕ, the singular term τ2+iβ appears in the numerator rather than the denominator, which is handled by the recurrence described in Section 2.7.

4.3. Construction of the Quadratures to Evaluate the Integral over the Contour Cρ

Recall from Section 4.1 that our approximation Rm(z) of the mth order Chebyshev polynomial, Tm(z), has the formula

Rm(z)=12πiEρ\CρTm(v)vzdv+12πii=1nTm(vi)vizdviwi, (4.36)

where vi=Eρ(θi), dvi=Eρ(θi), Cρ is the region of the Bernstein ellipse Eρ between the contours γ1 and γ2, and θ1, … , θn and w1, … , wn are the nodes and weights of a quadrature formula such that

12πiCρTm(v)vzdv12πii=1nTm(vi)vizdviwi, (4.37)

for z ∈ [−1, 1], zγ1Eρ2o, zγ2Eρ2o and z\Eρ2o where Eρ2o is the interior of Eρ2 (see Figure 5). Recall also from Section 4.1 that, by Cauchy’s integral formula,

Tm(z)=EρTm(v)vzdv, (4.38)

for zEρ, where Eρ is the Bernstein ellipse described in Section 4.1. Finally, recall from Section 3.2.2 that, if ρ=M1/m, then |Tm(z)|<M for all zEρo. For the sake of simplicity, we first assume that M = e, and then consider the case for general M in Sections 4.3.3 and 4.3.4.

We summarize the contours on which Rm(z) approximates Tm(z) pointwise by stating that

|CρTm(v)vzdvi=1nTm(vi)dviwiviz|<ϵ, (4.39)

for all z ∈ [−1, 1], zγ1Eρ2o, zγ2Eρ2o, and z\Eρ2o, where Eρ2o is the interior of Eρ2. Let p1 and p2 denote the intersections of γ1 and γ2 with Eρ, respectively (see Figure 7). Let C˜ρCρ denote the portion of Cρ no closer than 1/m2 from the points p1 and p2, defined by

C˜ρ={z:zCρ,|p1z|>1/m2,|p2z|>1/m2}. (4.40)

Fig. 7: Splitting of the Bernstein ellipse into C˜ρ and Cρ\C˜ρ based on proximity to Gustafsson’s contours in the cos ϕ-plane.

Fig. 7:

Gustafsson’s contours are denoted as γ1 and γ2, and drawn in green. The points where γ1 and γ2 intersect Eρ are denoted as p1 and p2, respectively. The region of the ellipse bounded by the intersections p1 and p2 defines the segment Cρ. The segment of Cρ not close to the points p1 and p2 is denoted as C˜ρ and drawn in red. The segments of Cρ which are close to the points p1 and p2 are denoted as Cρ\C˜ρ and drawn in blue. The remainder of the ellipse is denoted as Eρ\Cρ and drawn in black.

We split the integral in (4.37) into integrals over C˜ρ and Cρ\C˜ρ, arriving at

CρTm(v)vzdv=C˜ρTm(v)vzdv+Cρ\C˜ρTm(v)vzdv. (4.41)

The domain of integration C˜ρ is relatively well-separated from all values of z on which the quadrature rule in (4.39) must hold. In contrast, the domain of integration Cρ\C˜ρ is not well-separated.

4.3.1. Quadratures for the Portion of Cρ Away From Gustafsson’s Contours

Recall from formula (4.40) that, by construction, C˜ρ is separated from γ1 and γ2 by 1/m2, and that the Bernstein ellipse Eρ is separated from the interval [−1, 1] by 1/m2 near ±1. If we use the smooth change of variables ϕ described in Section 3.3 to transform the contour C˜ρ into a subset [a, b] of the interval [−1, 1], and then perform an affine transformation A to map [a, b] to [−1, 1], it is easy to see that any point z on [−1, 1], γ1Eρ2o, γ2Eρ2o, or \Eρ2o, will be mapped to a point Aϕ(z) on or outside of the Bernstein ellipse Eρ. Note that the mapping ϕ is smooth, and that both ϕ and A approach the identity map for large m. It follows that

C˜ρTm(v)vzdv11p(u)uAϕ(z)du, (4.42)

for some polynomial p(u) of order ≈ m, where Aϕ(z) is on or outside of the Bernstein ellipse Eρ. Thus, Corollary 3.3 tells us that the right-hand side of (4.42) is well-approximated by an O(m) point Gauss-Legendre quadrature, since the integrand of the right-hand side is well-approximated by a Chebyshev expansion with O(m) terms. From the fact that the change-of-variables mapping u=Aϕ(v) is smooth and nearly equal to the identity for large m, it follows that

C˜ρTm(v)vzdv (4.43)

is also well-approximated by an O(m) point Gauss-Legendre quadrature, for all z ∈ [−1, 1], zγ1Eρ2o, zγ2Eρ2o, and z\Eρ2o. Specifically, Corollary 3.3 indicates (after a mapping by Aϕ, recalling that Aϕ(p1)=1 and Aϕ(p2)=1) that there exists an O(m) point quadrature such that

|C˜ρTm(v)vzdvi=1nTm(vi)dviwiviz|ϵ, (4.44)

when z is well-separated from p1 and p2, and that

|C˜ρTm(v)vzdvi=1nTm(vi)dviwiviz|ϵ|(zp1)(zp2)|, (4.45)

when zp1 or zp2 , for all z ∈ [−1, 1], zγ1Eρ2o, zγ2Eρ2o, and z\Eρ2o.

Recall that, in the integrands of (4.11) and (4.14), Rm(z) is multiplied by the term H(z), which is a bounded function over the interval [−1, 1] and all of the relevant contours. Thus, for formula (4.25) to be an accurate approximation to Gm, the integral (4.43) only needs to be approximated to within an error ϵ in the L1-sense, over z ∈ [−1, 1], zγ1Eρ2o, zγ2Eρ2o, and z\Eρ2o. From formulae (4.44) and (4.45), we see that this is indeed the case. Thus, an O(m) point quadrature for (4.43) is sufficient to compute Gm with no loss of accuracy.

4.3.2. Quadratures for the Portions of Cρ Near Gustafsson’s Contours

In this section, we present the construction of a quadrature rule which approximates the contour integral

Cρ\C˜ρTm(v)vzdv, (4.46)

for z ∈ [−1, 1], zγ1Eρ2o, zγ2Eρ2o, and z\Eρ2o, where Eρ2o is the interior of Eρ2. Since [−1, 1] is well-separated from Cρ\C˜ρ, we focus only on z(γ1Eρ2)(γ2Eρ2). Observe that Cρ\C˜ρ consists of two disjoint segments (see Figure 7). One segment of Cρ\C˜ρ contains the point p1, which denotes the intersection of γ1 (associated with z = 1) with Eρ, the other segment of Cρ\C˜ρ contains the point p2, which denotes the intersection of γ2 (associated with z = −1) with Eρ. We denote the points where Cρ\C˜ρ ends and C˜ρ begins by p˜1 and p˜2, where p˜1 is the point closer to p1, and p˜2 is the point closer to p2. We analyze the segment of Cρ\C˜ρ near p2, with the understanding that the Bernstein ellipse is symmetric and an identical argument applies to the segment of Cρ\C˜ρ near p1. We define Bδ as

Bδ={z:|Arg(z)|π6,|z|δ}. (4.47)

Recall from Section 3.6 that γ2, in the vicinity of p2, always lies in p2+B^1/m2, where B^1/m2 is a rotated version of B1/m2, such that the opening in B1/m2 is bisected by Cρ (see Figure 8). We note that, when z is in one of the regions of interest but outside of B1/m2, it is sufficiently well-separated from the domain of integration Cρ\C˜ρ, so that a Gauss-Legendre quadrature with O(1) terms accurately approximates the integral (4.46). Hence, for the remainder of this section we exclusively focus on developing a quadrature rule which approximates (4.46) for zp2+B^1/m2.

Fig. 8: Region p2+B^1/m2 in which the quadrature in formula (4.37) must accurately evaluate the integral over the contour Cρ\C˜ρ for zγ2.

Fig. 8:

The values of z for which the quadrature must be accurate are the points in the interior of the shaded region denoted by p2+B^1/m2, whose boundary is drawn in red. Note that the angle that B^1/m2 makes with Cρ\C˜ρ is π/6 from above and π/6 from below. Gustafsson’s contour beginning at z = −1 is denoted γ2, and is drawn in green. The intersection of γ2 with the Bernstein Ellipse is denoted by p2. The Bernstein ellipse, Eρ, is drawn as three contiguous segments. The left segment, colored black, corresponds to the part of the Bernstein ellipse which is not in Cρ, and is denoted Eρ\Cρ. The middle segment, denoted Cρ\C˜ρ, is drawn in blue. The right segment, colored black, corresponds to C˜ρ. The point where Cρ\C˜ρ ends and where C˜ρ begins is denoted by p2˜.

For convenience, we rotate, translate, and rescale Cρ\C˜ρ and p2+B^1/m2 (see Figure 9), so that the segment Cρ\C˜ρ is approximated by the interval [0, 1] (i.e., it is translated by p2, rotated, and scaled by a factor of m2). Likewise, γ2 represents a similarly translated, rotated, and scaled copy of γ2. Note that we associate p2 with the point x = 0 and the point p˜2 with x = 1 (see Figure 9). Consider a quadrature rule xi, … , xn and w1, … , wn such that

|01ρ(x)xzdxi=1nρ(xi)xizwi|<ϵ (4.48)

for all zB1, where ρ(x) is smooth. Such a quadrature rule, if used to approximate (4.46), will be accurate to precision ϵ, for all zp2+B^1/m2. However, recall that, in the integrands of (4.11) and (4.14), Rm(z) is multiplied by the term H(z), which is a bounded function over the interval [−1, 1] and all of the relevant contours. Thus, for formula (4.25) to be an accurate approximation to Gm, the integral (4.46) only needs to be approximated to within an error ϵ in the L1-sense, over zγ1Eρ2o and zγ2Eρ2o. In the rotated and rescaled coordinates, this means that our quadrature rule for the integral in (4.48) must be accurate to within an error ϵ in the L1-sense, over zγ2B1, where γ2 starts at z1B1 and ends at z2B1, with |z1|=|z2|=1 (see Figure 9). Thus, the left-hand side of (4.48) has to be bounded by ϵ only in L1(γ2B1), meaning that the integral and the quadrature approximation in (4.48) can disagree on a set of measure ϵ.

Fig. 9: Rescaling and rotation of region of interest depicted in Figure 8.

Fig. 9:

Region B1 is a translation, rotation, and rescaling of B^1/m2 such that B1 has radius 1. The values of z for which the quadrature must be accurate are the points in the interior of the shaded region B1.

This allows us to relax (4.48) to the condition

|01ρ(x)xzdxi=1nρ(xi)xizwi|<ϵ|z|, (4.49)

for zB1. Thus, for each δ > 0, the quadrature is accurate to within an error ϵ/δ for all zBδ. Since the length of γ2Bδ is on the order δ, the corresponding L1 error in the quadrature rule is δϵ/δ=ϵ.

We can construct this quadrature by first sampling ziB1, and then computing a generalized Gaussian quadrature (see [4]) on x ∈ [0, 1], where (4.49) is enforced on all the sampled zi’s. By Cauchy’s theorem, if (4.49) holds on ∂B1, then it will also hold on B1. However, this still results in a quadrature rule with several hundred nodes. It turns out that far fewer nodes can be used, due to the following observation.

Recall that, in the integrands of (4.11) and (4.14), Rm(z) is multiplied by the term H(z), given by

H(z)=eiκ1αz1αz1z2. (4.50)

Because H(z) is smooth near z = p2, we only need that

|γ2B1σ(z)01ρ(x)xzdxdzγ2B1σ(z)i=1nρ(xi)xizwidz|<ϵ, (4.51)

for all sufficiently smooth functions σ(z). Since σ(z) is smooth, it can be represented by a Taylor series of a small order k, so that

σ(z)j=0kajzj. (4.52)

Thus, inequality (4.51) becomes

|γ2B1zj01ρ(x)xzdxdzγ2B1zji=1nρ(xi)xizwidz|<ϵ, (4.53)

for each j = 0, 1, … , k. Exchanging the order of integration,

|01ρ(x)γ2B1zjxzdzdxi=1nρ(xi)γ2B1zjxizdz,wi|<ϵ, (4.54)

Recall from Section 2.9 that

γ2B1zjxzdz=ϕ(x)+ψ(x)log(xz1xz2), (4.55)

where ϕ and ψ are polynomials of order j, and z1 and z2 are the endpoints of γ2B1. Due to the geometry of B1, we have that

|z1x|12,|z2x|12, (4.56)

for all x ∈ [0, 1]. We also observe that the branch cut of

log(xz1xz2) (4.57)

does not intersect the interior of [0, 1], so (4.57) is smooth on [0, 1]. Since ρ(x) is smooth and (4.57) is smooth, we observe that the integrand in (4.54), given by

ρ(x)γ2B1zjxzdz, (4.58)

is a smooth function of x for x ∈ [0, 1]. Hence, a Gauss-Legendre quadrature with O(1) points will satisfy (4.54).

Because (4.54) is satisfied, (4.51) is satisfied, and so the contour deformation argument presented in Section 4.1.1 can be carried out without change, using a Gauss-Legendre quadrature with O(1) points on Cρ\C˜ρ and O(m) points on C˜ρ (see Section 4.3.1).

4.3.3. The Error in the Approximation Rm(z)

In order to derive the approximation (4.25) to the Green’s function Gm, we approximated the Chebyshev polynomial Tm(z) by the function Rm(z), defined by (4.22) (see also (4.11) and (4.14)). Recall from Section 3.2.2 that, when ρ=M1/m, we have that |Tm(z)|M for all zEρ. If the formula for Rm(z) is evaluated numerically, then the integrand and summand in that formula will both have size approximately M, while the sum, Rm(z), will have size approximately one for z ∈ [−1, 1]. Thus, due to cancellation error, |Rm(z)Tm(z)|Mϵ for all z ∈ [−1, 1], where ϵ is equal to machine precision. This means that, for ρ=M1/m, the approximation for Gm given by formula (4.25) has an error of M ϵ.

4.3.4. The Number of Quadrature Nodes on C˜ρ

In Section 4.3.1, we showed that O(m) nodes are required on C˜ρ by pointing out that the distance from C˜ρ to the nearest pole in the integrand of (4.43) is 1/m2 at its endpoints and ≈ 1/m in the middle. We then used Corollary 3.3 to state that the number of terms required to expand the integrand of (4.43) in Chebyshev polynomials is O(m), which means that O(m) nodes are needed in the corresponding quadrature formula.

In fact, Corollary 3.3 provides a quantitative bound on how many terms are required. We observe that |Tm(z)|<M for all zEρo, and that the minimum attainable error in the evaluation of (4.43) is M ϵ. Replacing ϵ with M ϵ and L with M in Corollary 3.3, we find that the number of Chebyshev expansion terms required to approximate the integrand of (4.43) to precision M ϵ is

k0m(log(ϵ/2))/log(M). (4.59)

Thus, O(k0/2)=O(m) Gauss-Legendre nodes are required on C˜ρ. The parameter M allows for a tradeoff between the number of quadrature nodes and the error in the approximation. This is illustrated for various values of M in Tables 1 and 2, both for ϵ1016 and ϵ1034, respectively.

Table 1: The required number of Gauss-Legendre nodes on C˜ρ to approximate (4.43), in double precision.

In this table, k0/2 is the required number of nodes, ϵ=1016, and ρ=M1/m.

M Mϵ k 0 k0/2
10 10−15 16.3m 8.15m
100 10−14 8.15m 4.08m
103 10−13 5.43m 2.72m
106 10−10 2.72m 1.36m
109 10−7 1.81m 0.91m
1012 10−4 1.34m 0.68m
Table 2: The required number of Gauss-Legendre nodes on C˜ρ to approximate (4.43), in quadruple precision.

In this table, k0/2 is the required number of nodes, ϵ=1034, and ρ=M1/m.

M Mϵ k 0 k0/2
10 10−33 34.3m 17.15m
100 10−32 17.15m 8.58m
103 10−31 11.43m 5.72m
106 10−28 5.72m 2.86m
109 10−25 3.81m 1.91m
1012 10−22 2.86m 1.43m
1015 10−19 2.29m 1.14m
Remark 4.2.

In Section 4.3.2, we demonstrate that only O(1) nodes are required on Cρ\C˜ρ. We observe that, in practice, we can place a single O(m) Gauss-Legendre quadrature with k0/2 nodes on the entire contour Cρ, rather than placing two O(1) quadratures on each part of Cρ\C˜ρ and one O(m) quadrature on C˜ρ. We also note that, in practice, the minimum number of quadrature nodes required to achieve the accuracy M ϵ matches the estimates in Tables 1 and 2 very closely.

Remark 4.3.

We note that integrating (4.43) with an O(m) point Gauss-Legendre quadrature with respect to arc length requires an arc length parameterization of the contour Cρ, which is a section of the Bernstein ellipse Eρ. Such arc length parameterizations are given by incomplete elliptic integrals, and are not available analytically, although accurate and efficient algorithms are available. Since the precise locations of the quadrature nodes on Cρ depend on the intersection points p1 and p2 of Gustafsson’s contours with the Bernstein ellipse Eρ, and since these points are constantly changing for each choice of parameters κ, m, and α (see Section 1), the evaluation of such elliptic integrals becomes computationally expensive even with an efficient algorithm. It turns out that it is possible to use a Gauss-Legendre quadrature with respect to the ellipse parameter θ (see formula (2.8)), instead of arc length. Applying a quadrature rule with respect to the ellipse parameter turns out to be only slightly suboptimal in terms of the required number of nodes. For M = 100, an error of 10−14 is attained by 5m quadrature nodes with respect to θ, instead of the expected 4m nodes with respect to arc length, and an error of 10−32 is attained by 11m quadrature nodes with respect to θ, instead of the expected 8.6m nodes with respect to arc length (See Tables 1 and 2).

4.4. Extension to Complex Wavenumber

In Section 4.1.2, we presented an algorithm which evaluates Gm for real-valued wavenumber by summing the contributions from three components: the two steepest descent contours γ1 and γ2 (i.e., Gustafssons’ contours), and the connecting segment, denoted by Cρ (see formula (4.25)). Our approach ultimately sums these three contributions using O(m) quadrature nodes. To evaluate Gm for complex-valued wavenumber, we replace the steepest descent contours for real wavenumber with the steepest descent contours for complex wavenumber, constructed in Section 3.1.1 (see formulae (3.13) and (3.14)). The geometry of these contours also allows us to evaluate Gm with O(m) nodes. However, unlike the real case, the lengths of the three segments, γ1, γ2, and Cρ, vary with the wavenumber’s complex argument, and, therefore, so does the allocation of the O(m) quadrature nodes among the three segments. In this section, we characterize how γ1, γ2, and Cρ change with complex-valued κ. We briefly comment on the intersection of these contours with the Bernstein ellipse. Lastly, we demonstrate that one of the two contours approaches a singularity when the wavenumber is almost purely imaginary. For this case, we replace the three contours with a different set of contours which are well-separated from the singularity and is also evaluated with O(m) nodes.

4.4.1. Behavior of the Steepest Descent Contours and Connecting Contour for Complex Wavenumber

Recall from Section 3.1.1 (see formulae (3.13) and (3.14)) that the steepest descent contours for the numerator of the spherical-wave term with complex wavenumber can be parameterized as

γ1(s)=e2iϕs2+2iβeiϕs+1, (4.60)
γ2(s)=e2iϕs2+2iβ+eiϕs1. (4.61)

Recall also from Section 1.2 that, with the negative time-harmonic convention, the wavenumber for the retarded modal Green’s function is in quadrant IV of the complex plane (i.e., π/2ϕ0). It is easy to see from formulae (4.60) and (4.61) that, as ϕπ/2, both contours are rotated counterclockwise compared to the contours associated with real wavenumber (compare Figure 1 and Figure 10). Unlike the steepest descent contours for real wavenumber, for complex wavenumber, γ1’s intersection with the Bernstein ellipse may occur anywhere on the Bernstein ellipse in the upper half of the complex plane. Consequently, the arclength of γ1 and the connecting contour, Cρ, varies with ϕ, with the arclength of Cρ approaching zero in the limit ϕπ/2.

Fig. 1: Phase plot of the numerator of the spherical-wave term in the z = cos ϕ-plane.

Fig. 1:

A phase plot of the function exp(iκ1αz) with the parameters β = 0.95 and κ = 45 is shown. The interval [−1, 1] and the steepest descent contours γ1 and γ2 are superimposed on the plot. Note that the branch cut in the term 1αz is visible on the right of the figure. The distance from the point z = 1 to the branch cut is equal to β2.

Fig. 10: Phase plot of the numerator of the spherical-wave term in the z = cos ϕ-plane for complex wavenumber.

Fig. 10:

A phase plot of the function exp(iκ1αz) with the parameters β = 0.95 and κ = 45 exp(−i0.44π) is shown. The interval [−1, 1] and the steepest descent contours γ1 and γ2 are superimposed on the plot. Note that the branch cut in the term 1αz is visible on the right of the figure. The distance from the point z = 1 to the branch cut is equal to β2. The angle that the branch cut makes with the real axis is equal to −2ϕ.

4.4.2. Intersection of the Steepest Descent Contours for Complex Wavenumber with the Bernstein Ellipse

For real wavenumber, we are able to solve for the steepest descent contours’ intersections with the Bernstein ellipse by using the quadratic formula. However, for general complex wavenumber, the steepest descent contours’ intersections with the Bernstein ellipse can only be found by solving a quartic equation. Rather than implement the quartic formula, we apply Newton’s algorithm to find the intersection points. An appropriate initialization can be found by approximating the steepest descent contour by a line, and then finding the intersection of this linear approximation with the Bernstein ellipse by using the quadratic formula. A convenient linearization is as follows.

When 2β<1 we approximate the contours (4.60) and (4.61) as

γ1(s)e2iϕs2+1, (4.62)
γ2(s)e2iϕs21. (4.63)

When 2β1, we approximate the contours as

γ1(s)2iβeiϕs+1, (4.64)
γ2(s)2iβ+eiϕs1. (4.65)

With this initialization, Newton’s method convergences in under 15 iterations for all possible choices of complex wavenumber and source-to-target distance. To solve for the intersection in terms of the Bernstein ellipse parameter, we use the inverse Joukowski transformation (see Section 2.2, formula 2.3).

4.4.3. The Number of Quadrature Nodes Needed for Complex Wavenumber

When the wavenumber is complex, the lengths of the contours within the Bernstein ellipse change with the argument of the complex wavenumber (see formulae (4.60) and (4.61)), meaning that the number of times the integrand oscillates on each contour varies with the argument of the complex wavenumber. Thus, the number of nodes in their respective quadrature rules must also vary. We now demonstrate that the total number of quadrature nodes remains O(m) when the wavenumber is complex, and estimate the number of nodes required to resolve each contour integral in terms of the intersections of the contours with the Bernstein ellipse. Observe that, along the steepest descent contour with respect to the spherical-wave term, all oscillations in the integrand of (4.25) arise from the mth-order Chebyshev polynomial term, Tm(z). Let Eρ(θ1) be the intersection of γ1 with the Bernstein ellipse, and Eρ(θ2) be the intersection of γ2 with the Bernstein ellipse. Consider an extension to γ1 which includes a short segment connecting γ1(0) to the point Eρ(0), and an extension to γ2 which includes a segment connecting γ2(0) to the point Eρ(π), which we denote as γ1e and γ2e, respectively (see Figure 11). From the Taylor expansion of the major axis of the Bernstein ellipse in terms of the Fourier mode m, the point Eρ(0) can be shown to be O(1/m2) from z = 1, and by symmetry, Eρ(π) is O(1/m2) from z = −1. Consequently, on the segment from z = 1 to Eρ(0), Tm(z) oscillates at most once, and likewise, Tm(z) oscillates at most once from z = −1 to Eρ(π). Observe that, γ1e intersects the Bernstein ellipse at Eρ(θ1) and Eρ(0), and γ2e intersects the Bernstein ellipse at Eρ(θ2) and Eρ(π). It can be shown that the number of times Tm(z) oscillates on the extended contour γ1e within the Bernstein ellipse is at most twice the number of oscillations of Tm(z) along the Bernstein ellipse beginning and ending at the intersection points of γ1e with the Bernstein ellipse (i.e., Eρ(θ) from 0 to θ1). Thus, because γ1γ1e, the number of oscillations on γ1 is bounded by 2mθ1/π; when κ is in quadrant IV, this bound can be further improved to mθ1/π. A similar argument shows that the number of oscillations on γ2 is bounded by 2m(πθ2), and when κ is in quadrant IV, this can be further improved to m(πθ2). Lastly, Tm(z) on the connecting contour oscillates exactly m(θ2θ1)/π times. Hence, the three contours γ1, γ2, and Cρ are resolved with a total O(m) nodes, with nodes allocated in proportion to their respective numbers of oscillations. It turns out that when the total number of nodes over these three contours matches the required number of nodes estimated in Section 4.3.4 and Remarks 4.24.3, our method achieves full accuracy for complex wavenumber.

Fig. 11: The intersections of the extended contours γ1e and γ2e with the Bernstein ellipse.

Fig. 11:

The extended steepest descent contours γ1e and γ2e, for a complex wavenumber, are plotted in green, together with their intersection points with the Bernstein ellipse, Eρ. The connecting contour, Cρ, is plotted in blue. A circle of radius aρ − 1 centered at z = 1 is plotted with a dotted circumference.

Recall from Section 4.2 that the integrand associated with γ1 has a 1/τ singularity in the limit as β approaches zero (see formula 4.29). For real wavenumber, we address this by splitting the integral over γ1 into a singular part and a smooth part. We then evaluate the smooth part with a Gauss-Legendre quadrature rule and the singular part with a specialized recurrence used by Gustafsson [9], which altogether is accomplished with O(1) nodes. For complex wavenumber, a similar approach can be taken, except that special care is necessary to ensure that the domain of integration of the singular part has O(1) oscillations, and thus, can be evaluated with O(1) nodes.

Let [0, τ0] be the domain of integration for the singular integral and [τ0, τ1] be the domain of integration for the smooth integral, where τ1 is the value of the contour parameter such that γ1(τ1) intersects the Bernstein ellipse. The following heuristic provides a robust method to select τ0 such that Tm(z) oscillates at most once on the contour γ(τ) from zero to τ0. Recall that Tm(z) oscillates at most once in the disc of radius aρ11/m2 centered at z = 1 (see Figure 11). Then, we choose τ0 such that |γ2(τ0)1|=aρ(0)1. This solution is easily approximated by using formulae (4.62) and (4.63) when 2β<1, and by using formulae (4.64) and (4.65) when 2β1. With this choice of τ0, Tm(z) oscillates at most once, meaning it is resolved with a O(1) nodes.

4.4.4. Avoiding the Singularity Associated with γ1 for Nearly-Imaginary Wavenumber

Recall from Section 3.8 that, after reparameterizing Gustafsson’s contours and splitting the integral into a γ1 term and a γ2 term, the integral associated with γ1 has a singularity at z = −1 (see formula 3.83). When the wavenumber is real, γ1 is well-separated from the singularity. In the limit when the wavenumber approaches a purely imaginary value (i.e., ϕπ/2), γ1 approaches the point z = −1 (compare Figure 1 and Figure 10). Therefore, when evaluating Gm for almost purely imaginary wavenumber, integration along γ1 has loss of accuracy if Hw(z) is not small prior to reaching the singularity. Observe that Hw(z)|z=1eiκ1α, and that Hw(z)|z1eiκ1+α. Hence,

Hw(z)|z1Hw(z)|z=1eiκ(1+α1α). (4.66)

When the right-hand-side of formula (4.66) is less than machine epsilon, Hw(z) near z = −1 does not contribute to the integral, so the contour can truncated before reaching the singularity without loss of accuracy. When formula (4.66) is not negligible, integrating on the steepest descent contour for complex wavenumber results in loss of accuracy due to the singularity at z = −1. However, observe that when formula (4.66) is not negligible, κ(1+α1+α) is small, meaning that the spherical-wave term oscillates slowly on any contour of the form in formula (4.60). Therefore, the integral along any reasonable contour which is well-separated from the singularity may be resolved with O(m) nodes. Hence, when the right-hand-side of formula (4.66) is not negligible, rather than integrate on the steepest descent contour for complex wavenumber, we simply use a different contour; we choose the steepest descent contour associated with Re κ, which is well-separated from the singularity for all possible κ and β.

4.4.5. Bounding the Angle of the Intersection of the Steepest Descent Contours with the Bernstein Ellipse

The proof in Section 4.3.2, showing that only O(m) nodes are required to compute Gm, relied on the fact that the angle of intersection between the steepest descent contour and the Bernstein ellipse is bounded from below. For complex wavenumber in quadrant IV, it is also possible to show that the angle of intersection of the contours with the ellipse is bounded from below. Interestingly, for the case of complex wavenumber in quadrant I, there exist combinations of wavenumbers and Fourier modes m for which the intersection is oblique.

4.5. Summary of the Algorithm

Recall from Section 1.2 that Gm is a function of κ, m, and α. Recall also that α can be determined from β (see formula 1.19)), and vice versa. We thus consider Gm as a function of κ, m, and β. We compute Gm as follows. Recall from Section 4.2 the formula for Gm,

Gm4α0τ1F1(τ)τ2+2iβdτ4iα0τ2F2(τ)τ2+2iβ+dτi=1neiκ1αvi1αvi1vi2Tm(vi)dviwi, (4.67)

where F1(τ) and F2(τ) are smooth functions corresponding to the γ1 and γ2 contours, respectively defined by (3.91) and (3.95), τ1 and τ2 are positive parameters such that γ1(τ1) and γ2(τ2) are the respective intersections of γ1 and γ2 with Eρ, and Tm is the mth order Chebyshev polynomial. Recall from Section 3.6 that both γ1Eρ and γ2Eρ have length 1/m2. Hence, Tm(z) oscillates at most once along each contour. By construction, on Gustafsson’s contours (see Section 3.1.1), the numerator of the spherical-wave term in the integrand does not oscillate. Hence, the entire integrand oscillates at most once. By the argument in Section 4.2, the integrand associated with the γ2 contour is always smooth and hence can be evaluated with an O(1) point Gauss-Legendre quadrature.

The integrand associated with the contour γ1 has a singularity for β0 (i.e., when the source and target are close). For this case, we follow the method in Section 4.2 and evaluate the portion near the singularity by expanding the function F1(τ) into its Taylor series, and then use the recurrence described in Section 4.2.1. Due to the smoothness of F1(τ), this part of the integral is computed with an O(1) cost. The remainder of the integral is smooth and oscillates at most once, and hence is evaluated with an O(1) Gauss-Legendre quadrature. Hence, both integrals in (4.67) are evaluated in O(1) operations.

The remaining term in (4.67) is a sum of residues evaluated on Cρ, where Cρ denotes the portion of a Bernstein ellipse connecting γ1 and γ2 (see Section 4.1). We select the residues v1, … , vn and weights w1, … , wn by constructing a quadrature which approximates

CρTm(v)vzdv, (4.68)

and which holds for all values of z relevant to the evaluation of Gm (see Section 4.3.2). By the argument in Section 4.3, this is accomplished using O(m) Gauss Legendre nodes on Cρ.

Therefore, the entire cost of our algorithm for Gm is O(m) and is completely independent of both κ and β. Lastly, since the algorithm is entirely quadrature-based, it is embarrassingly parallelizable.

5. Numerical Experiments

In Sections 5.15.5 we characterize the performance and accuracy of our method. Importantly, as demonstrated below, we achieve full precision for all possible ranges of β and κ, and our algorithm’s performance is completely independent of β and κ.

We use adaptive integration applied to (1.18) as the gold standard, and measure the error of our algorithm by comparing the two results. We use the change of variables ϕ=x3, dϕ=3x2dx, to ensure that adaptive integration is accurate when α1. We compute the 1αcos(ϕ) term using the double angle formula to avoid cancellation error. The error in evaluating the modal Green’s function for very large κ is not measured, as adaptive integration is too expensive and no prior method can compute the modal Green’s function for large κ.

An implementation of the previously described algorithm was written in Fortran 77 and wrapped in MATLAB as a MEX file. Our code is available on https://doi.org/10.5281/zenodo.7040462. In our implementation, we chose M = 100, and used 5m quadrature nodes with respect to the ellipse parameter on Cρ in double precision, and 11m quadrature nodes with respect to the ellipse parameter on Cρ in extended precision (see Section 4.3.4 and Remarks 4.2 and 4.3). The timing and performance experiments in Sections 5.15.4 were performed using a consumer laptop with a four-core 2.6 GHz Intel i7 processor running a timing script in MATLAB 2018b with two threads. The parallel computing experiment in Section 5.5 was run on a server with a 16-core Intel Xeon 2.9 GHz processor.

5.0.1. The Interpretation of β and κ

Recall from Section 1.2 that the modal Green’s function can be thought of as a function of four parameters: m, k, α, and R0. After the introduction of the parameters κ and β (see formula (1.16)), the R0 term exclusively appears as a 1/R0 scaling outside the integral. Hence, with this parameterization, R0 is of no independent consequence to the performance of our algorithm, so we only characterize our algorithm’s performance as a function of κ, β, and m. Recall also that β is defined as

β=Δρ0, (5.1)

where Δ is the minimum source-to-target distance and ρ0=2rr, with r and r being the radial distances of the source and target in cylindrical coordinates. Recall finally from Section 1.2 that κ is defined as

κ=kR0. (5.2)

5.1. Performance of the Algorithm with Varying Source-to-Target Distance

We examined the performance of our algorithm over a wide range of source-to-target distances. As shown in Table 3 and Table 4, our algorithm’s performance is independent of β.

Table 3: The evaluation of the modal Green’s function in double precision for varying β with a large wavenumber (κ = 10, 000).

The error is evaluated by using adaptive Gaussian quadrature as the gold standard. For brevity, β=1018 is omitted.

κ = 10, 000, m = 10 κ = 10, 000, m = 1000
β Evaluation Time Absolute Error Evaluation Time Absolute Error
100 3.76×10−5 secs 3.34×10−14 1.44×10−3 secs 4.71×10−13
10−3 3.57×10−5 secs 3.43×10−14 1.44×10−3 secs 1.79×10−12
10−6 3.51×10−5 secs 3.92×10−14 1.44×10−3 secs 3.43×10−13
10−9 3.59×10−5 secs 5.50×10−14 1.44×10−3 secs 5.27×10−13
10−12 3.52×10−5 secs 3.33×10−14 1.44×10−3 secs 5.28×10−13
10−15 3.46×10−5 secs 1.69×10−14 1.44×10−3 secs 4.81×10−13
10−21 3.44×10−5 secs 6.63×10−14 1.44×10−3 secs 5.11×10−13

Table 4: The evaluation of the modal Green’s function in quadruple precision for varying β with a large wavenumber (κ = 10, 000).

The error is evaluated by using adaptive Gaussian quadrature as the gold standard. For brevity, β=1018 is omitted.

κ = 10, 000, m = 10 κ = 10, 000, m = 1000
β Evaluation Time Absolute Error Evaluation Time Absolute Error
100 5.82×10−3 secs 2.89×10−32 2.11×10−1 secs 6.59×10−31
10−3 5.84×10−3 secs 2.31×10−32 2.11×10−1 secs 1.07×10−30
10−6 5.74×10−3 secs 2.06×10−31 2.11×10−1 secs 1.82×10−30
10−9 6.25×10−3 secs 3.45×10−33 2.12×10−1 secs 2.63×10−31
10−12 6.27×10−3 secs 2.07×10−32 2.12×10−1 secs 1.55×10−31
10−15 6.22×10−3 secs 9.65×10−32 2.12×10−1 secs 5.53×10−31
10−21 5.92×10−3 secs 2.18×10−31 2.11×10−1 secs 6.36×10−31

5.2. Performance of the Algorithm with Varying κ, for Real-Valued κ

We examined the performance of our algorithm over a wide range of real values of κ (for performance with complex-valued κ, see Section 5.3 ). As shown in Tables 58, our algorithm’s performance is independent of κ.

Table 5: The evaluation of the modal Green’s function in double precision for varying real-valued κ with large source-to-target distance (β = 1).

The error is evaluated by using adaptive Gaussian quadrature as the gold standard. Note for κ>106, the resource requirements of prior methods becomes excessive. For brevity, κ=1012 and κ=1015 are omitted.

β = 1 , m = 10 β = 1 , m = 1000
κ Evaluation Time Absolute Error Evaluation Time Absolute Error
10−6 1.22×10−4 secs 1.45×10−13 1.84×10−3 secs 2.05×10−12
10−3 6.26×10−5 secs 1.50×10−13 1.64×10−3 secs 2.05×10−12
100 6.08×10−5 secs 1.61×10−13 1.66×10−3 secs 2.02×10−12
101 1.36×10−4 secs 2.71×10−14 2.50×10−3 secs 1.83×10−12
102 1.02×10−4 secs 4.94×10−15 2.43×10−3 secs 2.23×10−12
103 4.56×10−5 secs 1.30×10−14 1.73×10−3 secs 1.51×10−12
104 3.89×10−5 secs 3.34×10−14 1.69×10−3 secs 1.03×10−12
105 3.94×10−5 secs 2.25×10−14 1.70×10−3 secs 5.05×10−13
106 4.15×10−5 secs 2.75×10−13 1.75×10−3 secs 3.32×10−13
107 3.78×10−5 secs 8.39×10−4 secs
108 3.91×10−5 secs 8.33×10−4 secs
109 4.46×10−5 secs 8.23×10−4 secs
1018 3.98×10−5 secs 8.33×10−4 secs

Table 8: The evaluation of the modal Green’s function in quadruple precision for varying real-valued κ with small source-to-target distance (β=1012).

The error is evaluated by using adaptive Gaussian quadrature as the gold standard. Note for κ>106 , the resource requirements of prior methods becomes excessive.

β = 10−12 , m = 10 β = 10−12 , m = 1000
κ Evaluation Time Absolute Error Evaluation Time Absolute Error
10−6 7.23×10−3 secs 7.06×10−31 2.08×10−1 secs 5.75×10−29
10−3 7.53×10−3 secs 7.12×10−31 2.08×10−1 secs 5.75×10−29
100 7.16×10−3 secs 6.54×10−31 2.10×10−1 secs 5.70×10−29
101 7.53×10−3 secs 8.03×10−32 2.11×10−1 secs 5.56×10−29
102 7.09×10−3 secs 5.00×10−32 2.11×10−1 secs 4.55×10−29
103 6.25×10−3 secs 6.41×10−32 2.12×10−1 secs 7.93×10−30
104 6.23×10−3 secs 2.07×10−32 2.11×10−1 secs 1.55×10−31
105 6.21×10−3 secs 1.43×10−31 2.12×10−1 secs 2.21×10−31
106 6.21×10−3 secs 2.37×10−31 2.12×10−1 secs 3.88×10−31
107 6.25×10−3 secs 1.52×10−1 secs
108 6.22×10−3 secs 1.53×10−1 secs
109 6.19×10−3 secs 1.52×10−1 secs
1012 5.78×10−3 secs 1.52×10−1 secs
1015 5.72×10−3 secs 1.52×10−1 secs
1018 5.69×10−3 secs 1.52×10−1 secs

5.3. Performance of the Algorithm with Varying κ, for Complex-Valued κ

We examined the performance of our algorithm over a wide range of complex-valued κ, β, and Fourier mode m. As shown in Tables 912, our algorithm’s performance is independent of κ and β.

Table 9: The evaluation of the modal Green’s function in double precision for varying complex-valued κ with large source-to-target distance (β = 1).

The error and |G0| are evaluated by using adaptive Gaussian quadrature as the gold standard for |κ|<106. For |κ|106, the resource requirements of prior methods becomes excessive, and we instead evaluate |G0| using our method.

β = 1 , m = 10 β = 1 , m = 1000
|κ| Arg(κ) |G0| Evaluation Time Error Scaled by G0 Evaluation Time Error Scaled by G0
10−3 π/8 3.31 1.20×10−4 secs 1.52×10−14 4.46×10−4 secs 5.13×10−14
10−3 π/4 3.31 9.86×10−5 secs 1.52×10−14 4.35×10−4 secs 5.14×10−14
10−3 −3π/8 3.31 1.10×10−4 secs 1.36×10−14 3.75×10−4 secs 5.16×10−14
10−3 π/2 3.31 7.95×10−5 secs 1.55×10−14 2.84×10−4 secs 5.11×10−14

1 π/8 2.28 8.04×10−5 secs 2.17×10−14 5.63×10−4 secs 4.74×10−14
1 π/4 1.69 7.66×10−5 secs 2.24×10−14 2.82×10−4 secs 4.97×10−14
1 −3π/8 1.40 8.02×10−5 secs 2.13×10−14 2.26×10−4 secs 5.19×10−14
1 π/2 1.31 7.60×10−5 secs 1.79×10−14 2.19×10−4 secs 5.40×10−14

103 π/8 2.85×10−119 7.56×10−5 secs 3.82×10−14 2.79×10−4 secs 3.36×10−13
103 π/4 6.72×10−219 7.73×10−5 secs 4.72×10−14 3.24×10−4 secs 1.25×10−13
103 −3π/8 1.81×10−285 1.29×10−4 secs 6.83×10−14 3.76×10−4 secs 6.44×10−15
103 π/2 7.62×10−309 1.53×10−4 secs 3.76×10−14 7.65×10−4 secs 5.84×10−15

105 −10−3 1.84×10−33 5.93×10−5 secs 4.91×10−12 1.76×10−4 secs 5.90×10−12
106 −10−4 5.82×10−34 6.16×10−5 secs 1.83×10−4 secs
109 −10−7 1.84×10−35 7.06×10−5 secs 1.58×10−4 secs
1018 −10−16 5.82×10−40 7.37×10−5 secs 1.73×10−4 secs

Table 12: The evaluation of the modal Green’s function in quadruple precision for varying complex-valued κ with small source-to-target distance (β=1012).

The error and |G0| are evaluated by using adaptive Gaussian quadrature as the gold standard for |κ|<106. For |κ|106, the resource requirements of prior methods becomes excessive, and we instead evaluate |G0| using our method.

β = 10−12 , m = 10 β = 10−12 , m = 1000
|κ| Arg(κ) |G0| Evaluation Time Error Scaled by G0 Evaluation Time Error Scaled by G0
10−3 π/8 4.15×101 1.03×10−2 secs 1.28×10−32 2.92×10−2 secs 1.26×10−31
10−3 π/4 4.15×101 8.83×10−3 secs 1.19×10−32 3.28×10−2 secs 1.26×10−31
10−3 −3π/8 4.15×101 8.38×10−3 secs 1.62×10−32 2.85×10−2 secs 1.25×10−31
10−3 π/2 4.15×101 8.99×10−3 secs 1.59×10−32 3.38×10−2 secs 1.26×10−31

1 π/8 3.98×101 9.18×10−3 secs 1.43×10−32 2.90×10−2 secs 1.30×10−31
1 π/4 3.96×101 8.42×10−3 secs 1.59×10−32 2.87×10−2 secs 1.30×10−31
1 −3π/8 3.94×101 8.56×10−3 secs 1.52×10−32 3.28×10−2 secs 1.34×10−31
1 −π/2 3.94×101 9.14×10−3 secs 1.81×10−32 3.30×10−2 secs 1.30×10−31

103 −π/8 2.95×101 9.06×10−3 secs 1.88×10−30 2.95×10−2 secs 2.20×10−33
103 −π/4 2.95×101 8.73×10−3 secs 1.78×10−30 2.93×10−2 secs 2.67×10−33
103 −3π/8 2.95×101 9.01×10−3 secs 1.69×10−30 3.62×10−2 secs 2.68×10−33
103 −π/2 2.95×101 7.70×10−3 secs 1.66×10−30 4.00×10−2 secs 1.25×10−33

105 −10−3 2.31×101 7.73×10−3 secs 7.99×10−31 3.63×10−2 secs 7.90×10−31
106 −10−4 1.98×101 7.78×10−3 secs 2.27×10−2 secs
109 −10−7 1.02×101 7.74×10−3 secs 2.27×10−2 secs
1018 −10−16 1.77×10−3 7.54×10−3 secs 2.24×10−2 secs

5.4. Performance of the Algorithm with Varying Fourier Mode (m)

We examined the performance of our algorithm over a wide range of Fourier modes (represented by the parameter m). Because the number of points in the quadrature scales linearly with m, as demonstrated by Table 13, evaluation time scales linearly with the Fourier mode. Recall from the introduction of this section that the evaluation was performed on a four-core processor using two threads.

Table 13:

The evaluation time of the modal Green’s function in double precision for varying m (β=1012, κ = 10, 000).

m Evaluation Time
1 3.88×10−5 secs
10 5.56×10−5 secs
102 1.75×10−4 secs
103 1.46×10−3 secs
104 1.43×10−2 secs
105 1.37×10−1 secs
106 1.36×100 secs
107 1.29×101 secs

5.5. Parallelization of the Algorithm

The cost of our algorithm is O(m) and does not depend on κ or β (see Section 4.5). Because our algorithm is quadrature-based, it is embarrassingly parallelizable.

We measured the algorithm’s performance on a server with a 16-core Intel Xeon 2.9 GHz processor, where each core can run two threads, for a total of 32-threads. We vary the number of threads from 1 to 32, and report the results in Figure 12.

Fig. 12: Evaluation time of the modal Green’s function plotted against m with varying numbers of threads (β=107, κ = 10, 000).

Fig. 12:

The calculation is performed in double precision. The evaluation times corresponding to 32 threads are not plotted for small m.

Remark 5.1.

We note that, for many applications, the number of source-target interactions is greater than the number of modes, meaning that the practitioner may benefit most by parallelizing over source-target interactions rather than parallelizing over different modes.

6. Conclusions and Generalizations

We have developed an algorithm which evaluates the modal Green’s function for the Helmholtz equation in O(m) time, that is completely independent of both the wavenumber, which is permitted to be complex, and the source-to-target distance. Furthermore, our algorithm is embarrassingly parallelizable. Our algorithm’s method can be readily extended in several directions, described in Sections 6.16.4.

6.1. An O(1) Evaluator for Small Wavenumber (κm)

Recall that our algorithm is independent of the wavenumber because we integrate along Gustafsson’s contours, which are the steepest descent contours with respect to the numerator of the spherical-wave term (see Section 3.1). When the Fourier mode m is larger than the scaled wavenumber κ, it is more efficient to integrate along a different contour. If instead, we choose the steepest descent contour on which exp (imϕ) does not oscillate, we arrive at an alternative algorithm whose cost is O(κ) and is independent of m . When κ is extremely small, this algorithm is essentially O(1). The case where β is small (i.e., when the source and target are close) is handled in an identical fashion to the method described in Section 4.2. Thus, this alternative algorithm’s cost is completely independent of both m and β, and grows as O(κ).

6.2. An O(1) Evaluator for the Modal Green’s Functions for the Laplace Equation

The same method described in Section 6.1 can be applied to the case where κ = 0 to yield an O(1) evaluator of the modal Green’s function for the Laplace equation, whose cost is independent of β (i.e., the cost is independent of the source-to-target distance).

6.3. Extension to an O(m) Evaluator for a Collection of Modal Green’s Functions, with Amortized Cost O(1)

This paper presents an algorithm for the evaluation of a single modal Green’s function for the Helmholtz Equation in O(m) time, independent of β and κ, where β is the scaled minimum source-to-target distance and κ is the scaled wavenumber. It is possible to use this algorithm to compute all of the modal Green’s functions −M, −M + 1, … , M − 1, M in O(M) time using the following method. In [15], Matviyenko presents a five-term recurrence relation for the modal Green’s functions for the Helmholtz equation. He observes that the recurrence relation is stable upwards for one range of Fourier modes and stable downwards for another range of modes. Furthermore, there exists a range of modes for which the recurrence is bi-unstable. Thus, a classical Miller-type algorithm cannot be applied. However, it was recently observed in [16] that if a recurrence relation is represented as a banded matrix, then the inverse power method can be used to find a solution, even when the stability behavior is mixed in the sense just described. We thus apply the inverse power method, as described in [16], to the resulting five-diagonal matrix corresponding to Matviyenko’s recurrence relation. In this fashion, we obtain all the eigenvectors corresponding to the zero eigenvalue; only one vector in this eigenspace corresponds to the vector of modal Green’s functions. We thus use the O(m) evaluator of this paper to select the vector corresponding to the modal Green’s functions. The cost of performing the inverse-power method is O(M), and the cost of the evaluation of the Mth modal Green’s function is O(M), meaning that all M Fourier coefficients are obtained in O(M) time. We note that with this scheme, the M modes are computed simultaneously rather in parallel, however, the computation may still be parallelized over source-target interactions.

6.4. An Evaluator for the Partial Derivatives of the Modal Green’s Function

In Section V of [15], Matviyenko derives an identity expressing the partial derivatives of Gm in terms of Gm, Gm+1, … , Gm+5. Thus, the O(m) evaluator presented in this paper can be used to evaluate the partial derivatives of the modal Green’s function in O(m) time. Furthermore, if the method proposed in Section 6.3 is used to evaluate a collection of modal Green’s functions in amortized O(1) time, then the partial derivatives can be evaluated in amortized O(1) time as well. Finally, we note that the higher order partial derivatives of the modal Green’s function can be expressed in terms of a finite number of functions Gm, and can therefore also be evaluated in O(m) time (or O(1) amortized time) (see Remark 5.1 of [15]).

6.5. An O(1) Evaluator for an Arbitrary Mode of the Modal Green’s Function

It appears that the steepest descent contours for the entire integrand of formula (1.2) do exist, but their relationship with |κ|, Arg(κ) , m, and β is quite involved. Consider, for example, Figure 13, which is a phase-amplitude plot of the product of the numerator of the spherical-wave term and the Fourier exponential term. Observe that a steepest descent contour can be constructed from −π to π, which passes through the stationary points of the integrand. The steepest descent contour passing through the point ϕ[π,π], is the solution to

mγ(s)+κ1αcos(γ(s))=αis+κ1αcos(ϕ), (6.1)

which is a transcendental equation. We expect that the construction of a completely general-purpose O(1) evaluator will be fairly complicated.

Table 6: The evaluation of the modal Green’s function in quadruple precision for varying real-valued κ with large source-to-target distance (β = 1).

The error is evaluated by using adaptive Gaussian quadrature as the gold standard. Note for κ>106, the resource requirements of prior methods becomes excessive. For brevity, κ=1012 and κ=1015 are omitted.

β = 1 , m = 10 β = 1 , m = 1000
κ Evaluation Time Absolute Error Evaluation Time Absolute Error
10−6 7.17×10−3 secs 4.06×10−31 2.07×10−1 secs 1.12×10−30
10−3 6.43×10−3 secs 3.99×10−31 2.07×10−1 secs 1.12×10−30
100 6.92×10−3 secs 3.94×10−31 2.09×10−1 secs 1.20×10−30
101 7.09×10−3 secs 1.34×10−31 2.11×10−1 secs 9.06×10−31
102 6.93×10−3 secs 7.74×10−34 2.10×10−1 secs 1.32×10−30
103 6.81×10−3 secs 9.49×10−33 2.11×10−1 secs 1.01×10−30
104 5.79×10−3 secs 2.89×10−32 2.12×10−1 secs 6.59×10−31
105 5.78×10−3 secs 1.59×10−31 2.12×10−1 secs 6.05×10−31
106 5.76×10−3 secs 2.23×10−31 2.11×10−1 secs 4.69×10−31
107 5.76×10−3 secs 2.11×10−1 secs
108 5.82×10−3 secs 1.52×10−1 secs
109 5.72×10−3 secs 1.52×10−1 secs
1018 5.70×10−3 secs 1.52×10−1 secs

Table 7: The evaluation of the modal Green’s function in double precision for varying real-valued κ with small source-to-target distance (β=1012).

The error is evaluated by using adaptive Gaussian quadrature as the gold standard. Note for κ>106, the resource requirements of prior methods becomes excessive.

β = 10−12 , m = 10 β = 10−12 , m = 1000
κ Evaluation Time Absolute Error Evaluation Time Absolute Error
10−6 4.49×10−5 secs 3.08×10−13 1.34×10−3 secs 2.90×10−11
10−3 4.45×10−5 secs 2.90×10−13 1.37×10−3 secs 2.88×10−11
100 4.77×10−5 secs 1.90×10−13 1.40×10−3 secs 2.84×10−11
101 4.79×10−5 secs 4.35×10−14 1.41×10−3 secs 2.74×10−11
102 4.61×10−5 secs 1.80×10−14 1.43×10−3 secs 2.29×10−11
103 3.57×10−5 secs 1.07×10−14 1.44×10−3 secs 4.19×10−12
104 3.49×10−5 secs 3.33×10−14 1.43×10−3 secs 5.28×10−13
105 3.46×10−5 secs 1.50×10−13 1.44×10−3 secs 6.58×10−13
106 3.41×10−5 secs 5.11×10−13 1.45×10−3 secs 3.04×10−13
107 3.48×10−5 secs 7.88×10−4 secs
108 3.43×10−5 secs 7.87×10−4 secs
109 3.38×10−5 secs 7.87×10−4 secs
1012 3.22×10−5 secs 7.85×10—4 secs
1015 3.38×10−5 secs 7.87×10−4 secs
1018 3.33×10−5 secs 7.85×10−4 secs

Table 10: The evaluation of the modal Green’s function in quadruple precision for varying complex-valued κ with large source-to-target distance (β = 1).

The error and |G0| are evaluated by using adaptive Gaussian quadrature as the gold standard for |κ|<106. For |κ|106, the resource requirements of prior methods becomes excessive, and we instead evaluate |G0| using our method.

β = 1 , m = 10 β = 1 , m = 1000
|κ| Arg(κ) |G0| Evaluation Time Error Scaled by G0 Evaluation Time Error Scaled by G0
10−3 π/8 3.31 8.44×10−3 secs 9.83×10−32 2.87×10−2 secs 7.32×10−32
10−3 π/4 3.31 8.07×10−3 secs 9.89×10−32 2.82×10−2 secs 7.36×10−32
10−3 −3π/8 3.31 8.65×10−3 secs 9.98×10−32 3.27×10−2 secs 7.33×10−32
10−3 π/2 3.31 8.82×10−3 secs 9.88×10−32 3.25×10−2 secs 7.36×10−32

1 π/8 2.28 8.74×10−3 secs 1.04×10−31 3.32×10−2 secs 8.39×10−32
1 π/4 1.69 8.82×10−3 secs 1.07×10−31 3.32×10−2 secs 8.69×10−32
1 −3π/8 1.40 8.13×10−3 secs 1.07×10−31 2.83×10−2 secs 8.87×10−32
1 π/2 1.31 8.22×10−3 secs 1.08×10−31 2.82×10−2 secs 8.87×10−32

103 π/8 2.85×10−119 9.16×10−3 secs 4.22×10−32 3.30×10−2 secs 8.28×10−31
103 π/4 6.72×10−219 8.80×10−3 secs 4.77×10−32 3.57×10−2 secs 1.92×10−31
103 −3π/8 1.81×10−285 8.91×10−3 secs 3.97×10−32 3.58×10−2 secs 1.35×10−32
103 π/2 7.62×10−309 7.66×10−3 secs 3.98×10−32 3.99×10−2 secs 8.62×10−34

105 −10−3 1.84×10−33 7.74×10−3 secs 5.13×10−30 4.72×10−2 secs 5.17×10−30
106 −10−4 5.82×10−34 7.63×10−3 secs 2.27×10−2 secs
109 −10−7 1.84×10−35 7.55×10−3 secs 2.28×10−2 secs
1018 −10−16 5.82×10−40 7.51×10−3 secs 2.23×10−2 secs

Table 11: The evaluation of the modal Green’s function in double precision for varying complex-valued κ with small source-to-target distance (β=1012).

The error and |G0| are evaluated by using adaptive Gaussian quadrature as the gold standard for |κ|<106. For |κ|106, the resource requirements of prior methods becomes excessive, and we instead evaluate |G0| using our method.

β = 10−12 , m = 10 β = 10−12 , m = 1000
|κ| Arg(κ) |G0| Evaluation Time Error Scaled by G0 Evaluation Time Error Scaled by G0
10−3 π/8 4.15×101 8.63×10−5 secs 2.42×10−15 3.38×10−4 secs 6.08×10−14
10−3 π/4 4.15×101 1.18×10−4 secs 2.27×10−15 3.94×10−4 secs 6.13×10−14
10−3 −3π/8 4.15×101 1.16×10−4 secs 2.39×10−15 3.63×10−4 secs 6.13×10−14
10−3 π/2 4.15×101 1.13×10−4 secs 2.73×10−15 3.44×10−4 secs 6.10×10−14

1 π/8 3.98×101 1.60×10−4 secs 1.99×10−15 7.86×10−4 secs 6.42×10−14
1 π/4 3.96×101 8.46×10−5 secs 2.65×10−15 3.21×10−4 secs 6.55×10−14
1 −3π/8 3.94×101 8.61×10−5 secs 2.44×10−15 3.12×10−4 secs 6.66×10−14
1 π/2 3.94×101 8.61×10−5 secs 2.65×10−15 2.92×10−4 secs 6.70×10−14

103 π/8 2.95×101 7.49×10−5 secs 4.31×10−13 3.04×10−4 secs 5.83×10−13
103 π/4 2.95×101 6.54×10−5 secs 4.50×10−13 3.18×10−4 secs 5.32×10−13
103 −3π/8 2.95×101 8.33×10−5 secs 4.01×10−13 3.52×10−4 secs 2.98×10−13
103 π/2 2.95×101 7.73×10−5 secs 3.99×10−13 1.96×10−4 secs 1.39×10−13

105 −10−3 2.31×101 6.46×10−5 secs 2.55×10−13 1.87×10−4 secs 2.06×10−13
106 −10−4 1.98×101 5.29×10−5 secs 1.92×10−4 secs
109 −10−7 1.02×101 5.88×10−5 secs 1.46×10−4 secs
1018 −10−16 1.77×10−3 5.89×10−5 secs 1.45×10−4 secs

Acknowledgments

James Garritano was supported in part by NIH F30HG011193 and by US NIH MSTP Training Grant T32GM007205. Vladimir Rokhlin was supported in part by ONR N00014-18-1-2353 and NSF DMS-1952751. Kirill Serkh was supported in part by the NSERC Discovery Grants RGPIN-2020-06022 and DGECR-2020-00356.

References

  • [1].Abramowitz Milton and Stegun Irene A.. Handbook of Mathematical Functions. National Bureau of Standards, 1964. [Google Scholar]
  • [2].Andreasen M. “Scattering from bodies of revolution.” IEEE. T. Antenn. Propag 13.2 (1965): 303–310. [Google Scholar]
  • [3].Bremer James. “An algorithm for the numerical evaluation of the associated Legendre functions that runs in time independent of degree and order.” J. Comput. Phys 360 (2018): 15–38. [Google Scholar]
  • [4].Bremer J, Gimbutas Z, and Rokhlin V. “A nonlinear optimization procedure for generalized Gaussian quadratures.” SIAM J. Sci. Comput 32.4 (2010): 1761–1788. [Google Scholar]
  • [5].Cohl H and Tohline J. “A Compact Cylindrical Green’s Function expansion for the Solution of Potential Problems.” Astrophys. J 527.1 (1999): 86. [Google Scholar]
  • [6].Conway J. and Cohl HS. “Exact Fourier expansion in cylindrical coordinates for the three-dimensional Helmholtz Green function.” Z. Angew. Math. Phys 61.3 (2010): 425–443. [Google Scholar]
  • [7].Epstein C, Greengard L, and O’Neil M. “A high-order wideband direct solver for electromagnetic scattering from bodies of revolution.” J. Comput. Phys 387 (2019): 205–229. [Google Scholar]
  • [8].Gedney S and Mittra R. “The use of the FFT for the efficient solution of the problem of electromagnetic scattering by a body of revolution.” IEEE. T. Antenn. Propag (1988): 92–95. [Google Scholar]
  • [9].Gustafsson Mats. “Accurate and efficient evaluation of modal Green’s functions.” J. of Electromagnet. Waves. 24.10 (2010): 1291–1301. [Google Scholar]
  • [10].Helsing J and Holst A. “Variants of an explicit kernel-split panel based Nystrom discretization scheme for Helmholtz boundary value problems.” Adv. Comput. Math 41.3 (2015): 691–708. [Google Scholar]
  • [11].Helsing J and Karlsson A. “An explicit kernel-split panel-based Nystrom scheme for integral equations on axially symmetric surfaces.” J. Comput. Phys 272 (2014): 686–703. [Google Scholar]
  • [12].Lai J and O’Neil M. “An FFT-accelerated direct solver for electromagnetic scattering from penetrable axisymmetric objects.” J. Comput. Phys 390 (2019): 152–174. [Google Scholar]
  • [13].Lorentz GG Approximation of Functions. Holt, Rinehart and Winston, Inc., 1966. [Google Scholar]
  • [14].Mason J. Chebyshev polynomials. CRC Press, 2002. [Google Scholar]
  • [15].Matviyenko Gregory. “On the azimuthal Fourier components of the Green’s function for the Helmholtz equation in three dimensions.” J. Math. Phys 36.9 (1995): 5159–5169. [Google Scholar]
  • [16].Osipov Andrei. “Evaluation of small elements of the eigenvectors of certain symmetric tridiagonal matrices with high relative accuracy.” Appl. Comput. Harmon. A. 43.2 (2017): 173–211. [Google Scholar]
  • [17].Trefethen N. Approximation Theory and Practice. SIAM, 2019. [Google Scholar]
  • [18].Trefethen N. Spectral methods in MATLAB. SIAM, 2000. [Google Scholar]
  • [19].Vaessen Jean-Pierre A., and van Beurden M. “Accurate and efficient computation of the modal Green’s function arising in the electric-field integral equations for a body of revolution.” IEEE T. Antenn. Propag 60.7 (2012): 3294–3304. [Google Scholar]
  • [20].Wang Peng and Xiao G. “A Note on the Singularity Extraction Technique in Solving Scattering Problems for Bodies of Revolution.” Asia Pacif. Microwave (2010): 2146–2148. [Google Scholar]
  • [21].Young P, Hao S, and Martinsson PG. “A high-order Nystrom discretization scheme for boundary integral equations defined on rotationally symmetric surfaces.” J. Comput. Phys 40.1 (2014): 4142–4159. [Google Scholar]

RESOURCES