Abstract
We present a fast method for evaluating expressions of the form
where αi are real numbers, and xi are points in a compact interval of . This expression can be viewed as representing the electrostatic potential generated by charges on a line in . While fast algorithms for computing the electrostatic potential of general distributions of charges in exist, in a number of situations in computational physics it is useful to have a simple and extremely fast method for evaluating the potential of charges on a line; we present such a method in this paper, and report numerical results for several examples.
2010 Mathematics Subject Classification: 31C20 (primary) and 41A55, 41A50 (secondary)
Keywords: Fast multipole method, Chebyshev system, generalized Gaussian quadrature
1. Introduction and motivation
1.1. Introduction.
In this paper, we describe a simple fast algorithm for evaluating expressions of the form
(1) |
where αi are real numbers, and xi are points in a compact interval of . This expression can be viewed as representing the electrostatic potential generated by charges on a line in . We remark that fast algorithms for computing the electrostatic potential generated by general distributions of charges in exist, see for example the Fast Multipole Method [9] whose relation to the method presented in this paper is discussed in §1.2. However, in a number of situations in computational physics it is useful to have a simple and extremely fast method for evaluating the potential of charges on a line; we present such a method in this paper. Under mild assumptions the presented method involves operations and has a small constant. The method is based on writing the potential 1/r as
We show that there exists a small set of quadrature nodes t1, … , tm and weights w1, … , wm such that for a large range of values of r we have
(2) |
see Lemma 4.5, which is a quantitative version of (2). Numerically the nodes t1, … , tm and weights w1, … , wm are computed using a procedure for constructing generalized Gaussian quadratures, see §5.2. An advantage of representing 1/r as a sum of exponentials is that the translation operator
(3) |
can be computed by taking an inner product of the weights (w1 , … , wm) with a diagonal transformation of the vector (e−rt1 , … , e−rtm). Indeed, we have
(4) |
The algorithm described in §3 leverages the existence of this diagonal translation operator to efficiently evaluate (1).
1.2. Relation to past work.
We emphasize that fast algorithms for computing the potential generated by arbitrary distributions of charges in exist. An example of such an algorithm is the Fast Multipole Method that was introduced by [9] and has been extended by several authors including [7, 10, 16]. In this paper, we present a simple scheme for the special case where the charges are on a line, which occurs in a number of numerical calcuations, see 1.3. The presented scheme has a much smaller runtime constant compared to general methods, and is based on the diagonal form (4) of the translation operator (3). The idea of using the diagonal form of this translation operator to accelerate numerical computations has been studied by several authors; in particular, the diagonal form is used in algorithms by Dutt, Gu and Rokhlin [6], and Yavin and Rokhlin [22] and was subsequently studied in detail by Beylkin and Monzón [1, 2].
The current paper improves upon these past works by taking advantage of robust generalized Gaussian quadrature codes [4] that were not previously available; these codes construct a quadrature rule that is exact for functions in the linear span of a given Chebyshev system, and can be viewed as a constructive version of Lemma 4.2 of Kreĭn [13]. The resulting fast algorithm presented in §3 simplifies past approaches, and has a small runtime constant; in particular, its computational cost is similar to the computational cost of 5-10 Fast Fourier Transforms on data of a similar length, see 5.
1.3. Motivation.
Expressions of the form (1) appear in a number of situations in computational physics. In particular, such expressions arise in connection with the Hilbert Transform
For example, the computation of the projection Pmf of a function f onto the first m + 1 functions in a family of orthogonal polynomials can be reduced to an expression of the form (1) by using the Christoffel–Darboux formula, which is related to the Hilbert transform; we detail the reduction of Pmf to an expression of the form (1) in the following.
Let be a family of monic polynomials that are orthogonal with respect to the weight w(x) ≥ 0 on . Consider the projection operator
where . Let x1 , … , xn and w1 , … , wn be the n > m/2 point Gaussian quadrature nodes and weights associated with , and set
(5) |
By construction the polynomial that interpolates the values u1 , … , un at the points x1 , … , xn will accurately approximate Pmf on (a, b) when f is sufficiently smooth, see for example §7.4.6 of Dahlquist and Björck [5]. Directly evaluating (5) would require Ω(n2) operations. In contrast, the algorithm of this paper together with the Christoffel–Darboux Formula can be used to evaluate (5) in operations. The Christoffel-Darboux formula states that
(6) |
see §18.2(v) of [17]. Using (6) to rewrite (5) yields
(7) |
where we have used the fact that the diagonal term of the double summation is equal to f(xj)/hm. The summation in (7) can be rearranged into two expressions of the form (1), and thus the method of this paper can be used to compute a representation of Pmf in operations.
Remark 1.1. Analogs of the Christoffel–Darboux formula hold for many other families of functions; for example, if Jν(w) is a Bessel function of the first kind, then we have
see [21]. This formula can be used to write a projection operator related to Bessel functions in an analogous form to (7), and the algorithm of this paper can be similarly applied
Remark 1.2. A simple modification of the algorithm presented in this paper can be used to evaluate more general expressions of the form
where x1 , … , xn are source points, and y1 , … , ym are target points. For simplicity, this paper focuses on the case where the source and target points are the same, which is the case in the projection application described above.
2. Main result
2.1. Main result.
Our principle analytical result is the following theorem, which provides precise accuracy and computational complexity guarantees for the algorithm presented in this paper, which is detailed in §3.
Theorem 2.1. Let x1 < … < xn ∈ [a, b] and α1 , … , αn ∈ be given. Set
Given δ > 0 and ε > 0, the algorithm described in §3 computes values such
(8) |
in operations, where
(9) |
The proof of Theorem 2.1 is given in §4. Under typical conditions, the presented algorithm involves operations. The following corollary describes a case of interest, where the points x1, … , xn are Chebyshev nodes for a compact interval [a, b] (we define Chebyshev nodes in §4.2).
Corollary 2.1. Fix ε = 10−15, and let the points x1 , … , xn be Chebyshev nodes on [a, b]. If δ = 1/n, then the algorithm of §3 involves operations.
The proof of Corollary 2.1 is given in §4.4. The following corollary states that a similar result holds for uniformly random points.
Corollary 2.2. Fix ε = 10−15, and suppose that x1 , … , xn are sampled uniformly at random from [a, b]. If δ = 1/n, then the algorithm of §3 involves operations with high probability.
The proof of Corollary 2.2 is immediate from standard probabilistic estimates. The following remark describes an adversarial configuration of points.
Remark 2.1. Fix ε > 0, and let x1 , … , x2n be a collection of points such that x1 , … , xn and xn+1, … , x2n are evenly spaced in [0, 2−n] and [1 − 2−n, 1], respectively, that is
We claim that Theorem 2.1 cannot guarantee a complexity better than for this configuration of points. Indeed, if δ ≥ 2−n, then Nδ ≥ n2/2, and if δ < 2−n, then log2(δ−1) > n. In either case
This complexity is indicative of the performance of the algorithm for this point configuration; the reason that the algorithm performs poorly is that structures exist at two different scales. If such a configuration were encountered in practice, it would be possible to modify the algorithm of §3 to also involve two different scales to achieve evaluation in operations.
3. Algorithm
3.1. High level summary.
The algorithm involves passing over the points x1 , … , xn twice. First, we pass over the points in ascending order and compute
(10) |
and second, we pass over the points in descending order and compute
(11) |
Finally, we define for j = 1, … , n such that
We call the computation of , … , the forward pass of the algorithm, and the computation of , … , the backward pass of the algorithm. The forward pass of the algorithm computes the potential generated by all points to the left of a given point, while the backward pass of the algorithm computes the potential generated by all points to the right of a given point. In §3.2 and §3.3 we give an informal and detailed description of the forward pass of the algorithm. The backward pass of the algorithm is identical except it considers the points in reverse order.
3.2. Informal description.
In the following, we give an informal description of the forward pass of the algorithm that computes
Assume that a small set of nodes t1, … , tm and weights w1, … , wm such that
(12) |
where δ > 0 is given and fixed. The existence and computation of such nodes and weights is described in §4.4 and §5.2. We divide the sum defining into two parts:
(13) |
where j0 = max {i : xi − xi > δ(b − a)}. By definition, the points x1, … , xj0 are all distance at least δ(b − a) from xj. Therefore, by (12)
If we define
(14) |
then it is straightforward to verify that
(15) |
Observe that we can update gk(j0) to gk(j0 + 1) using the following formula
(16) |
We can now summarize the algorithm for computing , … , . For each j, we compute by the following three steps:
Update g1, … , gm as necessary
Use g1, … , gm to evaluate the potential from xi such that xj − xi > δ(b − a)
Directly evaluate the potential from xi such that 0 < xj − xi < δ(b − a)
By (16), each update of g1, … , gm requires operations, and we must update g1, … , gm at most n times, so we conclude that the total cost of the first step of the algorithm is operations. For each j = 1, … , n, the second and third step of the algorithm involve and operations, respectively, see (15). It follows that the total cost of the second and third step of the algorithm is operations, where Nδ is defined in (9). We conclude that , … , can be computed in operations. In §4, we complete the proof of the computational complexity guarantees of Theorem 2.1 by showing that there exist nodes t1, … , tm and weights w1, … , wm that satisfy (12), where ε > 0 is the approximation error in (12).
3.3. Detailed description.
In the following, we give a detailed description of the forward pass of the algorithm that computes , … , . Suppose that δ > 0 and ε > 0 are given and fixed. We describe the algorithm under the assumption that we are given quadrature nodes t1, … , tm and weights w1, … , wm such that
(17) |
The existence of such weights and nodes is established in §4.4, and the computation of such nodes and weights is discussed in §5.2. To simplify the description of the algorithm, we assume that x0 = −∞ is a placeholder node that does not generate a potential.
Algorithm 3.1. | |
---|---|
Remark 3.1. In some applications, it may be necessary to evaluate an expression of the form (1) for many different weights α1, … , αn associated with a fixed set of points x1, … , xn. For example, in the projection application described in §1.3 the weights α1, … , αn correspond to a function that is being projected, while the points x1, … , xn are a fixed set of quadrature nodes. In such situations, pre-computing the exponentials e−(xj−xj0)ti used in the Algorithm 3.1 will significantly improve the runtime, see §5.1.
4. Proof of Main Result
4.1. Organization.
In this section we complete the proof of Theorem 2.1; the section is organized as follows. In §4.2 we give mathematical preliminaries. In § 4.3 we state and prove two technical lemmas. In §4.4 we prove Lemma 4.5, which together with the analysis in §3 establishes Theorem 2.1. In §4.5 we prove Corollary 2.1, and Corollary 2.2.
4.2. Preliminaries.
Let a < b ∈ and be fixed, and suppose that , and x1 < … < xn ∈ [a, b] are given. The interpolating polynomial P of the function f at x1, … , xn is the unique polynomial of degree at most n − 1 such that
This interpolating polynomial P can be explicitly defined by
(18) |
where qj is the nodal polynomial for xj, that is,
(19) |
We say x1, … , xn are Chebyshev nodes for the interval [a, b] if
(20) |
The following lemma is a classical result in approximation theory. It says that a smooth function on a compact interval is accurately approximated by the interpolating polynomial of the function at Chebyshev nodes, see for example §4.5.2 of Dahlquist and Björck [5].
Lemma 4.1. Let f ∈ Cn([a, b]), and x1, … , xn be Chebyshev nodes for [a, b]. If P is the interpolating polynomial for f at x1, … , xn, then
where
In addition to Lemma 4.1, we require a result about the existence of generalized Gaussian quadratures for Chebyshev systems. In 1866, Gauss [8] established the existence of quadrature nodes x1, … , xn and weights w1, … , wn for an interval [a, b] such that
whenever f(x) is a polynomial of degree at most 2n − 1. This result was generalized from polynomials to Chebyshev systems by Kreĭn [13]. A collection of functions f0, … , fn on [a, b] is a Chebyshev system if every nonzero generalized polynomial
has at most n distinct zeros in [a, b]. The following result of Kreĭn says that any function in the span of a Chebyshev system of 2n functions can be integrated exactly by a quadrature with n nodes and n weights.
Lemma 4.2 (Kreĭn [13]). Let f0, … , f2n−1 be a Chebyshev system of continuous functions on [a, b], and w : (a, b) → be a continuous positive weight function. Then, there exists unique nodes x1, … , xn and weights w1, …, wn such that
whenever f is in the span of f0, … , f2n−1.
4.3. Technical Lemmas.
In this section, we state and prove two technical lemmas that are involved in the proof of Theorem 2.1. We remark that a similar version of Lemma 4.3 appears in [18].
Lemma 4.3. Fix a > 0 and t ∈ [0, ∞), and let r1, … , rn be Chebyshev nodes for [a, 2a]. If Pt(r) is the interpolating polynomial for e−rt at r1, … , rn, then
Proof. We have
By writing the derivative of tne−ta as
we can deduce that the maximum of tne−ta occurs at t = n/a, that is,
(21) |
By (21) and the result of Lemma 4.1, we conclude that
It remains to show that 2nne−n ≤ n!. Since ln(x) is a increasing function, we have
Exponentiating both sides of this inequality gives enne−n ≤ n!, which is a classical inequality related to Stirling’s approximation. This completes the proof. □
Lemma 4.4.Suppose that ε > 0 and M > 1 are given. Then, there exists
values r1, …, rm ∈ [1, M] such that for all r ε [1, M] we have
(22) |
for some choice of coefficients Cj(r) that depend on r.
Proof. We construct an explicit set of m := (⌊log2 M⌋ + 1)(⌊log4 ε−1⌋ + 1) points and coefficients such that (22) holds. Set n := ⌊log4 ε−1⌋ + 1. We define the points r1, … , rm by
(23) |
for k = 1, … , n and i = 0, …, ⌊log2 M⌋, and define the coefficients c1(r), … , cm(r) by
(24) |
for k = 1, …, n and i = 0, … , ⌊log2 M⌋. We claim that
Indeed, fix r ∈ [1, M], and let i0 ∈ {0, … , ⌊log2 M⌋} be the unique integer such that r ∈ [2i0, 2i0+1). By the definition of the coefficients, see (24), we have
We claim that the right hand side of this equation is the interpolating polynomial Pt,i0 (r) for e−rt at ri0n+k, … , r(i0+1)n, that is,
Indeed, see (18) and (19). Since the points ri0n+k, … , r(i0+1)n are Chebyshev nodes for the interval [2i0, 2i0+1], and since i0 was chosen such that r ∈ [2i0, 2i0+1), it follows from Lemma 4.3 that
Since n = ⌊log4 ε−1⌋ + 1 the proof is complete. □
Remark 4.1. The proof of Lemma 4.4 has the additional consequence that the coefficients c1(r), … , cm(r) in (22) can be chosen such that they satisfy
Indeed, in (24) the coefficients Cj (r) are either equal zero or equal to the nodal polynomial, see (19), for Chebyshev nodes on an interval that contains r. The nodal polynomials for Chebyshev nodes on an interval [a, b] are bounded by on [a, b], see for example [18]. The fact that e−rt can be approximated as a linear combination of functions e−r1t, … , e−rmt with small coefficients means that the approximation of Lemma 4.4 can be used in finite precision environments without any unexpected catastrophic cancellation.
4.4. Completing the proof of Theorem 2.1.
Previously in §3.2, we proved that the algorithm of §3 involves operations. To complete the proof of Theorem 2.1 it remains to show that there exists
points t1, … , tm and weights w1, … , wm that satisfy (17); we show the existence of such nodes and weights in the following lemma, and thus complete the proof of Theorem 2.1. The computation of such nodes and weights is described in §5.2.
Lemma 4.5. Fix a < b ∈ , and let δ > 0 and ε > 0 be given. Then, there exists nodes t1, … , tm and weights w1, … , wm such that
(25) |
Proof. Fix a < b ∈ , and let δ, ε > 0 be given. By the possibility of rescaling r, wj, and tj, we may assume that b − a = δ−1 such that we want to establish (25) for r ∈ [1, δ−1]. By Lemma 4.4 we can choose points r0, … , r2m−1 ∈ [1, δ−1], and coefficients c0(r), … , c2m−1(r) depending on r such that
(26) |
The collection of functions e−r0t, … , e−r2m−1t form a Chebyshev system of continuous functions on the interval [0, log(2ε−1)], see for example [12]. Thus, by Lemma 4.2 there exists m quadrature nodes t1, … , tm and weights w1, … , wm such that
whenever f(t) is in the span of e−r0t, … , e−r2m−1t. By the triangle inequality
(27) |
Recall that we have assumed r ∈ [1, δ−1], in particular, r ≥ 1 so it follows that
(28) |
By (26), the function e−rt can be approximated to error ε/(2log(2ε−1)) in the L∞-norm on [0, log(2ε−1)] by functions in the span of e−r0t, … , e−r2m−1t. Since our quadrature is exact for these functions, we conclude that
(29) |
4.5. Proof of Corollary 2.1.
In this section, we prove Corollary 2.1, which states that the algorithm of §3 involves operations when x1, … , xn are Chebyshev nodes, ε = 10−15, and δ = 1/n.
Proof of Corollary 2.1. By rescaling the problem we may assume that [a, b] = [−1, 1] such that the Chebyshev nodes x1, … , xn are given by
By the result of Theorem 2.1, it suffices to show that , where
It is straightforward to verify that the number of Chebyshev nodes within an interval of radius 1/n around the point −1 < x < 1 is , that is,
This estimate, together with the fact that the first and last Chebyshev node are distance at least 1/n2 from 1 and −1, respectively, gives the estimate
(30) |
Let π/2 > η > 0 be a fixed parameter; direct calculation yields
Combining this estimate with (30) yields as was to be shown. □
5. Numerical results and implementation details
5.1. Numerical results.
We report numerical results for two different point distributions: uniformly random points in [1, 10], and Chebyshev nodes in [−1, 1]. In both cases, we choose the weights α1, … , αn uniformly at random from [0, 1], and test the algorithm for
We time two different versions of the algorithm: a standard implementation, and an implementation that uses precomputed exponentials. Precomputing exponentials may be advantageous in situations where the expression
(31) |
must be evaluated for many different weights α1, … , αn associated with a fixed set of points x1, …, xn, see Remark 3.1. We find that using precomputed exponentials makes the algorithm approximately ten times faster, see Tables 1, 2, and 3. In addition to reporting timings, we report the absolute relative difference between the output of the algorithm of §3 and the output of direct evaluation; we define the absolute relative difference ϵr between the output of the algorithm of §3 and the output of direct calculation by
(32) |
Table 1.
Label | Definition |
---|---|
n | number of points |
tw | time of algorithm of §3 without precomputation in seconds |
tp | time of precomputing exponentials for algorithm of §3 in seconds |
tu | time of algorithm of §3 using precomputed exponentials in seconds |
td | time of direct evaluation in seconds |
ϵr | maximum absolute relative difference defined in (32) |
tf | time of FFT using precomputed exponentials (for time comparison only) |
Table 2.
n | tw | tp | tu | td | ϵr |
---|---|---|---|---|---|
1000 | 0.74 E −03 | 0.18 E −02 | 0.93 E −04 | 0.66 E −03 | 0.19 E −14 |
2000 | 0.19 E −02 | 0.31 E −02 | 0.19 E −03 | 0.25 E −02 | 0.30 E −14 |
4000 | 0.42 E −02 | 0.61 E −02 | 0.43 E −03 | 0.10 E −01 | 0.52 E −14 |
8000 | 0.85 E −02 | 0.10 E −01 | 0.89 E −03 | 0.37 E −01 | 0.72 E −14 |
16000 | 0.18 E −01 | 0.25 E −01 | 0.18 E −02 | 0.14 E +00 | 0.92 E −14 |
32000 | 0.38 E −01 | 0.49 E −01 | 0.37 E −02 | 0.59 E +00 | 0.19 E −13 |
64000 | 0.84 E −01 | 0.98 E −01 | 0.78 E −02 | 0.23 E +01 | 0.21 E −13 |
128000 | 0.16E +00 | 0.19 E +00 | 0.18 E −01 | 0.95 E +01 | 0.35 E −13 |
256000 | 0.37 E +00 | 0.53 E +00 | 0.34 E −01 | 0.40 E +02 | 0.59 E −13 |
512000 | 0.75 E +00 | 0.10 E +01 | 0.71 E −01 | 0.19 E +03 | 0.88 E −13 |
1024000 | 0.17 E +01 | 0.23 E +01 | 0.15 E +00 | 0.81 E +03 | 0.14 E −12 |
Table 3.
n | tw | tp | tu | td | ϵr |
---|---|---|---|---|---|
1000 | 0.54 E −03 | 0.12 E −02 | 0.74 E −04 | 0.60 E −03 | 0.11 E −14 |
2000 | 0.15 E −02 | 0.26 E −02 | 0.15 E −03 | 0.24 E −02 | 0.14 E −14 |
4000 | 0.38 E −02 | 0.51 E −02 | 0.37 E −03 | 0.99 E −02 | 0.39 E −14 |
8000 | 0.83 E −02 | 0.10 E −01 | 0.85 E −03 | 0.38 E −01 | 0.35 E −14 |
16000 | 0.19 E −01 | 0.23 E −01 | 0.17 E −02 | 0.14 E +00 | 0.58 E −14 |
32000 | 0.41 E −01 | 0.48 E −01 | 0.37 E −02 | 0.62 E +00 | 0.89 E −14 |
64000 | 0.98 E −01 | 0.90 E −01 | 0.82 E −02 | 0.24 E +01 | 0.12 E −13 |
128000 | 0.22 E +00 | 0.19 E +00 | 0.23 E −01 | 0.10 E +02 | 0.19 E −13 |
256000 | 0.44 E +00 | 0.47 E +00 | 0.32 E −01 | 0.40 E +02 | 0.26 E −13 |
512000 | 0.84 E +00 | 0.94 E +00 | 0.73 E −01 | 0.19 E +03 | 0.52 E −13 |
1024000 | 0.19 E +01 | 0.19 E +01 | 0.14 E +00 | 0.84 E +03 | 0.64 E −13 |
Dividing by accounts were the fact that the calculations are performed in finite precision; any remaining loss of accuracy in the numerical results is a consequence of the large number of addition and multiplication operations that are performed. All calculations are performed in double precision, and the algorithm of §3 is run with ε = 10−15. The parameter δ > 0 is set via an empirically determined heuristic. The numerical experiments were performed on a laptop with a Intel Core i5-8350U CPU and 7.7 GiB of memory; the code was written in Fortran and compiled with gfortran with standard optimization flags. The results are reported in Tables 1, 2, and 3.
To put the run time of the algorithm in context, we additionally perform a time comparison to the Fast Fourier Transform (FFT), which also has complexity . Specifically, we compare the run time of the algorithm of §3 on random data using precomputed exponentials with the run time of an FFT implementation from FFTPACK [20] on random data of the same length using precomputed exponentials. We report these timings in Table 4; we find that the FFT is roughly 5-10 times faster than our implementation of the algorithm of §3; we remark that no significant effort was made to optimize our implementation, and that it may be possible to improve the run time by vectorization.
Table 4.
n | tu | tf |
---|---|---|
1000 | 0.91 E − 04 | 0.16 E − 04 |
2000 | 0.28 E − 03 | 0.37 E − 04 |
4000 | 0.41 E − 03 | 0.44 E − 04 |
8000 | 0.93 E − 03 | 0.85 E − 04 |
16000 | 0.18 E − 02 | 0.24 E − 03 |
32000 | 0.38 E − 02 | 0.41 E − 03 |
64000 | 0.81 E − 02 | 0.88 E − 03 |
128000 | 0.18 E − 01 | 0.19 E − 02 |
256000 | 0.38 E − 01 | 0.59 E − 02 |
512000 | 0.71 E − 01 | 0.12 E − 01 |
1024000 | 0.14 E + 00 | 0.25 E − 01 |
5.2. Computing nodes and weights.
The algorithm of §3 is described under the assumption that nodes t1, … , tm and weights w1, … , wm are given such that
(33) |
where ε > 0 and δ > 0 are fixed parameters. As in the proof of Lemma 4.5 we note that by rescaling r it suffices to find nodes and weights satisfying
(34) |
Indeed, if the nodes t1, … , tm and weights w1, … , wm satisfy (34), then the nodes t1/(b − a), … , tm/(b − a) and weights w1/(b − a), … , wm/(b − a) will satisfy (33). Thus, in order to implement the algorithm of §3 it suffices to tabulate nodes and weights that are valid for r ∈ [1, M] for various values of M. In the implementation used in the numerical experiments in this paper, we tabulated nodes and weights valid for r ∈ [1, M] for
For example, in Tables 5 and 6 we have listed m = 33 nodes t1, … , t33 and weights w1, … , w33 such that
for all r ∈ [1, 1024].
Table 5.
0.2273983006898589D−03, 0.1206524521003404D−02, 0.3003171636661616D−02, |
0.5681878572654425D−02, 0.9344657316017281D−02, 0.1414265501822061D−01, |
0.2029260691940998D−01, 0.2809891134697047D−01, 0.3798133147119762D−01, |
0.5050795277167632D−01, 0.6643372693847560D−01, 0.8674681067847460D−01, |
0.1127269233505314D+00, 0.1460210820252656D+00, 0.1887424688689547D+00, |
0.2435986924712581D+00, 0.3140569015209982D+00, 0.4045552087678740D+00, |
0.5207726670656921D+00, 0.6699737362118449D+00, 0.8614482005965975D+00, |
0.1107074709906516D+01, 0.1422047253849542D+01, 0.1825822499573290D+01, |
0.2343379511131976D+01, 0.3006948272874077D+01, 0.3858496861353812D+01, |
0.4953559345813267D+01, 0.6367677940017810D+01, 0.8208553424367139D+01, |
0.1064261195532074D+02, 0.1396688222191633D+02, 0.1889449184151398D+02 |
Table 6.
0.5845245927410881D−03, 0.1379782337905140D−02, 0.2224121503815854D−02, |
0.3150105276431181D−02, 0.4200370923383030D−02, 0.5431379037435571D−02, |
0.6918794756934398D−02, 0.8763225538492927D−02, 0.1109565843047196D−01, |
0.1408264766413004D−01, 0.1793263393523491D−01, 0.2290557147478609D−01, |
0.2932752351846237D−01, 0.3761087060298772D−01,0.4828044150885936D−01, |
0.6200636888239893D−01, 0.7964527252809662D−01, 0.1022921587521237D+00, |
0.1313462348178323D+00, 0.1685948994092301D+00, 0.2163218289369589D+00, |
0.2774479391081561D+00, 0.3557192797195578D+00, 0.4559662159666857D+00, |
0.5844792718191478D+00, 0.7495918095861060D+00, 0.9626599456939077D+00, |
0.1239869481076760D+01, 0.1605927580173348D+01, 0.2102583514906888D+01, |
0.2811829220697454D+01, 0.3937959064316012D+01, 0.6294697335695096D+01 |
The nodes and weights satisfying (34) can be computed by using a procedure for generating generalized Gaussian quadratures for Chebyshev systems together with the proof of Lemma 4.4. Indeed, Lemma 4.4 is constructive with the exception of the step that invokes Lemma 4.2 of Kreĭn. The procedure described in [4] is a constructive version of Lemma 4.2: given a Chebyshev system of functions, it generates the corresponding quadrature nodes and weights. We remark that generalized Gaussian quadrature generation codes are a powerful tools for numerical computation with a wide range of applications. The quadrature generation code used in this paper was an optimized version of [4] recently developed by Serkh for [19].
Acknowledgements.
The authors would like to thank Jeremy Hoskins for many useful discussions. Certain commercial equipment is identified in this paper to foster understanding. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that equipment identified is necessarily the best available for the purpose.
N.F.M. was supported in part by NSF DMS-1903015.
V.R. was supported in part by AFOSR FA9550-16-1-0175 and ONR N00014-14-1-0797.
Contributor Information
ZYDRUNAS GIMBUTAS, National Institute of Standards and Technology, Boulder, CO 80305, USA.
NICHOLAS F. MARSHALL, Department of Mathematics, Princeton University, Princeton, NJ 08540, USA
VLADIMIR ROKHLIN, Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA.
References
- [1].Beylkin Gregory and Monzón Lucas, Approximation by exponential sums revisited, Appl. Comput. Harmon. Anal 28 (2010), no. 2, 131–149. MR2595881 [Google Scholar]
- [2].Beylkin Gregory and Monzón Lucas, On approximation of functions by exponential sums, Appl. Comput. Harmon. Anal 19 (2005), no. 1, 17–48. MR2147060 [Google Scholar]
- [3].Braess Dietrich, Nonlinear approximation theory, Springer Series in Computational Mathematics, vol. 7, Springer-Verlag, Berlin, 1986. MR866667 [Google Scholar]
- [4].Bremer James, Gimbutas Zydrunas, and Rokhlin Vladimir, A nonlinear optimization procedure for generalized Gaussian quadratures, SIAM J. Sci. Comput 32 (2010), no. 4, 1761–1788. MR2671296 [Google Scholar]
- [5].Dahlquist Germund and Björck Åke, Numerical methods, Dover Publications, Inc., Mineola, NY, 2003, Translated from the Swedish by Ned Anderson, Reprint of the 1974 English translation. MR1978058 [Google Scholar]
- [6].Dutt A, Gu M, and Rokhlin V, Fast algorithms for polynomial interpolation, integration, and differentiation, SIAM J. Numer. Anal 33 (1996), no. 5, 1689–1711. MR1411845 [Google Scholar]
- [7].Fong William and Darve Eric, The black-box fast multipole method, J. Comput. Phys 228 (2009), no. 23, 8712–8725. MR2558773 [Google Scholar]
- [8].Gauss CF. Methodus nova integralium valores per approximationen inveniendi, Werke, 3 (1866), 1630–196. [Google Scholar]
- [9].Greengard Leslie, The rapid evaluation of potential fields in particle systems, ACM Distinguished Dissertations, MIT Press, Cambridge, MA, 1988. MR936632 [Google Scholar]
- [10].Greengard Leslie and Rokhlin Vladimir, A new version of the fast multipole method for the Laplace equation in three dimensions, Acta numerica, 1997, Acta Numer., vol. 6, Cambridge Univ. Press, Cambridge, 1997, pp. 229–269. MR1489257 [Google Scholar]
- [11].Jakob-Chien Rüdiger and Alpert Bradley K., A fast spherical filter with uniform resolution, Journal of Computational Physics 136 (1997), no. 2, 580–584. [Google Scholar]
- [12].Karlin Samuel and Studden William J., Tchebycheff systems: With applications in analysis and statistics, Pure and Applied Mathematics, Vol. XV, Interscience Publishers John Wiley & Sons, New York-London-Sydney, 1966. MR0204922 [Google Scholar]
- [13].Kreĭn MG, The ideas of P. L. Čebyšev and A. A. Markov in the theory of limiting values of integrals and their further development, Amer. Math. Soc. Transl. (2) 12 (1959), 1–121. MR0113106 [Google Scholar]
- [14].Ma J, Rokhlin V, and Wandzura S, Generalized Gaussian quadrature rules for systems of arbitrary functions, SIAM J. Numer. Anal 33 (1996), no. 3, 971–996. MR1393898 [Google Scholar]
- [15].Martinsson Per-Gunnar, Rokhlin Vladimir, and Tygert Mark, On interpolation and integration in finite-dimensional spaces of bounded functions, Commun. Appl. Math. Comput. Sci 1 (2006), 133–142. MR2244272 [Google Scholar]
- [16].Nabors K, Korsmeyer FT, Leighton FT, and White J, Preconditioned, adaptive, multipole-accelerated iterative methods for three-dimensional first-kind integral equations of potential theory, SIAM J. Sci. Comput 15 (1994), no. 3, 713–735, Iterative methods in numerical linear algebra (Copper Mountain Resort, CO, 1992). MR1273161 [Google Scholar]
- [17].NIST Digital Library of Mathematical Functions. http://dlmf.nist.gov/, Release 1.0.22 of 2019-03-15. Olver FWJ, Olde Daalhuis AB, Lozier DW, Schneider BI, Boisvert RF, Clark CW, Miller BR and Saunders BV, eds
- [18].Rokhlin V, A fast algorithm for the discrete Laplace transformation, J. Complexity 4 (1988), no. 1, 12–32. MR939693 [Google Scholar]
- [19].Serkh Kirill, On the Solution of Elliptic Partial Differential Equations on Regions with Corners, ProQuest LLC, Ann Arbor, MI, 2016, Thesis (Ph.D.)–Yale University. MR3564124 [Google Scholar]
- [20].Swarztrauber PN, Vectorizing the FFTs, Parallel Computations (Rodrigue G, ed.), Academic Press, 1982, pp. 51–83. [Google Scholar]
- [21].Tygert M. Analogues for Bessel Functions of the Christoffel-Darboux Identity. Yale Tech. Rep (2016). [Google Scholar]
- [22].Yarvin Norman and Rokhlin Vladimir, An improved fast multipole algorithm for potential fields on the line, SIAM J. Numer. Anal 36 (1999), no. 2, 629–666. MR1675269 [Google Scholar]
- [23].Yarvin N and Rokhlin V, Generalized Gaussian quadratures and singular value decompositions of integral operators, SIAM J. Sci. Comput 20 (1998), no. 2, 699–718. MR1642612 [Google Scholar]