Steerable Near-Quadrature Filter Pairs in Three Dimensions

Tommy M Tang; Hemant D Tagare

doi:10.1137/21m143529x

. Author manuscript; available in PMC: 2022 Nov 23.

Published in final edited form as: SIAM J Imaging Sci. 2022 May 26;15(2):670–700. doi: 10.1137/21m143529x

Steerable Near-Quadrature Filter Pairs in Three Dimensions

Tommy M Tang ^†, Hemant D Tagare ^‡

PMCID: PMC9683347 NIHMSID: NIHMS1847751 PMID: 36425343

Abstract

Steerable filter pairs that are near quadrature have many image processing applications. This paper proposes a new methodology for designing such filters. The key idea is to design steerable filters by minimizing a departure-from-quadrature function. These minimizing filter pairs are almost exactly in quadrature. The polar part of the filters is nonnegative, monotonic, and highly focused around an axis, and asymptotically the filters achieve exact quadrature. These results are established by exploiting a relation between the filters and generalized Hilbert matrices. These near-quadrature filters closely approximate three dimensional Gabor filters. We experimentally verify the asymptotic mathematical results and further demonstrate the use of these filter pairs by efficient calculation of local Fourier shell correlation of cryogenic electron microscopy.

Keywords: steerable filters, quadrature, filters, quadratic programming, Hilbert matrices

Keywords: 68U10, 90C20

1. Introduction.

Many image and video processing applications benefit from evaluating local directional properties of images. For example, local spatio-temporal directional change is useful in understanding motion in video sequences [6]. Directional space-frequency analysis is also useful in understanding texture and other features of images [10].

Local directional properties of an image can be computationally difficult to calculate, because they have to be calculated at every pixel and for every direction. Steerable filters were introduced in [16] to simplify this calculation. Loosely speaking, a steerable filter is a directional (i.e., an anisotropic) filter with the property that it can be oriented along any direction by simply taking a linear combination of a finite set of its orientations (the precise definition is given in section 2). Thus, by first convolving the image with the finite set of filters, local properties of an image along any additional direction can be calculated simply by taking a linear combination of the finite convolutions. Computational complexity is reduced because a smaller number of convolutions are involved [16, 17, 29].

Besides steerability, other properties are also desirable in filters used for local directional analysis. Ideally, a quadrature filter pair is required for accurate representation. The filters should also be directionally focused for high orientation sensitivity; a narrower filter will yield a higher angular resolution analysis of orientation [15]. And the filter should have a small space-frequency uncertainty for localization [9]. As we will show, all of these properties cannot be obtained simultaneously, and this raises the question of whether some of the properties can be obtained exactly, while others are obtained approximately.

In this paper, we approach this problem by designing directional, steerable filter pairs that are near-quadrature. The main idea behind our approach is to explicitly minimize a departure-from-quadrature objective function over a set of steerable filters. The results of this minimization are surprising. It turns out that in three dimensions, there are steerable filters that are almost exactly in quadrature. Moreover, as the complexity of these filters increases, they converge to exact quadrature, while the angular width of the filters converges to zero. These filters also have low space-frequency uncertainty. Thus these filters appear to be well-behaved in terms of all four properties mentioned above.

Below, we establish several properties of these filters mathematically, using the Karush–Kuhn–Tucker (KKT) conditions of the departure-from-quadrature minimization problem, and by exploiting a close relation between the minimizers and generalized Hilbert matrices.

An important application of steerable filters is the approximation of Gabor filters. Gabor filters achieve minimum space-frequency uncertainty and are commonly used in texture and motion analysis [9]. However, Gabor filters are not steerable. In section 6.1, we show that the steerable filters we derive can closely approximate Gabor filters. And because these filters are steerable, they can be used to approximate Gabor filters oriented in any direction. Thus they may be used in many of the myriad imaging tasks in which Gabor filters are desirable, with the advantage that local properties may be calculated with decreased computational complexity.

A significant use of approximating Gabor filters by steerable filters is the analysis of local frequency content in cryogenic electron microscopy (cryo-EM) [37]. Cryo-EM is a technique for reconstructing the three dimensional shape of biological macromolecules, usually proteins or protein complexes [14]. In section 6.2, we show how our steerable filters can be used in cryo-EM. Cryo-EM and various terms associated with it are explained in detail in section 6.2.

1.1. Organization of the paper.

The rest of the paper is organized as follows: A literature review is provided immediately below in section 1.2. Section 2 introduces steerable filters and related terminology. Section 3 formulates and solves the design of near-quadrature filters. section 4 establishes asymptotic properties of the solution. Section 5 contains results of numerical experiments confirming these properties. Section 6 contains applications of the steerable filters: In Section 6.1 we show how steerable filters can approximate Gabor filters. Section 6.2 applies this result to cryo-EM. Section 7 contains comments and a brief discussion. Section 8 concludes the paper. The proofs of the theorems in section 4 are relegated to the appendix.

1.2. Literature review.

As mentioned above, axially symmetric steerable filters were introduced in [16], which defined steerablility, established basic properties of steerable filters, and demonstrated their use in measuring orientation, angularly adaptive filtering, and contour detection. Steerable filters efficiently compute local properties of signals and functions [17, 29]. Steerable 3 dimensional (3D) wavelets are proposed in [7] based on the Reisz transform. These wavelets form a frame, but are not highly directional.

There is a close connection between steerable filters and representations of Lie groups SO(2) and SO(3) [22, 35]. Representation theory is used in [35] to describe the relationship between the steering angles and the method of producing steering coefficients with which to steer the basis. Formulas connecting the representations of the group with steering coefficients are also found in [22]. The formulas in [22] use properties of spherical harmonics.

3D steerable filters are used in a number of applications. For example, texture (such as hair, or certain kinds of biological tissue) and other image features often have directional orientation in an image. Finding the orientation of these features can be made computationally efficient with steerable filters [19]. Examples of the use of steerable filters for this purpose can be found in biomedical image analysis in 2 dimensions, e.g., [10], as well as in 3 dimensions, e.g. [1, 5].

3D steerable filters are also used for motion analysis in video sequences [6, 20, 40, 41]. Here the idea is to stack the sequence of 2 dimensions (2D) video frames as a 3D image, and then to determine the motion using the orientation of a 3D steerable filter. In another application to video analysis, 3D steerable filters are used to colorize video [28].

Steerable filters are used to analyze the quality of edges in [36] for calculating the 3D shape-from-focus from a video sequence. As an object moves close to the focal plane of a lens, its edges become sharper in the 2D image that it forms. Thus the quality of edges, and changes thereof, provide monocular clues to the 3D depth of an object. Steerable 3D filters are used in the 2D × time stack to analyze how the quality of edges changes through the video sequence [36].

More recently, steerable functions have been used in convolutional neural networks (CNN). Classical CNNs use convolutions in the first layer, and hence are translationally equivariant (translation of the data causes translation of the output of the first layer). Additional 3D rotational equivariance is obtained for the CNNs if the convolution kernels are obtained as linear combination of steerable functions [39]. Equivariant CNNs [23] are also used for texture classification [2]. CNNs that use 3D steerable filters can also be arranged to provide 3D rotational invariance [3]. As mentioned above, steerability is closely related to group representations. It has been argued that group representation methods, which make machine learning models equivariant, are critical for applications in physical sciences, where symmetry plays a key role [33].

Note that none of the 3D steerable filters mentioned above are near-quadrature; in fact, many are not even filter pairs (in the sense of quadrature).

Gabor filters achieve the lower bound of the space-frequency uncertainty described by the Heisenberg uncertainty principle and are therefore considered optimal for use in many imaging tasks [9]. However, Gabor filters cannot be steered exactly [21] and several approximations have been proposed. These include approximately steering a Gabor filter [21] or constructing approximate Gabor filters or approximate Hilbert pairs that are exactly steerable [16, 21]. These methods are mostly focused on 2, rather than 3 dimensions.

2. Background.

This section defines the technical terms used in the paper.

2.1. Axial symmetry.

We begin by defining radial-polar separable functions, and axially symmetric functions in R³.

Definition 2.1.

A function f : R³ → R is radial-polar separable if f(x, y, z) = w(r)ϕ(u, v, w) for (x, y, z) ≠ 0, where $r = \sqrt{x^{2} + y^{2} + z^{2}}$ and (u, v, w) = (x/r, y/r, z/r). The radial part of the function is w, and the polar part of the function is ϕ.

The arguments of the polar part ϕ are points (u, v, w) which lie on the unit sphere. That is, the polar part is a function from the unit sphere to the real line ϕ : S² → R.

Let ⟨u, v⟩ denote the usual inner product of vectors u, v ∈ R³.

Definition 2.2.

A radial-polar function is axially symmetric with respect to a unit-length axis a = (α, β, γ) if the polar part can be written as

ϕ (u, v, w) = p (〈 (u, v, w), a 〉),

(2.1)

where (u, v, w) ∈ S² and p : [−1, 1] → R is a real-valued function.

By convention, we always write the axis of an axially symmetric function as a unit vector. Below, we explicitly denote this axis as a superscript of the function. That is, we write an axially symmetric function f with the axis a as f^a.

An axially symmetric function can be rotated so that its axis of symmetry is any axis we desire. We adopt a canonical way of describing such rotated functions: we define the original function using the positive z-axis as its axis, and then replace this axis by an axis a for a rotated copy of the function. Thus the original function is specified as f^(0,0,1):

f^{(0, 0, 1)} (x, y, z) = w (r) p (〈 (x / r, y / r, z / r), (0, 0, 1) 〉) = w (r) p (z / r) .

(2.2)

The rotated version of f^(0,0,1) is f^a. Note that the polar part p(z/r) of f^(0,0,1) is a function of the z-coordinate of points on a unit sphere. This coordinate take values in [−1, 1].

2.2. Steerable axially symmetric functions.

Definition 2.3.

An axially symmetric function f is steerable if there exists a finite set of axis a₁, …, a_M ∈ R³ such that for any axis a ∈ R³

f^{a} = \sum_{j = 1}^{M} k_{j} (a) f^{a_{j}},

(2.3)

where k_j are real-valued functions of a.

In the above definition, the functions k_j are steering functions, and their values k_j(a) are steering coefficients. The functions $f^{a_{j}}$ are the steering basis.

How can we find steerable axially symmetric functions? Theorem 2.4 below, established in [11, 16], gives a recipe. The theorem uses the finite set of axes a₁, . . . , a_M, where the coordinates of the axis a_j are a_j = (α_j, β_j, γ_j). The theorem requires the p function in (2.2) to be a polynomial.

Theorem 2.4.

For f^(0,0,1) as defined in (2.2), let p : [−1, 1] → R be an odd or even polynomial of z/r of order N. Then f^(0,0,1) is steerable, i.e.,

f^{a} = \sum_{j = 1}^{M} k_{j} (a) f^{a_{j}}

(2.4)

for a = (α, β, γ) and axes a₁ = (α₁, β₁, γ₁), . . . , a_M = (α_M, β_M, γ_M) if and only if $M \geq (\begin{matrix} N + 2 \\ 2 \end{matrix})$ , and the k_j satisfy:

\begin{array}{l} (\begin{matrix} α^{N} \\ α^{N - 1} β \\ α^{N - 1} γ \\ α^{N - 2} β^{2} \\ α^{N - 2} β γ \\ ⋮ \\ γ^{N} \end{matrix}) \\ = (\begin{matrix} α_{1}^{N} & α_{2}^{N} & \dots & α_{M}^{N} \\ α_{1}^{N - 1} β_{1} & α_{2}^{N - 1} β_{2} & \dots & α_{M}^{N - 1} β_{M} \\ α_{1}^{N - 1} γ_{1} & α_{2}^{N - 1} γ_{2} & \dots & α_{M}^{N - 1} γ_{M} \\ α_{1}^{N - 2} β_{1}^{2} & α_{2}^{N - 2} β_{2}^{2} & \dots & α_{M}^{N - 2} β_{M}^{2} \\ α_{1}^{N - 2} β_{1} γ_{1} & α_{2}^{N - 2} β_{2} γ_{2} & \dots & α_{M}^{N - 2} β_{M} γ_{M} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ γ_{1}^{N} & γ_{2}^{N} & \dots & γ_{M}^{N} \end{matrix}) (\begin{matrix} k_{1} (a) \\ k_{2} (a) \\ k_{3} (a) \\ ⋮ \\ k_{M} (a) \end{matrix}) . \end{array}

(2.5)

In other words, the recipe for creating a steerable axially symmetric function is the following: Choose an odd or even Nth order polynomial p and a radial function w to define f^(0,0,1). Also choose a fixed set of axes a₁, . . . , a_M. To steer f^(0,0,1) so that its axis is a, solve (2.5) to get the steering coefficients k_j(a). Then, use the linear sum in (2.4).

Although the above definitions allow the function p to be a constant or a zero function, the interesting cases are where p is neither.

2.3. From functions to filters.

In image processing, when an image is convolved with a function, the function is called a filter, and its Fourier transform is called the frequency response of the filter. To distinguish a function from its Fourier transform, we will denote the former by an ordinary font and the latter by a calligraphic font. Thus the function V : R³ → R has a Fourier transform $𝒱 : R^{3} \to C$ .

2.4. Quadrature.

Let p_e and p_o be odd and even functions (not necessarily polynomials) defined on [−1, 1], and define axially symmetric even and odd functions $f_{e}^{(0, 0, 1)} (x, y, z) = w (r) p_{e} (z / r)$ and $f_{o}^{(0, 0, 1)} (x, y, z) = w (r) p_{o} (z / r)$ for some non zero radial function w(r). Then, as long as $f_{e}^{(0, 0, 1)}$ , $f_{o}^{(0, 0, 1)}$ have finite L₂ norm, we may take $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ to be frequency responses of a pair of axially symmetric filters. Referring to $f_{e}^{(0, 0, 1)}$ or $i f_{o}^{(0, 0, 1)}$ as “frequency response of a filter” is a bit cumbersome. In the interest of simple terminology, with a slight abuse, we will refer to $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ themselves as filters.

The notion of quadrature filters is well-defined for 1 dimensional (1D) signals. Two 1D filters are in quadrature if they are Hilbert transforms of each other [27]. In other words, their Fourier transforms, say $ℱ_{1}$ , $ℱ_{2}$ , are related by

ℱ_{2} (ω) = {\begin{array}{l} i ℱ_{1} (ω) & for ω > 0, \\ 0 & for ω = 0, \\ - i ℱ_{1} (ω) & for ω < 0. \end{array}

(2.6)

If we assume that $ℱ_{1}$ is even, then $ℱ_{2}$ is odd. This notion of quadrature filters does not generalize easily to 2 or higher dimensions [12]. However, for the purposes of this paper, we do not need a complete generalization. We only need a notion of quadrature for $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ (viewed as frequency responses of filters). Following (2.6) we say that $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ (viewed as $ℱ_{1}$ and $ℱ_{2}$ ) are in quadrature if

i f_{o}^{(0, 0, 1)} (ω) = {\begin{array}{l} i f_{e}^{(0, 0, 1)} (ω) & for ω^{T} {(0, 0, 1)}^{T} > 0, \\ 0 & for ω^{T} {(0, 0, 1)}^{T} = 0, \\ - i f_{e}^{(0, 0, 1)} (ω) & for ω^{T} {(0, 0, 1)}^{T} < 0. \end{array}

(2.7)

Observe that with this definition, if we restrict both of the 3D functions if_o and if_e to any line through the origin, the 1D restricted functions now satisfy (2.6), the definition of quadrature in one dimension. Now, recognizing that the radial parts w(r) are identical for both filters, and that the polar parts are functions on the sphere, we have the following.

Definition 2.5.

The filters $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ are in quadrature if the functions p_e and p_o satisfy p_e = p_o > 0 in the north hemisphere and p_e = p₀ = 0 on the equator of the sphere.

If w(r) is chosen to peak around ω₀ and close to zero away from ω₀, then $f_{e}^{(0, 0, 1)}$ and $f_{o}^{(0, 0, 1)}$ can be said to measure the local frequency content at (0, 0, ±ω_o). Further, if $f_{e}^{(0, 0, 1)}$ and $f_{o}^{(0, 0, 1)}$ can be steered, then the frequency content local to any direction, at distance ω_o, can be measured. This is appealing and raises the question of whether the axially symmetric filters $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ can be designed to be steerable and also to be in quadrature? Unfortunately, the answer is no. To see this, recall that the polar parts of axially symmetric functions oriented along the z-axis must be functions of the z-coordinate of points on the unit sphere. And for such functions to have a finite steering basis, the polar parts must in fact be polynomials [25, 32]. That is, the polar parts of the filter pair $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ must consist of an even polynomial and an odd polynomial.

But then Definition 2.5 requires that the polar parts of our two filters agree entirely on the interval (0, 1]. Requiring an odd polynomial and even polynomial to agree on (0, 1] is impossible if both polynomials are nonzero. Thus, steerable filters designed according to Theorem 2.4 cannot be in exact quadrature.

However we can try to get these filters approximately in quadrature by minimizing a departure-from-quadrature (DQ) objective function, as shown in the next section.

3. Departure from quadrature (DQ).

To define DQ, let z : S² → [−1, 1] denote the z-axis coordinate function on the unit sphere. Then, the DQ of p_e and p_o is

\int_{S_{+}^{2}} {(p_{e} (z (a)) - p_{o} (z (a)))}^{2} d a = \int_{0}^{π / 2} {(p_{e} (\cos θ) - p_{o} (cos θ))}^{2} sin θ d θ = \int_{0}^{1} {(p_{e} (z) - p_{o} (z))}^{2} d z,

(3.1)

where, $S_{+}^{2}$ is the north hemisphere, da is the infinitesimal area on the unit sphere, and θ is the zenith angle.

Below, we minimize this DQ measure while imposing two natural constraints on p_e and p_o:

p_{e} (0) = p_{o} (0) = 0 and p_{e} (1) = p_{o} (1) = 1.

(3.2)

The first constraint is the equator constraint in the definition of quadrature. The second constraint is the north hemisphere constraint of the quadrature evaluated at the north pole. Together, the constraints guarantee that the polynomials are not constant.

3.1. Minimizing DQ.

Minimize the DQ of (3.1) subject to the constraints of (3.2) reduces to a standard quadratic program: Set p(z) = p_e(z) − p_o(z) = a_Nz^N + a_N−1z^N−1 + . . . + a₀. Then, the DQ objective function is $\int_{0}^{1} p^{2} (z) d x$ . In addition, the constraint p_e(0) = p_o(0) = 0 translates to a₀ = 0. Setting a₀ = 0 to satisfy this constraint, and letting a = (a₁, . . . , a_N) be a vector of the remaining coefficients, the DQ objective function is

\int_{0}^{1} p {(z)}^{2} d z = \sum_{i, j = 1}^{N} a_{i} a_{j} \int_{0}^{1} z^{i + j} d z = a^{T} H a,

(3.3)

where $H_{i j} = \frac{1}{i + j + 1}$ . The constraint p_e(1) = p_o(1) = 1 translates to $\sum_{i even} a_{i} = 1$ and $\sum_{i odd} a_{i} = - 1$ . Setting $B^{T} = (\begin{matrix} 1 & 0 & 1 & \dots & 0 \\ 0 & 1 & 0 & \dots & 1 \end{matrix})$ , this constraint can be written as $B^{T} a = (\begin{matrix} - 1 \\ 1 \end{matrix})$ . Thus the minimization problem can be put into the following form.

Problem 3.1.

\hat{a} = {argmin}_{a \in ℝ^{N}} \frac{1}{2} a^{T} H a s u b j e c t t o B^{T} a = (\begin{matrix} - 1 \\ 1 \end{matrix}) .

(3.4)

In this form, the first set of constraints in (3.2) is implicit in the objective function, while the second set of constraints is explicitly imposed. The matrix H is an instance of a generalized Hilbert matrix (with p = 2).

Definition 3.2.

Let p ≥ 0 be an integer. Then, a generalized Hilbert matrix is a matrix $H \in M_{N} (ℝ)$ with entries: $H_{i j} = \frac{1}{i + j - 1 + p}$ .

All generalized Hilbert matrices are positive-definite [8]. Thus, Problem 3.1 is a standard quadratic program.

The Lagrangian for the quadratic program is

L (a, λ) = \frac{1}{2} a^{T} H a - λ^{T} (B^{T} a - (\begin{matrix} - 1 \\ 1 \end{matrix})),

where λ = (λ₁ λ₂)^T is the Lagrange multiplier. According to KKT theory [18] the optimal $\hat{a}$ is obtained by setting $\frac{\partial L}{\partial a} = \frac{\partial L}{\partial λ} = 0$ which gives

\hat{a} = H^{- 1} B λ, and λ = K^{- 1} (\begin{matrix} - 1 \\ 1, \end{matrix})

(3.5)

where K = B^TH⁻¹B. Thus,

\hat{a} = H^{- 1} B K^{- 1} (\begin{matrix} - 1 \\ 1 \end{matrix}) .

(3.6)

The H matrix is invertible because it is positive-definite. Moreover, because the columns of B (rows of B^T) are linearly independent, K is also positive-definite, and hence invertible. Thus the solution in (3.6) is well-defined.

The optimal $\hat{a}$ gives the optimizing polynomial $p_{\hat{a}} = \hat{a} (N) x^{N} + \dots + \hat{a} (1) x$ from which the optimal even and odd polynomials are available as $p_{e, \hat{a}} (z) = \sum_{i even} {\hat{a}}_{i} z^{i}$ , $p_{o, \hat{a}} (z) = \sum_{i odd} - {\hat{a}}_{i} z^{i}$ . Note the change in the sign of the coefficients for the odd polynomial. The optimal polynomial of the higher order in the pair is either the odd or the even polynomial, depending on whether N is odd or even.

4. Properties of the optimal solution.

The minimizing polynomial and the resulting even and odd optimal polynomials have a number of interesting properties, expressed as various theorems below. These theorems address the behavior we explained as being desirable in steerable filters: The theorems explore angular concentration, closeness to quadrature, and polar monotonicity. The proofs of these theorems are postponed to section A of the appendix. The proofs depend critically on H being a generalized Hilbert matrix.

4.1. Positivity and shape of the polynomial.

We begin by characterizing the signs of the odd and even components of the minimizing $\hat{a}$ of (3.6). The ith component of $\hat{a}$ is denoted ${\hat{a}}_{i}$ , i = 1, . . . , N.

Theorem 4.1.

The even components of the optimum vector $\hat{a}$ are positive and the odd components are negative, i.e., ${\hat{a}}_{i} > 0$ for i even, and ${\hat{a}}_{i} < 0$ for i odd.

An immediate consequence of the signs of the odd and even components is the following.

Corollary 4.2.

The optimal even and odd polynomials $p_{e, \hat{a}}$ , $p_{o, \hat{a}}$ have positive coefficients. Consequently, they are positive and have positive derivatives on (0, 1].

The filters $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ use these polynomials for the polar part. And the positivity of the polynomials gives a single lobe to the filters in the north and south hemispheres; there are no sidelobes. That is, $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ take high values along the axis (0, 0, 1) and their values decrease monotonically as the angle from the axis increases. That is, the filters $f_{e}^{(0, 0, 1)}$ and $i f_{o}^{(0, 0, 1)}$ are axially “concentrated” or “focused.”

The next theorem shows that the concentration around the axis (0, 0, 1) increases with N, the order or the polynomial. Asymptotically as N → ∞, the support of the filters shrinks to just the axis.

Theorem 4.3.

For any i, the ith component ${\hat{a}}_{i} \to 0$ as N → ∞.
The optimal polynomials $p_{e, \hat{a}}$ , $p_{o, \hat{a}}$ converge pointwise to 0 in the interval [0, 1) as N → ∞, while $p_{e, \hat{a}} (1) = p_{o, \hat{a}} (z) = 1$ . That is, the optimal polynomials converge to the indicator function of 1 on [0, 1].

Thus the optimal polynomials become more and more concentrated towards the endpoint z = 1 as N increases. Because the polynomials converge pointwise to 0 in [0, 1), they converge pointwise to 0 in (−1, 1) (the polynomials are odd and even). Hence their L₁ and L₂ norms converge to 0 on [−1, 1].

Since the optimal odd and even polynomials are the polar parts of steerable filters, the steerable filters become more concentrated around their axes as N increases.

4.2. The asymptotic behavior of the DQ.

The optimal polynomials (and hence the filters) are designed to minimize the DQ. The following theorem shows that the DQ vanishes as N, the order of the polynomial increases.

Theorem 4.4.

The minimum value of the DQ approaches 0 as N → ∞, i.e., $\frac{1}{2} {\hat{a}}^{T} H \hat{a} \to 0$ as N → ∞.

In other words, exact quadrature can be approached as closely as desired by increasing the order of the polynomials. In fact, numerical calculations in section 5 show that the DQ appears to decrease exponentially with N.

The theorems above show that filters designed by using the minimizing polynomials have the desired properties: The filters are steerable. The filters are concentrated around their axis, and do not change sign in the north and south hemisphers. Moreover, the filters are close to quadrature, and the concentration around the axis and closeness to quadrature can be arbitrarily increased by increasing N.

5. Numerical results.

We now turn to present numerical results about the optimal polynomials and the resulting filters.

The theorems in section 4 gave asymptotics for several quantities. To gain additional insight, we numerically investigate the behavior of these quantities for large but finite N. We especially investigate the rate at which these quantities converge.

Below we plot the DQ (3.1) and other approximation errors of the filters. We report these errors as a fraction, by dividing the errors with the L₂ norm of the filter, or of some other quantity. In each case, we indicate what the denominators of these normalized errors are.

5.1. Optimal polynomials.

Figure 1 shows the even and odd optimal polynomials for N = 3, 4, 5, 6 in the interval [−1, 1]. The positivity and monotonic behavior on (0, 1] is quite clear from the plots. The increasing concentration near 1 with increasing N is also clear. Figure 1 contains a table which shows the DQ and the normalized DQ values for these polynomials. DQ is normalized by dividing by the L₂ norm of the polynomial over the interval [−1, 1]. Note the rapid decrease of the DQ and the normalized DQ with increasing N, even for these relatively low order polynomials.

Figure 2 contains the relation between the DQ, normalized DQ, and N in greater detail. Figure 2(a) shows the logarithm of DQ as a function of N for N = 3, . . . , 40. The figure also shows a linear fit to the calculated values. The linear fit (fit to the data from N = 7 up to N = 40) suggests that the DQ decreases with N as exp(−0.115N). Figure 2(b) shows the logarithm of the normalized DQ as a function of N. The normalized DQ is defined as the ratio of DQ to the L₂ norm squared of the highest order polynomial in the optimal even odd pair. The linear fit (fit to the data from N = 7 up to N = 40) suggests that the DQ decreases with N as exp(−0.079N). Figure 2 provides supportfor Theorem 4.4.

Figure 3 provides support for Theorem 4.3, which says that the polynomials converge to 0 on [0, 1) (hence on (−1, 1)). Figure 3 shows the L₂ norm squared of the highest order polynomial (amongst the even/odd optimal polynomials) as a function of N for N = 3, . . . , 40. Note the monotonic decrease in the L₂ norm. This is consistent with Theorem 4.3(b).

6. Applications.

In this section, we focus on two applications of steerable filters. The first is that of approximating 3D Gabor filter pairs. This is discussed below in section 6.1. The second application, presented in section 6.2, uses the 3D Gabor filter approximations to understand the local Fourier shell correlation in 3D reconstructions of the structure of protein molecules.

6.1. Approximating 3D Gabor filter pairs.

Steerable filters using the optimal polynomials can approximate 3D Gabor filter pairs. Some notation is necessary to describe the approximation succinctly: When we fix N and obtain the optimal polynomial pair for a given maximum degree, we are simultaneously determining the order of both polynomials in the pair–the higher order polynomial will be of degree N and the other will be of degree N − 1. For example, if we set N = 5, then our higher order polynomial will be of degree 5 and therefore odd, whereas the even polynomial in the pair will have degree 4. Below, we denote the higher order optimal polynomial as p_N and the other polynomial as $p_{N^{'}}$ , regardless of whether N is even or odd.

Our approach to approximating 3D Gabor filter pairs is the following: We construct a steerable filter of the form

\hat{𝒢} (x, y, z) = e^{- a_{2} {(r - r_{0})}^{2} - a_{1} (r - r_{0}) - a_{0}} p_{N} (z / r),

where x, y, z are coordinates in the Fourier domain, $r = \sqrt{x^{2} + y^{2} + z^{2}}$ , and r₀ is the frequency at which the radial part of the function has a maximum. The coefficients a₀, a₁, a₂ are parameters to be adjusted. This filter is even if N is even, else it is odd. And regardless of whether it is even or odd, it is axially symmetric about the z-axis. Note that the first exponential term is the radial part of the filter (i.e., the term w(r)), whereas the polynomial term is the polar part of the filter.

With our method, the polar part of the filter is determined entirely by the choice of N; the radial part depends on a₂, a₁, a₀. Given N, we adjust a₂, a₁, a₀ such that the degree N filter in our pair best approximates its counterpart of the frequency response of the Gabor filter pair. We then use the same radial function for the filter of degree N − 1. For example, if N = 17, then we choose a₂, a₁, a₀ so that $\hat{G} (x, y, z)$ best approximates the imaginary part (the odd part) of the frequency response of the Gabor filter pair, and then use those same coefficients for our even filter of degree N = 16 which approximates the real part of the frequency part of the Gabor filter pair.

Let $𝒢 (x, y, z)$ be the frequency response of the chosen Gabor filter with r₀ as the peak frequency. Then $𝒢$ is approximated by $\hat{𝒢}$ by numerically minimizing the L₂ norm of $𝒢 - \hat{𝒢}$ with respect to a₀, a₁, a₂, using the MATLAB fminsearch. Finally, we take

\hat{𝒢}' (x, y, z) = e^{- a_{2} {(r - r_{0})}^{2} - a_{1} (r - r_{0}) - a_{3}} p_{N^{'}} (z / r)

as an approximation of the other Gabor filter in the Gabor filter pair.

To illustrate this idea with a concrete example, we take the Fourier transform of a 3D Gabor filter pair sampled on a grid of 128 × 128 × 128 voxels. We let the frequency peak of the Fourier transform of the filter be located at 32 voxels from the zero frequency in the x, z plane, and we let the width (standard deviation) of the peak be 9.697 voxels [26]. To be precise, the exact formulas for our Gabor filter pairs are given by

𝒢_{o d d} (x, y, z) = \frac{1}{2} exp {- \frac{x^{2} + y^{2} + {(z - 32)}^{2}}{2 {(9.697)}^{2}}} - \frac{1}{2} exp {- \frac{x^{2} + y^{2} + {(z + 32)}^{2}}{2 {(9.697)}^{2}}},

𝒢_{e v e n} (x, y, z) = \frac{1}{2} exp {- \frac{x^{2} + y^{2} + {(z - 32)}^{2}}{2 {(9.697)}^{2}}} + \frac{1}{2} exp {- \frac{x^{2} + y^{2} + {(z + 32)}^{2}}{2 {(9.697)}^{2}}} .

Next we perform the optimization as described above to obtain steerable approximator filters for these filters. We take the filters giving the minimum L₂ norm of $G - \hat{G}$ for a range of N. As shown in Figure 5 the minimum L₂ norm varies with N, with the smallest minimum achieved at N = 23. For this value of N, the optimal radial part is

W (r) = exp {- 0.0021 {(r - 32)}^{2} - 0.0102 (r - 32) - 0.6492} .

Figure 5. — *The L*₂ *norm of* $𝒢 - \hat{𝒢}$ *divided by the L*₂ *norm of $𝒢$ for even and odd Gabor filter approximation.*

The accompanying polynomials are

p_{o d d} (z^{'}) = 0.0661 z^{' 23} + 0.085 z^{' 21} + 0.098 z^{' 19} + 0.1042 z^{' 17} + 0.1038 z^{' 15} + 0.0979 z^{' 13} + 0.0903 z^{' 11} + 0.089 z^{' 9} + 0.1026 z^{' 7} + 0.1101 z^{' 5} + 0.0522 z^{' 3} + 0.0008 z^{'},

p_{e v e n} (z^{'}) = 0.1175 z^{' 22} + 0.1014 z^{' 18} + 0.0918 z^{' 18} + 0.0889 z^{' 16} + 0.0924 z^{' 14} + 0.0999 z^{' 12} + 0.1056 z^{' 10} + 0.0999 z^{' 8} + 0.0823 z^{' 6} + 0.1095 z^{' 4} + 0.0109 z^{' 2},

where z′ = z/r.

The first two rows of the left column of Figure 4 show the frequency responses of the even and odd parts of the Gabor filter pair in the x, z plane in the Fourier domain. The first two rows of the right column of Figure 4 show the frequency responses of the approximating steerable filter for N = 23. The next two rows of the figure show the steering behavior of the approximating filters. The first two columns show brute force rotation of the even and odd Gabor pairs. The next two columns show the steered filters calculated according to (2.4).

Figure 5 shows the normalized L₂ norm of the difference of the even and odd Gabor filters and their steering approximations, i.e., $𝒢 - \hat{𝒢}$ , as a function of N. For normalization, the L₂ norm of $𝒢 - \hat{𝒢}$ is divided by the L₂ norm of $𝒢$ . The minimum value that this normalized N takes is 0.1107 for N = 23.

Gabor filters are useful because they have minimum space-frequency uncertainty. It is useful to compare the space-frequency uncertainty of our steerable approximations for each N with the space-frequency uncertainty of the Gabor filters that they are approximating. For any function $f : ℝ^{3} \to ℝ$ , the space-frequency uncertainty is the product

(\int_{ℝ^{3}} | x |^{2} | f (x) |^{2} d x) (\int_{ℝ^{3}} | ξ |^{2} | ℱ (ξ) |^{2} d ξ),

where $ℱ$ denotes the Fourier transform of function f. For all f with unit L₂ norm, this product has a lower bound, which is achieved when $\hat{f}$ is the odd or even part of the Gabor Fourier transform. We computed the space-frequency uncertainty for the even approximator filter for different N and compared it with the space-frequency uncertainty of the even part of the Gabor Fourier transform (the uncertainty of the odd part is identical to the even). Figure 6 plots the ratio of the steerable filter uncertainty to Gabor filter uncertainty as a function of N. The ratio has a minumum at N = 17 and at the minimum its value is equal to 1.0747. Note that in the range N = 10 – 25 the steerable filter uncertainty is within 10% of the Gabor filter uncertainty.

Figure 6. — *The ratios obtained by dividing the uncertainty products of our approximators by the uncertainty product of the Gabor filter. Note that the minimum value of* 1.0747 *is achieved at N* = 17 *and the low values are maintained in the range N* = 10 *to N* = 25.

6.1.1. Choice of steering directions.

To select steering directions, we used the ParticleSampleSphere from the S2 Sampling Toolbox [31]. This function gives approximately uniform M points on a sphere. For m steering directions, we let M = 2m and select the directions represented in the upper half-sphere (when M is even, the points in the lower hemisphere lie on the same directional axes as the points in the upper hemisphere).

6.2. Local Fourier shell correlation (FSC) in Cryo-EM.

As mentioned in section 1, cryo-EM is a modern method for reconstructing the 3D molecular structure of proteins [14]. In brief, cryo-EM works as follows: Protein molecules are purified from biological material and frozen in a thin layer of vitreous ice. This sample is imaged with a transmission electron microscope. The resulting image contains many tomographic images of the protein at random orientations. Several thousand such images are used to reconstruct the 3D structure of the protein. Mathematically, each reconstructed structure is a map from R³ to R. Figures 7(a)–(b) show one example; this is the reconstructed structure of the capsaicin receptor TRYPV1 obtained from the Electron Microscopy Data Bank (EMBD entry 5778, https://www.emdataresource.org) [24]. The background has been masked in the figure.

Figure 7. — (a)–(b) *Structure of the capsaicin receptor TRYPV* 1. (c) *FSC of the structure. The structure has a resolution of* 3.275Å as reported in [24].

Cryo-EM images are extremely noisy, and this noise percolates into the 3D construction. This raises the possibility that artifacts due to noise may be misinterpreted as valid features of the structure. To avoid this, a form of cross validation is used. The set of all images is split into two subsets with an equal number of images, and a 3D structure is reconstructed from each subset.

Suppose that the two reconstructions are V₁ : R³ → R and V₂ : R³ → R with Fourier transforms $𝒱_{1}$ and $𝒱_{2}$ . Let S_ω be the surface of a sphere of radius ω in the Fourier domain. This surface contains all frequencies which have the same wavelength but different orientations. Define the FSC of the two volumes restricted to this surface as

FSC (ω) = \frac{Re \int_{S_{ω}} 𝒱_{1} (u) 𝒱_{2}^{*} (u) d u}{\sqrt{\int_{S_{ω}} 𝒱_{1} (u) 𝒱_{1}^{*} (u) d u} \sqrt{\int_{S_{ω}} 𝒱_{2} (u) 𝒱_{2}^{*} (u) d u}},

(6.1)

where u is a point in S_ω and du is a differential area in S_ω. In other words, FSC(ω) is the correlation of the Fourier transforms of the two volumes on S_ω. High FSC values indicate a good signal-to-noise ratio for frequencies in S_ω, and low FSC values indicate poor signal-to-noise ratio for frequencies in S_ω.

It is standard practice in cryo-EM to plot FSC(ω) as a function of the radius ω (the radius is converted to a wavelength, which is expressed in angstroms). See Figure 7(d) for the FSC of the TRYPV1 structure (the EMDB provides the two volumes to calculate the FSC with. These volumes were used to calculate the FSC in the figure. The volumes were also masked to suppress the background). Typically, the value of ω, translated from frequency to wavelength, at which FSC equals 0.143 is taken as the resolution of the reconstruction [30]. It is the highest spatial frequency which can be reliably interpreted in the structure. Features of the structure at finer spatial scales are suspect. All of this is standard practice in cryo-EM [14]. A cryo-EM structure cannot be published or included in the EMDB without FSC analysis and a measurement of resolution. The resolution of the TRYPV1 receptor shown in Figure 7 is 3.275 Å.

One problem with the FSC is that it characterizes the volume as a whole. But often, even simple visual inspection shows that the resolution of most reconstructed volumes is not globally uniform; it can vary quite a bit locally. To deal with this, various local resolutions are defined in cryo-EM, and reviewed in [37]. The idea behind local resolution is to replace the Fourier transforms used in (6.1) by local versions of the Fourier transforms.

By analogy with (6.1), a local FSC (LFSC) at a point x ∈ R³ in the volume can be defined as

LFSC (x, ω) = \frac{Re \int_{S_{ω}} {\ddot{V}}_{1} (x, u) {\ddot{V}}_{2}^{*} (x, u) d u}{\sqrt{\int_{S_{ω}} {\ddot{V}}_{1} (x, u) {\ddot{V}}_{1}^{*} (x, u) d u} \sqrt{\int_{S_{ω}} {\ddot{V}}_{2} (x, u) {\ddot{V}}_{2}^{*} (x, u) d u}},

(6.2)

where ${\ddot{V}}_{1} (x, u) = ℱ^{- 1} (𝒱_{1} \cdot (𝒢_{e v e n}^{u} + i 𝒢_{o d d}^{u})) (x)$ and ${\ddot{V}}_{2} (x, u) = ℱ^{- 1} (𝒱_{2} \cdot (𝒢_{e v e n}^{u} + i 𝒢_{o d d}^{u})) (x)$ , and $ℱ^{- 1}$ is the inverse Fourier transform. In addition, $𝒢_{e v e n}^{u}$ and $𝒢_{o d d}^{u}$ are Fourier transforms of the even and odd Gabor filters with center frequency u. The idea is that $ℱ^{- 1} (𝒱_{1} \cdot (𝒢_{e v e n}^{u} + i 𝒢_{o d d}^{u})) (x)$ is the convolution of V₁ and $ℱ^{- 1} ((𝒢_{e v e n}^{u} + i 𝒢_{o d d}^{u}))$ evaluated at x. Because $ℱ^{- 1} ((𝒢_{e v e n}^{u} + i 𝒢_{o d d}^{u}))$ is a complex function whose real and imaginary parts are Gaussian enveloped cosine and sine functions, $ℱ^{- 1} ((𝒢_{e v e n}^{u} + i 𝒢_{o d d}^{u})) (x)$ measures the local u frequency content in V₁ at x. Similarly $ℱ^{- 1} (𝒱_{2} \cdot (𝒢_{e v e n}^{u} + i 𝒢_{o d d}^{u})) (x)$ measures the local u frequency content of V₂. Thus, (6.2) is the local FSC.

In practice, the LFSC is calculated by approximating the integrals in the numerator and denominator of (6.2) by Riemann sums [38]. That is, a finite number of vectors u are chosen to lie uniformly on the surface S_ω and the integrals are approximated by Riemann sums involving these u’s. Each term in the Riemann sum requires a convolution with $𝒢_{e v e n}^{u}$ and $𝒢_{o d d}^{u}$ , which is computationally expensive.

But Gabor filters aligned along different u ∈ S_ω are merely 3D rotations of each other. Thus we can use the optimal approximation of the Gabor filter pairs with the steerable filters suggested in section 6.1 to an advantage. The idea is to optimally approximate the even and odd Gabor filters $𝒢_{e v e n}^{u}$ and $𝒢_{o d d}^{u}$ with the corresponding 3D steerable filters $p_{e v e n}^{u}$ and $p_{o d d}^{u}$ aligned along u. Then, we need only evaluate the convolutions of the steerable bases of these filters. Appropriate linear sums of the convolutions with the steerable bases give convolutions with $p_{e v e n}^{u}$ and $p_{o d d}^{u}$ at the desired u’s. These convolutions can be taken to be ${\ddot{V}}_{1}$ and ${\ddot{V}}_{2}$ in LFSC. This saves computation, because there are fewer steerable bases than u’s.

As noted above, the FSC curve in Figure 7(d) reaches a value of 0.143 at 3.275 Å. We take S_ω to be the surface of the sphere which corresponds to 3D frequencies whose wavelength is 3.275 Å. To calculate the LFSC, we approximated the integrals in the numerator and denominator with Riemann sums. The sums are evaluated at 300 u’s that are uniformly spaced on S_ω. We first calculated the LFSC using Gabor filters oriented along these u’s. The Gabor filters have a peak frequency exactly at the frequency corresponding to S_ω. The width of the Gabor filter frequency response was set to 1.094 Å (20.7059 voxels), so that the entire Gaussian envelope of the frequency response of the Gabor filter could be contained in the FFT of the volume. Of course, calculating the LFSC using this brute force approach required 300 × 2 convolutions with the Gabor filters for each volume, for a total of 1200 convolutions.

Figures 8(a)–(b) shows the LFSC calculated using Gabor filter convolutions. The TRYPV1 receptor is rendered in color in the figure. The color denotes the LFSC at every voxel. The color bar in Figures 8(a)–(b) indicates how LFSC values are mapped to color. The figure clearly shows that the LFSC values change considerably, illustrating that reconstruction has a locally variable signal-to-noise ratio, and hence has more or less structural information in different parts. This is the benefit of using the LFSC.

Next, we optimally approximated the odd and even Gabor filters with steerable 3D filters using the procedure of the previous section. The approximation used a pair of polynomials of order 17 and order 16, and their steerable basis required 171 and 153 functions, respectively. The bases were convolved with each volume for a total of 324×2 = 628 convolutions. Thus, the results of 628 convolutions were taken and used to approximate the 300×4 Gabor convolutions in the Riemann sums of the numerator and denominator of the LFSC.

Figures 8(c)–(d) show the LFSC calculated using the steerable approximations. The color bar in Figure 8(c)–(d) is identical to the color bar of Figure 8(a)–(b). Qualitatively, the steerable filter LFSC is quite similar to the Gabor filter LFSC. Figures 8(c)–(d) show the histograms of the LFSCs. Note that both histograms are similar and that the left peak in both histograms is close to 0.143, the value of the (global) FSC on S_ω. To investigate relative values of the two LFSCs further, for every voxel in the volume, we took the Gabor LFSC value and the steerable filter LFSC value and made a scatter plot of these pairs of values. Figure 8(g) shows a plot of the kernel density estimate of the scatter (the scatter plot itself was too dense to convey where most of the scatter mass lies). A surface plot as well as a contour plot are shown. Note in the surface plot that the scatter is highly concentrated around the diagonal line. This is even more apparent in the contour plot, which also shows a 45 degree line. Finally, we calculated the correlation coefficient between the Gabor LFSC and the steerable filter LFSC. The correlation coefficient was 0.97, showing that the steerable filter LFSC closely approximated the Gabor LFSC. And the steerable LFSC approximation required only about half the number of the convolutions of the Gabor LFSC calculation. Of course, the number of convolutions required for the steerable approximation method does not grow with the number of Riemann sums used to compute the integral.

This example illustrates the utility of having steerable almost-conjugate 3D filters which can closely approximate Gabor filter pairs.

7. Discussion.

The optimal filter pairs proposed in this paper have surprising properties. Even for low order polynomials (Figure 1), the filter pair is close to quadrature, monotonic, and highly directional. And it is steerable in 3D. The filter pair also approximates Gabor filters quite well.

We emphasize that our goal is not to find steerable filters per se; our interest is in finding filter pairs, especially pairs which are directional, and which are close to quadrature. Thus we exclude more complicated steerable filters (Y or star shaped filters, for example) from consideration. These filters are not directional, and quadrature is difficult to define for them.

Finally, we note that proofs of various claims in section 4 of the paper are in the appendix. The key elements of the proofs exploit properties of the inverse Hilbert matrix, especially row-sums and column-sums of the matrix. Hilbert matrices are a subclass of both the more general Cauchy and Hankel matrices [13, 4]. It is likely that this more general class of matrices may also be used to prove the same theorems.

8. Conclusion.

3D steerable filters can be easily designed to be near-quadrature, especially when their polar parts are polynomial functions. These filters are sharply directional, and their directional as well as quadrature properties improve with the order of the polynomial function. These properties are closely related to the properties of the generalized Hilbert matrix. These filters also provide a practically useful approximation to the even and odd 3D Gabor filter pairs.

Funding:

The work of the authors was supported by NIH Grant R01GM125769.

Appendix A. Proofs of claims.

A.1. Properties of generalized Hilbert matrix inverse and notation.

It is clear from our solution in (3.6) as well as our rewritten constrained objective function (3.4) that our calculations are intimately intertwined with the inverse H⁻¹. In this section we define the class of matrices to which H belongs and give important properties that we will use in subsequent proofs.

A.1.1. Definitions and notations.

We will show that the fact that our H in (3.4) is a generalized Hilbert matrix (of order 2) is sufficient to prove the results we claim about the behavior of the solutions $\hat{a}$ as well as the asymptotic results.

Note first that H⁻¹B is a two column vector whose values are the sum of the odd- or even-indexed entries in each row, depending on the dimension of H. Consequently, it will be useful to adopt some notation to refer to the inverse of the Hilbert matrix, its entries, and certain specialized sums, such as the sum of the entries of a row (termed row-sum) or the sum of all odd-indexed entries.

Definition A.1.

Let G = H⁻¹, where H is a generalized Hilbert matrix of size N for constant p > 0. Let G_i,j denote the entries of G (we write G_ij when unambiguous). Furthermore, let

g^{o o} = \sum_{i, j o d d}^{N} G_{i, j}, g^{e e} = \sum_{i, j e v e n}^{N} G_{i, j},

g^{o e} = \sum_{i o d d, j e v e n}^{N} G_{i, j}, g^{e o} = \sum_{i e v e n, j o d d}^{N} G_{i, j}

be the sums of all entries of G in, for example, odd-indexed rows and odd-indexed columns. Note in particular that since H is symmetric, G is also symmetric and therefore g^oe = g^eo (when G has even dimension).

Now, let 1 ≤ i ≤ N. Then define

g_{i}^{o} = \sum_{j o d d}^{N} G_{i, j}, g_{i}^{e} = \sum_{j e v e n}^{N} G_{i, j}, r_{i} = g_{i} = \sum_{j = 1}^{N} G_{i, j}

as the sums of odd-indexed entries, the even-indexed entries, and all entries of a particular ith row, respectively. Finally define

g^{o} = \sum_{i o d d}^{N} r_{i}, g^{e} = \sum_{i e v e n}^{N} r_{i}

to be the sum of all entries in odd-numbered or even-numbered rows, respectively.

Definition A.2.

For each definition above, we adopt a notation to indicate its absolute value. We let s^o = |g^o|, s^e = |g^e|, $s_{k}^{e} = | g_{k}^{e} |$ , etc.

In particular, note that $s_{k}^{e} + s_{k}^{o}$ is the sum of the absolute value of the entries of a particular row. These values will prove particularly useful.

A.1.2. Exact expression for entries and sum of entries.

Proposition A.3.

Every generalized Hilbert matrix H is positive definite.

Proof. This has been proven in [8].

Proposition A.4.

Let G = H⁻¹ be the inverse of a generalized Hilbert matrix of size N and constant p > 0. Then the entries of G are given by

G_{i j} = \frac{{(- 1)}^{i + j}}{p + i + j - 1} (\frac{\prod_{k = 0}^{N - 1} (p + i + k) (p + j + k)}{(i - 1)! (N - i)! (j - 1)! (N - j)!}) .

(A.1)

Proof. This has been proven as well in [8].

Proposition A.5.

Let G = H⁻¹ be the inverse of a generalized Hilbert matrix of size N and constant p > 0. Fix some i ∈ {1, …, N}. Then the sum of the entries of row i is given by

\sum_{j = 1}^{N} G_{i j} = {(- 1)}^{N + i} \frac{\prod_{k = 0}^{N - 1} (p + i + k)}{(i - 1)! (N - i)!}

(A.2)

= {(- 1)}^{N + i} i (\begin{matrix} N - 1 + p + i \\ N \end{matrix}) (\begin{matrix} N \\ i \end{matrix})

(A.3)

Proof. See [34] for the first equality. The second equality is easily verified.

Proposition A.6.

Let G = H⁻¹ be the inverse of a generalized Hilbert matrix of size N and constant p > 0. Then the sum of all entries of the matrix is given by

\sum_{i, j = 1}^{N} G_{i j} = N (p + N) .

(A.4)

Proof. See [34] for the proof.

Corollary A.7.

We obtain the following expression for products of rows:

r_{i_{1}} r_{i_{2}} = (p + i_{1} + i_{2} - 1) G_{i_{1} i_{2}} .

Proof. Comparing the expressions of (A.1), (A.2), we see that

r_{i_{1}} r_{i_{2}} = (\sum_{j = 1}^{N} G_{i_{1} j}) (\sum_{j = 1}^{N} G_{i_{2} j}) = (- 1)^{2 N + i_{1} + i_{2}} \frac{\prod_{k = 0}^{N - 1} (p + i_{1} + k) (p + i_{2} + k)}{(i_{1} - 1)! (N - i_{!})! (i_{2} - 1)! (N - i_{2})!} = (- 1)^{i_{1} + i_{2}} \frac{\prod_{k = 0}^{N - 1} (p + i_{1} + k) (p + i_{2} + k)}{(i_{1} - 1)! (N - i_{!})! (i_{2} - 1)! (N - i_{2})!} = (p + i_{1} + i_{2} - 1) G_{i_{1} i_{2}}

as claimed.

A.1.3. Notes on magnitudes and signs of important sums.

Corollary A.8.

Let N be even. Then for every k, $| g_{k}^{o} | < | g_{k}^{e} |$ . Furthermore |g^oo| < |g^oe| and |g^eo| < |g^ee|. If N is odd then the inequalities are reversed.

Proof. This follows quickly from Propositions A.5 and A.4.

Corollary A.9.

Let N be even. Then g^o < 0 < g^e. If N is odd then the inequalities are reversed.

Proposition A.10.

The quantity g^oog^ee − g^oeg^eo is positive.

Proof. Note that our quantity g^oog^ee − g^oeg^eo is the determinant of the matrix $B^{T} H^{- 1} B = (\begin{matrix} g^{o o} & g^{o e} \\ g^{e o} & g^{e e} \end{matrix})$ . Thus it suffices to show that K is positive definite. But H is positive definite, and therefore G = H⁻¹ is positive definite. Furthermore, since B has full column rank, for all $x \in ℝ^{2}$ , the product Bx = 0 only if x is zero. Therefore for all nonzero $x \in ℝ^{2}$ , x^TKx = (xB)^TG(Bx) > 0 and therefore K is positive definite. Hence det K = det B^TGB = g^oog^ee − g^oeg^eo > 0.

A.2. Proof of Theorem 4.1: Optimal coefficients alternate in sign.

Our goal is to prove Theorem 4.1. First we will rewrite our expressions for the solution from KKT theory. Recall from (3.6) that we have

\hat{a} = H^{- 1} (\begin{matrix} λ_{1} \\ λ_{2} \\ ⋮ \\ λ_{2} \end{matrix}), where (\begin{matrix} λ_{1} \\ λ_{2} \end{matrix}) = K^{- 1} (\begin{matrix} - 1 \\ 1 \end{matrix}) .

But using the notation we defined in Section (A.1.1), we see that we can write explicitly:

K = B^{T} H^{- 1} B = (\begin{array}{l} g^{o o} & g^{o e} \\ g^{e o} & g^{e e} \end{array})

Thus our formula for lambda can be rewritten as

λ = (\begin{matrix} λ_{1} \\ λ_{2} \end{matrix}) = K^{- 1} (\begin{matrix} - 1 \\ 1 \end{matrix})

(A.5)

= \frac{1}{g^{o o} g^{e e} - g^{o e} g^{e o}} (\begin{matrix} g^{e e} & - g^{o e} \\ - g^{e o} & g^{o o} \end{matrix}) (\begin{matrix} - 1 \\ 1 \end{matrix})

(A.6)

= \frac{1}{g^{o o} g^{e e} - g^{o e} g^{e o}} (\begin{matrix} - g^{e e} - g^{o e} \\ g^{e o} + g^{o o} \end{matrix})

(A.7)

= \frac{1}{g^{o o} g^{e e} - g^{o e} g^{e o}} (\begin{matrix} - g^{e} \\ g^{o} \end{matrix}) .

(A.8)

The following lemma will be used to prove our theorem.

Lemma A.11.

Let s^o, s^e, $s_{k}^{o}$ , $s_{k}^{e}$ be as in Definition A.2. Furthermore, if N is even, let s⁺ = s^o + s^e, s⁻ = s^e − s^o, and for any k ∈ {1, …, N}, let $s_{k}^{+} = s_{k}^{o} + s_{k}^{e}$ and $s_{k}^{-} = s_{k}^{e} - s_{k}^{o}$ . Otherwise, let s⁻ = s^o − s^e, $s_{k}^{-} = s_{k}^{o} - s_{k}^{e}$ . Then:

\frac{s^{+}}{s^{-}} < \frac{s_{k}^{+}}{s_{k}^{-}}

(A.9)

Note that our $s_{k}^{+}$ are simply defined to be the sum of the absolute values of even-indexed and odd-indexed values, and the $s_{k}^{-}$ to be the absolute value of the sum of the even-indexed and odd-indexed values, respectively. Both s⁻ and $s_{k}^{-}$ are familiar values–modulo an absolute value sign, they are simply the sum of entries in a row, and the total sum of all elements. The s⁺, $s_{k}^{+}$ values are meant to be an absolute value analogue, achieving far larger magnitudes. The point of Lemma A.11 is to take ratios of these magnitudes and relate them directly to the coefficients of our optimal vector $\hat{a}$ . Using our lemma, we will prove our theorem, relegating the proof of the lemma for later.

Proof of Theorem 4.1. We will show that inequality (A.9) implies our Theorem 4.1.

Note

\frac{s^{+}}{s^{-}} < \frac{s_{k}^{+}}{s_{k}^{-}} \Leftrightarrow \frac{1 - s^{-} / s^{+}}{1 + s^{-} / s^{+}} < \frac{1 - s_{k}^{-} / s_{k}^{+}}{1 + s_{k}^{-} / s_{k}^{+}} \Leftrightarrow \frac{s^{+} - s^{-}}{s^{+} + s^{-}} < \frac{s_{k}^{+} - s_{k}^{-}}{s_{k}^{+} + s_{k}^{-}},

where we have taken advantage of the fact that $0 < \frac{s^{-}}{s^{+}}$ , $\frac{s_{k}^{-}}{s_{k}^{+}} < 1$ .

When N is even our last inequality becomes directly $\frac{s^{o}}{s^{e}} < \frac{s_{k}^{o}}{s_{k}^{e}}$ . When N is odd, the inequality becomes $\frac{s^{e}}{s^{o}} < \frac{s_{k}^{e}}{s_{k}^{o}}$ and taking the reciprocal of both sides, we obtain $\frac{s^{o}}{s^{e}} > \frac{s_{k}^{o}}{s_{k}^{e}}$ . Now, let k < N be the index of any row. Then the expressions $\frac{g^{o}}{g^{e}}$ , $\frac{g_{k}^{o}}{g_{k}^{e}}$ must contain exactly one negative entry in each fraction, and

\frac{s^{o}}{s^{e}} < \frac{s_{k}^{o}}{s_{k}^{e}} \Leftrightarrow \frac{| g^{o} |}{| g^{e} |} < \frac{| g_{k}^{o} |}{| g_{k}^{e} |} \Leftrightarrow \frac{g^{o}}{g^{e}} > \frac{g_{k}^{o}}{g_{k}^{e}} .

(The inequality is reversed for odd N.) If N is even, then g^e is positive; otherwise g^o is negative. In either case, if k is even, our previous inequality is equivalent to

g^{o} g_{k}^{e} > g^{e} g_{k}^{o}

(A.10)

\Leftrightarrow {(g^{o o} g^{e e} - g^{o e} g^{e o})}^{- 1} (- g^{e} g_{k}^{o} + g^{o} g_{k}^{e}) > 0

(A.11)

\Leftrightarrow λ_{1} g_{k}^{o} + λ_{2} g_{k}^{e} > 0

(A.12)

by (A.8). If k is odd, this inequality is

λ_{1} g_{k}^{o} + λ_{2} g_{k}^{e} < 0.

(A.13)

But

B (\begin{matrix} λ_{1} \\ λ_{2} \end{matrix}) = (\begin{matrix} λ_{1} \\ λ_{2} \\ ⋮ \\ λ_{2} \end{matrix}),

so the kth element of a is given by

λ_{1} g_{k}^{o} + λ_{2} g_{k}^{e}

(A.14)

so in fact our two inequalities (A.12), (A.13) together give the statement of our main Theorem 4.1.

Proof of Lemma A.11. Now onto the inequality. We can rewrite the inequality (A.9) as

\frac{s^{+}}{s^{-}} < \frac{s_{i}^{+}}{s_{i}^{-}} \Leftrightarrow \frac{\sum_{i = 1}^{N} | r_{i} |}{\sum_{i = 1}^{N} r_{i}} < \frac{\sum_{j = 1}^{N} | G_{k j} |}{| \sum_{j = 1}^{N} G_{k j} |} \Leftrightarrow \frac{\sum_{i = 1}^{N} | r_{i} |}{N (p + N)} < \frac{\sum_{j = 1}^{N} | G_{k j} |}{| r_{k} |} \Leftrightarrow | r_{k} | \sum_{i = 1}^{N} | r_{i} | < N (p + N) \sum_{j = 1}^{N} | G_{k j} | \Leftrightarrow \sum_{i = 1}^{N} (p + i + j - 1) | G_{k j} | < N (p + N) \sum_{j = 1}^{N} | G_{k j} | .

But p+i+j−1 < 2N−1+p. When N ≥ 2, 2N−1 < N² and so p+i+j−1 < N²+p < N(p+N), and thus our inequality is proven.

A.3. Proof of concentration and pointwise convergence in optimal polynomial pairs.

Here we prove Theorem 4.3, relating the asymptotic decay of the polynomial coefficients and consequently their asymptotic high concentration.

Note that for a fixed degree, any polynomial with positive coefficients that add up to 1, the function is bounded from below by x^N for every x. We will exploit this fact to show the first part of Theorem 4.3, that is, that the coefficient to x^k for a fixed k for the optimal polynomial p_N vanishes as N → ∞. We will then show that this implies that the optimal polynomial pairs derived from minimizing quadrature loss converge pointwise to an indicator function at 1—that is, the polynomials converge pointwise to 0 (except at x = 1).

We will require the following computational lemma, whose proof will be given in section A.5.

Lemma A.12.

Let N be sufficiently large—say, larger than both p and 2, and fix j ≤ N. Then the expression

(N (p + N) - (p + i + j - 1)) | G_{i j} |

(A.15)

is increasing in i on the interval [1, N/2].

In fact, it will be convenient to use the shorthand D_ij = (N(p+n) − (p+i+j−1))G_ij. Our shorthand does not have the entry term G_ij within absolute value signs but it will typically not matter. Now we prove the part of the theorem that claims that the coefficients to a fixed order vanish asymptotically.

Proof of Theorem 4.3(a). Recall from earlier that a_k is given by

a_{k} = λ_{1} g_{k}^{o} + λ_{2} g_{k}^{e}

(A.16)

= λ_{G} (- g^{e} g_{k}^{o} + g^{o} g_{k}^{e})

(A.17)

= \frac{- g^{e} g_{k}^{o} + g^{o} g_{k}^{e}}{g^{o o} g^{e e} - {(g^{o e})}^{2}} .

(A.18)

Recall that we may write our denominator

g^{o o} g^{e e} - {(g^{o e})}^{2} = N (p + N) g^{o o} - d^{2},

where d was the absolute value of the sums of all elements in all odd rows. Using Corollary A.7 we may rewrite the denominator of A.18 as

g^{o o} g^{e e} - {(g^{o e})}^{2} = N (p + N) \sum_{i, j odd} G_{i j} - \sum_{i, j odd} r_{i} r_{j} = N (p + N) \sum_{i, j odd} G_{i j} - \sum_{i, j odd} (p + i + j - 1) G_{i j} = \sum_{i, j odd} (N (p + N) - (p + i + j - 1)) G_{i j} = \sum_{i, j odd} D_{i j} .

Now consider the numerator of A.18. Noting that $g_{k}^{e} = - g_{k}^{o} + r_{k}$ , we can rewrite our numerator as

- g^{e} g_{k}^{o} + g^{o} g_{k}^{e} = g^{e} (- g_{k}^{o}) + g^{o} (- g_{k}^{o} + r_{k}) = - g_{k}^{o} (g^{e} + g^{o}) + g^{o} r_{k} = - (N (p + N)) g_{k}^{o} + g^{o} r_{k} = - (N (p + N) \sum_{j odd} G_{k j} + (\sum_{j odd} r_{j}) r_{k} = \sum_{j odd} (p + k + j - 1 - N (p + N)) G_{k j} .

Now, the sign of summands is dependent entirely on k. Since j ranges over odd values, G_kj is negative if k is even, and positive when k is odd. But the summands have the same sign, so since we are concerned only with magnitude we may flip the sign for convenience. In other words it suffices to show that

{(- 1)}^{k} \frac{- g^{e} g_{k}^{o} + g^{o} g_{k}^{e}}{g^{o o} g^{e e} - {(g^{o e})}^{2}} = \frac{\sum_{j odd} D_{k j}}{\sum_{i, j odd} D_{i j}}

(A.19)

decays to 0 as N → ∞. Take N > 4k, and let S = {i odd : N/4 ≤ i ≤ N/2}. First, note that this implies N > 4, and that |S| is approximately N/8. Certainly, for large N we may say that |S| > N/16. Furthermore, since k < N/4, by Lemma A.12, we have that for any i ∈ S,

| \sum_{j odd} D_{i j} | > | \sum_{j odd} D_{k j} | .

(A.20)

Thus from (A.19), we may write

\frac{\sum_{j odd} D_{k j}}{\sum_{i, j odd} D_{i j}} < \frac{\sum_{j odd} D_{k j}}{\sum_{j odd, i \in S} D_{i j}} < \frac{\sum_{j odd} D_{k j}}{\frac{N}{16} \sum_{j odd} D_{k j}} = \frac{16}{N}

which clearly decays to 0 as N → ∞.

We now prove that as we increase the degree N of our polynomials, the optimal pairs converge to a delta function at 1.

Proof of Theorem 4.3(b). Let ϵ > 0. First, there is some M such that $x^{M} < \frac{ϵ}{4}$ . Recall that any ${\hat{a}}^{(N)}$ corresponds to a pair of polynomials with coefficients ${\hat{a}}^{(N)} (k)$ . It suffices to show that the polynomial given by $q_{N} (x) = \sum_{k = 1}^{N} | {\hat{a}}^{(N)} (k) | x^{k}$ converges pointwise to 0 at every x ≠ 1, since the optimal odd/even polynomials are bounded by this polynomial. By Theorem 4.3(a), there is some L such that for every N > max(M, L), ${\hat{a}}^{(N)}$ satisfies $\max (a_{N, 1}, a_{N, 2}, \dots, a_{N, M - 1}) < \frac{ϵ (1 - x)}{2 x}$ . Selecting such an M, N, let c = max(a_N,1, a_N,2, …, a_N,M−1). Now, using the facts above and the fact that $\sum_{k = 1}^{N} | {\hat{a}}^{(N)} (k) | = 2$ ,

q_{N} (x) = \sum_{k = 1}^{M - 1} a_{N, k} x^{k} + \sum_{k = M}^{N} a_{N, k} x^{k} < c \sum_{k = 1}^{M - 1} x^{k} + \sum_{k = M}^{N} a_{N, k} x^{M} < \frac{ϵ (1 - x)}{2} \sum_{k = 1}^{\infty} x^{k} + x^{M} \sum_{k = M}^{N} a_{N, k} < \frac{ϵ (1 - x)}{2 x} \frac{x}{1 - x} + 2 x^{M} < \frac{ϵ}{2} + \frac{ϵ}{2} = ϵ

as needed.

A.4. Proof of Theorem 4.4: Asymptotic conjugacy loss vanishes.

Let us fix some N and simply write $\hat{a}$ as the corresponding solution. We proceed by deriving an alternative expression for the objective function using our previous results and then prove some facts about the growth. Recall

\frac{1}{2} a^{T} H a = \frac{1}{2} λ^{T} B^{T} H^{- 1} B λ = \frac{1}{2} λ^{T} K λ, where K = (\begin{array}{l} g^{o o} & g^{o e} \\ g^{e o} & g^{e e} \end{array}) .

But $λ = K^{- 1} (\begin{matrix} - 1 \\ 1 \end{matrix}) = λ_{G} (\begin{matrix} - g^{e} \\ g^{o} \end{matrix})$ , where $λ_{G} = \frac{1}{g^{o o} g^{e e} - g^{e o} g^{o e}} > 0$ , so noting the symmetry of K, we have

\frac{1}{2} a^{T} H a = \frac{1}{2} {(\begin{matrix} - 1 \\ 1 \end{matrix})}^{T} K^{- 1} K λ = \frac{λ_{G}}{2} (\begin{array}{l} - 1 & 1 \end{array}) (\begin{matrix} - g^{e} \\ g^{o} \end{matrix}) = \frac{g^{e} + g^{o}}{2 (g^{o o} g^{e e} - g^{e o} g^{o e})} = \frac{N (p + N)}{2 (g^{o o} g^{e e} - {(g^{o e})}^{2})} .

Now, let d = |g^oe| − |g^oo|. Then since N(p + N) = g^ee + g^oo + g^oe + g^eo = g^oo + g^ee − 2|g^oe|, we may write g^oo = |g^oe| − d, g^ee = |g^oe| + d + N(P + N). Thus

g^{o o} g^{e e} - g^{e o} g^{o e} = (| g^{o e} | + d + N (p + N)) (| g^{o e} | - d) - {| g^{o e} |}^{2} = (| g^{o e} | + d) (| g^{o e} | - d) + N (p + N) (| g^{o o} |) - {| g^{o e} |}^{2} = N (p + N) (g^{o o}) - d^{2} .

We now examine the growth of g^oo and d². The value d is simply a sign change times the sum of the odd rows. We have

d = {(- 1)}^{N} \sum_{i odd}^{N} \frac{\prod_{k = 0}^{N - 1} (p + i + k)}{(i - 1)! (N - i)!}

and thus we may write

d^{2} = \sum_{i, j odd}^{N} \frac{\prod_{k = 0}^{N - 1} (p + i + k) (p + j + k)}{(i - 1)! (N - i)! (j - 1)! (N - j)!} = \sum_{i, j odd}^{N} (p + i + j - 1) | G_{i j} | .

On the other hand, using the fact that g^oo is a sum of only positive entries, we see that

N (p + N) g^{o o} = \sum_{i, j odd}^{N} N (p + N) |G_{i j}|, g^{o o} = \sum_{i, j odd}^{N} \frac{1}{p + i + j - 1} \frac{\prod_{k = 0}^{N - 1} (p + i + k) (p + j + k)}{(i - 1)! (N - i)! (j - 1)! (N - j)!} .

So we can write the denominator of our objective function value as

2 (g^{o o} g^{e e} - {(g^{o e})}^{2}) = 2 (N (p + N) (g^{o o}) - d^{2}) = 2 \sum_{i, j odd}^{N} (N (p + N) - (p + i + j - 1)) | G_{i j} | = \sum_{i, j odd}^{N} (\frac{2 N (p + N)}{p + i + j - 1} - 1) (p + i + j - 1) | G_{i j} | \geq \sum_{i, j odd}^{N} (N - 1) (p + i + j - 1) | G_{i j} | = \sum_{i, j odd}^{N} (N - 1) \frac{\prod_{k = 0}^{N - 1} (p + i + k) (p + j + k)}{(i - 1)! (N - i)! (j - 1)! (N - j)!} \geq \sum_{i, j odd}^{N} (N - 1) \frac{(N - 1)!^{2}}{(i - 1)! (N - i)! (j - 1)! (N - j)!} = \sum_{i, j odd}^{N} (N - 1) (\begin{matrix} N - 1 \\ i - 1 \end{matrix}) (\begin{matrix} N - 1 \\ j - 1 \end{matrix}) = (N - 1) (\sum_{i^{'} even}^{N - 1} (\begin{matrix} N - 1 \\ i^{'} \end{matrix})) (\sum_{j^{'} even}^{N - 1} (\begin{matrix} N - 1 \\ j^{'} \end{matrix})) = (N - 1) (2^{N - 2}) (2^{N - 2}) = (N - 1) 2^{2 N - 4} .

Since our numerator grows like O(N²), we are done; we see clearly that our fraction vanishes as N → ∞.

A.5. Proof of lemmas.

In this section we prove Lemma A.12. First we require the following auxiliary lemma.

Lemma A.13.

Let G be the inverse of a Hilbert matrix of any order of size N. Fix some 0 ≤ j ≤ N. Then for i ∈ [1, N/2],

\frac{| G_{i + 1, j} |}{| G_{i, j} |} \geq (\frac{N + p + i}{j + p + i}) (\frac{p + i + j - 1}{p + i}) .

(A.21)

Furthermore, both terms on the right-hand side of (A.21) are greater than or equal to 1, and for at least one of them, the numerator is larger than the denominator by at least 1.

Proof. First we express |G_ij| as a function of i. Using (A.1) and gathering together terms that do not depend on i in an absorbing coefficient C(N, j, p), we see that

| G_{i j} | = C (N, j, p) \frac{1}{p + i + j - 1} (\frac{\prod_{k = 0}^{N - 1} (p + i + k)}{(i - 1)! (N - i)!})

and therefore

\frac{| G_{i + 1, j} |}{| G_{i, j} |} = \frac{p + i + j - 1}{p + i + j} \frac{(\frac{\prod_{k = 0}^{N - 1} (p + i + k)}{(i - 1)! (N - i)!})}{(\frac{(i)! (N - i - 1)!}{\prod_{k = 0}^{N - 1} (p + i + 1 + k)})} = \frac{(p + i + j - 1) (N - i)! (i - 1)! \prod_{k = 0}^{N - 1} (p + i + k + 1)}{(p + i + j) (i)! (N - i - 1)! \prod_{k = 0}^{N - 1} (p + i + k)} = \frac{(p + i + j) (p + i + N) (N - i)}{(p + i + j - 1) (p + i) (i)} \geq (\frac{p + i + N}{p + i + j}) (\frac{p + i + j - 1}{p + i})

since for i ≤ N/2, $\frac{N - i}{i} \geq 1$ . Now, clearly, both fractions in our last product are at least 1. Moreover, if j < N then the numerator in the first fraction is greater than its denominator by at least 1. But if j = N then p + i + j − 1 = p + i + N − 1 > p + i strictly by at least 1, and so we have shown all claims in Lemma A.13.

Now we may prove our main lemma giving our increasing expression.

Proof of Lemma A.12. Let us write (A.15) as a function of i, with

f (i) = (N (p + N) - (p + i + j - 1)) | G_{i j} | .

We need that for i ∈ {1, …, ⌊N/2⌋}, $\frac{f (i + 1)}{f (i)} > 1$ . For notational purposes, let A = N(p+N) − (p+i+j) = f(i+1)/|G_i+1,j|. Now, using inequality (A.21) from the result of Lemma (A.13), we may compute

\frac{f (i + 1)}{f (i)} = \frac{A}{A + 1} \frac{| G_{i + 1, j} |}{| G_{i j} |} \geq \frac{A}{A + 1} \frac{(N + p + i) (p + i + j - 1)}{(j + p + i) (p + i)} .

Lemma A.13 also states that at least one of $(\frac{N + p + i}{j + p + i})$ , $(\frac{p + i + j - 1}{p + i})$ is strictly larger than 1, that is, the numerator is greater than the denominator by at least 1. Let B be this denominator (or if both fractions are greater than 1, pick either). Now, A+1 = N(p+N) − (p+i+j−1) > j + p + i ⟺ N² + Np > 2p + 2i + 2j − 1. But 2i + 2j − 1 < 4N, so if N > 4 then N² + Np > 4N + 4p > 2i + 2j − 1 + 2p. Similarly, N(p + N) − (p + i + j − 1) > p + i under weak conditions on N, and so in all cases, A + 1 > B thus we can rewrite our inequality

\frac{f (i + 1)}{f (i)} \geq \frac{A}{A + 1} \frac{(N + p + i) (p + i + j - 1)}{(j + p + i) (p + i)} \geq \frac{A}{A + 1} \frac{B + 1}{B} > 1

as needed.

REFERENCES

[1].Aguet F, Geissbuhler S, Marki I, Lasser T, and Unser M, Super-resolution orientation estimation and localization of fluorescent dipoles using 3-D steerable filters, Opt. Express, 17 (2009), pp. 6829–6848. [DOI] [PubMed] [Google Scholar]
[2].Andrearczyk A and Depeursinge A, Rotational 3D Texture Classification Using Group Equivariant CNNs, preprint, https://arxiv.org/abs/1810.06889 (2018).
[3].Andrearczyk V, Fageot J, Oreiller V, Montet X, and Depeursinge A, Exploring local rotation invariance in 3D CNNs with steerable filters, in Proceedings of The 2nd International Conference on Medical Imaging with Deep Learning, Cardoso MJ, Feragen A, Glocker B, Konukoglu E, Oguz I, Unal G, and Vercauteren T, eds., Proc. Mach. Learn. Res., (PMLR), 102 (2019), pp. 15–26, https://proceedings.mlr.press/v102/andrearczyk19a.html. [Google Scholar]
[4].Basor EL, Chen Y, and Widom H, Determinants of Hankel matrices, J. Funct. Anal, 179 (2001), pp. 214–234. [Google Scholar]
[5].Blin G, Sadurska D, Portero MR, Chen J, Nand Watson, and S L, Nessys: A new set of tools for the automated detection of nuclei within intact tissues and dense 3D cultures, PLoS Biol, 17 (2019), e3000388. [DOI] [PMC free article] [PubMed] [Google Scholar]
[6].Bovik AC, Clark M, and Geisler WS, Multichannel texture analysis using localized spatial filters, IEEE Trans. Pattern Anal. Mach. Intell, 12 (1990), pp. 55–73. [Google Scholar]
[7].Chenouard N and Unser M, 3D steerable wavelets in practice, IEEE Trans. Image Process, 21 (2012), pp. 4522–4533. [DOI] [PubMed] [Google Scholar]
[8].Collar AR, XIX.—On the reciprocation of certain matrices, Proc. Roy. Soc. Edinburgh, 59 (1940), pp. 195–206. [Google Scholar]
[9].Daugman JG, Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters, J. Opt. Soc. Amer. A, 2 (1985), pp. 1160–1169. [DOI] [PubMed] [Google Scholar]
[10].Depeursinge A and Fageot J, Biomedical texture operators and aggregation functions: A methodological review and user’s guide, in Biomedical Texture Analysis, Depeursinge A, Al-Kadi O, and Mitchell J, eds., Elsevier MICCAI Soc. Book Ser., Academic Press London, 2017, pp. 55–94. [Google Scholar]
[11].Derpanis KG and Gryn JM, Three-dimensional nth derivative of Gaussian separable steerable filters, in IEEE International Conference on Image Processing 2005, Vol. 3, IEEE, Piscataway, NJ, 2005, pp. III–553. [Google Scholar]
[12].Felsberg M and Sommer G, The monogenic signal, IEEE Trans. Signal Process, 49 (2001), pp. 3136–3144, 10.1109/78.969520. [DOI] [Google Scholar]
[13].Fiedler M, Notes on Hilbert and Cauchy matrices, Linear Algebra Appl, 432 (2010), pp. 351–356. [Google Scholar]
[14].Frank J, Three-Dimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Biological Molecules in their Native State, Oxford University Press, Oxford, 2006. [Google Scholar]
[15].Freeman WT, Steerable Filters and Local Analysis of Image Structure, Ph.D. thesis, Massachusetts Jnstitute of Technology, Cambridge, MA, 1992. [Google Scholar]
[16].Freeman WT and Adelson EH, The design and use of steerable filters, IEEE Trans. Pattern Anal. Mach. Intell, 13 (1991), pp. 891–906. [Google Scholar]
[17].Gil J, Polarimetric characterization of light and media, European Phys. J. Appl. Phys, 40 (2007), pp. 1–47. [Google Scholar]
[18].Gordon G and Tibshirani R, Karush-Kuhn-Tucker Conditions, Class notes, Carnegie Mellon University, 2012. [Google Scholar]
[19].Heeger DJ, Optical flow using spatiotemporal filters, Inte. J. Comput Vis, 1 (1988), pp. 279–302. [Google Scholar]
[20].Huang C and Chen Y, Motion estimation method using a 3D steerable filter, Image Vis. Comput, 13 (1995), pp. 21–32. [Google Scholar]
[21].Kalliomäki I and Lampinen J, On steerability of Gabor-type filters for feature detection, Pattern Recognit. Lett, 28 (2007), pp. 904–911. [Google Scholar]
[22].Krajsek K and Mester R, A unified theory for steerable and quadrature filters, in Advances in Computer Graphics and Computer Vision, Springer, Berlin, 2007, pp. 201–214. [Google Scholar]
[23].Li B, Wang Q, and Lee G, Filtra: Rethinking steerable CNN by filter transform, in Proceedings of the 38th International Conference on Machine Learning, Meila Mand Zhang T, eds., Proc. Mach. Learn. Res., (PMLR), 139 (2021), pp. 6515–6522. [Google Scholar]
[24].Liao M JDCY, Cao E, Structure of the TRPV1 ion channel determined by electron cryo-microscopy, Nature, 504 (2013), pp. 107–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
[25].McKean H and Dym H, Fourier Series and Integrals, Academic, Boston, 1972. [Google Scholar]
[26].Movellan JR, Tutorial on Gabor Filters, manuscript.
[27].Papoulis A, The Fourier Integral and its Applications, McGraw-Hill, New York, 1962. [Google Scholar]
[28].Paul S, Bhattacharya S, and Gupta S, Spatiotemporal colorization of video using 3D steerable pyramids, IEEE Trans. Circuits Systems Video Technol., 27 (2017), pp. 1605–1619. [Google Scholar]
[29].Potamianos A and Maragos P, A comparison of the energy operator and the Hilbert transform approach to signal and speech demodulation, Signal Process, 37 (1994), pp. 95–120. [Google Scholar]
[30].Rosenthal PB and Henderson R, Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy, J. Mol. Biol, 333 (2003), pp. 721–745. [DOI] [PubMed] [Google Scholar]
[31].Semechko A, S2-Sampling-Toolbox. https://github.com/AntonSemechko/S2-Sampling-Toolbox (2021).
[32].Simons FJ, Dahlen F, and Wieczorek MA, Spatiospectral concentration on a sphere, SIAM Rev, 48 (2006), pp. 504–536. [Google Scholar]
[33].Smidt T, Euclidean symmetry and equivariance in machine learning, Trends Chem, 3 (2021), pp. 82–85. [Google Scholar]
[34].Smith RB, Two theorems on inverses of finite segments of the generalized Hilbert matrix, Math. Tables Other Aids Comput, 13 (1959), pp. 41–43. [Google Scholar]
[35].Teo PC, Theory and Applications of Steerable Functions.. Ph. D. thesis, Stanford University, Palu Alto, CA, 1998. [Google Scholar]
[36].Tiantian F and Hongbin Y, A novel shape from focus method based on 3D steerable filters for improved performance on treating textureless region, Opt. Commun, 410 (2018), pp. 254–261. [Google Scholar]
[37].Vilas J, Heymann J, Tagare HD, Ramírez-Aportela E, Carazo JM, and Sorzano C, Local resolution estimates of cryoem reconstructions, Curr. Opinion Struct. Biol, 64 (2020), pp. 74–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
[38].Vilas JL, Tagare HD, Vargas J, Carazo JM, and Sorzano COS, Measuring local-directional resolution and local anisotropy in cryo-em maps, Nature Commun, 11 (2020), pp. 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
[39].Weiler M, Geiger M, Welling M, Boomsma W, and Cohen T, 3D steerable CNNs: Learning rotationally equivariant features in volumetric data, in Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18), Curran Associates Inc., Red Hook, NY, pp. 10402–10413. [Google Scholar]
[40].Yu W, Daniilidis K, and Sommer G, A new 3D orientation steerable filter, in Mustererkennung 2000, Sommer G, Krüger N, and Perwass C, eds., Springer, Berlin, 2000, pp. 203–212. [Google Scholar]
[41].Yu w., Sommer G, and Daniilidis K, 3D-orientation signatures with conic kernel filtering for multiple motion analysis, in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Vol. 1, IEEE Computer Society, Los Alamitos, CA, 2001, pp. I–I. [Google Scholar]

[R1] [1].Aguet F, Geissbuhler S, Marki I, Lasser T, and Unser M, Super-resolution orientation estimation and localization of fluorescent dipoles using 3-D steerable filters, Opt. Express, 17 (2009), pp. 6829–6848. [DOI] [PubMed] [Google Scholar]

[R2] [2].Andrearczyk A and Depeursinge A, Rotational 3D Texture Classification Using Group Equivariant CNNs, preprint, https://arxiv.org/abs/1810.06889 (2018).

[R3] [3].Andrearczyk V, Fageot J, Oreiller V, Montet X, and Depeursinge A, Exploring local rotation invariance in 3D CNNs with steerable filters, in Proceedings of The 2nd International Conference on Medical Imaging with Deep Learning, Cardoso MJ, Feragen A, Glocker B, Konukoglu E, Oguz I, Unal G, and Vercauteren T, eds., Proc. Mach. Learn. Res., (PMLR), 102 (2019), pp. 15–26, https://proceedings.mlr.press/v102/andrearczyk19a.html. [Google Scholar]

[R4] [4].Basor EL, Chen Y, and Widom H, Determinants of Hankel matrices, J. Funct. Anal, 179 (2001), pp. 214–234. [Google Scholar]

[R5] [5].Blin G, Sadurska D, Portero MR, Chen J, Nand Watson, and S L, Nessys: A new set of tools for the automated detection of nuclei within intact tissues and dense 3D cultures, PLoS Biol, 17 (2019), e3000388. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] [6].Bovik AC, Clark M, and Geisler WS, Multichannel texture analysis using localized spatial filters, IEEE Trans. Pattern Anal. Mach. Intell, 12 (1990), pp. 55–73. [Google Scholar]

[R7] [7].Chenouard N and Unser M, 3D steerable wavelets in practice, IEEE Trans. Image Process, 21 (2012), pp. 4522–4533. [DOI] [PubMed] [Google Scholar]

[R8] [8].Collar AR, XIX.—On the reciprocation of certain matrices, Proc. Roy. Soc. Edinburgh, 59 (1940), pp. 195–206. [Google Scholar]

[R9] [9].Daugman JG, Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters, J. Opt. Soc. Amer. A, 2 (1985), pp. 1160–1169. [DOI] [PubMed] [Google Scholar]

[R10] [10].Depeursinge A and Fageot J, Biomedical texture operators and aggregation functions: A methodological review and user’s guide, in Biomedical Texture Analysis, Depeursinge A, Al-Kadi O, and Mitchell J, eds., Elsevier MICCAI Soc. Book Ser., Academic Press London, 2017, pp. 55–94. [Google Scholar]

[R11] [11].Derpanis KG and Gryn JM, Three-dimensional nth derivative of Gaussian separable steerable filters, in IEEE International Conference on Image Processing 2005, Vol. 3, IEEE, Piscataway, NJ, 2005, pp. III–553. [Google Scholar]

[R12] [12].Felsberg M and Sommer G, The monogenic signal, IEEE Trans. Signal Process, 49 (2001), pp. 3136–3144, 10.1109/78.969520. [DOI] [Google Scholar]

[R13] [13].Fiedler M, Notes on Hilbert and Cauchy matrices, Linear Algebra Appl, 432 (2010), pp. 351–356. [Google Scholar]

[R14] [14].Frank J, Three-Dimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Biological Molecules in their Native State, Oxford University Press, Oxford, 2006. [Google Scholar]

[R15] [15].Freeman WT, Steerable Filters and Local Analysis of Image Structure, Ph.D. thesis, Massachusetts Jnstitute of Technology, Cambridge, MA, 1992. [Google Scholar]

[R16] [16].Freeman WT and Adelson EH, The design and use of steerable filters, IEEE Trans. Pattern Anal. Mach. Intell, 13 (1991), pp. 891–906. [Google Scholar]

[R17] [17].Gil J, Polarimetric characterization of light and media, European Phys. J. Appl. Phys, 40 (2007), pp. 1–47. [Google Scholar]

[R18] [18].Gordon G and Tibshirani R, Karush-Kuhn-Tucker Conditions, Class notes, Carnegie Mellon University, 2012. [Google Scholar]

[R19] [19].Heeger DJ, Optical flow using spatiotemporal filters, Inte. J. Comput Vis, 1 (1988), pp. 279–302. [Google Scholar]

[R20] [20].Huang C and Chen Y, Motion estimation method using a 3D steerable filter, Image Vis. Comput, 13 (1995), pp. 21–32. [Google Scholar]

[R21] [21].Kalliomäki I and Lampinen J, On steerability of Gabor-type filters for feature detection, Pattern Recognit. Lett, 28 (2007), pp. 904–911. [Google Scholar]

[R22] [22].Krajsek K and Mester R, A unified theory for steerable and quadrature filters, in Advances in Computer Graphics and Computer Vision, Springer, Berlin, 2007, pp. 201–214. [Google Scholar]

[R23] [23].Li B, Wang Q, and Lee G, Filtra: Rethinking steerable CNN by filter transform, in Proceedings of the 38th International Conference on Machine Learning, Meila Mand Zhang T, eds., Proc. Mach. Learn. Res., (PMLR), 139 (2021), pp. 6515–6522. [Google Scholar]

[R24] [24].Liao M JDCY, Cao E, Structure of the TRPV1 ion channel determined by electron cryo-microscopy, Nature, 504 (2013), pp. 107–12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] [25].McKean H and Dym H, Fourier Series and Integrals, Academic, Boston, 1972. [Google Scholar]

[R26] [26].Movellan JR, Tutorial on Gabor Filters, manuscript.

[R27] [27].Papoulis A, The Fourier Integral and its Applications, McGraw-Hill, New York, 1962. [Google Scholar]

[R28] [28].Paul S, Bhattacharya S, and Gupta S, Spatiotemporal colorization of video using 3D steerable pyramids, IEEE Trans. Circuits Systems Video Technol., 27 (2017), pp. 1605–1619. [Google Scholar]

[R29] [29].Potamianos A and Maragos P, A comparison of the energy operator and the Hilbert transform approach to signal and speech demodulation, Signal Process, 37 (1994), pp. 95–120. [Google Scholar]

[R30] [30].Rosenthal PB and Henderson R, Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy, J. Mol. Biol, 333 (2003), pp. 721–745. [DOI] [PubMed] [Google Scholar]

[R31] [31].Semechko A, S2-Sampling-Toolbox. https://github.com/AntonSemechko/S2-Sampling-Toolbox (2021).

[R32] [32].Simons FJ, Dahlen F, and Wieczorek MA, Spatiospectral concentration on a sphere, SIAM Rev, 48 (2006), pp. 504–536. [Google Scholar]

[R33] [33].Smidt T, Euclidean symmetry and equivariance in machine learning, Trends Chem, 3 (2021), pp. 82–85. [Google Scholar]

[R34] [34].Smith RB, Two theorems on inverses of finite segments of the generalized Hilbert matrix, Math. Tables Other Aids Comput, 13 (1959), pp. 41–43. [Google Scholar]

[R35] [35].Teo PC, Theory and Applications of Steerable Functions.. Ph. D. thesis, Stanford University, Palu Alto, CA, 1998. [Google Scholar]

[R36] [36].Tiantian F and Hongbin Y, A novel shape from focus method based on 3D steerable filters for improved performance on treating textureless region, Opt. Commun, 410 (2018), pp. 254–261. [Google Scholar]

[R37] [37].Vilas J, Heymann J, Tagare HD, Ramírez-Aportela E, Carazo JM, and Sorzano C, Local resolution estimates of cryoem reconstructions, Curr. Opinion Struct. Biol, 64 (2020), pp. 74–78. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] [38].Vilas JL, Tagare HD, Vargas J, Carazo JM, and Sorzano COS, Measuring local-directional resolution and local anisotropy in cryo-em maps, Nature Commun, 11 (2020), pp. 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] [39].Weiler M, Geiger M, Welling M, Boomsma W, and Cohen T, 3D steerable CNNs: Learning rotationally equivariant features in volumetric data, in Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18), Curran Associates Inc., Red Hook, NY, pp. 10402–10413. [Google Scholar]

[R40] [40].Yu W, Daniilidis K, and Sommer G, A new 3D orientation steerable filter, in Mustererkennung 2000, Sommer G, Krüger N, and Perwass C, eds., Springer, Berlin, 2000, pp. 203–212. [Google Scholar]

[R41] [41].Yu w., Sommer G, and Daniilidis K, 3D-orientation signatures with conic kernel filtering for multiple motion analysis, in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Vol. 1, IEEE Computer Society, Los Alamitos, CA, 2001, pp. I–I. [Google Scholar]

PERMALINK

Steerable Near-Quadrature Filter Pairs in Three Dimensions

Tommy M Tang

Hemant D Tagare

Abstract

1. Introduction.

1.1. Organization of the paper.

1.2. Literature review.

2. Background.

2.1. Axial symmetry.

Definition 2.1.

Definition 2.2.

2.2. Steerable axially symmetric functions.

Definition 2.3.

Theorem 2.4.

2.3. From functions to filters.

2.4. Quadrature.

Definition 2.5.

3. Departure from quadrature (DQ).

3.1. Minimizing DQ.

Problem 3.1.

Definition 3.2.

4. Properties of the optimal solution.

4.1. Positivity and shape of the polynomial.

Theorem 4.1.

Corollary 4.2.

Theorem 4.3.

4.2. The asymptotic behavior of the DQ.

Theorem 4.4.

5. Numerical results.

5.1. Optimal polynomials.

Figure 1.

Figure 2.

Figure 3.

6. Applications.

6.1. Approximating 3D Gabor filter pairs.

Figure 5.

Figure 4.

Figure 6.

6.1.1. Choice of steering directions.

6.2. Local Fourier shell correlation (FSC) in Cryo-EM.

Figure 7.

Figure 8.

7. Discussion.

8. Conclusion.

Funding:

Appendix A. Proofs of claims.

A.1. Properties of generalized Hilbert matrix inverse and notation.

A.1.1. Definitions and notations.

Definition A.1.

Definition A.2.

A.1.2. Exact expression for entries and sum of entries.

Proposition A.3.

Proposition A.4.

Proposition A.5.

Proposition A.6.

Corollary A.7.

A.1.3. Notes on magnitudes and signs of important sums.

Corollary A.8.

Corollary A.9.

Proposition A.10.

A.2. Proof of Theorem 4.1: Optimal coefficients alternate in sign.

Lemma A.11.

A.3. Proof of concentration and pointwise convergence in optimal polynomial pairs.

Lemma A.12.

A.4. Proof of Theorem 4.4: Asymptotic conjugacy loss vanishes.

A.5. Proof of lemmas.

Lemma A.13.

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases