NON-UNIQUE GAMES OVER COMPACT GROUPS AND ORIENTATION ESTIMATION IN CRYO-EM

Afonso S Bandeira; ETH Zürich; Yutong Chen; Roy R Lederman; Amit Singer

doi:10.1088/1361-6420/ab7d2c

. Author manuscript; available in PMC: 2024 Jan 25.

Published in final edited form as: Inverse Probl. 2020 Apr 29;36(6):064002. doi: 10.1088/1361-6420/ab7d2c

NON-UNIQUE GAMES OVER COMPACT GROUPS AND ORIENTATION ESTIMATION IN CRYO-EM

Afonso S Bandeira ¹, ETH Zürich ², Yutong Chen ^3,^*, Roy R Lederman ⁴, Amit Singer ⁵

PMCID: PMC10810340 NIHMSID: NIHMS1958378 PMID: 38274355

Abstract

Let $𝒢$ be a compact group and let $f_{i j} \in C (𝒢)$ . We define the Non-Unique Games (NUG) problem as finding $g_{1}, \dots, g_{n} \in 𝒢$ to minimize $\sum_{i, j = 1}^{n} f_{i j} (g_{i} g_{j}^{- 1})$ . We introduce a convex relaxation of the NUG problem to a semidefinite program (SDP) by taking the Fourier transform of $f_{i j}$ over $𝒢$ . The NUG framework can be seen as a generalization of the little Grothendieck problem over the orthogonal group and the Unique Games problem and includes many practically relevant problems, such as the maximum likelihood estimator to registering bandlimited functions over the unit sphere in $d$ -dimensions and orientation estimation of noisy cryo-Electron Microscopy (cryo-EM) projection images. We implement a SDP solver for the NUG cryo-EM problem using the alternating direction method of multipliers (ADMM). Numerical study with synthetic datasets indicate that while our ADMM solver is slower than existing methods, it can estimate the rotations more accurately, especially at low signal-to-noise ratio (SNR).

1991 Mathematics Subject Classification. Primary: 00A69; Secondary: 90C34, 20C40

Key words and phrases. Computer vision, pattern recognition, algorithms, optimization, cryo-EM

1. Introduction

We consider problems of the following form:

\begin{array}{l} \underset{g_{1}, \dots, g_{n}}{m i n i m i z e} & \sum_{i, j = 1}^{n} f_{i j} (g_{i} g_{j}^{- 1}) \\ subject to & g_{i} \in 𝒢, \end{array}

(1.1)

where $𝒢$ is a compact group and $f_{i j} : 𝒢 \to R$ are suitable functions. We will refer to such problems as a Non-Unique Game (NUG) problem over $𝒢$ .

Note that the solution to the NUG problem is not unique. If $g_{1}, \dots, g_{n}$ is a solution to (1.1), then so is $g_{1} g, \dots, g_{n} g$ for any $g \in 𝒢$ . That is, we can solve (1.1) up to a global shift $g \in 𝒢$ .

In many inverse problems, the goal is to estimate multiple group elements from information about group offsets, and can be formulated as (1.1). A simple example is angular synchronization [40], where one is tasked with estimating angles ${\{θ_{i}\}}_{i}$ from information about their offsets $θ_{i} - θ_{j}$ mod $2 π$ . The problem of estimating the angles can then be formulated as an optimization problem depending on the offsets, and thus be written in the form of (1.1). In this case, $𝒢 ≅ S O (2)$ .

One of the simplest instances of (1.1) is the Max-Cut problem, where the objective is to partition the vertices of a graph as to maximize the number of edges (the cut) between the two sets. In this case, $𝒢 ≅ Z_{2}$ , the group of two elements {±1}, and $f_{i j}$ is zero if $(i, j)$ is not an edge of the graph and

f_{i j} (1) = 0 and f_{i j} (- 1) = - 1,

if $(i, j)$ is an edge. In fact, we take a semidefinite programming based approach towards (1.1) that is inspired by — and can be seen as a generalization of — the semidefinite relaxation for the Max-Cut problem by Goemans and Williamson [21].

Another important source of inspiration is the semidefinite relaxation of $M a x - 2 - L i n (Z_{L})$ , proposed in [15], for the Unique Games problem, a central problem in theoretical computer science [26, 27]. Given integers $n$ and $L$ , an Unique-Games instance is a system of linear equations over $Z_{L}$ on $n$ variables ${\{x_{i}\}}_{i = 1}^{n}$ . Each equation constraints the difference of two variables. More precisely, for each $(i, j)$ in a subset of the pairs, we associate a constraint

x_{i} - x_{j} = b_{i j} m o d L .

The objective is then to find ${\{x_{i}\}}_{i = 1}^{n}$ in $Z_{L}$ that satisfy as many equations as possible. This can be easily described within our framework by taking, for each constraint,

f_{i j} (g) = - δ_{g \equiv b_{i j}},

and $f_{i j} = 0$ for pairs not corresponding to constraints. The term ‘unique’ derives from the fact that the constraints have this special structure where the offset can only take one value to satisfy the constraint, and all other values have the same score. This motivated our choice of nomenclature for the framework treated in this paper. The semidefinite relaxation for the unique games problem proposed in [15] was investigated in [8] in the context of the signal alignment problem, where the $f_{i j}$ are not forced to have a special structure (but $𝒢 ≅ Z_{L}$ ). The NUG framework presented in this paper can be seen as a generalization of the approach in [8] to other compact groups $𝒢$ . We emphasize that, unlike [8] that was limited to the case of a finite cyclic group, here we consider compact groups that are possibly infinite and non-commutative.

Besides the signal alignment problem treated in [8] the semidefinite relaxation to the NUG problem we develop generalizes with other effective relaxations. When $𝒢 ≅ Z_{2}$ it coincides with the semidefinite relaxations for Max-Cut [21], the little Grothendieck problem over $Z_{2}$ [3, 32], recovery in the stochastic block model [2, 7], and Synchronization over $Z_{2}$ [1, 7, 18]. When $𝒢 ≅ S O (2)$ and the functions $f_{i j}$ are linear with respect to the representation $ρ_{1} : S O (2) \to C$ given by $ρ_{1} (θ) = e^{i θ}$ , it coincides with the semidefinite relaxation for angular synchronization [40]. Similarly, when $𝒢 ≅ O (d)$ and the functions are linear with respect to the natural $d$ -dimensional representation, then the NUG problem essentially coincides with the little Grothendieck problem over the orthogonal group [9, 31]. Other examples include the shape matching problem in computer graphics for which $𝒢$ is the permutation group (see [24, 16]). In addition, it has been shown in [13] that the formulation of NUG and the algorithms presented in this paper can be extended to simultaneous alignment and classification of a mixture of different signals.

1.1. Orientation estimation in cryo-Electron Microscopy.

A particularly important application of this framework is the orientation estimation problem in cryo-Electron Microscopy [39].

Cryo-EM is a technique used to determine the 3-dimensional structure of biological macromolecules. The molecules are rapidly frozen in a thin layer of ice and imaged with an electron microscope, which gives noisy 2-dimensional projections. One of the main difficulties with this imaging process is that these molecules are imaged at different unknown orientations in the sheet of ice and each molecule can only be imaged once (due to the destructive nature of the imaging process). More precisely, each measurement consists of a tomographic projection of a rotated (by an unknown rotation) copy of the molecule. The task is then to reconstruct the molecule density from many such noisy measurements. Although in principle it is possible to reconstruct the 3-dimensional density directly from the noisy images without estimation of the rotations [25], or by treating rotations as nuisance parameters [47, 6] here we consider the problem of estimating the rotations directly from the noisy images. In Section 2, we describe how this problem can be formulated in the form (1.1).

2. Multireference Alignment

In classical linear inverse problems, one is tasked with recovering an unknown element $x \in 𝒳$ from a noisy measurement of the form $𝒫 (x) + ϵ$ , where $ϵ$ represents the measurement error and $𝒫$ is a linear observation operator. There are, however, many problems where an additional difficulty is present; one class of such problems includes non-linear inverse problems in which an unknown transformation acts on $x$ prior to the linear measurement. Specifically, let $𝒳$ be a vector space and $𝒢$ be a group acting on $𝒳$ . Suppose we have $n$ measurements of the form

y_{i} = 𝒫 (g_{i} \circ x) + ϵ_{i}, i = 1, \dots, n

(2.1)

where

$x$ is a fixed but unknown element of $𝒳$ ,
$g_{1}, \dots, g_{n}$ are unknown elements of $𝒢$ ,
$\circ$ is the action of $𝒢$ on $𝒳$ ,
$𝒫 : 𝒳 \to Y$ is a linear operator,
$Y$ is the (finite-dimensional) measurement space,
$ϵ_{i}$ ’s are independent noise terms.

If the $g_{i}$ ’s were known, then the task of recovering $x$ would reduce to a classical linear inverse problem, for which many effective techniques exist. While in many situations it is possible to estimate $x$ directly without estimating $g_{1}, \dots, g_{n}$ , or by treating these as nuisance parameters, here we focus on the problem of estimating the group elements $g_{1}, \dots, g_{n}$ .

There are several common approaches for inverse problems of the form (2.1). One is motivated by the observation that estimating $x$ knowing the $g_{i}$ ’s and estimating the $g_{i}$ ’s knowing $x$ are both considerably easier tasks. This suggests an alternating minimization approach where each estimation is updated iteratively. Besides a lack of theoretical guarantees, convergence may also depend on the initial guess. Another approach, which we refer to as pairwise comparisons [40], consists in determining, from pairs of observations $(y_{i}, y_{j})$ , the most likely value for $g_{i} g_{j}^{- 1}$ . Although the problem of estimating the $g_{i}$ ’s from these pairwise guesses is fairly well-understood [40, 10, 43] enjoying efficient algorithms and performance guarantees, this method suffers from loss of information as not all of the information of the problem is captured in this most likely value for $g_{i} g_{j}^{- 1}$ and thus this approach tends to fail at low signal-to-noise-ratio.

In contrast, the Maximum Likelihood Estimator (MLE) leverages all information. Assuming that the $ϵ_{i}$ ’s are i.i.d. Gaussian, the MLE for the observation model (2.1) is given by the following optimization problem:

\begin{matrix} \underset{g_{1}, \dots, g_{n}, x}{minimize} & \sum_{i = 1}^{n} {‖y_{i} - 𝒫 (g_{i} \circ x)‖}_{2}^{2} \\ subject to & g_{i} \in 𝒢 \\ x \in 𝒳 \end{matrix}

(2.2)

We refer to (2.2) as the Multireference Alignment (MRA) problem. Let us denote the ground truth signal and group elements by $x^{0}$ and $g_{1}^{0}, \dots, g_{n}^{0}$ ; the solution to the optimization problem by $x^{*}$ and $g_{1}^{*}, \dots, g_{n}^{*}$ , which we will also refer to as $x^{M L E}$ and $g_{1}^{M L E}, \dots, g_{n}^{M L E}$ . Unfortunately, the exponentially large search space and nonconvex nature of (2.2) often render it computationally intractable. However, for several problems of interest, we formulate (2.2) as an instance of an NUG for which we develop computationally tractable approximations.

Notice that although MLE typically enjoys several theoretical properties, their underlying technical conditions do not hold in this case. Specifically, the number of parameters to be estimated is not fixed but rather grows indefinitely with the sample size $n$ : for each sample $y_{i}$ there is a group element $g_{i}$ that needs to be estimated. As a result, the MLE may not be consistent in this case. In other words, even in the limit $n \to \infty$ the estimator $x^{M L E}$ may not converge to the ground truth $x^{0}$ . Similarly, the estimated group elements will not converge to their true values. A different version of MLE, not considered in this paper, in which the group elements are treated as nuisance parameters and are marginalized would enjoy the nice theoretical properties.

2.1. Registration of signals on the sphere.

Consider the problem of estimating a bandlimited signal on the circle $x : S^{1} \to C$ from noisy rotated discrete sampled copies of it. In this problem, $𝒳 = s p a n {\{e^{i k θ}\}}_{k = - t}^{t}$ is the space of bandlimited functions up to degree $t$ on $S^{1}, 𝒢 = S O (2)$ and the group action is

g \circ x = \sum_{k = - t}^{t} α_{k} e^{i k (θ - θ_{g})},

where $x = \sum_{k = - t}^{t} α_{k} e^{i k θ}$ and we identified $g \in S O (2)$ with $θ_{g} \in [0,2 π]$ .

The measurements are of the form

y_{i} ≔ 𝒫 (g_{i} \circ x) + ϵ_{i}, i = 1, \dots, n

where

$x \in 𝒳$ ,
$g_{i} \in S O (2)$ ,
$𝒫 : 𝒳 \to C^{L}$ samples the function at $L$ equally spaced points in $S^{1}$ ,
$ϵ_{i} ~ 𝒩 (0, σ^{2} I_{L \times L}) (i = 1, \dots, n)$ are independent Gaussians.

Our objective is to estimate $g_{1}, \dots, g_{n}$ and $x$ . Since estimating $x$ knowing the group elements $g_{i}$ is considerably easier, we will focus on estimating $g_{1}, \dots, g_{n}$ . As shown below, this will essentially reduce to the problem of aligning (or registering) the observations $y_{1}, \dots, y_{n}$ .

In absence of noise, the problem of finding the $g_{i}$ ’s is trivial (cf. first column of Figure 2.1). With noise, if $x$ is known (as it is in some applications), then the problem of determining the $g_{i}$ ’s can be solved by matched filtering (cf. second column of Figure 2.1). However, $x$ is unknown in general. This, together with the high levels of noise, render the problem significantly more difficult (cf. last two columns of Figure 2.1).

We now define the problem of registration in $d$ -dimensions in general. $𝒳 = s p a n {\{p_{k}\}}_{k \in 𝒜_{t}}$ is the space of bandlimited functions up to degree $t$ on $S^{d}$ where the $p_{k}$ ’s are orthonormal polynomials on $S^{d}, 𝒜_{t}$ indexes all $p_{k}$ up to degree $t$ and $𝒢 = S O (d + 1)$ .

The measurements are of the form

y_{i} ≔ 𝒫 (g_{i} \circ x) + ϵ_{i}, i = 1, \dots, n

(2.3)

where

$x \in 𝒳,$
$g_{i} \in S O (d + 1)$ ,
$𝒫 : 𝒳 \to C^{L}$ samples the function on $L$ points in $S^{d}$ ,
$ϵ_{i} ~ 𝒩 (0, σ^{2} I_{L \times L}) (i = 1, \dots, n)$ are independent Gaussians.

Again, our objective is to estimate $g_{1}, \dots, g_{n}$ and $x$ . We would like the sampling operator $𝒫$ to be ‘uniform’. One possible sampling scheme is spherical designs surveyed in [11]. An illustration of signals on a sphere, sampled at such points, is provided in Figure 2.2.

Figure 2.2. — An illustration of registration in 2-dimensions. The left four spheres provide examples of clean signals $y_{i}$ and the right four spheres are of noisy observations. Note that the images are generated using a quantization of the sphere.

The MRA solution for registration in $d$ -dimensions is given by

\begin{array}{l} \underset{g_{1}, \dots, g_{n}, x}{minimize} & \sum_{i = 1}^{n} ∥ y_{i} - 𝒫 (g_{i} \circ x) ∥_{2}^{2} \\ subject to & g_{i} \in S O (d + 1) \\ x \in 𝒳 \end{array}

(2.4)

We now remove $x$ from (2.4). Let $𝒬 : C^{L} \to 𝒳$ be the adjoint of $𝒫 . 𝒬$ is also an approximate inverse of $𝒫$ (up to normalization), because points are sampled from a $t$ -design which has the property of exactly integrating polynomials on the sphere. Then, ${∥y_{i} - 𝒫 (g_{i} \circ x)∥}_{2}^{2} \approx {∥𝒬 (y_{i}) - g_{i} \circ x∥}_{2}^{2}$ (up to normalization), and the approximation error decreases as $L$ increases. Since $g_{i}$ preserves the norm $∥ \cdot ∥_{2}$ , it follows that (2.4) is equivalent to

\begin{matrix} \underset{g_{1}, \dots, g_{n}, x}{m i n i m i z e} & \sum_{i = 1}^{n} {∥g_{i}^{- 1} \circ 𝒬 (y_{i}) - x∥}_{2}^{2} \\ subject to & g_{i} \in S O (d + 1) \\ x \in 𝒳 . \end{matrix}

(2.5)

Since the minimizer $x$ with fixed $g_{i}$ ’s is the average, (2.5) is equivalent to

\begin{array}{l} \underset{g_{1}, \dots, g_{n}}{minimize} & \sum_{i, j = 1}^{n} ∥ g_{i}^{- 1} \circ 𝒬 (y_{i}) - g_{j}^{- 1} \circ 𝒬 (y_{j}) ∥_{2}^{2} \\ subject to & g_{i} \in S O (d + 1) . \end{array}

(2.6)

Since $g_{i}$ preserves $∥ \cdot ∥_{2}$ norm, then (2.6) is equivalent to

\begin{array}{l} \underset{g_{1}, \dots, g_{n}}{minimize} & \sum_{i, j = 1}^{n} ∥ 𝒬 (y_{i}) - g_{i} g_{j}^{- 1} \circ 𝒬 (y_{j}) ∥_{2}^{2} \\ subject to & g_{i} \in S O (d + 1) . \end{array}

(2.7)

In summary, (2.4) can be approximated by (2.7), which is an instance of (1.1).

2.2. Orientation estimation in cryo-EM.

The task here is to reconstruct the molecule density from many noisy tomographic projection images (see the right column of Figure 1.1 for an idealized density and measurement dataset). We assume the molecule does not have any non-trivial point group symmetry. The linear inverse problem of recovering the molecule density given the rotations fits in the framework of classical computerized tomography for which effective methods exist. Thus, we focus on the non-linear inverse problem of estimating the unknown rotations and the underlying density.

An added difficulty is the high level of noise in the images. In fact, it is already non-trivial to distinguish whether a molecule is present in an image or if the image consists only of noise (see Figure 2.3 for a subset of an experimental dataset). On the other hand, these datasets consist of many projection images which renders reconstruction possible.

Figure 2.3. — Sample images from the E. coli 50S ribosomal subunit, generously provided by Dr. Fred Sigworth at the Yale Medical School.

We formulate the problem of orientation estimation in cryo-EM. Let $𝒳$ be the space of bandlimited functions that are also essentially compactly supported in $R^{3}$ and $𝒢 = S O (3)$ . For perfectly centered images, and ignoring the effect of the microscope’s contrast transfer function, the measurements are of the form

I_{i} (x, y) ≔ 𝒫 (g_{i} \circ ϕ) + ϵ_{i}, i = 1, \dots, n

(2.8)

$ϕ \in 𝒳,$
$g_{i} \in S O (3)$ ,
$𝒫 (ϕ)$ samples $\int_{- \infty}^{\infty} ϕ (x, y, z) d z (𝒫$ is called the discrete X-ray transform),
$ϵ_{i}$ ’s are i.i.d Gaussians representing noise.

Our objective is to find $g_{1}, \dots, g_{n}$ and $ϕ$ .

The operator $𝒫$ in the orientation estimation problem is different than in the registration problem. Specifically, $𝒫$ is a composition of tomographic projection and sampling. To write the objective function for the orientation estimation problem, we will use the Fourier slice theorem [30].

The Fourier slice theorem states that the 2-dimensional Fourier transform of a tomographic projection of a molecular density $ϕ$ coincides with the restriction to a plane normal to the projection direction, a slice, of the 3-dimensional Fourier transform of the density $ϕ$ . See Figure 2.4.

Figure 2.4. — An illustration of the use of the Fourier slice theorem and the common lines approach to the orientation estimation problem in cryo-EM. Image courtesy of Amit Singer and Yoel Shkolnisky [42].

Let ${\hat{I}}_{i} (r, θ)$ be the Fourier transform of $I_{i}$ in polar coordinates. We identify ${\hat{I}}_{i}$ and ${\hat{I}}_{j}$ with the $x y$ -plane in $R^{3}$ , and apply $g_{i}^{- 1}$ and $g_{j}^{- 1}$ to ${\hat{I}}_{i}$ and ${\hat{I}}_{j}$ , respectively. Then, the directions of the lines of intersection on ${\hat{I}}_{i}$ and ${\hat{I}}_{j}$ are given, respectively, by unit vectors

c_{i j} (g_{i} g_{j}^{- 1}) = \frac{g_{i} (g_{i}^{- 1} \cdot {\vec{e}}_{3} \times g_{j}^{- 1} \cdot {\vec{e}}_{3})}{∥ g_{i} (g_{i}^{- 1} \cdot {\vec{e}}_{3} \times g_{j}^{- 1} \cdot {\vec{e}}_{3} ∥_{2}} = \frac{{\vec{e}}_{3} \times g_{i} g_{j}^{- 1} \cdot {\vec{e}}_{3}}{∥ {\vec{e}}_{3} \times g_{i} g_{j}^{- 1} \cdot {\vec{e}}_{3} ∥_{2}},

(2.9)

c_{j i} (g_{i} g_{j}^{- 1}) = \frac{g_{j} (g_{i}^{- 1} \cdot {\vec{e}}_{3} \times g_{j}^{- 1} \cdot {\vec{e}}_{3})}{∥ g_{j} (g_{i}^{- 1} \cdot {\vec{e}}_{3} \times g_{j}^{- 1} \cdot {\vec{e}}_{3})) ∥_{2}} = \frac{{(g_{i} g_{j}^{- 1})}^{- 1} \cdot {\vec{e}}_{3} \times {\vec{e}}_{3}}{∥ {(g_{i} g_{j}^{- 1})}^{- 1} \cdot {\vec{e}}_{3} \times {\vec{e}}_{3} ∥_{2}} .

(2.10)

where ${\vec{e}}_{3} ≔ (0,0, 1)^{T}$ . See [42] for details.

Since the noiseless images should agree on their common lines, we consider the following MRA-like cost function:

\begin{array}{l} \underset{g_{1}, \dots, g_{n}}{minimize} & \sum_{i, j = 1}^{n} ∥ {\hat{I}}_{i} (\cdot, c_{i j} (g_{i} g_{j}^{- 1})) - {\hat{I}}_{j} (\cdot, c_{j i} (g_{i} g_{j}^{- 1})) ∥_{2}^{2} \\ subject to & g_{i} \in S O (3), \end{array}

(2.11)

where, with a minor abuse of notation, we identify the vector $c_{i j}$ with the angle $θ_{i j}$ of a common line in the Fourier transform of an image ${\hat{I}}_{i}$ . Equation (2.11) is an instance of (1.1). Note that we could also use the $L_{1}$ norm or a weighted $L_{2}$ norm in the cost function.

Note that for $n = 2$ images, there is always a degree of freedom along the line of intersection. In other words, we cannot recover the true orientation between ${\hat{I}}_{1}$ and ${\hat{I}}_{2}$ . However, for $n \geq 3$ , this degree of freedom is eliminated. It is also worth mentioning several important references in the context of angular reconstitution [22, 45]. In general, the measurement system suffers from a handedness ambiguity on the reconstruction (see, for example, [42]), this will be discussed in detail later in the paper.

3. Linearization via Fourier expansion

Let us consider the objective function in the general form

\sum_{i, j = 1}^{n} f_{i j} (g_{i} g_{j}^{- 1}) .

(3.1)

Note that each $f_{i j}$ in (3.1) can be nonlinear and nonconvex. However, since $𝒢$ is compact (and since $f_{i j} \in C (𝒢)$ ), we can expand, each $f_{i j}$ in Fourier series. More precisely, given the unitary irreducible representations $\{ρ_{k}\}$ of $𝒢$ , we can write

f_{i j} (g_{i} g_{j}^{- 1}) = \sum_{k = 0}^{\infty} d_{k} tr [{\hat{f}}_{i j} (k) ρ_{k} (g_{i} g_{j}^{- 1})] = \sum_{k = 0}^{\infty} d_{k} tr [{\hat{f}}_{i j} (k) ρ_{k} (g_{i}) ρ_{k}^{*} (g_{j})],

(3.2)

where ${\hat{f}}_{i j} (k)$ are the Fourier coefficients of $f_{i j}$ and can be computed from $f_{i j}$ via the Fourier transform

{\hat{f}}_{i j} (k) ≔ \int_{𝒢} f_{i j} (g) ρ_{k} (g^{- 1}) d g = \int_{𝒢} f_{i j} (g) ρ_{k}^{*} (g) d g .

(3.3)

Above, $d g$ denotes the Haar measure on $𝒢$ and $d_{k}$ the dimension of the representation $ρ_{k}$ . See [17] for an introduction to the representations of compact groups.

We express the objective function (3.1) as

\sum_{i, j = 1}^{n} f_{i j} (g_{i} g_{j}^{- 1}) = \sum_{i, j = 1}^{n} \sum_{k = 0}^{\infty} d_{k} tr [{\hat{f}}_{i j} (k) ρ_{k} (g_{i}) ρ_{k}^{*} (g_{j})] = \sum_{k = 0}^{\infty} \sum_{i, j = 1}^{n} d_{k} tr [{\hat{f}}_{i j} (k) ρ_{k} (g_{i}) ρ_{k}^{*} (g_{j})],

which is linear in $ρ_{k} (g_{i}) ρ_{k}^{*} (g_{j})$ . This motivates writing (1.1) as linear optimization over the variables

X^{(k)} ≔ [\begin{matrix} ρ_{k} (g_{1}) \\ ⋮ \\ ρ_{k} (g_{n}) \end{matrix}] {[\begin{matrix} ρ_{k} (g_{1}) \\ ⋮ \\ ρ_{k} (g_{n}) \end{matrix}]}^{*} .

In other words,

\sum_{i, j = 1}^{n} f_{i j} (g_{i} g_{j}^{- 1}) = \sum_{k = 0}^{\infty} tr [C^{(k)} X^{(k)}],

where the coefficient matrices are given by

C^{(k)} ≔ d_{k} [\begin{matrix} {\hat{f}}_{11} (k) & {\hat{f}}_{21} (k) & \dots & {\hat{f}}_{n 1} (k) \\ {\hat{f}}_{12} (k) & {\hat{f}}_{22} (k) & \dots & {\hat{f}}_{n 2} (k) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{f}}_{1 n} (k) & {\hat{f}}_{2 n} (k) & \dots & {\hat{f}}_{n n} (k) \end{matrix}] .

We refer to the $d_{k} \times d_{k}$ block of $X^{(k)}$ corresponding to $ρ_{k} (g_{i}) ρ_{k}^{*} (g_{j}) = ρ_{k} (g_{i} g_{j}^{- 1})$ as $X_{i j}^{(k)}$ . We now turn our attention to constraints on the variables ${\{X^{(k)}\}}_{k = 0}^{\infty}$ . It is easy to see that:

X^{(k)} ⪰ 0, \forall_{k}

(3.4)

X_{i i}^{(k)} = I_{d_{k} \times d_{k}}, \forall_{k, i},

(3.5)

r a n k [X^{(k)}] = d_{k}, \forall_{k},

(3.6)

X_{i j}^{(k)} \in I m (ρ_{k}), \forall_{k, i, j} .

(3.7)

Constraints (3.4), (3.5) and (3.6) ensure $X^{(k)}$ is of the form

X^{(k)} = [\begin{matrix} X_{1}^{(k)} \\ X_{2}^{(k)} \\ ⋮ \\ X_{n}^{(k)} \end{matrix}] {[\begin{matrix} X_{1}^{(k)} \\ X_{2}^{(k)} \\ ⋮ \\ X_{n}^{(k)} \end{matrix}]}^{*},

for some $X_{i}^{(k)}$ unitary $d_{k} \times d_{k}$ matrices. The constraint (3.7) attempts to ensure that $X_{i}^{(k)}$ is in the image of the representation of $𝒢$ . Notably, none of these constraints ensures that, for different values of $k, X_{i j}^{(k)}$ correspond to the same group element $g_{i} g_{j}^{- 1}$ . Adding such a constraint would yield

\begin{matrix} \underset{X^{(k)}}{m i n i m i z e} & \sum_{k = 0}^{\infty} t r [C^{(k)} X^{(k)}] \\ subject to & X^{(k)} ⪰ 0 \\ X_{i i}^{(k)} = I_{d_{k} \times d_{k}} \\ r a n k [X^{(k)}] = d_{k} \\ X_{i j}^{(k)} = ρ_{k} (g_{i} g_{j}^{- 1}), \end{matrix}

(3.8)

where $g_{i}$ and $g_{j}$ are elements of $𝒢$ .

Unfortunately, both the rank constraint and the last constraint in (3.8) are, in general, nonconvex. We will relax (3.8) by dropping the rank requirement and replacing the last constraint by positivity constraints that couple different $X^{(k)}$ ’s. We achieve this by considering the Dirac delta funcion on $𝒢$ . Notice that the Dirac delta funcion $δ (g)$ on the identity $e \in 𝒢$ can be expanded as

δ (g) = \sum_{k = 0}^{\infty} d_{k} tr [\hat{δ} (k) ρ_{k} (g)] = \sum_{k = 0}^{\infty} d_{k} tr [(\int_{𝒢} δ (h) ρ_{k}^{*} (h) d h) ρ_{k} (g)] = \sum_{k = 0}^{\infty} d_{k} tr [ρ_{k} (g)] .

If we replace $g$ with $g^{- 1} (g_{i} g_{j}^{- 1})$ , then we get

δ (g^{- 1} g_{i} g_{j}^{- 1}) = \sum_{k = 0}^{\infty} d_{k} tr [ρ_{k} (g^{- 1}) ρ_{k} (g_{i} g_{j}^{- 1})] = \sum_{k = 0}^{\infty} d_{k} tr [ρ_{k}^{*} (g) X_{i j}^{(k)}] .

To arrive at a convex program, we consider the following convex constraints, that form a natural convex relaxation for Dirac deltas,

\sum_{k = 0}^{\infty} d_{k} tr [ρ_{k}^{*} (g) X_{i j}^{(k)}] \geq 0 \forall g \in 𝒢,

(3.9)

\int_{𝒢} (\sum_{k = 0}^{\infty} d_{k} tr [ρ_{k}^{*} (g) X_{i j}^{(k)}]) d g = 1.

(3.10)

This suggests relaxing (3.8) to

\begin{array}{l} \underset{X^{(k)}}{m i n i m i z e} & \sum_{k = 0}^{\infty} t r [C^{(k)} X^{(k)}] \\ subject to & X^{(k)} ⪰ 0 \\ X_{i i}^{(k)} = I_{d_{k} \times d_{k}} \\ \sum_{k = 0}^{\infty} d_{k} t r [ρ_{k}^{*} (g) X_{i j}^{(k)}] \geq 0 \forall g \in 𝒢 \\ \int_{𝒢} (\sum_{k = 0}^{\infty} d_{k} t r [ρ_{k}^{*} (g) X_{i j}^{(k)}]) d g = 1 . \end{array}

(3.11)

For a nontrivial irreducible representation $ρ_{k}$ , we have $\int_{𝒢} ρ_{k} (g) d g = 0$ . This means that the integral constraint in (3.11) is equivalent to the constraint

X_{i j}^{(0)} = 1, \forall_{i, j} .

Thus, we focus on the optimization problem

\begin{matrix} \underset{X^{(k)}}{m i n i m i z e} & \sum_{k = 0}^{\infty} t r [C^{(k)} X^{(k)}] \\ subject to & X^{(k)} ⪰ 0 \\ X_{i i}^{(k)} = I_{d_{k} \times d_{k}} \\ \sum_{k = 0}^{\infty} d_{k} t r [ρ_{k}^{*} (g) X_{i j}^{(k)}] \geq 0 \forall g \in 𝒢 \\ X_{i j}^{(0)} = 1 . \end{matrix}

(3.12)

When $𝒢$ is a finite group it has only a finite number of irreducible representations. This means that (3.12) is a semidefinite program and can be solved, to arbitrary precision, in polynomial time [46]. In fact, when $𝒢 ≅ Z_{L}$ , a suitable change of basis shows that (3.12) is equivalent to the semidefinite programming relaxation proposed in [8] for the signal alignment problem.

Unfortunately, many of the applications of interest involve infinite groups. This creates two obstacles to solving (3.12). One is due to the infinite sum in the objective function and the other due to the infinite number of positivity constraints. In the next section, we address these two obstacles for the groups $S O (2)$ and $S O (3)$ .

4. Finite truncations for $S O (2) a n d S O (3)$ via Fejér kernels

The objective of this section is to replace (3.12) by an optimization problem depending only in finitely many variables $X^{(k)}$ . The objective function in (3.12) is converted from an infinite sum to a finite sum by truncating at degree $t$ . That is, we fix a $t$ and set $C^{(k)} = 0$ for $k > t$ . This consists of truncating the Fourier series of $\sum_{i, j = 1}^{n} f_{i j} (g_{i} g_{j}^{- 1})$ . Unfortunately, constraint (3.9) given by

\sum_{k = 0}^{\infty} d_{k} tr [ρ_{k}^{*} (g) X_{i j}^{(k)}] \geq 0 \forall g \in 𝒢,

still involves infinitely many variables $X_{i j}^{(k)}$ and consists of infinitely many linear constraints.

We now address this issue for the groups $S O (2)$ and $S O (3)$ .

4.1. Truncation for $S O (2)$ .

Since we truncated the objective function at degree $t$ , it is then natural to truncate the infinite sum in constraint (3.9) also at $t$ . If we truncated below $t$ , then some variables (such as $X^{(t)}$ ) are not constrained; and if we truncated above $t$ , then some variables (such as $X^{(t + 1)}$ ) do not affect the cost function. The irreducible representations of $S O (2)$ are $\{e^{i k θ}\}$ , and $d_{k} = 1$ for all $k$ . Let us identify $g \in S O (2)$ with $θ_{g} \in [0,2 π]$ . That straightforward truncation corresponds to approximating the Dirac delta with

δ (g) \approx \sum_{k = - t}^{t} e^{i k θ_{g}} .

This approximation is known as the Dirichlet kernel, which we denote as

D_{t} (θ) ≔ \sum_{k = - t}^{t} e^{i k θ} .

However, the Dirichlet kernel does not inherit all the desirable properties of the delta function. In fact, $D_{t} (θ)$ is negative for some values of $θ$ .

Instead, we use the Fejér kernel, which is a non-negative kernel, to approximate the Dirac delta. The Fejér kernel is defined as

F_{t} (θ) ≔ \frac{1}{t} \sum_{k = 0}^{t - 1} D_{k} = \sum_{k = - t}^{t} (1 - \frac{|k|}{t}) e^{i k θ},

which is the first-order Cesàro mean of the Dirichlet kernel.

This motivates us to replace constraint (3.9) with

\sum_{k = - t}^{t} (1 - \frac{| k |}{t}) e^{- i k θ} X_{i j}^{(k)} \geq 0 \forall θ \in [0,2 π],

where, for $k > 0, X_{i j}^{(- k)}$ denotes ${[X_{i j}^{(k)}]}^{*}$ .

This suggests considering

\begin{matrix} \underset{X^{(k)}}{m i n i m i z e} & \sum_{k = 0}^{t} t r [C^{(k)} X^{(k)}] \\ subject to & X^{(k)} ⪰ 0 \\ X_{i i}^{(k)} = I_{d_{k} \times d_{k}} \\ \sum_{k = - t}^{t} (1 - \frac{| k |}{t}) e^{- i k θ} X_{i j}^{(k)} \geq 0 \forall θ \in [0,2 π] \\ X_{i j}^{(0)} = 1, \end{matrix}

which only depends on the variables $X_{i j}^{(k)}$ for $k = 0, \dots, t$ .

Unfortunately, the condition that the trigonometric polynomial $\sum_{k = - t}^{t} (1 - \frac{| k |}{t}) e^{- i k θ} X_{i j}^{(k)}$ is always non-negative, still involves an infinite number of linear in-equalities. Interestingly, due to the Fejér-Riesz factorization theorem (see [19]), this condition can be replaced by an equivalent condition involving a positive semidefinite matrix — it turns out that every nonnegative trigonometric polynomial is a square, meaning that the so called sum-of-squares relaxation [33, 34] is exact. However, while such a formulation would still be an SDP and thus solvable, up to arbitrary precision, in polynomial time, it would involve a positive semidefinite variable for every pair $(i, j)$ , rendering it computationally challenging. For this reason we relax the non-negativity constraint by asking that $\sum_{k = - t}^{t} (1 - \frac{| k |}{t}) e^{- i k θ} X_{i j}^{(k)}$ is non-negative in a finite set $Ω_{t} \in S O (2)$ . This yields the following optimization problem:

\begin{matrix} \underset{X^{(k)}}{m i n i m i z e} & \sum_{k = 0}^{t} t r [C^{(k)} X^{(k)}] \\ subject to & X^{(k)} ⪰ 0 \\ X_{i i}^{(k)} = I_{d_{k} \times d_{k}} \\ \sum_{k = - t}^{t} (1 - \frac{| k |}{t}) e^{- i k θ} X_{i j}^{(k)} \geq 0 \forall θ \in Ω_{t} \\ X_{i j}^{(0)} = 1 . \end{matrix}

(4.1)

4.2. Truncation for $S O (3)$ .

The irreducible representations of $S O (3)$ are the Wigner-D matrices $\{W^{(k)} (α, β, γ)\}$ , and $d_{k} = 2 k + 1$ . See [48] for an introduction to Wigner-D matrices. Let us associate $g \in S O (3)$ with Euler $(Z - Y - Z)$ angle $(α, β, γ) \in [0,2 π] \times [0, π] \times [0,2 π]$ . A straightforward truncation yields the approximation

δ (g) \approx \sum_{k = 0}^{t} (2 k + 1) tr [W^{(k)} (α, β, γ)] .

Observe that the operator tr is invariant under conjugation. Then $W^{(k)}$ can be decomposed as

W^{(k)} (α, β, γ) = R Λ^{(k)} (θ) R^{*}

with an $R$ such that

Λ^{(k)} (θ) = [\begin{matrix} e^{- i k θ} \\ ⋱ \\ 1 \\ ⋱ \\ e^{i k θ} \end{matrix}] .

We can think of $R$ as a change of basis and $Λ^{(k)} (θ)$ as a rotation from $S O (2)$ under the basis $R$ . It follows that

tr [W^{(k)} (α, β, γ)] = tr [Λ^{(k)} (θ)] = \sum_{m = - k}^{k} e^{i m θ} = D_{k} (θ) .

The relationship between $θ$ and $α, β, γ$ is

θ = 2 a r c c o s [c o s (\frac{β}{2}) c o s (\frac{α + γ}{2})] .

This relationship can be obtained by directly evaluating $t r [W^{(1)} (α, β, γ)]$ using the Wigner-d matrix $w^{(1)}$ :

tr [W^{(1)} (α, β, γ)] = \sum_{m = - 1}^{1} W_{m, m}^{(1)} (α, β, γ) = \sum_{m = - 1}^{1} e^{- i m (α + γ)} w_{m, m}^{(1)} (β) = cos (β) (1 + cos (α + γ)) + cos (α + γ) .

See [48] also for an introduction to Wigner-d matrix. This straightforward truncation at $t$ yields

δ (g) \approx \sum_{k = 0}^{t} (2 k + 1) D_{k} (θ),

which, again, inherits the undesirable property that this approximation can be negative for some $θ$ . Recall that we circumvented this property in the 1-dimension case by taking the first-order Cesàro mean of the Dirichlet kernel. In the 2-dimension case, we will need the second-order Cesàro mean. Notice that

D_{k} (θ) = \frac{sin [(2 k + 1) \frac{θ}{2}]}{sin (\frac{θ}{2})} .

Fejér proved that [4]

\sum_{k = 0}^{t} \frac{(3)_{t - k}}{(t - k)!} (k + \frac{1}{2}) s i n [(2 k + 1) \frac{θ}{2}] \geq 0, 0 \leq θ \leq π

where $\frac{(3)_{t - k}}{(t - k)!} = \frac{1}{2} (t - k + 2) (t - k + 1)$ . It follows that

\sum_{k = 0}^{t} \frac{{(3)}_{t - k}}{(t - k)!} (k + \frac{1}{2}) D_{k} (θ) = \sum_{k = 0}^{t} \frac{{(3)}_{t - k}}{(t - k)!} (k + \frac{1}{2}) \frac{sin [(2 k + 1) \frac{θ}{2}]}{sin (\frac{θ}{2})} \geq 0, - π \leq 0 \leq π .

Let us define

F_{t} (g) = F_{t} (α, β, γ) ≔ \sum_{k = 0}^{t} \frac{(3)_{t - k}}{(t - k)!} (k + \frac{1}{2}) \frac{s i n [(2 k + 1) \frac{θ_{g}}{2}]}{s i n (\frac{θ_{g}}{2})}

where $θ_{g} = 2 a r c c o s [c o s (\frac{β}{2}) c o s (\frac{α + γ}{2})]$ .

We replace constraint (3.9) with

F_{t} (α, β, γ) \geq 0 \forall (α, β, γ) \in [0,2 π] \times [0, π] \times [0,2 π] .

Secondly, we discretize the group $S O (3)$ to obtain a finite number of constraints. We consider a suitable finite subset $Ω_{t} \subset S O (3)$ . In our implementation, we use a Hopf fibration [49] to discretize $S O (3)$ . The quotient space $S O (3) / S O (2)$ is equivalent to $S^{2}$ . We take a uniform discretization of $S^{1} \equiv S O (2)$ and a spherical design [11] of $S^{2}$ . It is possible to find a spherical design on $S^{2}$ with $𝒪 (r^{2})$ points [11]. By [49], we use $𝒪 (r)$ points to discretize $S O (2)$ . The size of our $S O (3)$ discretization is $𝒪 (r^{3})$ . So, we have to enforce $𝒪 (r^{3})$ inequality constraints. The choice of $r$ is up to the user to strike a balance between computational speed and accuracy. We can then relax the non-negativity constraint yielding the following semidefinite program¹:

\begin{matrix} \underset{X^{(k)}}{m i n i m i z e} & \sum_{k = 0}^{t} t r [C^{(k)} X^{(k)}] \\ subject to & X^{(k)} ⪰ 0 \\ X_{i i}^{(k)} = I_{d_{k} \times d_{k}} \\ \sum_{k = 0}^{t} \frac{(3)_{t - k}}{(t - k)!} (k + \frac{1}{2}) t r [{(W^{(k)} (α, β, γ))}^{*} X_{i j}^{(k)}] \geq 0 \forall (α, β, γ) \in Ω_{t} \\ X_{i j}^{(0)} = 1 . \end{matrix}

(4.2)

4.3. An additional constraint on $X^{(1)}$ .

In this section, we discuss an additional constraint on $X^{(1)}$ , which uses properties of quaternions to constrain each block $X_{i j}^{(1)}$ of $X^{(1)}$ in the convex hull of $S O (3)$ more directly.

We consider the standard rotation matrix $R$ and the unit quaternion $q = q_{r} + q_{i} i + q_{j} j + q_{k} k$ which represent the same rotation as the block $X_{i j}^{(1)}$ (whose representation is associated with spherical harmonics), and we consider the outer product $Q_{i j}$ of the unit quaternion $q$

Q_{i j} = q^{T} q .

(4.3)

The standard rotation matrix $R$ is related to the block $X_{i j}^{(1)}$ by the formula

R^{*} = M^{*} (T^{*} X_{i j}^{(1)} T) M,

where,

T ≔ [\begin{matrix} - \frac{i}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \\ 0 & 1 & 0 \\ - \frac{i}{\sqrt{2}} & 0 & - \frac{1}{\sqrt{2}} \end{matrix}],

and

M ≔ [\begin{matrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ - 1 & 0 & 0 \end{matrix}] .

Indeed, one can verify that

T^{*} X_{i j}^{(1)} T = [\begin{matrix} R [22] & R [32] & - R [12] \\ R [23] & R [33] & - R [13] \\ - R [21] & - R [31] & R [11] \end{matrix}],

where,

R = [\begin{array}{l} R [11] & R [12] & R [13] \\ R [21] & R [22] & R [23] \\ R [31] & R [32] & R [33] \end{array}],

and $M$ is used simply to rearrange the elements of $T^{*} X_{i j}^{(1)} T$ .

Next, $R$ is mapped to $Q_{i j}$ by Rodrigues’ rotation formula (see [37]):

Q_{i j} = ℛ (R) ≔ \frac{1}{4} [\begin{matrix} 1 + R [11] + R [22] + R [33] & R [32] - R [23] & R [13] - R [31] & R [21] - R [12] \\ R [32] - R [23] & 1 + R [11] - R [22] - R [33] & R [21] + R [12] & R [31] + R [13] \\ R [13] - R [31] & R [21] + R [12] & 1 + R [22] - R [11] - R [33] & R [23] + R [32] \\ R [21] - R [12] & R [31] + R [13] & R [23] + R [32] & 1 + R [33] - R [11] - R [22] \end{matrix}] .

In summary, the mapping from $X_{i j}^{(1)}$ to $Q_{i j}$ , which we denote by $𝒜_{E_{q}}$ , is given by the formula:

Q_{i j} = 𝒜_{E_{q}} (X_{i j}^{(1)}) = ℛ (M^{*} T^{*} X_{i j}^{(1)} T M) .

For each $Q_{i j}$ , we wish to impose the constraints implied by Equation (4.3):

Q_{i j} ⪰ 0, t r [Q_{i j}] = 1, r a n k [Q_{i j}] = 1 .

But since the constraint $r a n k [Q_{i j}] = 1$ is not convex, we will drop it, giving us the following SDP in place of (4.2):

\begin{matrix} \underset{X^{(k)}}{m i n i m i z e} & \sum_{k = 0}^{t} t r [C^{(k)} X^{(k)}] \\ subject to & X^{(k)} ⪰ 0 \\ X_{i i}^{(k)} = I_{d_{k} \times d_{k}} \\ \sum_{k = 0}^{t} \frac{(3)_{t - k}}{(t - k)!} (k + \frac{1}{2}) t r [{(W^{(k)} (α, β, γ))}^{*} X_{i j}^{(k)}] \geq 0 \forall (α, β, γ) \in Ω_{t} \\ X_{i j}^{(0)} = 1 \\ Q_{i j} = 𝒜_{E_{q}} (X_{i j}^{(1)}), Q_{i j} ⪰ 0, t r [Q_{i j}] = 1 . \end{matrix}

(4.4)

5. Applications

In this section, we consider the application of (4.1) to the problem of registration over the unit circle and the application of (4.4) to registration over the unit sphere and orientation estimation in cryo-EM. To solve the SDP for each problem, the only parameters we need to determine are the coefficient matrices $C^{(k)}$ and the truncation parameter $t$ . The calculations of $C^{(k)}$ are detailed in the following pages. As for $t$ , we experimented over a range of values for each problem, and chose a value that balances computational time with accuracy in the estimated rotations. The SDP outputs the relative transformations $g_{i j} = g_{i} g_{j}^{- 1}$ , while we need $g_{1}, \dots, g_{n}$ . For each problem, we describe a rounding procedure to recover $g_{1}, \dots, g_{n}$ from the $g_{i j}$ ’s.

5.1. Registration in 1-dimension.

Recall that $𝒳$ is the space of bandlimited functions up to degree $t$ on $S^{1}$ . That is, for $x \in 𝒳$ , we can express

x (ω) = \sum_{l = - t}^{t} α_{l} e^{i l ω} .

Again, the irreducible representations of $S O (2)$ are $\{e^{i k θ}\}$ , and $d_{k} = 1$ for all $k$ . Let us identify $g \in S O (2)$ with $θ_{g} \in [0,2 π]$ , then

g \cdot x (ω) = \sum_{l = - t}^{t} e^{i l θ_{g}} α_{l} e^{i l ω} .

Let $𝒫$ sample the underlying signal $x$ at $L = 2 t + 1$ distinct points. This way, we can determine all the $α_{l}$ ’s associated with $x$ .

Since $y_{i} = 𝒫 (g_{i} \cdot x) + ϵ_{i}$ , for the adjoint $𝒬$ , we have

𝒬 (y_{i}) (ω) = \sum_{l = - t}^{t} α_{l}^{(i)} e^{i l ω} .

Let us identify $g_{i} g_{j}^{- 1}$ with $θ_{i j} \in [0,2 π]$ . Then, we can express $f_{i j}$ in terms of $α_{l}^{(i)}, α_{l}^{(j)}$ and $θ_{i j}$ :

f_{i j} (g_{i} g_{j}^{- 1}) = {‖ 𝒬 (y_{i}) - g_{i} g_{j}^{- 1} \circ 𝒬 (y_{j}) ‖}_{2}^{2} = \int_{S^{1}} {| \sum_{l = - t}^{t} (α_{l}^{(i)} - α_{l}^{(j)} e^{i l θ_{i j}}) e^{i l ω} |}^{2} d ω = \sum_{l = - t}^{t} {| α_{l}^{(i)} - α_{l}^{(j)} e^{i l θ_{i j}} |}^{2} .

The Fourier coefficients of $f_{i j}$ are

{\hat{f}}_{i j} (k) = \int_{0}^{2 π} \sum_{l = - t}^{t} {| α_{l}^{(i)} - α_{l}^{(j)} e^{i l θ_{i j}} |}^{2} e^{- i k θ_{i j}} d θ_{i j} = \int_{0}^{2 π} \sum_{l = - t}^{t} ({| α_{l}^{(i)} |}^{2} e^{- i k θ_{i j}} + {| α_{l}^{(j)} |}^{2} e^{- i k θ_{i j}} - α_{l}^{(i)} \bar{α_{l}^{(j)}} e^{i (- k - l) θ_{i j}} - \bar{α_{l}^{(i)}} α_{l}^{(j)} e^{i (- k + l) θ_{i j}}) d θ_{i j} = 2 π {\begin{array}{l} \sum_{l = - t}^{t} ({| α_{l}^{(i)} |}^{2} + {| α_{l}^{(j)} |}^{2}) - α_{0}^{(i)} \bar{α_{0}^{(j)}} - \bar{α_{0}^{(i)}} α_{0}^{(j)} & , k = 0 \\ - α_{- k}^{(i)} \bar{α_{- k}^{(j)}} - \bar{α_{k}^{(i)}} α_{k}^{(j)} & , k \neq 0 \end{array}

Note that we re-indexed the coefficients ${\hat{f}}_{i j} (k) \leftarrow {\hat{f}}_{i j} (k - (t + 1))$ .

5.1.1. Rounding.

(4.1) gives us the $X^{(k)}$ ’s. From the $X^{(k)}$ ’s, we want to extract each $θ_{i} \in [0,2 π]$ (up to a global transformation). Let us consider $X^{(1)}$ . We want $X^{(1)}$ to be of the form

[\begin{matrix} e^{i θ_{1}} \\ e^{i θ_{2}} \\ ⋮ \\ e^{i θ_{n}} \end{matrix}] {[\begin{matrix} e^{i θ_{1}} \\ e^{i θ_{2}} \\ ⋮ \\ e^{i θ_{n}} \end{matrix}]}^{*} .

Although $X^{(1)}$ is not guaranteed to be rank 1, we will simply take the top eigenvector of $X^{(1)}$ as our estimate of $e^{i θ_{1}}, \dots, e^{i θ_{n}}$ . And from $e^{i θ_{1}}, \dots, e^{i θ_{n}}$ , we can recover $θ_{1}, \dots, θ_{n}$ . See [40] for the reasoning behind this approach. In practice, we find using the top eigenvector of $X^{(1)}$ is a sufficient estimate of $θ_{1}, \dots, θ_{n}$ . We do not need to run additional comparisons against $X^{(2)}, \dots, X^{(t)}$ .

5.2. Registration in 2-dimension.

Recall that $𝒳$ is the space of bandlimited functions up to degree $t$ on $S^{2}$ . That is, for $x \in 𝒳$ , we can express

x (ω) = \sum_{l = 0}^{t} \sum_{m = - l}^{l} α_{l, m} Y_{l, m} (ω),

where $\{Y_{l m}\}$ are the spherical harmonics. Again, the irreducible representations of $S O (3)$ are the Wigner D-matrices $\{W^{(k)} (α, β, γ)\}$ , and $d_{k} = 2 k + 1$ . Let us associate $g \in S O (3)$ with Euler (Z-Y-Z) angle $(α, β, γ) \in [0,2 π] \times [0, π] \times [0,2 π]$ , then

g \cdot x (ω) = \sum_{l = 0}^{t} \sum_{m, m^{'} = - l}^{l} W_{m, m^{'}}^{(l)} (α, β, γ) α_{l, m^{'}} Y_{l, m} (ω) .

Let $𝒫$ sample the underlying signal $x$ at $L = (t + 1)^{2}$ points. This way, we can determine all the $α_{l m}$ ’s associated with $x$ .

Again, for the adjoint $𝒬$ we have

𝒬 (y_{i}) (ω) = \sum_{l = 0}^{t} \sum_{m = - l}^{l} α_{l, m}^{(i)} Y_{l, m} (ω) .

Let us identify $g_{i} g_{j}^{- 1} \in S O (3)$ with Euler (Z-Y-Z) angle $(α_{i j}, β_{i j}, γ_{i j}) \in [0,2 π] \times [0, π] \times [0,2 π]$ . Then, we can express $f_{i j}$ in terms of $α_{l, m}^{(i)}, α_{l, m}^{(j)}$ and $(α_{i j}, β_{i j}, γ_{i j})$ :

f_{i j} (g_{i} g_{j}^{- 1}) = ∥ 𝒬 (y_{i}) - g_{i} g_{j}^{- 1} \circ 𝒬 (y_{j}) ∥_{2}^{2} = \sum_{l = 0}^{t} \sum_{m = - l}^{l} ({| α_{l, m}^{(i)} |}^{2} + {| α_{l, m}^{(j)} |}^{2}) - \sum_{l = 0}^{t} \sum_{m, m^{'} = - l}^{l} α_{l, m}^{(i)} \bar{W_{m, m^{'}}^{(l)}} (α_{i j}, β_{i j}, γ_{i j}) \bar{α_{l, m^{'}}^{(j)}} - \sum_{l = 0}^{t} \sum_{m, m^{'} = - l}^{l} \bar{α_{l, m}^{(i)}} W_{m, m^{'}}^{(l)} (α_{i j}, β_{i j}, γ_{i j}) α_{l, m^{'}}^{(j)} .

The Fourier coefficients are given by

{\hat{f}}_{i j} (k) = \int_{S O (3)} f_{i j} (g) {(W^{(k)} (α, β, γ))}^{*} d g = \frac{8 π^{2}}{2 k + 1} {\begin{array}{l} \sum_{l = 0}^{t} \sum_{m = - l}^{l} ({| α_{l m}^{(i)} |}^{2} + {| α_{l m}^{(j)} |}^{2}) - α_{00}^{(i)} \bar{α_{00}^{(j)}} - \bar{α_{00}^{(i)}} α_{00}^{(j)} & , k = 0 \\ {(- {(- 1)}^{m - m^{'}} α_{k, - m^{'}}^{(i)} \bar{α_{k, - m}^{(j)}} - \bar{α_{k, m^{'}}^{(i)}} α_{k, m}^{(j)})}_{m, m^{'} = - k}^{k} & , k \neq 0 \end{array}

Here, we used the orthogonality relationship

\int_{S O (3)} \bar{W_{m_{1}, m_{2}}^{(k)}} (α, β, γ) W_{m_{1}^{'}, m_{2}^{'}}^{(k^{'})} (α, β, γ) d g = \frac{8 π^{2}}{2 k + 1} δ_{k, k^{'}} δ_{m_{1}, m_{1}^{'}} δ_{m_{2}, m_{2}^{'}},

and the property

W_{m, m^{'}}^{(k)} (α, β, γ) = (- 1)^{m - m^{'}} \bar{W_{- m, - m^{'}}^{(k)}} (α, β, γ) .

5.2.1. Rounding.

Again, (4.4) gives us the $X^{(k)}$ ’s. From the $X^{(k)}$ ’s, we want to extract each $(α, β, γ) \in [0,2 π] \times [0, π] \times [0,2 π]$ (up to a global transformation). Let us consider $X^{(1)}$ . We want $X^{(1)}$ to be of the form

[\begin{matrix} W^{(1)} (α_{1}, β_{1}, γ_{1}) \\ W^{(1)} (α_{2}, β_{2}, γ_{2}) \\ ⋮ \\ W^{(1)} (α_{n}, β_{n}, γ_{n}) \end{matrix}] {[\begin{matrix} W^{(1)} (α_{1}, β_{1}, γ_{1}) \\ W^{(1)} (α_{2}, β_{2}, γ_{2}) \\ ⋮ \\ W^{(1)} (α_{n}, β_{n}, γ_{n}) \end{matrix}]}^{*},

where each $W^{(1)} (α_{i}, β_{i}, γ_{i})$ is a 3 × 3 matrix. Similarly, $X^{(1)}$ is not guaranteed to be rank 3, but we will simply take the top 3 eigenvector of $X^{(1)}$ as our estimate of $W^{(1)} (α_{1}, β_{1}, γ_{1}), \dots, W^{(1)} (α_{n}, β_{n}, γ_{n})$ . And from $W^{(1)} (α_{1}, β_{1}, γ_{1}), \dots, W^{(1)} (α_{n}, β_{n}, γ_{n})$ , we can recover $(α_{1}, β_{1}, γ_{1}), \dots, (α_{n}, β_{n}, γ_{n})$ .

5.3. Orientation estimation in cryo-EM.

We refer to [50] to expand the objective function. We emphasize that the theory holds for arbitrary basis on the space containing the ${\hat{I}}_{i}$ ’s. We choose to construct the $C^{(k)}$ ’s using coefficients and parameters from the Fourier-Bessel expansion. Projection ${\hat{I}}_{i}$ can be expanded via Fourier-Bessel series as

{\hat{I}}_{i} (r, θ) = \sum_{k = - \infty}^{\infty} \sum_{q = 1}^{\infty} α_{k q}^{(i)} ψ_{k q}^{(c)} (r, θ),

where

ψ_{k q}^{(c)} (r, θ) = \{\begin{array}{l} N_{k q} J_{k} (R_{k q} \frac{r}{c}) e^{i k θ} & , r \leq c, \\ 0 & , r > c . \end{array}

The parameters above are defined as follows:

$c$ is the radius of the disc containing the support of ${\hat{I}}_{i}$ ,
$J_{k}$ is the Bessel function of integer order $k$ ,
$R_{k q}$ is the $q^{th}$ root of $J_{k}$ ,
$N_{k q} = \frac{1}{c \sqrt{π} |J_{k + 1} (R_{k q})|}$ is a normalization factor.

To avoid aliasing, we truncate the Fourier-Bessel expansion as follows.

{\hat{I}}_{i} (r, θ) \approx \sum_{k = - k_{max}}^{k_{max}} \sum_{q = 1}^{p_{k}} α_{k q}^{(i)} ψ_{k q}^{(c)} (r, θ) .

See [50] for a discussion on $k_{m a x}$ and $p_{k}$ . For the purpose of this section, let us assume we have $\{α_{k q}^{(i)} : - k_{m a x} \leq k \leq k_{m a x}, 1 \leq q \leq p_{k}\}$ for each ${\hat{I}}_{i}$ . (These can be computed from the Cartesian grid sampled images.)

We shall determine the relationship between ${\hat{I}}_{i} (r, θ_{i})$ and ${\hat{I}}_{j} (r, θ_{j})$ , and the lines of intersection between $g_{i}^{- 1} \cdot {\hat{I}}_{i}$ and $g_{j}^{- 1} \cdot {\hat{I}}_{j}$ embedded in $R^{3}$ . Recall from (2.9) and (2.10) that the directions of the lines of intersection between $g_{i}^{- 1} \cdot {\hat{I}}_{i}$ and $g_{j}^{- 1} \cdot {\hat{I}}_{j}$ are given, respectively, by unit vectors

c_{i j} (g_{i} g_{j}^{- 1}) = \frac{{\vec{e}}_{3} \times g_{i} g_{j}^{- 1} \cdot {\vec{e}}_{3}}{∥ {\vec{e}}_{3} \times g_{i} g_{j}^{- 1} \cdot {\vec{e}}_{3} ∥_{2}},

c_{j i} (g_{i} g_{j}^{- 1}) = \frac{{(g_{i} g_{j}^{- 1})}^{- 1} \cdot {\vec{e}}_{3} \times {\vec{e}}_{3}}{∥ {(g_{i} g_{j}^{- 1})}^{- 1} \cdot {\vec{e}}_{3} \times {\vec{e}}_{3} ∥_{2}} .

Let us associate $g_{i} g_{j}^{- 1} \in S O (3)$ with Euler (Z-Y-Z) angle $(α_{i j}, β_{i j}, γ_{i j}) \in [0,2 π] \times [0, π] \times [0,2 π]$ . Then

{\vec{e}}_{3} \times g_{i} g_{j}^{- 1} \cdot {\vec{e}}_{3} = [\begin{matrix} sin γ_{i j} sin β_{i j} \\ - cos γ_{i j} sin β_{i j} \\ 0 \end{matrix}],

{(g_{i} g_{j}^{- 1})}^{- 1} \cdot {\vec{e}}_{3} \times {\vec{e}}_{3} = [\begin{matrix} - sin α_{i j} sin β_{i j} \\ - cos α_{i j} sin β_{i j} \\ 0 \end{matrix}],

under the rotation matrix $R_{Z} (γ_{i j}) R_{Y} (β_{i j}) R_{Z} (α_{i j})$ . The directions of the lines of intersection in ${\hat{I}}_{i}$ and ${\hat{I}}_{j}$ under $g_{i} g_{j}^{- 1}$ are in the directions, respectively,

θ_{i} = arctan (sin γ_{i j}, - cos γ_{i j}) = γ_{i j} - \frac{π}{2},

\begin{array}{r} θ_{j} = arctan (- sin α_{i j}, - cos α_{i j}) = - α_{i j} - \frac{π}{2} . \end{array}

We express the $f_{i j}$ ’s in terms of $α_{k q}^{(i)}, α_{k q}^{(j)}$ , and $θ_{i}$ and $θ_{j}$ :

f_{i j} (θ_{i}, θ_{j}) ≔ f_{i j} (g_{i} g_{j}^{- 1}) = ∥ \sum_{k = - k_{max}}^{k_{max}} \sum_{q = 1}^{p_{k}} (α_{k q}^{(i)} ψ_{k q}^{(c)} (r, θ_{i}) - α_{k q}^{(j)} ψ_{k q}^{(c)} (r, θ_{j})) ∥_{L^{2}}^{2} = \sum_{k, k^{'}, q, q^{'}} c N_{k q} N_{k^{'} q^{'}} (α_{k q}^{(i)} e^{i k θ_{i}} - α_{k q}^{(j)} e^{i k θ_{j}}) {(α_{k^{'} q^{'}}^{(i)} e^{i k^{'} θ_{i}} - α_{k^{'} q^{'}}^{(j)} e^{i k^{'} θ_{j}})}^{*} \cdot \int_{0}^{1} J_{k} (R_{k q} r) J_{k^{'}} (R_{k^{'} q^{'}} r) d r .

For each $k, k^{'}, q, q^{'}$ , we approximate the integral

\int_{0}^{1} J_{k} (R_{k q} r) J_{k^{'}} (R_{k^{'} q^{'}} r) d r

with a Gaussian quadrature.

Using the approximation above, we have

f_{i j} (θ_{i}, θ_{j}) \approx \sum_{k, q, k^{'}, q^{'}} b_{k, q, k^{'}, q^{'}} (α_{k q}^{(i)} \bar{α_{k^{'} q^{'}}^{(i)}} e^{i (k - k^{'}) θ_{i}} + α_{k q}^{(j)} \bar{α_{k^{'} q^{'}}^{(j)}} e^{i (k - k^{'}) θ_{j}} - α_{k q}^{(i)} \bar{α_{k^{'} q^{'}}^{(j)}} e^{i (k θ_{i} - k^{'} θ_{j})} - α_{k q}^{(j)} \bar{α_{k^{'} q^{'}}^{(i)}} e^{i (k θ_{j} - k^{'} θ_{i})}),

where

b_{k, q, k^{'}, q^{'}} = c N_{k q} N_{k^{'} q^{'}} \sum_{i} w_{i} J_{k} (R_{k q} r_{i}) J_{k^{'}} (R_{k^{'} q^{'}} r_{i}) .

In terms of the Euler (Z-Y-Z) angles,

f_{i j} (α_{i j}, γ_{i j}) ≔ f_{i j} (θ_{i}, θ_{j}) \approx \sum_{k, q, k^{'}, q^{'}} b_{k, q, k^{'}, q^{'}} e^{- i \frac{π}{2} (k - k^{'})} (α_{k q}^{(i)} \bar{α_{k^{'} q^{'}}^{(i)}} e^{i (k - k^{'}) γ_{i j}} + α_{k q}^{(j)} \bar{α_{k^{'} q^{'}}^{(j)}} e^{- i (k - k^{'}) α_{i j}} - α_{k q}^{(i)} \bar{α_{k^{'} q^{'}}^{(j)}} e^{i k γ_{i j} + i k^{'} α_{i j}} - α_{k q}^{(j)} \bar{α_{k^{'} q^{'}}^{(i)}} e^{- i k α_{i j} - i k^{'} γ_{i j}}) .

The Fourier coefficients are given by

{\hat{f}}_{i j} (k) = \int_{S O (3)} f_{i j} (α, γ) {(W^{(k)} (α, β, γ))}^{*} d g = \int_{0}^{2 π} \int_{0}^{2 π} f_{i j} (α, γ) (\int_{0}^{π} W^{(k)} (- γ + π, β, - α + π) sin β d β) d α d γ .

Note that

{(\int_{0}^{π} W^{(k)} (α, β, γ) s i n β d β)}_{m, m^{'}} = \int_{0}^{π} e^{i m α} w_{m, m^{'}}^{(k)} (β) e^{i m^{'} γ} s i n β d β,

where $w^{(k)}$ is the Wigner $d$ -matrix. Let us define

H_{k} (m, m^{'}) ≔ (- 1)^{m + m^{'}} \int_{0}^{π} w_{m, m^{'}}^{(k)} (β) s i n β d β = (- 1)^{m + m^{'}} \int_{0}^{π} i^{m - m^{'}} \sum_{l = - k}^{k} w_{l, m}^{(k)} (π / 2) e^{- i l β} w_{l, m^{'}}^{(k)} (π / 2) s i n β d β = (- 1)^{m + m^{'}} i^{m - m^{'}} [2 w_{0, m}^{(k)} (π / 2) w_{0, m^{'}}^{(k)} (π / 2) - \frac{i π}{2} w_{1, m}^{(k)} (π / 2) w_{1, m^{'}}^{(k)} (π / 2) + \frac{i π}{2} w_{- 1, m}^{(k)} (π / 2) w_{- 1, m^{'}}^{(k)} (π / 2) + \sum_{| l | \geq 2}^{k} (\frac{1 + e^{- i l π}}{1 - l^{2}}) w_{l, m}^{(k)} (π / 2) w_{l, m^{'}}^{(k)} (π / 2)] .

The ${(m, m^{'})}^{t h}$ entry of ${\hat{f}}_{i j} (k)$ is approximated by

{({\hat{f}}_{i j} (k))}_{m, m^{'}} = H_{k} (m, m^{'}) \int_{0}^{2 π} \int_{0}^{2 π} f_{i j} (α, γ) e^{- i m γ} e^{- i m^{'} α} d α d γ = 4 π^{2} H_{k} (m, m^{'}) \sum_{k, q, k^{'}, q^{'}} b_{k, q, k^{'}, q^{'}} (α_{k q}^{(i)} \bar{α_{k^{'} q^{'}}^{(i)}} δ_{{m = k - k^{'}}} δ_{{m^{'} = 0}} + α_{k q}^{(j)} \bar{α_{k^{'} q^{'}}^{(j)}} δ_{{m = 0}} δ_{{m^{'} = k^{'} - k}} - α_{k q}^{(i)} \bar{α_{k^{'} q^{'}}^{(j)}} δ_{{m = k}} δ_{{m^{'} = k^{'}}} - α_{k q}^{(j)} \bar{α_{k^{'} q^{'}}^{(i)}} δ_{{m = - k^{'}}} δ_{{m^{'} = - k}}) .

Here, $δ$ is the Kronecker delta and $b_{k, q, k^{'}, q^{'}}$ absorbed $e^{- i \frac{π}{2} (k - k^{'})}$ .

5.3.1. Handedness ambiguity in cryo-EM.

There exists one additional issue specifically for the cryo-EM problem arising from the handedness ambiguity. Suppose an image was projected from some molecular density $ϕ$ and orientation $R$ . Let $J$ be the reflection operator across the imaging plane. The molecular density $J ϕ$ under orientation $J R J^{*}$ would produce the same projection. In other words, the set of projection images can belong to different molecular densities $ϕ$ and $J ϕ$ under different rotations.

We will formalize and deal with the handedness ambiguity in terms of the Wigner D-matrices. Recall that the Wigner D-matrix $W^{(k)} (g)$ corresponding to $(α, β, γ) \in S O (3)$ is

W^{(k)} (g) = {[e^{i m α} w_{m, m^{'}}^{(k)} (β) e^{i m γ}]}_{m, m^{'} = - k}^{k} .

Let $J^{(k)}$ be the following $(2 k + 1) \times (2 k + 1)$ diagonal matrix:

J^{(k)} ≔ [\begin{array}{l} ⋱ \\ - 1 \\ 1 \\ - 1 \\ ⋱ \end{array}] .

(The diagonal alternates between +1 and −1.) Due to the handedness ambiguity, if ${\{W^{(k)} (g_{i}) W^{(k)} (g_{j}^{- 1})\}}_{k}$ is a solution to (4.4), then ${\{J^{(k)} W^{(k)} (g_{i}) W^{(k)} (g_{j}^{- 1}) J^{(k)}\}}_{k}$ is also a valid solution to (4.4). In fact, for any $h \in [0,1]$ ,

{\{h W^{(k)} (g_{i}) W^{(k)} (g_{j}^{- 1}) + (1 - h) J^{(k)} W^{(k)} (g_{i}) W^{(k)} (g_{j}^{- 1}) J^{(k)}\}}_{k}

is a valid solution to (4.4).

Let us remove this extra degree of freedom $h$ . Observe that for $h = \frac{1}{2}$ ,

\frac{1}{2} W^{(k)} (g) + \frac{1}{2} J^{(k)} W^{(k)} (g) J^{(k)} = \{\begin{array}{l} e^{i m α} w_{m, m^{'}}^{(k)} (β) e^{i m^{'} γ}, & m + m^{'} \equiv 0 m o d 2, \\ 0, & otherwise. \end{array}

I.e., the odd-indexed entries are 0. For example, in the case of $k = 1$ ,

\frac{1}{2} W^{(1)} (g) + \frac{1}{2} J^{(1)} W^{(1)} (g) J^{(1)} = [\begin{matrix} e^{- i α} w_{- 1, - 1}^{(1)} (β) e^{- i γ} & 0 & e^{- i α} w_{- 1, 1}^{(1)} (β) e^{i γ} \\ 0 & w_{0, 0}^{(1)} (β) & 0 \\ e^{i α} w_{1, - 1}^{(1)} (β) e^{- i γ} & 0 & e^{i α} w_{1, 1}^{(1)} (β) e^{i γ} \end{matrix}],

and in the case of $k = 2$ ,

\frac{1}{2} W^{(2)} (g) + \frac{1}{2} J^{(2)} W^{(2)} (g) J^{(2)} = [\begin{matrix} e^{- 2 i α} w_{- 2, - 2}^{(2)} (β) e^{- 2 i γ} & 0 & e^{- 2 i α} w_{- 2,0}^{(2)} (β) & 0 & e^{- 2 i α} w_{- 2,2}^{(2)} (β) e^{2 i γ} \\ 0 & e^{- i α} w_{- 1, - 1}^{(2)} (β) e^{- i γ} & 0 & e^{- i α} w_{- 1,1}^{(2)} (β) e^{i γ} & 0 \\ w_{0, - 2}^{(2)} (β) e^{- 2 i γ} & 0 & w_{0,0}^{(2)} (β) & 0 & w_{0,2}^{(2)} (β) e^{2 i γ} \\ 0 & e^{i α} w_{1, - 1}^{(2)} (β) e^{- i γ} & 0 & e^{i α} w_{1,1}^{(2)} (β) e^{i γ} & 0 \\ e^{2 i α} w_{2, - 2}^{(2)} (β) e^{- 2 i γ} & 0 & e^{2 i α} w_{2,0}^{(2)} (β) & 0 & e^{2 i α} w_{2,2}^{(2)} (β) e^{2 i γ} \end{matrix}] .

We constrain the odd-indexed entries of $X_{i j}^{(k)}$ ’s to be 0 so that the SDP finds the solution with $h = \frac{1}{2}$ . Note that, in practice, we do not explicitly add this constraint. Instead, we permute $X_{i j}^{(k)}$ into two disjoint diagonal blocks. For example, in the case of $k = 1$ ,

\frac{1}{2} W^{(1)} (g) + \frac{1}{2} J^{(1)} W^{(1)} (g) J^{(1)} = [\begin{matrix} w_{0, 0}^{(1)} (β) 0 & 0 \\ 0 & e^{- i α} w_{- 1, - 1}^{(1)} (β) e^{- i γ} & e^{- i α} w_{- 1, 1}^{(1)} (β) e^{i γ} \\ 0 & e^{i α} w_{1, - 1}^{(1)} (β) e^{- i γ} & e^{i α} w_{1, 1}^{(1)} (β) e^{i γ} \end{matrix}],

and in the case of $k = 2$ ,

\frac{1}{2} W^{(2)} (g) + \frac{1}{2} J^{(2)} W^{(2)} (g) J^{(2)} = [\begin{matrix} e^{- i α} w_{- 1, - 1}^{(2)} (β) e^{- i γ} e^{- i α} w_{- 1, 1}^{(2)} (β) e^{i γ} & 0 & 0 & 0 \\ e^{i α} w_{1, - 1}^{(2)} (β) e^{- i γ} & e^{i α} w_{1, 1}^{(2)} (β) e^{i γ} & 0 & 0 & 0 \\ 0 & 0 & e^{- 2 i α} w_{- 2, - 2}^{(2)} (β) e^{- 2 i γ} & e^{- 2 i α} w_{- 2, 0}^{(2)} (β) & e^{- 2 i α} w_{- 2, 2}^{(2)} (β) e^{2 i γ} \\ 0 & 0 & w_{0, - 2}^{(2)} (β) e^{- 2 i γ} & w_{0, 0}^{(2)} (β) & w_{0, 2}^{(2)} (β) e^{2 i γ} \\ 0 & 0 & e^{2 i α} w_{2, - 2}^{(2)} (β) e^{- 2 i γ} & e^{2 i α} w_{2, 0}^{(2)} (β) & e^{2 i α} w_{2, 2}^{(2)} (β) e^{2 i γ} \end{matrix}] .

We can conjugate each $X_{i j}^{(k)}$ in (4.4) by a permutation and get

X_{i j}^{(k)} = [\begin{matrix} X_{i j}^{(k, 0)} & 0 \\ 0 & X_{i j}^{(k, 1)} \end{matrix}] .

Similarly, we can conjugate $X^{(k)}$ by a permutation and get

X^{(k)} = [\begin{matrix} X^{(k, 0)} & 0 \\ 0 & X^{(k, 1)} \end{matrix}],

where

X^{(k, 0)} = [\begin{matrix} X_{11}^{(k, 0)} & \dots & X_{1 n}^{(k, 0)} \\ ⋮ & ⋱ & ⋮ \\ X_{n 1}^{(k, 0)} & \dots & X_{n n}^{(k, 0)} \end{matrix}], X^{(k, 1)} = [\begin{matrix} X_{11}^{(k, 1)} & \dots & X_{1 n}^{(k, 1)} \\ ⋮ & ⋱ & ⋮ \\ X_{n 1}^{(k, 1)} & \dots & X_{n n}^{(k, 1)} \end{matrix}] .

Let us denote the above permutation as $Π_{k}$ . The objective function in (4.4) is preserved if we conjugate both the $C^{(k)}$ ’s and the $X^{(k)}$ ’s by $Π_{k}$ . I.e.,

tr [C^{(k)} X^{(k)}] = tr [Π_{k} C^{(k)} Π_{k}^{T} Π_{k} X^{(k)} Π_{k}^{T}] = tr [C^{(k, 0)} X^{(k, 0)}] + tr [C^{(k, 1)} X^{(k, 1)}],

where $C^{(k, 0)}$ and $C^{(k, 1)}$ are blocks corresponding to $X^{(k, 0)}$ and $X^{(k, 1)}$ , respectively. We apply the same permutation to the $X^{(k)}$ ’s in the constraints of (4.4) and reduce (4.4) to

\begin{array}{l} \underset{X^{(k, 0)}, X^{(k, 1)}}{minimize} & \sum_{k = 0}^{t} tr [C^{(k, 0)} X^{(k, 0)}] + tr [C^{(k, 1)} X^{(k, 1)}] \\ subject to & X^{(k, 0)} \underline{≻} 0, X^{(k, 1)} \underline{≻} 0 \\ X_{i i}^{(k, 0)} = I_{k \times k}, X_{i i}^{(k, 1)} = I_{(k + 1) \times (k + 1)} \\ \sum_{k = 0}^{t} \frac{{(3)}_{t - k}}{(t - k)!} (k + \frac{1}{2}) (tr [{(W^{(k, 0)} (α, β, γ))}^{*} X_{i j}^{(k, 0)}] \\ + tr [{(W^{(k, 1)} (α, β, γ))}^{*} X_{i j}^{(k, 1)}]) \geq 0 \forall (α, β, γ) \in Ω_{t} \\ X_{i j}^{(0, 1)} = 1 \\ Q_{i j} = 𝒜_{E_{q}} (X_{i j}^{(1, 0)}, X_{i j}^{(1, 1)}), Q_{i j} \underline{≻} 0, tr [Q_{i j}] = 1, \end{array}

(5.1)

where $𝒜_{E_{q}}$ defines the linear relationship between $Q_{i j}$ and $(X^{(1,0)}, X^{(1,1)})$ as

\begin{array}{l} Y^{(i j)} = {[\begin{matrix} - \frac{i}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \\ 0 & 1 & 0 \\ - \frac{i}{\sqrt{2}} & 0 & - \frac{1}{\sqrt{2}} \end{matrix}]}^{*} [\begin{matrix} X_{i j}^{(1, 1)} (1, 1) & 0 & X_{i j}^{(1, 1)} (1, 2) \\ 0 & X_{i j}^{(1, 0)} & 0 \\ X_{i j}^{(1, 1)} (2, 1) & 0 & X_{i j}^{(1, 1)} (2, 2) \end{matrix}] [\begin{matrix} - \frac{i}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \\ 0 & 1 & 0 \\ - \frac{i}{\sqrt{2}} & 0 & - \frac{1}{\sqrt{2}} \end{matrix}], \end{array}

\begin{array}{l} Q_{i j} = \frac{1}{4} [\begin{matrix} 1 + Y_{11}^{(i j)} + Y_{22}^{(i j)} + Y_{33}^{(i j)} & 0 & 0 & - (Y_{31}^{(i j)} - Y_{13}^{(i j)}) \\ 0 & 1 - Y_{11}^{(i j)} - Y_{22}^{(i j)} + Y_{33}^{(i j)} & - (Y_{31}^{(i j)} + Y_{13}^{(i j)}) & 0 \\ 0 & - (Y_{31}^{(i j)} + Y_{13}^{(i j)}) & 1 + Y_{11}^{(i j)} - Y_{22}^{(i j)} - Y_{33}^{(i j)} & 0 \\ - (Y_{31}^{(i j)} - Y_{13}^{(i j)}) & 0 & 0 & 1 - Y_{11}^{(i j)} + Y_{22}^{(i j)} - Y_{33}^{(i j)} \end{matrix}] . \end{array}

5.3.2. Rounding.

(5.1) gives us the $X^{(k, 0)}$ ’s and the $X^{(k, 1)}$ ’s. We aggregate $X^{(1,0)}$ and $X^{(1,1)}$ into the $3 n \times 3 n$ matrix $X^{(1)}$ . From the $X^{(1)}$ ’s, we build the synchronization matrix $S$ described in [39] and apply the cryo_syncrotations function on $S$ from the ASPIRE software package [5] to recover $(α_{1}, β_{1}, γ_{1}), \dots, (α_{n}, β_{n}, γ_{n})$ . Note that we do not truncate the eigenvectors and eigenvalues from $X^{(1)}$ .

6. Implementation and results for synthetic cryo-EM datasets

In this section, we give a brief history of methods used for determining orientations of cryo-EM projection images, and where we stand against the current state-of-the-art.

In 1986, Vainshtein and Goncharov developed a common-lines based method for ab-initio modeling in cryo-EM [45]. In 1987, M. Van Heel also independently discovered the same method, and coined it angular reconstitution [22]. Recall that by the Fourier slice theorem, two cryo-EM projection images (in Fourier space) must intersect along a common line. Given three cryo-EM images from different viewing directions, their common lines must uniquely determine their relative orientations up to handedness. (See Figure 6.1.) The orientations of the rest of the images are determined by common lines with the first three images. This is angular reconstitution in a nutshell.

Figure 6.1. — Three cryo-EM images uniquely determine their orientations up to handedness. Image courtesy of Singer et al. [41].

In 1992, Farrow and Ottensmeyer expanded upon angular reconstitution by developing a method to sequentially add images via least squares [20]. One major drawback in sequentially assigning orientations of cryo-EM images is the propagation of error due to false common line detection. In 1996, Penczek, Zhu, and Frank tried to circumvent the issue via brute-force search for a global energy minimizer [35]. However, the search space is simply too big for that method to be applicable. In 2006, Mallick et al. introduced a Bayesian method in which common lines between pairs of images are determined by their common lines with different projection triplets [29]. In the method by Mallick et al., at least seven common lines must be correctly and simultaneously determined, which can be problematic. In 2010, Singer et al. lowered the requirement to that only two common lines need to be correctly and simultaneously determined [41]. In 2011, Singer and Shkolnisky built upon [41] by adding a global consistency condition among the orientations [39]. This method is called synchronization, and it is regarded as the current state-of-the-art.

Note that all avaible methods for orientation determination, including the NUG approach proposed in this paper, cannot be directly applied to raw experimental images. We will explain the factors preventing us from doing so in the next subsection. We therefore numerically validate and evaluate the method using synthetic data.

6.1. Shifts, CTF and contrast.

In comparison to the simplistic forward model (2.8), there are three major imperfections in the experimental cryo-EM datasets. First, the images are not centered, and the common lines will not correspond exactly even under their true orientations. Second, the images are subject to the contrast transfer function (CTF) of the electron microscope. A CTF, as a function of radial frequency, is shown in Figure 6.2. Making matters mathematically more challenging, a CTF is typically estimated per micrograph and each micrograph would have a different CTF. Thus, all images from the same micrograph are typically assigned the same CTF. We also say those images belong to the same defocus group. However, as shown in Table 2 of Section 6.4, shifts and CTFs are not detrimental to NUG’s performance. This brings us to the third obstacle, which does prevent us from directly applying NUG on raw experimental images. The ice layer in which the molecules are frozen is not uniform. It can be thicker/thinner where different projections are taken, and this effect is equivalent to scaling the projections by a factor $γ > 0$ . So, various projections have various contrasts. Those effects are typically mitigated in class averages. Class averages are formed by in-plane aligning and averaging raw images that are estimated to have similar viewing directions. It is possible to apply NUG on class averages instead of the original raw experimental images. However, the quality of the results then depends crucially on the specific class averaging procedure being used and does not provide much insight into the performance of NUG itself.

Figure 6.2. — Example created by Jiang and Chiu available at http://jiang.bio.purdue.edu/software/ctf/ctfapplet.html

Table 2.

Effects of shifts, CTFs and contrast on the MSE (defined in (6.8)) of the NUG SDP cryo-EM. Shifts are sampled from $𝒰 (- 5,5)$ , CTFs are drawn uniformly from 4 defocus groups and contrasts are sampled from $𝒰 (0.5,1.5)$ . We used 500 simulated projections of size 64 × 64.

SNR	benchmark	CTF	CTF and shifts	CTF and contrast
1/4	0.0285	0.0381	0.3382	0.7342
1/8	0.0333	0.1450	0.5741	> 2.0
1/16	0.0698	0.7631	1.3391	> 2.0
1/32	0.4387	1.9199	> 2.0	> 2.0
1/64	1.8587	> 2.0	> 2.0	> 2.0

Open in a new tab

6.2. ADMM implementation.

There are two parts that are particularly challenging for obtaining a numerical solution to the NUG SDP (4.4) and (5.1):

implementing a SDP solver that is scalable to real-world problems such as orientation estimation in cryo-EM,
computing the coefficient matrix for generic objective function $f_{i j}$ ’s.

For the cryo-EM problem, the NUG SDP is simply too big to iterate on solvers based on techniques like interior point methods. Interior point methods are known for their accuracy. However, their accuracy is achieved at the expense of computational complexity. In essence, interior point methods solve a system of linear equations that attempts to satisfy both primal and dual feasibilities [14]. That is, solvers based on interior point methods have to invert a matrix that contains both primal and dual variables. The number and size of the signals coupled with the number of inequality constraints in (4.4) and (5.1) make this inversion impractical. Instead, we use the alternating direction method of multipliers, or ADMM. As the name suggests, ADMM alternates between the primal and the different sets of dual variables. More importantly, the steps in ADMM are linear, with the exception of an eigendecomposition on the primal variable. In practice, the eigendecomposition on the primal variable is manageable.

In general, it is not possible to obtain a closed-form expression for the coefficient matrices $C^{(k)}$ ’s. Sometimes, we need to employ numerical integration schemes over the group $𝒢$ to find the $C^{(k)}$ ’s.

6.2.1. The alternating direction method of multipliers.

In short, ADMM solves the augmented Lagrangian via iterative partial updating. We solve an unconstrained optimization problem over the objective variable, and enforce the constraints via dual variables. We will express the NUG SDP in a more general form, and derive ADMM updates in the more general setting. (4.4) and (5.1) are SDPs of the following form:

\begin{matrix} \underset{X}{m i n i m i z e} & ⟨ C, X ⟩ \\ subject to & X ⪰ 0 \\ 𝒜_{E} (X) = b_{E} \\ 𝒜_{I} (X) \geq b_{I} . \end{matrix}

(6.1)

We can think of $C$ and $X$ as the following block matrices:

C = [\begin{matrix} C^{(0)} \\ C^{(1)} \\ ⋱ \\ C^{(t)} \end{matrix}], X = [\begin{array}{l} X^{(0)} \\ X^{(1)} \\ ⋱ \\ X^{(t)} \end{array}] .

Note that going from (4.4) and (5.1) to (6.1), we used the fact that

t r [C^{(k)} X^{(k)}] = ⟨C^{(k)}, X^{(k)}⟩

because the $C^{(k)}$ ’s are Hermitian. $𝒜_{E}$ and $𝒜_{I}$ are linear operators that encapsulate the equality and inequality constraints, respectively. For example, we can think of $𝒜$ as

𝒜 (X) = [\begin{matrix} ⟨A_{I}^{(1)}, X⟩ \\ ⋮ \\ ⟨A_{I}^{(m)}, X⟩ \\ ⋮ \end{matrix}],

where, for the constraint $\sum_{k = 0}^{t} b_{k} t r [ρ_{k}^{*} (g_{m}) X_{i j}^{(k)}], A_{I}^{(m)}$ is the matrix containing $b_{k} ρ_{k} (g_{m})$ at the position of $X_{i j}^{(k)}$ and 0 everywhere else.

More concretely, $𝒜 (X) = b$ is equivalent to $A v e c (X) = b$ , where

A = {[\begin{array}{l} v e c (A^{(1)}) & \dots & v e c (A^{(m)}) & \dots \end{array}]}^{T}

and vec vectorizes the matrix $A^{(m)}$ along the columns. The adjoint operator of $𝒜$ is given by

𝒜^{*} (z) = mat (A^{T} z),

where mat is the reverse operator of vec. Furthermore, we can verify

$⟨ 𝒜 (X), z ⟩ = ⟨ A v e c (X), z ⟩ = ⟨m a t (A^{T} z), X⟩ = ⟨𝒜^{*} (z), X⟩$ ,
$(𝒜 𝒜^{*}) (z) = 𝒜 (𝒜^{*} (z)) = (A A^{T}) z,$
$(𝒜^{*} 𝒜) (z) = 𝒜^{*} (𝒜 (X)) = m a t ((A^{T} A) v e c (X))$ .

Now, we describe the ADMM solver outlined in [44] for SDP (6.1). ADMM is essentially a series of partial iterative updates based on the augmented Lagrangian. So, we will define the dual variables and write out the augmented Lagrangian. Then, we derive the updates by setting the gradient of the augmented Lagrangian, with respect to specific variables, to 0.

The dual variables corresponding to (6.1) are

\begin{array}{l} - X \underline{≺} 0 & \leftrightarrow & S \underline{≻} 0 \\ b_{E} - 𝒜_{E} (X) = 0 & \leftrightarrow & y_{E} \\ b_{I} - 𝒜_{I} (X) \leq 0 & \leftrightarrow & y_{I} \geq 0. \end{array}

The Lagrangian for (6.1) is

L (X, S, y_{E}, y_{I}) ≔ - ⟨(b_{E}, b_{I}), (y_{E}, y_{I})⟩ + ⟨X, S + 𝒜_{E}^{*} (y_{E}) + 𝒜_{I}^{*} (y_{I}) - C⟩ .

(6.2)

The motivation for defining (6.2) is so that the primal variable $X$ satisfies the constraints using the dual variables via

\underset{S ⪰ 0, y_{E}, y_{I} \geq b_{I}}{m i n i m i z e} \underset{X}{m a x i m i z e} L (X, S, y_{E}, y_{I}) .

(6.3)

Notice that (6.3) is an unconstrained optimization problem over $X$ . If $X$ is not PSD, then there exists $S ⪰ 0$ such that $L (X, S, y_{E}, y_{I}) = - \infty$ . Thus, due to the inner maximization over $X,$ (6.3) produces a solution satisfying $X ⪰ 0$ . The equality and inequality constraints are enforced in a similar manner. The maximizing $X$ must satisfy

\nabla_{X} L = 0 \Leftrightarrow S + 𝒜_{E}^{*} (y_{E}) + 𝒜_{I}^{*} (y_{I}) = C .

In fact, this gives us the dual problem to (6.1)

\begin{array}{l} \underset{S, y_{E}, y_{I}}{minimize} & δ_{\underline{≺} 0} (S) + δ_{\leq 0} (y_{I}) - 〈 (b_{E}, b_{I}), (y_{E}, y_{I}) 〉 \\ subject to & S + 𝒜_{E}^{*} (y_{E}) + 𝒜_{I}^{*} (y_{I}) = C . \end{array}

The augmented Lagrangian is the Lagrangian with the dual variables regularized by the Frobenius norm:

L_{ρ} (X, S, y_{E}, y_{I}) ≔ - 〈 (b_{E}, b_{I}), (y_{E}, y_{I}) 〉 + 〈 X, S + 𝒜_{E}^{*} (y_{E}) + 𝒜_{I}^{*} (y_{I}) - C 〉 + \frac{ρ}{2} ∥ S + 𝒜_{E}^{*} (y_{E}) + 𝒜_{I}^{*} (y_{I}) - C ∥_{Fro}^{2} .

(6.4)

This regularization term is crucial for numerical convergence of the solver.

The updates for the dual variables are given by their individual optimality conditions.

Solving $0 = \nabla_{S} L_{ρ}$ for $S$ and applying the PSD projection, we get
$S^{(k + 1)} = {[C - 𝒜_{E}^{*} (y_{E}^{(k)}) - 𝒜_{I}^{*} (y_{I}^{(k)}) - \frac{1}{ρ} X^{(k)}]}_{⪰ 0} .$
Solving $0 = \nabla_{y_{E}} L_{ρ}$ for $y_{E}$ , we get
$y_{E}^{(k + 1)} = {(𝒜_{E} 𝒜_{E}^{*})}^{- 1} (\frac{1}{ρ} (b_{E} - 𝒜_{E} (X^{(k)})) - 𝒜_{E} (𝒜_{I}^{*} (y_{I}^{(k)}) + S^{(k)} - C)) .$
Note that for the NUG SDP, $𝒜_{E} 𝒜_{E}^{*} = ℐ$ because each equality constraint is on a single entry in $X$ . So, this update simplifies to
$y_{E}^{(k + 1)} = \frac{1}{ρ} (b_{E} - 𝒜_{E} (X^{(k)})) - 𝒜_{E} (𝒜_{I}^{*} (y_{I}^{(k)}) + S^{(k)} - C) .$
Note that if we solve $0 = \nabla_{y_{I}} L_{ρ}$ for $y_{I}$ , we will have to invert the operator $𝒜_{I} 𝒜_{I}^{*}$ , and this is extremely computationally challenging. Instead, we add an additional regularization term $\frac{ρ}{2} {∥y_{I} - y_{I}^{(k)}∥}_{λ ℐ - 𝒜_{I} 𝒜_{I}^{*}}^{2}$ , where $λ$ is the largest eigenvalue of $𝒜_{I} 𝒜_{I}^{*}$ and $y_{I}^{(k)}$ is the element from the previous $y_{I}$ -update. Solving $0 = \nabla_{y_{I}} [L_{ρ} + \frac{ρ}{2} {∥y_{I} - y_{I}^{(k)}∥}_{λ ℐ - 𝒜_{I} 𝒜_{I}^{*}}^{2}]$ for $y_{I}$ and applying the the nonnegativity projection, we get
$y_{I}^{(k + 1)} = {[\frac{1}{ρ λ} (b_{I} - 𝒜_{I} (X^{(k)})) + \frac{1}{λ} 𝒜_{I} (C - 𝒜_{E}^{*} (y_{E}^{(k)}) - 𝒜_{I}^{*} (y_{I}^{(k)}) - S^{(k)}) + y_{I}^{(k)}]}_{\geq 0} .$

The update for the primal variable is simply a gradient descent given by

X^{(k + 1)} = X^{(k)} + ρ (S^{(k)} + 𝒜_{E}^{*} (y_{E}^{(k)}) + 𝒜_{I}^{*} (y_{I}^{(k)}) - C) .

We initialize the variables as the following:

$X^{(0)} = I$ ,
$y_{I}^{(0)} = 0$ ,
$S^{(0)} = 0$ ,
$y_{E}^{(0)} = \frac{1}{2} 𝒜_{E} (C - 𝒜_{I}^{*} (y_{I}^{(0)}) - S^{(0)})$ .

We apply the updates in the following order:

$S^{(k + 1)} = {[C - 𝒜_{E}^{*} (y_{E}^{(k)}) - 𝒜_{I}^{*} (y_{I}^{(k)}) - \frac{1}{ρ} X^{(k)}]}_{⪰ 0},$
$y_{E}^{(k + 1 / 2)} = 𝒜_{E} (C - 𝒜_{I}^{*} (y_{I}^{(k)}) - S^{(k + 1)}),$
$y_{I}^{(k + 1)} = [\frac{1}{ρ λ} (b_{I} - 𝒜_{I} (X^{(k)})) {+ \frac{1}{λ} 𝒜_{I} (C - 𝒜_{E}^{*} (y_{E}^{(k + 1 / 2)}) - 𝒜_{I}^{*} (y_{I}^{(k)}) - S^{(k + 1)}) + y_{I}^{(k)}]}_{\geq 0},$
$y_{E}^{(k + 1)} = 𝒜_{E} (C - 𝒜_{I}^{*} (y_{I}^{(k + 1)}) - S^{(k + 1)}),$
$X^{(k + 1)} = X^{(k)} + ρ (S^{(k + 1)} + 𝒜_{E}^{*} (y_{E}^{(k + 1)}) + 𝒜_{I}^{*} (y_{I}^{(k + 1)}) - C) .$

Note that we updated $y_{E}$ twice, which guarantees the solver’s convergence to the optimizer [44].

6.2.2. Fourier coefficients for bandlimited functions on $S O (3)$ .

The coefficient matrix in (3.12) is composed of

{\hat{f}}_{i j} (k) = \int_{𝒢} f_{i j} (g) ρ_{k}^{*} (g) d g .

We obtained closed-form expressions for the registration problem (2.4) under the assumption that the noise in the underlying signal is Gaussian. For the cryo-EM problem (2.11), we require a single approximation of an integral using Gaussian quadrature. However, this is not always the case. For example, the noise in the cryo-EM projections is better modeled by the Poisson distribution [36]. We describe the numerical integration scheme in [28] that can be used to obtain the desired Fourier coefficients ${\hat{f}}_{i j} (k)$ . We need to make the assumption that $f_{i j}$ ’s are bandlimited by $T$ . The method described in [28] has a better computational complexity than straight forward numerical integration. We will define the quadrature, and then outline the evaluation over the quadrature.

For function $f$ with bandlimit $T$ , we have the equality

\hat{f} (k) = \frac{1}{{(2 T)}^{2}} \sum_{j_{1} = 0}^{2 T - 1} \sum_{j_{2} = 0}^{2 T - 1} \sum_{l = 0}^{2 T - 1} b_{T} (l) f (α_{j_{1}}, β_{l}, γ_{j_{2}}) ρ_{k}^{*} (α_{j_{1}}, β_{l}, γ_{j_{2}}),

(6.5)

where

b_{T} (l) = \frac{2}{T} sin [\frac{π (2 l + 1)}{4 T}] \sum_{m = 0}^{T - 1} \frac{1}{2 m + 1} sin [\frac{π (2 l + 1) (2 m + 1)}{4 T}],

and

α_{j_{1}} = \frac{2 π j_{1}}{2 T}, β_{l} = \frac{π (2 l + 1)}{4 T}, γ_{j_{2}} = \frac{2 π j_{2}}{2 T}, 0 \leq j_{1}, j_{2}, l < 2 T .

Recall that

{[ρ_{k} (α, β, γ)]}_{m, m^{'}} = e^{i m α} w_{m, m^{'}}^{(k)} (β) e^{i m^{'} γ} .

We can re-write the entries in (6.5) as

{[\hat{f} (k)]}_{m, m^{'}} = \frac{1}{{(2 T)}^{2}} \sum_{l = 0}^{2 T - 1} b_{T} (l) w_{m^{'}, m}^{(k)} (β_{l}) \sum_{j_{2} = 0}^{2 T - 1} e^{- i m γ_{j_{2}}} \sum_{j_{1} = 0}^{2 T - 1} e^{- i m^{'} α_{j_{1}}} f (α_{j_{1}}, β_{l}, γ_{j_{2}}) .

(6.6)

By rearranging the terms in (6.6), it becomes obvious that we should compute $\hat{f} (k)$ in the following order:

for all $0 \leq j_{2}, l < 2 T$ and $- T \leq m^{'} \leq T$ , compute
$S_{1} (l, j_{2}, m^{'}) = \frac{1}{2 T} \sum_{j_{1} = 0}^{2 T - 1} e^{- i m^{'} α_{j_{1}}} f (α_{j_{1}}, β_{l}, γ_{j_{2}}),$
for all $0 \leq l < 2 T$ and $- T \leq m^{'}, m \leq T$ , compute
$S_{2} (l, m^{'}, m) = \frac{1}{2 T} \sum_{j_{2} = 0}^{2 T - 1} e^{- i m γ_{j_{2}}} S_{1} (l, j_{2}, m^{'}),$
for all $- T \leq m^{'}, m \leq T$ , compute
${[\hat{f} (k)]}_{m, m^{'}} = \sum_{l = 0}^{2 T - 1} b_{T} (l) w_{m^{'}, m}^{(k)} (β_{l}) S_{2} (l, m^{'}, m) .$

The complexity to compute $[\hat{f} (k)]_{m, m^{'}}$ for all $m, m^{'}$ and $k$ is $𝒪 (T^{4})$ ; and the complexity of the straight-forward evaluation in (6.5) is $𝒪 (T^{6})$ [28].

6.2.3. Simplification for cryo-EM.

Recall that the objective function for cryo-EM does not depend on $β$ . I.e.,

f (α, γ) = f (α, β_{1}, γ) = f (α, β_{2}, γ) .

(6.6) reduces to

{[\hat{f} (k)]}_{m, m^{'}} = (\sum_{l = 0}^{2 T - 1} b_{T} (l) w_{m^{'}, m}^{(k)} (β_{l})) (\frac{1}{{(2 T)}^{2}} \sum_{j_{2} = 0}^{2 T - 1} e^{- i m γ_{j_{2}}} \sum_{j_{1} = 0}^{2 T - 1} e^{- i m^{'} α_{j_{1}}} f (α_{j_{1}}, γ_{j_{2}})) .

(6.7)

We compute (6.7) in the following order:

compute
$B_{T} (k) = \sum_{l = 0}^{2 T - 1} b_{T} (l) w_{m^{'}, m}^{(k)} (β_{l}),$
for all $0 \leq j_{2} < 2 T$ and $- T \leq m^{'} \leq T$ , compute
$S_{1} (j_{2}, m^{'}) = \frac{1}{2 T} \sum_{j_{1} = 0}^{2 T - 1} e^{- i m^{'} α_{j_{1}}} f (α_{j_{1}}, γ_{j_{2}}),$
for all $- T \leq m^{'}, m \leq T$ , compute
$S_{2} (m^{'}, m) = \frac{1}{2 T} \sum_{j_{2} = 0}^{2 T - 1} e^{- i m γ_{j_{2}}} S_{1} (j_{2}, m^{'}),$
for all $- T \leq m^{'}, m \leq T$ , compute
${[\hat{f} (k)]}_{m, m^{'}} = B_{T} (k) S_{2} (m^{'}, m) .$

The complexity to compute $[\hat{f} (k)]_{m, m^{'}}$ for all $m, m^{'}$ and $k$ is $𝒪 (T^{3})$ .

6.3. Rotation MSE and FSC resolution.

We assess the performance of orientation estimation method using the mean squared error (MSE), defined as follows:

MSE ≔ \frac{1}{n} \sum_{i = 1}^{n} ∥ {\hat{R}}_{i} - R_{i} ∥_{Fro}^{2} .

(6.8)

Here $R_{i}$ are the true rotations (which are known in the simulation setting) and ${\hat{R}}_{i}$ are the estimated rotations (note that we previously used the hat notation for the Fourier transform, but here it is used for estimators). Since the estimation is up to a global rotation and possibly handedness, the two sets of rotations are aligned prior to computing the MSE.

In addition, we will include the Fourier Shell Correlation (FSC) of the reconstructed structure against the known structure. We reconstruct the 3-dimensional structure (in Fourier space) using the estimated orientations to get $\hat{ϕ}$ , and compare it against the known structure $ϕ$ (in Fourier space). For each spatial frequency $r_{k}$ , we calculate the FSC

FSC (r_{k}) ≔ \frac{\sum_{∥ v ∥_{2} = r_{k}} \hat{ϕ} (v) ϕ^{*} (v)}{{\sum_{∥ v ∥_{2} = r_{k}} |\hat{ϕ} (v)|}^{2} \sum_{∥ v ∥_{2} = r_{k}} {|ϕ (v)|}^{2}} .

We derive the resolution (in Angstroms) by interpolating the FSC until we reach an $r_{k}$ yielding $F S C (r_{k}) \leq 0.5$ . See [23] for a discussion on the FSC and the cutoff value.

Note that the rotation MSE is the most direct assessment of orientation estimation methods. After all, the stated objective of orientation estimation in cryo-EM is to estimate the orientations. In addition, the quality of the 3-dimensional reconstruction depends on several other factors such as the signal-to-noise ratio of the images, the number of images, the distribution of the viewing directions and the CTF of the images. We focus on the rotation MSE because the other factors mentioned are independent of the algorithm being used.

6.4. Simulated data.

With the ASPIRE software package in [5], we generate a set of 100 and a set of 500 projection images of size 129 × 129 from the Plasmodium falciparum $80 S$ ribosome (see https://www.ebi.ac.uk/pdbe/emdb/empiar/entry/10028/.) We add Gaussian noise to the simulated projections, and then apply a circular mask to zero out the noise outside of the radius. Table 1 shows a comparison of NUG and the synchronization method for different noise levels. In particular, the estimation error of NUG at SNR=1/64 with 500 images is sufficiently low (MSE=0.05) to result in a meaningful 3-D ab-initio model (with ¡ 30 Angstrom resolution). Numerical experiments were conducted on a cluster of Intel(R) Xeon(R) CPU E7–8880v3 @ 2.30GHz totaling 144 cores and 792GB of RAM. The synchronization method is roughly two orders of magnitude faster than NUG, but NUG gives more accurate estimates at low SNR.

Table 1.

The resolutions shown are in Angstroms. At high SNR, synchronization is accurate to more decimal places. This is not surprising since we have compromised on various truncations and discretizations for computational speed, etc. However, as the noise increases, we see NUG outperforming synchronization.

100 images	SNR	NUG MSE	sync. MSE	NUG res	sync res
	1/1	0.0153	1.2759e-04	24.6	20.8
	1/2	0.0155	1.3593e-04	21.5	20.3
	1/4	0.0174	3.6615e-04	27.0	22.4
	1/8	0.0192	0.0037	28.7	25.8
	1/16	0.0227	0.0300	29.8	30.9
	1/32	0.0298	0.1572	35.6	45.8
	1/64	0.1559	2.7818	45.3	174.1
	1/128	2.1239	4.1492	97.9	175.3
500 images	SNR	NUG MSE	sync. MSE	NUG res	sync res
	1/1	0.0125	4.1412e-05	18.1	14.6
	1/2	0.0130	5.1825e-05	20.8	18.2
	1/4	0.0134	1.5268e-04	21.7	17.0
	1/8	0.0143	0.0018	16.4	18.1
	1/16	0.0175	0.0189	20.8	17.7
	1/32	0.0195	0.1559	24.6	30.7
	1/64	0.0460	2.2496	29.3	71.1
	1/128	1.6060	3.1661	64.2	107.6

Open in a new tab

To put the numbers in Table 1 into perspective, a MSE of 0.1, for example, can be regarded as small enough in one instance to lead to a good reconstruction, but too large in another. The 3-dimensional reconstruction from 100 images at low SNR (e.g., SNR=1/128) looks bad even if one uses the true orientations (i.e., MSE=0). On the other hand, the 3-dimensional reconstruction from 500 images at moderate SNR (e.g., SNR=1/32) would look very decent at MSE=0.1. The latter case is shown in Figure 7 in [39], where SNR=1/32 and the MSE is slightly above 0.1. The resolution measure quantifies the quality of the 3-D ab-initio model. An idea about the quality of 3-dimensional reconstruction with respect to the MSE can also be obtained from Table 3 and Figure 8 in [42]. We would like to point out that for the same computational cost, synchronization can process more images than NUG, and therefore may yield better 3-dimensional reconstructions. However, beyond the theoretical importance, NUG still might be useful in situations in which the number of images is limited (e.g., when there are only a handful number of class averages or in electron tomography).

We also illustrate the effects of contrast, CTF and shifts on the performance of NUG in the numerical experiment reported in Table 2. In this numerical experiment, phase flipping was applied to correct for the phases of the CTF but not their magnitudes. As for shifts, we simply ignore it in the NUG SDP. In the future, we can expand (4.4) to include shifts to improve upon our estimates. Please see Section 6.1 for a discussion on the effects of contrast, CTF and shifts.

The distribution for the contrast was based on the following numerical experiment. We used the publicly available experimental dataset EMPIAR-10028 found here: https://www.ebi.ac.uk/pdbe/emdb/empiar/entry/10028/. Using the known 3-dimensional structure of the molecule, we construct 10,000 simulated clean projections of size 64 × 64 at different orientations sampled uniformly over $S O (3)$ . For each image $\hat{I}$ in the experimental dataset, we solve for $γ$ in

\underset{γ, I_{i}}{minimize} ∥ γ \hat{I} - I_{i} ∥_{Fro}^{2},

where $I_{i}$ ’s are simulated projections generated from the known structure. Figure 6.3 shows the empirical distribution of $γ$ .

7. Summary

The NUG problem consists in the minimization of the sum of pairwise cost functions defined over the ratio between group elements, for arbitrary compact groups. This corresponds to the simultaneous multi-alignment of many datapoints (e.g. signals, images, or molecule densities) with respect to transformations given by an action of the corresponding group. We presented a methodology to solve this problem involving a relaxation of the problem to an SDP and an implementation of an ADMM algorithm to solve the resulting SDP.

In this paper we focused mainly in the context of alignment over $S O (2)$ and $S O (3)$ . A notable example is that of finding a consistent set of pairwise rotations of many different functions defined on a sphere which globally minimize the disagreement between pairs of functions.

The NUG problem arises in cryo-EM, where the task is to recover the relative orientations of many noisy 2-dimensional projections images of a 3-dimensional object, each obtained from a different unknown viewing direction. Once good alignments are estimated, the 3-dimensional object can be reconstructed using standard tomography algorithms. In this paper we formulate the problem of image alignment in cryo-EM as an instance of NUG, and demonstrate the applicability to simulated datasets.

The computational and numerical considerations require truncations of the both the cost functions and group representations; a general approach is proposed as part of the formulation as an SDP, and additional methods particularly developed for the case of $S O (3)$ , and to the special properties of the cryo-EM problem, are presented.

It is noteworthy that, compared to previous work on related alignment problems, the formulation of the problem as an SDP can provide a certificate for optimality in some cases. Specifically, whenever the solution of the SDP also satisfies the nonconvex constraints that have been relaxed, it serves as a certificate that optimality has been achieved. In numerical work not reported in this paper we have numerically observed optimality of NUG SDP for some instances of MRA, but not for estimation of orientations in cryo-EM. Better theoretical understanding of when NUG SDP achieves optimality is an interesting open problem. In the context of cryo-EM, like other common-line based approaches, it does not require an initial guess. These properties make NUG a candidate for future work on robust ab initio alignment which would provide an initialization for other refinement algorithms.

Acknowledgments

The authors thank the anonymous reviewers for their valuable comments that helped improve the paper.

A. Singer was partially supported by NIGMS Award Number R01GM090200, AFOSR FA9550-17-1-0291, Simons Investigator Award, the Moore Foundation Data-Driven Discovery Investigator Award, and NSF BIGDATA Award IIS-1837992. These awards also partially supported A. S. Bandeira, Y. Chen, and R. R. Lederman at Princeton University.

Part of this manuscript was prepared while A. S. Bandeira was with the Courant Institute of Mathematical Sciences and the Center for Data Science at New York University, and partially supported by National Science Foundation Grants DMS-1317308, DMS-1712730, DMS1719545, and by a grant from the Sloan Foundation.

The authors performed most of the research at Princeton University.

Footnotes

Similarly to $S O (2)$ , it is possible that the non-negativity constraint may be replaced by an SDP or sums-of-squares constraint. [38]

Contributor Information

Afonso S. Bandeira, Department of Mathematics

ETH Zürich, Ramistrasse 101, 8092 Zürich, Switzerland.

Yutong Chen, Systems Trading, Tudor Investment Corporation, New York, NY USA.

Roy R. Lederman, Department of Statistics and Data Science, Yale University, 24 Hillhouse Avenue, New Haven, CT 06511 USA

Amit Singer, Department of Mathematics, Program in Applied and Computational Mathematics, Princeton University, Fine Hall, Washington Road, Princeton, NJ 08544 USA.

References

[1].Abbe E, Bandeira AS, Bracher A, and Singer A. Decoding binary node labels from censored edge measurements: Phase transition and efficient recovery. Network Science and Engineering, IEEE Transactions on, 1(1):10–22, Jan 2014. [Google Scholar]
[2].Abbe E, Bandeira AS, and Hall G. Exact recovery in the stochastic block model. IEEE Transactions on Information Theory, 62(1):471–487, 2016. [Google Scholar]
[3].Alon N and Naor A. Approximating the cut-norm via Grothendieck’s inequality. In Proc. of the 36 th ACM STOC, pages 72–80. ACM Press, 2004. [Google Scholar]
[4].Askey R. Orthogonal Polynomials and Special Functions. SIAM, 4 edition, 1975. [Google Scholar]
[5].ASPIRE - algorithms for single particle reconstruction software package. http://spr.math.princeton.edu/.
[6].Bai XC, McMullan G, and Scheres SH. How cryo-EM is revolutionizing structural biology. Trends in Biochemical Sciences, 40(1):49–57, 2015. [DOI] [PubMed] [Google Scholar]
[7].Bandeira AS. Random Laplacian matrices and convex relaxations. Foundations of Computational Mathematics, 18(2):345–379, 2018. [Google Scholar]
[8].Bandeira AS, Charikar M, Singer A, and Zhu A. Multireference alignment using semidefinite programming. 5th Innovations in Theoretical Computer Science (ITCS 2014), 2014. [Google Scholar]
[9].Bandeira AS, Kennedy C, and Singer A. Approximating the little Grothendieck problem over the orthogonal and unitary groups. Math. Program, 160:433–475, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[10].Bandeira AS, Singer A, and Spielman DA. A Cheeger inequality for the graph connection Laplacian. SIAM J. Matrix Anal. Appl, 34(4):1611–1630, 2013. [Google Scholar]
[11].Bannai E and Bannai E. A survey on spherical designs and algebraic combinatorics on spheres. European Journal of Combinatorics, 30(6):1392–1425, 2009. [Google Scholar]
[12].Bhamre T, Zhao Z, and Singer A. Mahalanobis distance for class averaging of cryo-EM images. 2017 IEEE 14 th International Symposium on Biomedical Imaging, pages 654–658, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Lederman RR and Singer A. A representation theory perspective on simultaneous alignment and classification. Applied and Computational Harmonic Analysis, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
[14].Boyd S and Vandenberghe L. Convex Optimization. Cambridge University Press, 2004. [Google Scholar]
[15].Charikar M, Makarychev K, and Makarychev Y. Near-optimal algorithms for unique Games. Proceedings of the 38th ACM Symposium on Theory of Computing, 2006. [Google Scholar]
[16].Chen Y, Huang Q-X, and Guibas L. Near-optimal joint object matching via convex relaxation. Proceedings of the 31st International Conference on Machine Learning, 32(2):100–108, 2014. [Google Scholar]
[17].Coifman RR and Weiss G. Representations of compact groups and spherical harmonics. L’Enseignement Mathematique, Vol. 14, 1968. [Google Scholar]
[18].Cucuringu M. Synchronization over Z2 and community detection in signed multiplex networks with constraints. Journal of Complex Networks, 2015. [Google Scholar]
[19].Dumitrescu B. Positive Trigonometric Polynomials and Signal Processing Applications. Springer, 2007. [Google Scholar]
[20].Farrow NA and Ottensmeyer EP. A posteriori determination of relative projection directions of arbitrarily oriented macromolecules. Journal of the Optical Society of America A, 9(10):1749–1760, 1992. [Google Scholar]
[21].Goemans MX and Williamson DP. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the Association for Computing and Machinery, 42(6):1115–1145, 1995. [Google Scholar]
[22].Van Heel M. Angular reconstitution: a posteriori assignment of projection directions for 3D reconstruction. Ultramicroscopy, 21:111–123, 1987. [DOI] [PubMed] [Google Scholar]
[23].Van Heel M and Schatz M. Fourier shell correlation threshold criteria. Journal of Structural Biology, 151:250–262, 2005. [DOI] [PubMed] [Google Scholar]
[24].Huang Q-X and Guibas L. Consistent shape maps via semidefinite programming. Computer Graphics Forum, 32(5):177–186, 2013. [Google Scholar]
[25].Kam Z. The reconstruction of structure from electron micrographs of randomly oriented particles. Journal of Theoretical Biology, 82(1):15–39, 1980. [DOI] [PubMed] [Google Scholar]
[26].Khot S. On the power of unique 2-prover 1-round games. Thiry-fourth annual ACM symposium on Theory of computing, 2002. [Google Scholar]
[27].Khot S. On the unique Games conjecture (invited survey). In Proceedings of the 2010 IEEE 25th Annual Conference on Computational Complexity, CCC ‘10, pages 99–121, Washington, DC, USA, 2010. IEEE Computer Society. [Google Scholar]
[28].Kostlee PJ and Rockmore DN. FFTs on the rotation group. Journal of Fourier Analysis and Applications, 14:145–179, 2008. [Google Scholar]
[29].Mallick SP, Agarwal S, Kriegman DJ, Belongie SJ, Carragher B, and Potter CS. Structure and view estimation for tomographic reconstruction: A Bayesian approach. EEE Conference on Computer Vision and Pattern Recognition, 2:2253–2260, 2006. [Google Scholar]
[30].Natterer F. The Mathematics of Computerized Tomography. Classics in Applied Mathematics, SIAM, 2001. [Google Scholar]
[31].Naor A, Regev O, and Vidick T. Efficient rounding for the noncommutative Grothendieck inequality. In Proceedings of the 45th annual ACM symposium on Symposium on theory of computing, STOC ‘13, pages 71–80, New York, NY, USA, 2013. ACM. [Google Scholar]
[32].Nesterov Y. Semidefinite relaxation and nonconvex quadratic optimization. Optimization Methods and Software, 9(1–3):141–160, 1998. [Google Scholar]
[33].Parrilo PA. Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. Thesis, California Institute of Technology, 2000. [Google Scholar]
[34].Parrilo PA Blekherman G and Thomas RR. Semidefinite Optimization and Convex Algebraic Geometry. MOS-SIAM Series on Optimization, 2013. [Google Scholar]
[35].Penczek PA, Zhu J, and Frank J. A common-lines based method for determining orientations for n > 3 particle projections simultaneously. Ultramicroscropy, 63:205–218, 1996. [DOI] [PubMed] [Google Scholar]
[36].Prucnal PR and Saleh BEA. Transformation of image-signal-dependent noise into image-signal-independent noise. Optics Letters, 6:316–318, 1981. [DOI] [PubMed] [Google Scholar]
[37].Rodrigues O. Des lois géometriques qui regissent les déplacements d’ un systéme solide dans l’ espace, et de la variation des coordonnées provenant de ces déplacement considérées indépendent des causes qui peuvent les produire. J. Math. Pures Appl, 5:380–440, 1840. [Google Scholar]
[38].Saunderson J, Parrilo PA, and Willsky AS. Semidefinite descriptions of the convex hull of rotation matrices. SIAM J. Optim, 25(3):1314–1343, 2015. [Google Scholar]
[39].Shkolnisky Y and Singer A. Viewing direction estimation in cryo-EM using synchronization. SIAM J. Imagin Sciences, 5(3):1088–1110, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
[40].Singer A. Angular synchronization by eigenvectors and semidefinite programming. Applied and Computational Harmonic Analysis, 30(1):20–36, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
[41].Singer A, Coifman RR, Sigworth FJ, Chester DW, and Shkolnisky Y. Detecting consistent common lines in cryo-EM by voting. Journal of Structural Biology, 169(3):312–322, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
[42].Singer A and Shkolnisky Y. Three-dimensional structure determination from common lines in cryo-EM by eigenvectors and semidefinite programming. SIAM J. Imaging Sciences, 4(2):543–572, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
[43].Singer A, Zhao Z, Shkolnisky Y, and Hadani R. Viewing angle classification of cryo-Electron Microscopy images using eigenvectors. SIAM Journal on Imaging Sciences, 4:723–759, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
[44].Sun D, Toh KC, and Yang L. A convergent 3-block semiproximal alternating direction method of multipliers for conic programming with 4-type constraints. SIAM J. Opt, 25(2):882–915, 2015. [Google Scholar]
[45].Vainshtein BK and Goncharov AB. Determination of the spatial orientation of arbitrarily arranged identical particles of unknown structure from their projections. 1986.
[46].Vanderberghe L and Boyd S. Semidefinite programming. SIAM Review, 38:49–95, 1996. [Google Scholar]
[47].Wang L and Sigworth FJ. Cryo-EM and single particles. Physiology, 21:13–18, 2006. [DOI] [PubMed] [Google Scholar]
[48].Wigner EP. Group Theory and its Application to the Quantum Mechanics of Atomic Spectra. New York: Academic Press, 1959. [Google Scholar]
[49].Yershova A, Jain S, LaValle SM and Mitchell JC. Generating Uniform Incremental Grids on SO(3) Using the Hopf Fibration. The International journal of robotics research, 29(7):801–812, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
[50].Zhao Z, Shkolnisky Y, and Singer A. Fast steerable principal component analysis. IEEE Transactions on Computational Imaging, 2:1–12, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[51].Zhao Z and Singer A. Rotationally invariant image representation for viewing direction classification in cryo-EM. Journal of Structural Biology, 186(1):153–166, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] [1].Abbe E, Bandeira AS, Bracher A, and Singer A. Decoding binary node labels from censored edge measurements: Phase transition and efficient recovery. Network Science and Engineering, IEEE Transactions on, 1(1):10–22, Jan 2014. [Google Scholar]

[R2] [2].Abbe E, Bandeira AS, and Hall G. Exact recovery in the stochastic block model. IEEE Transactions on Information Theory, 62(1):471–487, 2016. [Google Scholar]

[R3] [3].Alon N and Naor A. Approximating the cut-norm via Grothendieck’s inequality. In Proc. of the 36 th ACM STOC, pages 72–80. ACM Press, 2004. [Google Scholar]

[R4] [4].Askey R. Orthogonal Polynomials and Special Functions. SIAM, 4 edition, 1975. [Google Scholar]

[R5] [5].ASPIRE - algorithms for single particle reconstruction software package. http://spr.math.princeton.edu/.

[R6] [6].Bai XC, McMullan G, and Scheres SH. How cryo-EM is revolutionizing structural biology. Trends in Biochemical Sciences, 40(1):49–57, 2015. [DOI] [PubMed] [Google Scholar]

[R7] [7].Bandeira AS. Random Laplacian matrices and convex relaxations. Foundations of Computational Mathematics, 18(2):345–379, 2018. [Google Scholar]

[R8] [8].Bandeira AS, Charikar M, Singer A, and Zhu A. Multireference alignment using semidefinite programming. 5th Innovations in Theoretical Computer Science (ITCS 2014), 2014. [Google Scholar]

[R9] [9].Bandeira AS, Kennedy C, and Singer A. Approximating the little Grothendieck problem over the orthogonal and unitary groups. Math. Program, 160:433–475, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] [10].Bandeira AS, Singer A, and Spielman DA. A Cheeger inequality for the graph connection Laplacian. SIAM J. Matrix Anal. Appl, 34(4):1611–1630, 2013. [Google Scholar]

[R11] [11].Bannai E and Bannai E. A survey on spherical designs and algebraic combinatorics on spheres. European Journal of Combinatorics, 30(6):1392–1425, 2009. [Google Scholar]

[R12] [12].Bhamre T, Zhao Z, and Singer A. Mahalanobis distance for class averaging of cryo-EM images. 2017 IEEE 14 th International Symposium on Biomedical Imaging, pages 654–658, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Lederman RR and Singer A. A representation theory perspective on simultaneous alignment and classification. Applied and Computational Harmonic Analysis, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] [14].Boyd S and Vandenberghe L. Convex Optimization. Cambridge University Press, 2004. [Google Scholar]

[R15] [15].Charikar M, Makarychev K, and Makarychev Y. Near-optimal algorithms for unique Games. Proceedings of the 38th ACM Symposium on Theory of Computing, 2006. [Google Scholar]

[R16] [16].Chen Y, Huang Q-X, and Guibas L. Near-optimal joint object matching via convex relaxation. Proceedings of the 31st International Conference on Machine Learning, 32(2):100–108, 2014. [Google Scholar]

[R17] [17].Coifman RR and Weiss G. Representations of compact groups and spherical harmonics. L’Enseignement Mathematique, Vol. 14, 1968. [Google Scholar]

[R18] [18].Cucuringu M. Synchronization over Z2 and community detection in signed multiplex networks with constraints. Journal of Complex Networks, 2015. [Google Scholar]

[R19] [19].Dumitrescu B. Positive Trigonometric Polynomials and Signal Processing Applications. Springer, 2007. [Google Scholar]

[R20] [20].Farrow NA and Ottensmeyer EP. A posteriori determination of relative projection directions of arbitrarily oriented macromolecules. Journal of the Optical Society of America A, 9(10):1749–1760, 1992. [Google Scholar]

[R21] [21].Goemans MX and Williamson DP. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the Association for Computing and Machinery, 42(6):1115–1145, 1995. [Google Scholar]

[R22] [22].Van Heel M. Angular reconstitution: a posteriori assignment of projection directions for 3D reconstruction. Ultramicroscopy, 21:111–123, 1987. [DOI] [PubMed] [Google Scholar]

[R23] [23].Van Heel M and Schatz M. Fourier shell correlation threshold criteria. Journal of Structural Biology, 151:250–262, 2005. [DOI] [PubMed] [Google Scholar]

[R24] [24].Huang Q-X and Guibas L. Consistent shape maps via semidefinite programming. Computer Graphics Forum, 32(5):177–186, 2013. [Google Scholar]

[R25] [25].Kam Z. The reconstruction of structure from electron micrographs of randomly oriented particles. Journal of Theoretical Biology, 82(1):15–39, 1980. [DOI] [PubMed] [Google Scholar]

[R26] [26].Khot S. On the power of unique 2-prover 1-round games. Thiry-fourth annual ACM symposium on Theory of computing, 2002. [Google Scholar]

[R27] [27].Khot S. On the unique Games conjecture (invited survey). In Proceedings of the 2010 IEEE 25th Annual Conference on Computational Complexity, CCC ‘10, pages 99–121, Washington, DC, USA, 2010. IEEE Computer Society. [Google Scholar]

[R28] [28].Kostlee PJ and Rockmore DN. FFTs on the rotation group. Journal of Fourier Analysis and Applications, 14:145–179, 2008. [Google Scholar]

[R29] [29].Mallick SP, Agarwal S, Kriegman DJ, Belongie SJ, Carragher B, and Potter CS. Structure and view estimation for tomographic reconstruction: A Bayesian approach. EEE Conference on Computer Vision and Pattern Recognition, 2:2253–2260, 2006. [Google Scholar]

[R30] [30].Natterer F. The Mathematics of Computerized Tomography. Classics in Applied Mathematics, SIAM, 2001. [Google Scholar]

[R31] [31].Naor A, Regev O, and Vidick T. Efficient rounding for the noncommutative Grothendieck inequality. In Proceedings of the 45th annual ACM symposium on Symposium on theory of computing, STOC ‘13, pages 71–80, New York, NY, USA, 2013. ACM. [Google Scholar]

[R32] [32].Nesterov Y. Semidefinite relaxation and nonconvex quadratic optimization. Optimization Methods and Software, 9(1–3):141–160, 1998. [Google Scholar]

[R33] [33].Parrilo PA. Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. Thesis, California Institute of Technology, 2000. [Google Scholar]

[R34] [34].Parrilo PA Blekherman G and Thomas RR. Semidefinite Optimization and Convex Algebraic Geometry. MOS-SIAM Series on Optimization, 2013. [Google Scholar]

[R35] [35].Penczek PA, Zhu J, and Frank J. A common-lines based method for determining orientations for n > 3 particle projections simultaneously. Ultramicroscropy, 63:205–218, 1996. [DOI] [PubMed] [Google Scholar]

[R36] [36].Prucnal PR and Saleh BEA. Transformation of image-signal-dependent noise into image-signal-independent noise. Optics Letters, 6:316–318, 1981. [DOI] [PubMed] [Google Scholar]

[R37] [37].Rodrigues O. Des lois géometriques qui regissent les déplacements d’ un systéme solide dans l’ espace, et de la variation des coordonnées provenant de ces déplacement considérées indépendent des causes qui peuvent les produire. J. Math. Pures Appl, 5:380–440, 1840. [Google Scholar]

[R38] [38].Saunderson J, Parrilo PA, and Willsky AS. Semidefinite descriptions of the convex hull of rotation matrices. SIAM J. Optim, 25(3):1314–1343, 2015. [Google Scholar]

[R39] [39].Shkolnisky Y and Singer A. Viewing direction estimation in cryo-EM using synchronization. SIAM J. Imagin Sciences, 5(3):1088–1110, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] [40].Singer A. Angular synchronization by eigenvectors and semidefinite programming. Applied and Computational Harmonic Analysis, 30(1):20–36, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] [41].Singer A, Coifman RR, Sigworth FJ, Chester DW, and Shkolnisky Y. Detecting consistent common lines in cryo-EM by voting. Journal of Structural Biology, 169(3):312–322, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] [42].Singer A and Shkolnisky Y. Three-dimensional structure determination from common lines in cryo-EM by eigenvectors and semidefinite programming. SIAM J. Imaging Sciences, 4(2):543–572, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] [43].Singer A, Zhao Z, Shkolnisky Y, and Hadani R. Viewing angle classification of cryo-Electron Microscopy images using eigenvectors. SIAM Journal on Imaging Sciences, 4:723–759, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] [44].Sun D, Toh KC, and Yang L. A convergent 3-block semiproximal alternating direction method of multipliers for conic programming with 4-type constraints. SIAM J. Opt, 25(2):882–915, 2015. [Google Scholar]

[R45] [45].Vainshtein BK and Goncharov AB. Determination of the spatial orientation of arbitrarily arranged identical particles of unknown structure from their projections. 1986.

[R46] [46].Vanderberghe L and Boyd S. Semidefinite programming. SIAM Review, 38:49–95, 1996. [Google Scholar]

[R47] [47].Wang L and Sigworth FJ. Cryo-EM and single particles. Physiology, 21:13–18, 2006. [DOI] [PubMed] [Google Scholar]

[R48] [48].Wigner EP. Group Theory and its Application to the Quantum Mechanics of Atomic Spectra. New York: Academic Press, 1959. [Google Scholar]

[R49] [49].Yershova A, Jain S, LaValle SM and Mitchell JC. Generating Uniform Incremental Grids on SO(3) Using the Hopf Fibration. The International journal of robotics research, 29(7):801–812, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] [50].Zhao Z, Shkolnisky Y, and Singer A. Fast steerable principal component analysis. IEEE Transactions on Computational Imaging, 2:1–12, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] [51].Zhao Z and Singer A. Rotationally invariant image representation for viewing direction classification in cryo-EM. Journal of Structural Biology, 186(1):153–166, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

NON-UNIQUE GAMES OVER COMPACT GROUPS AND ORIENTATION ESTIMATION IN CRYO-EM

Afonso S Bandeira

ETH Zürich

Yutong Chen

Roy R Lederman

Amit Singer

Abstract

1. Introduction

1.1. Orientation estimation in cryo-Electron Microscopy.

2. Multireference Alignment

2.1. Registration of signals on the sphere.

Figure 2.1.

Figure 2.2.

2.2. Orientation estimation in cryo-EM.

Figure 1.1.

Figure 2.3.

Figure 2.4.

3. Linearization via Fourier expansion

4. Finite truncations for SO(2)andSO(3) via Fejér kernels

4.1. Truncation for SO(2).

4.2. Truncation for SO(3).

4.3. An additional constraint on X(1).

5. Applications

5.1. Registration in 1-dimension.

5.1.1. Rounding.

5.2. Registration in 2-dimension.

5.2.1. Rounding.

5.3. Orientation estimation in cryo-EM.

5.3.1. Handedness ambiguity in cryo-EM.

5.3.2. Rounding.

6. Implementation and results for synthetic cryo-EM datasets

Figure 6.1.

6.1. Shifts, CTF and contrast.

Figure 6.2.

Table 2.

6.2. ADMM implementation.

6.2.1. The alternating direction method of multipliers.

6.2.2. Fourier coefficients for bandlimited functions on SO(3).

6.2.3. Simplification for cryo-EM.

6.3. Rotation MSE and FSC resolution.

6.4. Simulated data.

Table 1.

Figure 6.3.

7. Summary

Acknowledgments

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

4. Finite truncations for $S O (2) a n d S O (3)$ via Fejér kernels

4.1. Truncation for $S O (2)$ .

4.2. Truncation for $S O (3)$ .

4.3. An additional constraint on $X^{(1)}$ .

6.2.2. Fourier coefficients for bandlimited functions on $S O (3)$ .