Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Aug 1.
Published in final edited form as: J Struct Biol. 2015 Jun 4;191(2):245–262. doi: 10.1016/j.jsb.2015.05.007

Directly Reconstructing Principal Components of Heterogeneous Particles from Cryo-EM Images

Hemant D Tagare a,b,*, Alp Kucukelbir c,d, Fred J Sigworth e,b, Hongwei Wang f, Murali Rao g
PMCID: PMC4536832  NIHMSID: NIHMS705308  PMID: 26049077

Abstract

Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the (posterior) likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the inluenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP.

Keywords: Single Particle Reconstruction, Heterogeneity, Maximum-likelihood, EM Algorithm, Principal Components

1. Introduction

The three-dimensional principal components of heterogeneous particles are, loosely speaking, the primary “modes” of structural change in those particles. Principal components are biologically quite relevant. Each principal component informs us about parts of the structure that vary together in a coordinated manner. A key question in single particle electron cryo-microscopy (cryo-EM) is whether the principal components of heterogeneous three-dimensional structures can be reconstructed directly from the two-dimensional cryo-EM images. The goal of this article is to address this question from a theoretical as well as a practical and algorithmic point of view.

Classical cryo-EM reconstruction methods can be used to obtain principal components somewhat indirectly: These methods are used to reconstruct a number of different structures from the cryo-EM images. Then, the covariance of the reconstructed structures is taken as an estimate of the true three-dimensional covariance of the heterogeneous particle, and principal components are calculated as eigenvectors of the covariance. The difference between various reported methods lies in the reconstruction step. One approach assumes that the heterogeneous sample is a mixture of particles with a finite number of different structures. The particles in the mixture are recovered using the expectation-maximization algorithm (the EM algorithm). This approach is employed by several cryo-EM packages, e.g. Xmipp (Scheres 2007), RELION (Scheres 2012a; Scheres 2012b), and FREALIGN (Lyumkis 2013). Another approach employs the bootstrap (Penczek 2011). It samples the cryo-EM images with replacement, and reconstructs a large number of three-dimensional structures from the bootstrapped samples.

A more recent method to understand heterogeneity uses Laplacian eigenmaps to organize cryo-EM images into a low dimensional manifold from which an energy landscape is obtained (Dashti 2014). 2D movies of the heterogeneity are created along a trajectory in the energy landscape. These movies are generated for paths corresponding to different orientations and patch information from different orientations is compiled into a 3D movie.

A different approach to understanding heterogeneity bypasses the reconstruction step and directly models and estimates the covariance of the structures. In (Zeng 2012; Wang 2013), for example, this approach is used to estimate the covariance matrix of the structure, assuming that the covariance matrix has a diagonal form. This gives the voxel-wise variance of the structures, but not the principal components. Another approach attempts to reconstruct the covariance structure by a form of interpolation (Katsevich 2015; Anden). Because the covariance matrix is quite large, this approach is limited to small volumes.

Heterogeneity can also be investigated via normal mode analysis (Brooks and Karplus 1985; Chacon 2003). Normal modes are eigenvectors of the Hessian of the potential function of the atomic displacements of a molecule. Normal modes, especially the low-spatial-frequency normal modes, provide insight into possible heterogeneity of the particle due to bending and rotation of different parts of the molecule. Recent work (Jin 2014) has shown how normal modes can be used to understand heterogeneity in images. Normal mode analysis is useful in its own right, but in the context of principal components it can provide very informative priors. In the future, it may be possible to combine the strengths of both approaches into a unified whole.

In this paper, we consider the problem of directly and sequentially reconstructing the principal components from cryo-EM images. By “directly” we mean that the principal components are recovered without the intermediate step of reconstructing multiple structures or their covariances. By “sequentially” we mean that the principal components are reconstructed one at a time. This has the dual advantage of efficient memory utilization, because large covariance matrices are not needed, and of computational efficiency, because the principal components are recovered one at a time. Our approach is based on a generative model, and various complications of cryo-EM imaging such as variable image noise, different number of images in different projection directions, and even the contrast transfer function (CTF) can be incorporated into the model.

We also discuss a fundamental problem in covariance and principal component estimation that is often overlooked in the concern over algorithms and practical results. A priori, it is not clear at all whether, and how much of, the three-dimensional covariance (and hence the principal components) of a heterogeneous particle can be recovered from the two-dimensional cryo-EM projection images. Potentially, some information may be lost because the relation between images that are projected in different directions is not available in cryo-EM. However, it does turn out that the three-dimensional covariance can be recovered exactly without the knowledge of this relation. There is a Fourier slice theorem for covariances, which shows how the three-dimensional covariance of a structure can be recovered exactly from two-dimensional covariances of images. In Section 3, we present and explain this theorem. This theorem is as fundamental to heterogeneous particle reconstruction as the usual Fourier slice theorem is to single particle reconstruction. A similar result has also been reported in (Katsevich 2015).

A direct implementation of the Fourier slice theorem for covariances turns out to be cumbersome; the three-dimensional covariance is too large a data structure to calculate and hold in computer memory. A more practical alternative is to directly calculate the principal components and principal values of the covariance. The second part of this article contains our algorithm for directly recovering the principal components of the three-dimensional covariance from cryo-EM images. Simulations and experiments with real cryo-EM data show that the algorithm performs well with noisy data. Because the algorithm recovers principal components rather than a discrete set of structures, continuously variable structures are represented well by the method.

The theory and the algorithm in this paper are based on the following assumptions: We assume that if a single structure is reconstructed from a heterogeneous sample using classical reconstruction algorithms, then that structure is the mean (average) structure of the heterogeneous sample. We also assume that the heterogeneity is not excessive, so that during reconstruction each particle image is associated with the correct projection direction at the correct alignment. The latter assumption is not strict. Some mismatches and misalignments are not detrimental to the algorithm. Note that these two assumptions are also commonly made in other attempts to characterize structural heterogeneity (e.g. (Penczek 2011; Katsevich 2015)). Finally, we assume that the cryo-EM images can be CTF-corrected after alignment. This assumption is made only for conceptual simplicity; we want to address the heart of the problem - estimating the principal components - without the added complexity introduced by the CTF. The assumption is very easily relaxed.

We begin in Section 2 with a brief discussion of continuous space and discrete space models of heterogeneity. Section 3 addresses the question of whether the three-dimensional covariance can be recovered at all from cryo-EM images. Here, we explain how the Fourier slice theorem extends to covariance functions. A more mathematical explanation is in the Appendix. Section 4 formulates the problem of directly estimating the principal components. Section 5 proposes a practical algorithm for the problem. As shown in that section, a version of the classic Expectation Maximization algorithm (the EM algorithm) can be used to directly reconstruct the principal components from cryo-EM images. Section 6 contains results of using this algorithm to recover principal components from simulated and real cryo-EM data sets. Section 7 contains a discussion and concludes the paper.

2. Models for Heterogeneous Particles

Classic single particle reconstruction uses a “continuous space” and a “discrete space” model of the particle; the former is used to establish the Fourier slice theorem and the latter to derive practical algorithms. The continuous space model regards the structure (density) of the particle as a function s on a continuous three-dimensional space. The discrete space model takes the structure s to be a set of numbers on the vertices of a V × V × V lattice. The discrete structure can also be thought of as a V3 × 1 vector.

The continuous space model for a heterogeneous particle is a random process s defined on 3d space. That is, s(u) is a random variable for any point u in 3d. The mean μs is a deterministic function in 3d taking value μs(u) = E[s(u)] at u, and the covariance Σs is a deterministic function of any pair of points u, υ in 3d, with Σs(u, υ) = E [(s(u) − μs(u))(s(υ) − μs(υ))]. Samples drawn from s represent heterogeneous particles.

The discrete space model for a heterogeneous particle is a V × V × V or a V3 × 1 valued random variable s. Its mean μs = E[s] is a V × V × V or a V3 × 1 vector, and its covariance Σs is a V3 × V3 matrix. The principal components and principal values of s are the eigenvectors and eigenvalues of Σs.

Bayesian and likelihood approaches to principal component analysis use the following generative model for s (Tipping 1999; Basilevsky 1994):

s=μs+z1μ1+z2μ2++znμn, (1)

where, μ1, ⋯, μn are orthogonal vectors, i.e. μiTμj=0 for ij. The norm (length) of μk is the square root of the kth principal value of s(μk=λk), and μk/‖μk‖ = ek, the kth principal component of s. Finally, z1, ⋯, zn are independent scalar random variables that are normally distributed with density 𝒩(0, 1). This model can be rewritten with simplified notation by defining a V3 × n matrix μ whose columns are μk:

μ=[μ1μ2μn],

and a vector valued random variable z whose components are zk,

z=[z1z2zn]

and which has density 𝒩(0, I), where I is the identity matrix, so that

s=μs+μz. (2)

This model is used in section 5 to derive the EM algorithm for estimating principal components.

3. The Fourier Slice Theorem for Covariances

Having defined the basic quantities that we will use for describing heterogeneous particles, we turn to describing the Fourier slice theorem for covariances. This theorem mirrors the classic Fourier slice theorem in single particle reconstruction. We adopt the “continuous space” point-of-view of the classic theorem. Detailed calculations and justifications for all of the formulae in this section are given in the Appendix.

Let the heterogeneous particle s be a random process in three dimensions. Project s tomographically onto a two-dimensional plane as follows: With the understanding that the north hemisphere of a unit sphere includes the equator, pick a point in the north hemisphere. This point defines a unit length vector n. If Πn is the two-dimensional plane perpendicular to n containing the origin, then for any point a ∈ Πn,

yn(a)=s(a+nσ)dσ (3)

is the line integral along the normal ray through a, with σ being the distance along the ray. This makes yn a two-dimensional stochastic process defined on Πn. It is the random image generated by the tomographic projection of s. Its mean and covariance are easily shown to be (see Appendix):

Mean:μyn(a)=E[yn(a)]=μs(a+σn)dσ
Covariance:Σyn(a,b)=E[(yn(a)μyn(a))(yn(b)μyn(b))]=Σs(a+σ1n,b+σ2n)dσ1dσ2.

We will call Σyn the projected covariance function.

As with the classic Fourier slice theorem we will assume that projection images are available in all projection directions. That is, μyn and Σyn are available for all projection directions n.

Below, we will need Fourier transforms of the covariance functions. They are:

s(ω1,ω2)=ei(ω1Tu1+ω2Tu2)Σs(u1,u2)du1du2,and (4)
˜yn(ν1,ν2)=ei(ν1Tυ1+ν2Tυ2)Σyn(υ1,υ2)dυ1dυ2. (5)

In the above equations ω1, ω2 are a pair of three-dimensional frequencies while ν1, ν2 are a pair of two-dimensional frequencies. The terms ω1Tu1,ω2Tu2,ν1Tυ1,ν2Tυ2 are inner products. The integral in equation (4) is six-dimensional and the terms du1 and du2 are the differential volumes. Similarly, the integral in equation (5) is four-dimensional, and the terms dυ1 and dυ2 are differential areas. If the Fourier transform ℱs1, ω2) is known for every pair of three-dimensional frequencies ω1, ω2, then the transform can be inverted to recover the covariance function Σs.

Notice that when the Fourier transform is written as in equations (45) the frequencies themselves can be interpreted as vectors in the spatial domain. To be clear, this means that we treat any ω simultaneously as a vector in the Fourier domain as well as in the spatial domain. Thus, given any non-zero ω, there is a plane in the Fourier domain perpendicular to ω. Similarly, there is also a plane in the spatial domain perpendicular to ω (defined by the set all points u such that ωT u = 0). These two planes are parallel to each other.

Now recall the classic Fourier slice theorem: Let n be a vector in the three-dimensional Fourier domain and let n be a plane in a three-dimensional Fourier domain perpendicular to n. Similarly let Πn be the plane in the spatial domain perpendicular to n (figure 1a). If ω is a three-dimensional frequency in Πn, then, ω is also contained in Πn, and we may think of ω as a two-dimensional frequency vector in Πn. The classic Fourier slice theorem shows that if we tomographically project a three-dimensional function onto Πn, then the Fourier coefficient of the projection at any (the two-dimensional) frequency ω is equal to the Fourier coefficient of the three-dimensional function at (the three-dimensional) ω in Πn. That is, all Fourier coefficients of the tomographic projection on Πn are equal to the corresponding coefficients of the three-dimensional Fourier transform in the plane Πn.

Figure 1.

Figure 1

The classic Fourier slice theorem and the Fourier slice theorem for covariances. (a) The planes n and Πn are perpendicular to n. Thus a three-dimensional frequency ω in n can be thought of a two-dimensional frequency. The classic Fourier slice theorem says that the Fourier transform of the tomographic projection of a function on Πn at (the two-dimensional) frequency ω is equal to the Fourier transform of the function at the three-dimensional frequency ω in n. (b) The covariance Fourier slice theorem says that the Fourier transform of a projected covariance function at the pair of (two-dimensional) frequencies ω1, ω2 in Πn is equal to the Fourier transform of the unprojected covariance function at the pair of (three-dimensional) frequencies ω1, ω2 in n

The Fourier slice theorem for covariances shows that a similar argument holds for covariance functions. If ω1, ω2 are a pair of vectors in the three-dimensional Fourier space, then there is at least one vector n in the north hemisphere in the Fourier domain1 (figure 1b) such that the plane n perpendicular to n contains ω1, ω2. Let Πn be the plane in the spatial domain perpendicular to this n. Then ω1, ω2 can be thought of as two-dimensional frequencies in Πn. It turns out that the Fourier coefficient of the projected covariance function in Πn evaluated at the (two-dimensional) frequency pair ω1, ω2 is identical to the Fourier coefficent of the three-dimensional (unprojected) covariance function at the (three-dimensional) frequency pair ω1, ω2 in n, i.e.

˜yn(ω1,ω2)=s(ω1,ω2). (6)

A proof of this claim is available in the Appendix.

This theorem shows how the three-dimensional covariance can be recovered exactly from projected two-dimensional covariances. For any ω1, ω2 in the three-dimensional Fourier domain find the vector n and evaluate ℱs1, ω2) = ℱ̃yn1, ω2). Since this can be done for any, hence every, ω1, ω2, the entire Fourier transform ℱs can be calculated. Taking the inverse Fourier transform of ℱs, gives the covariance Σs. Thus, the covariance, and therefore the principal components, of s can be recovered from the covariances of the images.

The Fourier slice theorem for covariances makes a number of idealizations that have to be relaxed to develop a practical algorithm: (1) The images are not available in continuous space, but rather on a two-dimensional lattice. (2) The images are not available for all projection directions, but only for finitely many directions. (3) The number of images varies quite widely for different projection directions, and for some projection directions the number of images may not be large enough to get a good estimate of the image covariance. (4) The images are quite noisy, as is typical of cryo-EM. (5) The amount of noise may vary from image to image, since it depends on the quality of the micrograph that the image comes from.

These factors suggest developing an algorithm with a discrete space image model, which does not assume that a reliable image covariance is available and which explicitly takes image noise into account.

4. The Estimation Problem

Returning to the generative discrete space model of equation (2) of section 2, suppose that the random structure s is tomographically projected onto a plane by a tomographic projection operator A. The result is a random image (size V × V or V2 × 1) given by I = As + ε, where ε is additive noise. Samples of this random variable are images that are present along the projection direction corresponding to A.

More generally, several images Ij, j = 1, ⋯,N are available and are obtained from s by projections from different directions. Image Ij has a corresponding projection operator Aj. Assuming that the images Ij are aligned with the correct projection direction at the correct rotation and translation,

Ij=Ajs+εj=Aj(μs+μzj)+εj,j=1,N (7)

where zj are i.i.d. random variables, each zj distributed identically to z, and εj is additive noise with density 𝒩(0, σj) (each image can have a different noise level).

Since μs and Aj are known, equation (7) can be simplified further as

IjAjμs=Aj(μzj)+εj,j=1,N. (8)

The term Ajμs is just the projection of the mean structure by Aj, and IjAjμs is the image formed by subtracting out the projected mean structure from Ij. Since μs and Aj are known this is straightforward to do. From now on, all images are assumed to be mean-subtracted so that equation (8) can be written simply as

Ij=Aj(μzj)+εj,j=1,N. (9)

The images Ij are indexed by j, but later it will be convenient to use a double index. Assuming that there are a limited number of projection directions 1, ⋯, R and that after alignment the rth direction has Nr images, the images can be indexed by the joint index r, t with r = 1, ⋯, R and t = 1, ⋯, Nr. Thus the Ij can also be referred to as Ir,t.

Our problem is to use the information in equations (2) and (9) to estimate the μ’s. We solve this problem sequentially. That is, we first estimate μ1, then μ2, follwed by μ3, ⋯ and so on. The estimate for any μk assumes that μ1, ⋯, μk−1 are known.

To reduce the propagation of noise into the estimate of μn, we will assume the standard smoothness prior for μn. This prior has the log density log

logp(μn)γμn2, (10)

where ∇ is the discrete three-dimensional finite-difference gradient operator and ‖‖ is the norm. The constant γ > 0 is the regularization constant. In practice, γ is determined by cross validation.

The random variables zj are the latent variables of the problem; there is no interest in estimating them. The EM algorithm (McLachlan and Krishnan 2008) can be used to estimate μn while marginalizing (i.e. integrating out) these latent variables. The EM algorithm uses several conditional densities, and these are calculated now.

Recall that we are assuming that μ1, ⋯, μn−1 are known. We take μn and σ1, ⋯, σN (the image noise variances) as the parameters to be estimated and set Θ = (μn, σ1, ⋯, σN). Then, using the fact that image noise and zj are normally distributed:

p(Ij|zj,Θ)=1(2π)V2(σj)V2exp(IjAjμzj22σj2),and (11)
p(Ij,zj|Θ)=p(Ij|zj,Θ)×p(zj|Θ)=p(Ij|zj,Θ)×p(zj)=1(2π)V2(σj)V2exp(IjAjμzj22σj2)×1(2π)nexp(zj22). (12)

A straightforward but tedious calculation gives the result that p(zj | Ij, θ) is distributed normally as 𝒩(ρ̂j, Σ̂j), where

ρ^j=1σj2(I+1σj2μTAjTAjμ)1μTAjTIj,and (13)
Σ^j=I1σj2(I+1σj2μTAjTAjμ)1μTAjTAjμ. (14)

The matrix I+1σj2μTAjTAjμ is n × n. Typically n is not bigger than 5, so that the matrix inverse is tractable. Also note that ρ̂j is an n × 1 vector, which by definition is

ρ^j=E[zj]=[E[zj1]E[zj2]E[zjn]], (15)

where E[] is the expectation with respect to zj | Ij, θ. Similarly, by definition Σ̂j is the covariance matrix

Σ^j=[E[zj1zj1]E[zj1zjn]E[zj2zj1]E[zj2zjn]E[zjnzj1]E[zjnzjn]]. (16)

5. The EM Algorithm

Using the conditional densities of the previous section, the Q function (McLachlan and Krishnan 2008) of the EM algorithm is:

Q(Θ,Θ[k])=i=1NE[logp(Ij,zj|Θ)]γμn2, (17)

where the expectation is with respect to zj | Ij[k], and the second term is the log prior for μn. Using equation (12) and dropping all terms that do not depend on Θ, gives

Q(Θ,Θ[k])=i=1NE[IjAjμzj22σj2V2logσjzj22]γμn2. (18)

The EM algorithm proceeds by alternately maximizing Q with respect to μn subject to the constraint that μn is orthogonal to μ1, ⋯, μn−1, and then with respect to σ1, ⋯, σn.

5.1. Maximization with respect to μn

Simplifying the Q function by dropping terms that do not depend on μn gives (after some algebraic manipulations):

Q(Θ,Θ[k])=j=1N(1σj2){(AjTIj)TμnE[zjn]i<nμiTAjTAjμnE[zjizjn]12μnTAjTAjμnE[(zjn)2]}γμn2. (19)

Notice that the Q function is quadratic with respect to μn. The M-step requires us to maximize the Q function with respect to μn subject to the constraint that μn is orthogonal to μ1, ⋯, μn−1. This maximization is straightforward to carry out: simply minimize −Q with the conjugate gradient algorithm using the negative of the gradient of Q with respect to μn projected on the subspace orthogonal to span(μ1, ⋯, μn−1).

Taking the gradient of Q with respect to μn,

μnQ=j=1N1σj2AjT(IjE[zjn]i=1nAjμiE[zjizjn])2γ2μn, (20)

where ∇2 is the three-dimensional finite-difference Laplacian operator. The projection of the gradient on the subspace orthogonal to span(μ1, ⋯, μn−1) is

P(μnQ)=μnQi=1n1μiTμiμnQ. (21)

There is one last simplification using the dual indexing scheme for images: Recall that if there are r = 1, ⋯,R projection directions with Nr images aligned to the rth projection direction, then the image index j can be replaced by the double index r, t. Since Aj depends only on r

μnQ=j=1N1σj2AjT(IjE[zjn]i=1nAjμiE[zjizjn])2ζ2μn=r=1RArT(Ĩri=1nArμiβi,r)2ζ2μn,where, (22)
Ĩr=t=1Nr1σr,t2Ir,tE[zr,tn] (23)
βi,r=t=1Nr1σr,t2E[zr,tizr,tn]. (24)

E[zr,tn] is available as the nth component of ρj (equations (1315)) and E[zr,tizr,tn] as the ith component of the last column of Σj (equations (1416)).

The gradient calculation of equation (22) has an intuitively appealing interpretation. The term Ĩr is the weighted average of all images aligned to projection direction r. The term i=1nArμiβir is the weighted sum of the projections of all μ’s, which represent the current guess of the principal components and values. Thus, (Ĩri=1nArμiβi,r) is the information in the image that is not yet explained by the projections of the principal components and values. The operator ArT back projects this information, and the sum of all the back projections gives the gradient, i.e. the direction, along which the current estimate of μn should be updated to incorporate this information.

In summary, the maximization of Q with respect to μn is done with a conjugate gradient algorithm using the projected gradient as follows:

Calculation of Projected Gradient

  1. Using the given μ1, ⋯, μn−1 and the current estimate of μn, calculate for every image index j, the values of ρj and σj according to equation (13) and equation (14).

  2. For every projection direction index r, calculate Ĩr according to equation (23) and βi,r according to equation (24).

  3. Calculate the gradient according to equation (22), and project this gradient according to equation (21). Use −P(∇μnQ) as the gradient for the conjugate gradient step.

5.2. Maximization with respect to σj

The maximization of Q with respect to σj has the closed form solution:

σj=1V2(IjTIj2IjTAjμE[zj]+i1=1,i2=1n,nμi1TAjTAjμi2E[zji1zji2]) (25)

The Complete Algorithm

Thus, the EM algorithm for estimating μn and σ1, ⋯, σN is:

  1. Input: Mean subtracted images Ij, initial value of μn, σ1, ⋯, σN.

  2. Iterate: Iterate the following until convergence
    • Maximize w.r.t. μn using conjugate gradient minimization with the projected gradient.
    • Maximize w.r.t. σj, j = 1, ⋯,N using equation (25).

The above assumes that the value the regularization constant γ is known. In practice, it is determined by a cross-validation method, a procedure that is analogous to the Fourier Shell Correlation method for determining resolution. First, the set of images is split into two halves. Then a set of values of γ are chosen. In practice, it is sufficient to estimate γ up to its order of magnitude, so this set typically contains only a few values. For every value of γ, the vector μ1 of both halves of the set of images is determined independently. When the images have a high signal-to-noise ratio, we expect the two μ1’s to be almost equal. As the noise increases, some noise propagates into the estimate of the μ1’s and they deviate from equality. The closeness of the two μ1’s can be determined by calculating the magnitude of the component of the each μ1 along the other, and summing the magnitudes. The γ at which this sum is the maximum is the γ at which the two μ1’s are most similar and is taken as the estimate of γ. This estimate is then used to reconstruct the principal components from the entire set of images.

5.3. Comments

Now that we have described the model and the algorithm, we comment on both:

  1. The image formation (equation (9)) does not assume that the noise in every image has the same variance. This assumption takes care of the fact that different micrographs do not have identical noise.

  2. The algorithm directly uses every image to calculate the principal components. This is in contrast to an approach which relies on the covariance of the images. The latter approach requires a sufficient number of images along every projection direction to reliably calculate the image covariance. The EM approach does not.

  3. In many problems, the EM algorithm can get trapped in local minima. For some problems, the EM algorithm has to be run from multiple initializations to get a good estimate of the parameters. Our experience with simulated and real data is that the algorithm appears to converge reliably from a single random initialization.

  4. Two practical issues arise when the EM algorithm is applied to real cryo-EM data: First, some noise from the background inevitably propagates into the estimate of the principal component. To prevent this, a mask can be created loosely around the mean structure and the 3D principal components estimated only in the mask. Second, contrast between the particle and the solvent can be different in different images, and these give rise to a “contrast principal component”. This principal component usually appears as a change in the amplitude of the mean structure. To avoid this component, the reconstructed principal components can be constrained to be orthogonal to the mean structure. This is easily done by modifying equation (21) to
    P(μnQ)=μnQi=0n1μiTμiμnQ, (26)
    where μ0 = μs is the mean structure.

6. Experimental Results

The performance of the EM algorithm for determining principal components was evaluated with simulations and with two real cryo-EM datasets. In all cases, first, the regularization constant was determined by the cross-validation procedure, and then the EM algorithm was used to estimate the principal components. The algorithm was implemented in MATLAB and run on a single desktop computer. The forward and back projection operations were parallelized, i.e. Aj and AjT were implemented in parallel, with 6 MATLAB workers. The rest of the algorithm was not parallelized. The execution times per principal component for the algorithm for the three data sets is shown in Table 1.

Table 1.

Execution times for the algorithm per principal component. The algorithm was implemented in MATLAB, with the forward and back projection operators implemented in the spatial domain in parallel by 6 MATLAB workers. The rest of the algorithm was not parallelized.

Data Set Num. Images Image Size Num. Proj. Dirs Time per component
Simulation (3VG9) 19264 70 × 70 301 4.7 min.
70S Ribosome 10000 130 × 130 309 40.9 min.
RdRP 30000 128 × 128 1126 96.1 min.

We adopt the following convention to present results of the EM algorithm: The mean structure ±2λkek are structures that are ±2 standard deviations away from the mean structure along the kth principal component. We present all estimated principal components as this pair of structures in a figure. In addition, principal components that are estimated from real cryo-EM data are visualized by creating a movie that cyclically morphs from mean structure +2λkek to mean structure 2λkek in a linear fashion. The morph tool in Chimera (Pettersen 2004; CHIMERA) can be easily used to do this. The movie is available as supplementary information.

6.1. Principal Components of Simulated Data

For simulations, the atomic structure (3VG9) of the human adenosine A2A receptor with an allosteric inverse-agonist antibody was downloaded from the PDB (Hino 2012). Systematic changes were made to the structure to simulate heterogeneity. These changes are described in detail below. The changes to the structure are not biologically motivated, but instead are meant to represent uncorrelated stoichiometric and conformational changes. Our hope is to recover the uncorrelated changes as distinct principal components.

The structure of 3VG9 is shown in figure 2a. It contains a receptor (A) and an antibody fragment (B) which are approximately separated by a plane (P). Conformational heterogeneity was simulated by rotating the antibody fragment B around the rotational axis shown in figure 2a. Eight rotations, uniformly spaced in the range ±20 degrees, were applied to the atomic coordinates of the antibody fragment. A density map was created for each volume by using a simulator of solvated protein (Shang 2012). Then the volumes were low pass filtered to simulate limited resolution (voxel size: 2.5 Å cubed). Finally, the d.c. component of the volume was eliminated by high pass filtering to mimic solvent contrast. This resulted in eight volumes, each of size 70 × 70 × 70, representing purely conformational change. To model stoichiometric change, a volume was masked out of the rigid receptor structure. This volume was shifted and added near the receptor to a copy of the eight volumes. The entire set of the original eight volumes without the extra density plus the new eight volumes with the extra density were taken as sixteen volumes of a heterogeneous particle. Note that in this collection, there is a pair of volumes for every rotation of the antibody fragment - one volume without the extra density and one volume with the extra density. That is, the stoichiometric and conformational changes in this set are uncorrelated.

Figure 2.

Figure 2

The structure 3VG9 used in the simulation. (a) The ribbon structure of 3VG9. “A” is the receptor, “B” is the antibody fragment. The two are separated by the plane “P”. (b) The simulated density of 3VG9 as surface rendering and slices. The density is obtained by simulating a solvation model, followed by low pass filtering to simulate a 2.5Å cubed voxel, and then high pass filtering to simulate contrast with the solvent. (c) The antibody fragment B rotated around the axis shown in the figure by ±20 degrees in eight steps (only the extreme positions are shown). The eight densities obtained thus were duplicated and (d) an extra density added to the duplicated densities. The resulting 16 volumes were used in the simulation.

Figure 2 b–d illustrates the volumes. Figure 2b shows the original 3VG9 volume after applying the solvation model, low pass filtering, and solvent contrast filtering. Figure 2c shows the extreme ±20 degree rotations of the antibody fragment, and Figure 2d shows the same volumes with the extra density near the top.

The sixteen volumes were projected along 301 projection directions in the north hemisphere to produce 70×70 images of pixel size 2.5 Å. Assuming a voltage of 300 kv, four CTFs were simulated with defocus values of 1.2, 1.45, 1.7, and 2.0 μm respectively. Each CTF was applied to every projection image. Equal amounts of noise was added to the projection images before and after the CTF was applied. Figure 3a shows a typical spectrum of noise in one of the images. The spectrum clearly shows a CTF-filtered noise component on top of a white noise component (Zeng 2007). A number of simulations were carried out at different signal-to-noise ratios. For brevity, here we report only the results of the lowest signal-to-noise ratio. This signal-to-noise ratio is SNR = 0.03×1/2, where “×1/2” term represents that fact that noise (corresponding to SNR = 0.03) was added before applying the CTF as well as after applying the CTF. Figure 3 b–c show typical noisy low-and high-defocus images. Figure 3 d–e show the underlying non-noisy CTF-filtered images. These images are included only to visually assess the amount of the noise in Figure 3 b–c. The entire simulation resulted in 16(volumes) × 301(projections) × 4(CTFs) = 19264 images.

Figure 3.

Figure 3

The images used in the simulation. (a) Typical radially-averaged noise spectral density in the simulation. Noise was added before and after applying the CTF. (b)–(c) typical low and high defocus noisy images in the simulation. (d)–(e) are the underlying noise-free images corresponding to (b) and (c).

Finally, the mean volume and the “ground-truth” three-dimensional principal components were calculated from the sixteen volumes (without noise). Figure 4a shows the mean volume. Figures 4 b–c show the mean volume ±2λ1e1 where λ1 and e1 are the first principal value and component. Figures 4 d–e show the mean volume ±2λ2e2 where λ2 and e2 are the second principal value and component. The first principal component captures the presence/absence of the extra density without capturing any associated rotation of the antibody fragment. The second principal component captures the rotation of the antibody fragment without any change in the extra density. Thus the stoichiometric and conformational changes are captured in separate principal components. Figures 4 f–g show slices through the volumes of the first and second principal components. The slices are arranged raster-wise with the top left being the top slice and the bottom right being the bottom slice.

Figure 4.

Figure 4

The “ground truth” from the simulated volumes. (a) The mean of sixteen volumes, μ. (b) The mean volume 2λ1e1. (c) The mean volume +2λ1e1. (b) and (c) show that the first principal component captures the presence/absence of the extra density without capturing any associated rotation of the antibody fragment. (d) The mean volume 2λ2e2. (e) The mean volume +2λ2e2. (d) and (e) show that the second principal component captures the rotation of the antibody fragment without any change in the extra density. (f) and (g) are slices through the volumes of e1 and e2 respectively.

Next, the noisy 19264 images were CTF-corrected using Wiener filtering, and the CTF-corrected images used with the EM algorithm to estimate the three-dimensional principal components and principal values. Table 2 shows the principal values estimated by the algorithm as well as the “ground truth” principal values. Loosely speaking, the principal values are the amount by which heterogeneity extends along a corresponding principal component. Figures 5 a–b show the mean volume ±2λ^1ê1 where λ̂1 and ê1 are the estimated first principal value and component. Similarly, figures 5 c–d show the mean volume ±2λ^2ê2 where λ̂2 and ê2 are the estimated second principal value and component. Figures 5 c–d show slices through ê1 and ê2 respectively. Comparing the estimated principal component slices with those in figure 4 f–g reveals that the estimated components correspond very well with the ground truth components, although some noise has propagated into the “background” in spite of the regularization term. This is expected, since the regularization term reduces noise propagation but does not eliminate it entirely.

Table 2.

Ground Truth and Estimated Principal Values.

1st Prin. Val. 2nd Prin. Val.
Ground Truth 14.45 7.20
Estimated 12.61 9.95

Figure 5.

Figure 5

Estimates from the EM algorithm. The estimated principal values and components are λk^ and ek^ for k = 1, 2. (a)–(b) show the mean volume ±λ^1ê1. Similar to the ground truth, this estimated first principal component captures the presence/absence of the extra density without any motion of the antibody fragment. (c)–(d) show the mean volume ±λ^2ê2. The estimated second principal component captures the rotation of the antibody fragment without any change in the extra density. (e)–(f) show the slices through the estimated first and second principal components respectively. These are similar to the slices shown in figure 4 f–g; however, some noise has propagated into all of the slices.

How similar an estimated principal component is to the “ground truth” principal component can be evaluated by simply calculating the absolute value of the inner product between the two. That is, if êk is the estimated principal component and ek the “ground truth” principal component, then their similarity is measured by |ekTêk|. Since êk and ek are unit norm, the inner product is just the cosine of the angle between them. Furthermore, the absolute value compensates for the sign ambiguity of the principal components. The performance measure |ekTêk| takes values between 0 and 1 with high values reflecting greater similarity.

The first row Table 3, labeled “Raw”, shows the above performance measure for the estimated principal components. The noisy backgrounds of the estimated principal components reduces their similarity to ground truth. To demonstrate this, we attempted to calculate the performance measure after eliminating the noisy background as follows: the ground truth principal component was thresholded to produce a mask whose voxels took a value 1 when the absolute value in the corresponding voxels of the component were greater than 5% of the largest absolute value in all voxels, else the mask voxel was set to 0 (that is, the mask m has value m(u) = 1 at voxel u, if the ground truth principal component ek(u) > 0.05 × maxu | ek(u) |, else m(u) = 0). This produced a binary mask in which the background was completely suppressed. The mask was then applied to the ground truth and to the reconstructed principal components. The masked components were scaled to have a unit norm, and the performance measure recalculated with these components. The second row of Table 3, labeled ‘Masked’, shows the masked performance measures, and it is clear that with the background masked out, the components are much more similar, with absolute value of the inner product close to 1.

Table 3.

Absolute values of inner product between ground truth and estimated principal components

1st Prin. Comp. 2nd Prin. Comp.
Raw 0.91 0.89
Masked 0.95 0.95

Next, we carried out simulations in which we investigated the sensitivity of the estimated principal components to the amount of heterogeneity. Recall that in the simulation discussed above, the antibody fragment density was rotated by ±20 degrees. Now, we created three additional volume sets (sixteen volumes per set) with antibody fragment rotation in the ranges ±10,±5,±2.5 degrees respectively. The extra density was added in an uncorrelated manner as above. Thus, these sets of volumes contain different amount of conformational heterogeneity with a fixed amount of stoichiometric heterogeneity. Noisy images at the four CTFs mentioned above were generated from each volume set at SNR = 0.03 × 1/2, also as above. The principal components were recovered using the EM algorithm for each set. Figure 6a plots the absolute value of the inner product of the estimated rotation principal component and the ground truth rotation principal component vs. the extent of antibody fragment rotation. Raw and masked inner products are shown. Results from the ±20 degree rotations, which are available from the simulation discussed above, are also added to the figure. The masked results in figure 6a clearly show that the algorithm is able to recover the relevant principal component even as the conformational change becomes smaller.

Figure 6.

Figure 6

Sensitivity of the estimated principal components to the amount of heterogeneity. (b) Absolute value of the inner product between the estimated principal component and the ground truth principal component as a function of the extent of antibody fragment rotation. (b) Absolute value of the inner product between the estimated principal component and the ground truth principal component as a function of the percentage of extra density. The antibody fragment rotation in the range ±20 degrees.

Finally, we created volume sets in which the antibody fragment rotation was set fixed in the range ±20 degrees, but the extent of extra density was reduced to 50% and 25% of the simulation above. Again noisy images at the four CTFs mentioned above were generated at SNR = 0.03 × 1/2, and principal components were recovered using the EM algorithm. Figure 6b shows the absolute value of the inner product of the recovered principal component which modeled the extra density and the ground truth principal component as function of percentage extra density. Raw and masked inner products are shown. The 100% extra density results are from the simulation above. Figure 6b suggests that the algorithm is able to recover the relevant principal component as the mass change becomes smaller.

6.2. Principal Components of the 70S Ribosome

Next we evaluated the performance of the algorithm using real cryo-EM data. The data set used in this experiment is a subset of the data reported in (Agrawal 1999) and is publicly available. The subset contained 10000 images, half of which were cryo-EM images of the 70S Ribosome with Elongation Factor-G (EF-G) and half without Elongation Factor-G. The images, which are 130 × 130 pixels, were pre-processed in two steps. First, they were Wiener-filtered to obtain CTF-corrected images. Next, the images were low-pass filtered at 15Å to improve the SNR. This cutoff frequency was similar to the reported resolution of 17.5 − 18.4Å in (Agrawal 1999).

To start, two 3d reconstructions were obtained from the images using cryo-EM reconstruction package SPIDER(Shaikh 2008). The first 3d reconstruction was from the 5000 images of the ribosome with EF-G and the second reconstruction was from the 5000 images of the ribosome without EF-G. The two reconstructions are shown in figure 7. The presence and absence of EF-G is clearly visible as is the ratcheting of the 30S. These two reconstructions are used to evaluate the results of the principal component algorithm.

Figure 7.

Figure 7

Two separate reconstructions from images containing the 70S Ribosome with and without EF-G. The volumes are displayed with the stalk in the vertical position and the 30S in the plane of the paper. In this orientation, (a) the presence and (b) the absence of the EF-G is clearly visible. The ratcheting of the 30S is also visible. Cross hairs are added to fixed pixel locations to make it easier to visualize the ratcheting. The motion of the L1 is also visible. The difference between these two reconstructions is likely to be similar to the 1st reconstructed principal component.

Next, the 10000 images were pooled together and a single particle was reconstructed using 309 projection directions in the north hemisphere. Then, the class mean for each projection direction was subtracted from the images aligned to that direction, and the mean subtracted, aligned images used with the EM algorithm. Initial experimentation with reconstructing the first five principal components showed that only the first two components were meaningful, the remaining components showed no biologically plausible changes and were discarded. The first two principal components are discussed in detail below.

Almost all of the structural changes captured by the first two principal components are best visualized by orienting the ribosome such that its stalk is vertically upwards, and then rotating the ribosome around the stalk. This is how the results are presented below.

The first principal component

Figure 8a–b shows the mean ±2λ1ê1 structures. The figure clearly shows that the first principal component captures the binding/unbinding of the EF-G. Notice the close similarity of the structures in figure 8a–b to the structures in figure 7.

Figure 8.

Figure 8

Estimated first principal component from the 70S data shown as mean±2λ^1ê1. This principal component clearly captures the binding and unbinding of the EF-G. The principal component also captures associated conformational changes. They are presented below in fig. 9 and fig. 10.

In addition to the binding/unbinding of the EF-G, the first principal component also captures associated conformational changes of ribosome. The following conformational changes are clearly apparent:

  1. The most significant conformational change is the ratcheting of the 30S subunit with respect to the 50S subunit. Figure 9a–b shows the mean ±2λ^1ê1 structures from an angle that visualizes the 30S subunit straight on. Cross-hairs are added to the figures in fixed spatial position. The density immediately below the cross hairs corresponds to the 30S subunit, so that the 30S motion can be assessed with reference to the cross hairs. Arrows are also added to the figures illustrating the apparent movement of the 30S from figure 9a to figure 9b. The 30S seems to rotate clockwise around an axis that is perpendicular to the page, and which passes approximately through the center of the particle. Figure 9c–d assesses the movement of the 50S subunit. The viewing angle in figure 9 is the polar opposite of that of figure 9. In effect, if figure 9 a–b are the “front” images, then figure 9 c–d are the “back” images of the mean ±2λ^1ê1 densities. Here too, cross-hairs are added to the figure to aid comparison, and density immediately under the cross-hairs is the 50S subunit. When compared with the cross-hairs, it is clear that the 50S subunit does not exhibit any rotation.

    Thus, the ratcheting motion of the 30S subunit with respect to the 50S subunit is captured by the first principal component.

  2. An equally significant conformational change is the motion of the L1 subunit towards and away from the main body of the ribosome. Figure 10 shows that this change is also captured by the first principal component. This motion of the L1 subunit is also apparent in the separate reconstructions of figure 7(a)–(b).

  3. Finally, there is a thinning-thickening of the ribosome stalk as the EFG binds and unbinds to the ribosome. The first principal component captures this clearly, as also seen in figure 9 a–b.

Figure 9.

Figure 9

The ratcheting of the 30S subunit with respect to the 50S subunit is captured by the first principal component. (a) and (b) show the mean ±2λ^1ê1 densities from an angle that visualizes the 30S subunit straight on. The cross-hairs in (a) and (b) are in fixed spatial positions and can be used to assess the motion of the underlying density. (b) also contains arrows suggesting the apparent direction of motion of the densities under the cross hairs. (c) and (d) show the mean ±2λ^1ê1 densities from an angle that visualizes the 50S subunit straight on. The 50S subunit is apparently stationary. Thus the first principal component captures the ratcheting of the 30S subunit with respect to the 50S subunit

Figure 10.

Figure 10

The motion of the L1 subunit captured by the first principal component. (a) and (b) show the mean ±2λ^1ê1 densities from an angle that visualizes the L1 protein subunit (which is contained in the red circle in (a) and (b)). The cross-hairs are spatially fixed, and the translational motion of the L1 is quite apparent when compared to the cross-hairs. The particle is viewed after the rotation as shown from the positions in figure 8

All of these conformational changes are known to be associated with the binding-unbinding of the EFG to the ribosome (Agrawal 1999).

The above observations can be made more precise and quantitative. Recall that the two reconstructions, figure 7(a)–(b), are obtained from images with and without EF-G. The difference between these two reconstructions should capture the binding-unbinding of the EF-G as well as any conformational changes accompanying it. Comparing the first principal component with this difference should reveal how much of the difference is captured by the component. Figure 11a shows slices through the difference between the two reconstructions, and figure 11b shows slices through the first principal component. The similarity between the two is quite clear in the figure.

Figure 11.

Figure 11

Comparing the difference between the two reconstructions in figure 7 with the principal component. (a) shows slices through the difference between the two reconstructions in figure 7. (b) shows slices through the first principal component. (c) is the plot of the absolute value of the inner product between the masked and normalized difference in the reconstructions (normalized to have a unit norm) and the masked and normalized principal component as a function of the mask threshold t. The mask is obtained by thresholding the difference volume at t times the maximum absolute voxel value. (d) The mask at the threshold 0.15

If the difference between the reconstructions of figure 7(a)–(b) is scaled to have a unit norm, then the absolute value of the inner product between it and the principal component should evaluate their similarity in a quantitative manner. As with the simulated data above, there is the problem of the noise in the background, but unlike the simulation it is now unclear what threshold to choose for masking out the background. The strategy we adopt is the following: We first scale the difference density to have a unit norm. The absolute values of the voxels of the scaled difference are thresholded at t times the maximum absolute value of all voxels. This mask is applied to the difference density as well as to the principal component and the masked volumes are rescaled to have a unit norm. The absolute value of the inner product between the masked and rescaled volumes is plotted as function of t. The plot is shown in figure 11c. The plot reveals that the absolute value of the inner product is higher than 0.9 for t > 0.15. Slices through the mask at t = 0.15 are shown in figure 11d. The mask captures most of the “foreground” in the difference image of figure 11a while suppressing much, but not all, of the background. This analysis provides strong evidence that most of the difference density is captured very well by the first principal component.

The second principal component

Figure 12a–b show mean ±2λ^2ê2 densities along the second principal component. There is no real ratcheting apparent in this principal component. Instead this component seems to capture contrast variations that have some residual correlation with the binding/unbinding of the EF-G: Densities in fig. 12b seem to be “thicker” than densities in fig. 12a. Figure 12c shows slices through the second principal component. Note the dissimilarity to the slices of the difference density and to the slices through the first principal component (figure 11a–b).

Figure 12.

Figure 12

The estimated second principal component shown as (a) mean+2λ^2ê2 and (b) mean2λ^2ê2. This component seems to capture the residual correlation between the contrast variation and the EF-G binding/unbinding. (c) shows slices through the principal component. Note the dissimilarity with the slices through the first principal component (fig. 11b).

In summary, it appears that for this dataset the first principal component captures the stoichiometric change of the EF-G binding/unbinding to the ribosome as well as correlated conformational changes in the ribosome.

6.3. Principal components of the Influenza Virus RNA Polymerase Complex

The second cryo-EM data set contains images of the inluenza A RNA-dependent RNA polymerase (RdRP) (Chang 2015). RdRP is a hetero-trimer with a molecular weight of 250kDa. In (Chang 2015), a tetrameric assembly state of the hetero-trimer was revealed to adopt a squarish shape with an approximate size of 180 × 150 × 70Å3 having an empty space in the center. A data set, containing images with defocus values from 1.0 to 3.0 μm, was used in that study to create five classes using 3D classification in RELION. Details of the five classes are available in figure S1 supplemental information of (Chang 2015).

We investigated the principal components of the RdRP by analyzing a subset of images belonging to the class labelled III in the supplemental information of (Chang 2015). The subset we chose had 30036 images with defocus values between 1.0 and 2.0 μm. The images were downsampled from 256×256 pixels (pixel size 1.32Å) to 128×128 pixels and CTF-corrected by Wiener filtering. A single mean volume was reconstructed from the 30036 images using 1126 projection directions, which were approximately uniformly distributed on a sphere (appx. 6 degree spacing). A gold standard FSC analysis suggested that our reconstructed mean volume had a resolution of 14.7Å. The class mean for each projection direction was subtracted from the images aligned to that direction, and the mean subtracted, aligned images used with the EM algorithm.

The mean volume is shown in figure 13 and is similar to a higher resolution version of the structure reported in (Chang 2015). The four subcomplexes of the tetramer are labeled A,B,C,D in figure 13. The empty space in the middle of the tetramer is clearly visible. A groove, located in each subcomplex towards the tetramer center is visible in figure 13 for subcomplexes A and D. This groove is also present for subcomplexes B and C in the back (not shown). A hole in the center of the subcomplexes A,B,C,D is also visible. The pairs of subcomplexes AB and CD are referred to as dimers in (Chang 2015). The length of each dimer is larger than the distance between dimers.

Figure 13.

Figure 13

The mean density of the RdRP at gold standard FSC resolution 14.7Å. The RdRP has four subcomplexes labeled A,B,C,D. AB and CD are dimers. A groove and a hole is present in each subcomplex. The groove in the subcomplexes B and C is in the back and is not shown. There is evidence in (Chang 2015) that the RdRP has conformational heterogeneity. The two dimers rock with respect to each other by changing the angle between their long axes.

Before proceeding, it is useful to comment on the five RELION classes reported in the supplementary information of (Chang 2015). Classes I through IV very clearly have a tetramer structure similar to that of figure 13 (see figure S1 in the supplement to (Chang 2015)). One key difference between the classes I–IV is that the long axes of the two dimers are at different angles with respect to each other. The angle is schematically illustrated on the right in Figure 13. That is, the two dimers appear to rock with respect to each other. The presence of dimer rocking in classes I – IV strongly suggests that the RdRP is heterogeneous. If this heterogeneity is “continuous”, i.e. if the relative angle between the dimer long axes varies continuously, then it is likely that this heterogeneity is also present in just the images of class III and principal component analysis should reveal it. Class V is qualitatively different from classes I – IV. The density for class V also has the form of a tetramer, but the density of one of the dimers is dramatically reduced. This suggests that there were some dimers in the sample preparation that had not assembled into a tetramer, and that class V captured many of the dimers along with a few tetramers. If class III images contained any such unassembled dimers, then perhaps a principal component might reveal density change in one of the two dimers in the tetramer. By coincidence, this situation is similar to the simulation of section 6.1; the dimer rocking is a continuous conformational heterogeneity, the tetramer-dimer mixture is a stoichiometric heterogeneity.

Preliminary analysis of the selected class III images with the EM algorithm revealed that the background noise (noise in the solvent region of the images) and particle-solvent contrast changes had a strong influence on the principal components. To reduce these effects, we create a loose soft mask around the mean volume and reconstructed the principal components using the soft mask and the variant of EM algorithm discussed on comment 4 of section 5.3. The first two principal components found by the EM algorithm appear to be biologically interpretable and are discussed below. Continuous morphing of 3D densities along these principal components are contained in the movie referred to in section 6.2.

The first principal component

Figure 14 a–b shows the first principal component as mean volume ±2λ^1ê1 densities. Cross-hairs are added to the figures to aid visual comparison. The rocking of the two dimers with respect to each other is clearly evident in the side view of the particle in Figure 14. Also evident in the figure are changes to the groove geometry, as well changes in the holes in the center of the subcomplexes. Slices through the first principal component are shown in figure 14c. Note the lack of noise propagation into the background because of soft masking.

Figure 14.

Figure 14

The first principal component of RdRP. (a)–(b) Mean ±2λ^1ê1. The cross-hairs are in stationary position in the figure. In the side view of the particle, the front dimer rotates clockwise when going from mean mean2λ^1ê1 to mean mean+2λ^1ê1. The dimer in the back rotates counterclockwise. (c) Slices through the first principal component.

The first principal component clearly appears to capture the rocking of the dimers and suggests that this may be a continuous conformational change.

The second principal component

Figure 15 shows the second principal component as mean volume ±2λ^2ê2 densities. The most significant change in the principal component is the relative thinning of the density of one dimer (the one on the right) as indicated in the figure. A comparison of the two dimers in the side view of the particle shows that there is no rocking of the two dimers. Therefore, the second principal component appears to capture the dimer presence in the sample. However, the relative thinning of the dimer in this component does not appear to be as dramatic as that in class V of (Chang 2015). This suggests that most of the dimer images are captured in class V, with comparatively far fewer dimer images in class III, the class analyzed here.

Figure 15.

Figure 15

The second principal component of RdRP. (a)–(b) Mean ±2λ^2ê2. The density of the dimer on the right appears to thicken and thin along this principal component, while the density of the dimer on the left is relatively stable. The change in density is visually apparent in the locations that the arrows point to. Further, there is no rocking of the dimers apparent in the side view. This principal component appears to capture the tetramer-dimer mixing in the sample. (c) Slices through ê2.

In summary, the first two principal components of the RdRP reconstructed by the EM algorithm suggest a conformational change due to the rocking of the dimers and a stoichiometric change due to the presence of unassembled dimers in the tetramer sample.

7. Discussion and Conclusions

The processing of simulated and real cryo-EM data with the proposed algorithm suggests that the algorithm can reconstruct principal components of macromolecules from noisy cryo-EM data. The signal in the principal components is typically weaker than the signal in the mean structure, and some amount of noise inevitably percolates into the principal component estimate. The estimation technique could further benefit from selective suppression of the background noise in the component. One possibility is to use an adaptive basis technique such as in (Kucukelbir 2012). Another challenge is to account for contrast variation in the images. This is particularly important when negative staining is used. The range of contrast variation for negative staining is larger than the range for cryo-EM, and incorporating a more sophisticated contrast model in the generative model for principal components is likely to help with negative staining.

The principal components that we found so far have been biologically meaningful. But, occasionally, the interpretation of principal components can be tricky. Then, a rotation of the principal components within the subspace spanned by the principal components can be helpful. The classical principal component analysis literature contains many criteria for rotating principal components. Most of these criteria attempt to rotate the principal components so that they contain large and small loadings (loadings are the values of the coordinates of the principal components), making it easier to interpret the components. The popular varimax criterion, for example, achieves this by maximizing the sum of variances of the squared loadings. A detailed discussion of the different criteria, and their advantages and limitations, can be found in chapter 11 of (Jolliffe 2002). It would be interesting to explore this idea in the context of heterogeneous particles. For example, if a set of principal components turn out to be biologically difficult to interpret, perhaps they could be rotated in a way that their interpretation becomes easier.

There are other alternatives to principal component analysis. For example, independent component analysis finds components along which the data are independent rather than just uncorrelated (Hyvarinen 2001). Sparse approximations to the covariance structure are also tractable (Bien 2010). In spite of these more sophisticated alternatives, principal component analysis is often the first choice of method to understand covariance in the data.

Supplementary Material

1
2
Download video file (20.2MB, mp4)

Acknowledgements

This research was supported by the NIH grant 1R01GM095658. We would also like to thank Victoria Rudakova, Nicha Dvornek, Yunho Kim, and Lisa Berlinger for their help.

Appendix

The Fourier Slice Theorem for Covariances

This appendix contains the mathematical details of the Fourier slice theorem for Covariances.

Recall that s is a random process in three dimensions with a mean μs and a covariance function Σs. The process s is projected on ΠN as

yn(a)=s(a+nσ)dσ (27)

is the line integral along the normal ray through a. This makes yn a two-dimensional stochastic process defined on Πn. The mean and covariance of yn are

Mean:μyn(a)=E[yn(a)]=E[s(a+σn)dσ]=E[s(a+σn)]dσ=μs(a+σn)dσ,and
Covariance:Σyn(a,b)=E[(yn(a)μyn(a))(yn(b)μyn(b))]=E[(s(a+σ1n)dσ1μs(a+σ1n)dσ1)×(s(b+σ2n)dσ2μs(b+σ2n)dσ2)]=E[(s(a+σ1n)μs(a+σ1n))dσ1)(s(b+σ2n)μs(b+σ2n))dσ2)]=E[(s(a+σ1n)μs(a+σ1n))(s(b+σ2n)μs(b+σ2n))]dσ1dσ2=Σs(a+σ1n,b+σ2n)dσ1dσ2.

The Fourier transforms of the covariance functions are:

s(ω1,ω2)=ei(ω1Tu1+ω2Tu2)Σs(u1,u2)du1du2,and
˜yn(ν1,ν2)=ei(ν1Tυ1+ν2Tυ2)Σyn(υ1,υ2)dυ1dυ2.

Let ω1, ω2 be two frequencies in the three-dimensional Fourier domain and n be a vector perpepndicular to ω1, ω2. Let n be the plane perpendicular to n, so that n contains ω1, ω2 (fig. 1b). Further, let Πn be a plane perpendicular to n in the spatial domain, and let υ1 and υ2 be two points in Πn. Set u1 = υ1 + σ1n and u2 = υ2 + σ2n, and set the differential volumes du1 and du2 to du1 = dυ1dσ1 and du2 = dυ2dσ2. Then,

s(ω1,ω2)=ei(ω1Tu1+ω2T·u2)Σs(u1,u2)du1du2=ei(ω1T(υ1+σ1n)+ω2T(υ2+σ2n))Σs(υ1+σ1n,υ2+σ2n)dυ1dσ1dυ2dσ2=ei(ω1Tυ1+ω2Tυ2)ei(σ1ω1Tn+σ2ω2Tn))Σs(υ1+σ1n,υ2+σ2n)dυ1dσ1dυ2dσ2.

But, ω1Tn=0 and ω2Tn=0 because n is orthogonal to w1 and w2, giving

s(ω1,ω2)=ei(ω1Tυ1+ω2Tυ2){Σs(υ1+σ1n,υ2+σ2n)dσ1dσ2}dυ1dυ2=ei(ω1Tυ1+ω2Tυ2)Σyn(υ1,υ2)dυ1dυ2=˜yn(ω1,ω2).

This establishes:

Theorem 1

(Fourier slice theorem for Covariances) Let ω1, ω2 be any two points in three-dimensional Fourier space. If n is a unit length vector in the north hemisphere perpendicular to ω1 and ω1, then

˜yn(ω1,ω2)=s(ω1,ω2).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

The vector n is unique if ω1, ω2 are linearly independent. If ω1, ω2 are linearly dependent, then there is more than one such n. All that is required for the theorem is that there exist at least one such n.

References

  1. Scheres SHW, Gao H, Valle M, Herman GT, Eggermont PPB, Frank J, Carazo JM. Disentangling Conformational States of Macromolecules in 3D-EM through Likelihood Optimization. Nat. Methods. 2007;4:27–29. doi: 10.1038/nmeth992. [DOI] [PubMed] [Google Scholar]
  2. Scheres SHW. A Bayesian View on Cryo-EM Structure Determination. Journ. Mol. Biol. 2012;415:406–418. doi: 10.1016/j.jmb.2011.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Scheres SHW. RELION: Implementation of a Bayesian Approach to Cryo-EM Structure Determination. Journ, Struct. Biol. 2012;180:519–530. doi: 10.1016/j.jsb.2012.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Lyumkis D, Brilot AF, Theobald DL, Grigorieff N. Likelihood-based Classification of cryo-EM images using FREALIGN. Journ. Struct. Biol. 2013;183:377–388. doi: 10.1016/j.jsb.2013.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Penczek PA, Kimmel M, Spahn CMT. Identifying Conformational States of Macromolecules by Eigen-analysis of Resampled cryo-EM Images. Structure. 2011;19(11):1582–1590. doi: 10.1016/j.str.2011.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Dashti A, Schwander P, Longlois R, et al. Trajectories of the Ribosome as a Brownian Nanomachine. Proc. Natl. Acad. Sci. 2014 Dec.111(49):17492–17497. doi: 10.1073/pnas.1419276111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Zeng Y, Wang Q, Doerschuck PC. Three dimensional reconstruction of the statistics of heterogeneous objects from a collection of one projection image of each object. Journ. Opt. Soc. Am. A. 2012;29(6):959–970. doi: 10.1364/JOSAA.29.000959. [DOI] [PubMed] [Google Scholar]
  8. Wang Q, Matsui T, Domitrovic T, Zeng Y, Doerschuck PC, Johnson JE. Dynamics in cryo EM reconstructions visualized with maximum-likelihood derived variance maps. Journ. Struct. Biol. 2013;181:195–206. doi: 10.1016/j.jsb.2012.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Katsevich E, Katsevich A, Singer A. Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem. SIAM Journal on Imaging Sciences. 2015;8(1):126–185. doi: 10.1137/130935434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Anden J, Katsevich E, Singer A. Covariance estimation using conjugate gradient for 3D classification in Cryo-EM. doi: 10.1109/ISBI.2015.7163849. arXiv:1412.0985v2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Brooks B, Karplus M. Normal Modes for Specific Motions of Maromolecules: Application to the Hinge-Bending Mode of Lysozyme. Proc. Natl. Acad. Sci. 1985;82:4995–4999. doi: 10.1073/pnas.82.15.4995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Chacon P, Tama F, Wriggers W. Mega-Dalton Biomolecular Motion Captured from Elextron Microscopy Reconstructions. J. Mol. Biol. 2003;326:485–492. doi: 10.1016/s0022-2836(02)01426-2. [DOI] [PubMed] [Google Scholar]
  13. Jin Q, Sorzano COS, Rosa-Trevin JM, Bilbao-Castro JR, Nunez-Ramirez R, Llorca O, Tama F, Jonic S. Iterative Elastic 3D-to-2D Alignment Method Using Normal Modes for Studying Structural Dynamics of Large Molecular Complexes. Structure. 2014;22:496–506. doi: 10.1016/j.str.2014.01.004. [DOI] [PubMed] [Google Scholar]
  14. Jolliffe IT. Principal Component Analysis. Springer Series in Statistics; 2002. [Google Scholar]
  15. Tipping ME, Bishop CM. Probabilistic Principal Component Analysis. Journ. Roy. Stat. Soc., Series B. 1999;21(3):611–622. [Google Scholar]
  16. Basilevsky A. Statistical Factor Analysis and Related Methods: Theory and Applications. Wiley; 1994. [Google Scholar]
  17. McLachlan G, Krishnan T. The EM Algorithm and Extensions. Wiley-Interscience; 2008. [Google Scholar]
  18. Hino T, Arakawa T, Iwanari H, et al. G-protein-coupled Receptor Inactivation by an Allosteric Inverse-agonist Antibody. Nature. 2012;482(7384):237–240. doi: 10.1038/nature10750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Shang Z, Sigworth FJ. Hydration-layer Models for Cryo-EM Simulation. J Struct Biol. 2012 Oct.180(1):10–16. doi: 10.1016/j.jsb.2012.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Zeng X, Stahlberg H, Grigorieff N. A Maximum likelihood Approach to Two-dimensional Crystals. J Struct Biol. 2007;160:362–374. doi: 10.1016/j.jsb.2007.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Agrawal RK, Heagle AB, Penczek P, Grassucci RA, Frank J. EF-G-dependent GP hydrolysis induces translocation accompanied by large conformational changes in the 70S ribosome. Nature Structural Biology. 1999 Jul;6(7):643–647. doi: 10.1038/10695. [DOI] [PubMed] [Google Scholar]
  22. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. UCSF Chimera a visualization system for exploratory research and analysis. J Comput Chem. 2004 Oct;25(13):1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
  23. http://www.cgl.ucsf.edu/chimera. [Google Scholar]
  24. Shaikh TR, Gao H, Baxter WT, Asturias FJ, Boisset N, Leith A, Frank J. SPIDER image processing for single-particle reconstruction of biological macromolecules from electron micrographs. Nat. Protoc. 2008;3(12):1941–1974. doi: 10.1038/nprot.2008.156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Chang S, Sun D, Liang H, et al. Cryo-EM Structure of Influenza Virus RNA Polymerase Complex at 4.3Å Resolution. Molecular Cell. 2015 Mar 5;57:925–935. doi: 10.1016/j.molcel.2014.12.031. [DOI] [PubMed] [Google Scholar]
  26. Kucukelbir A, Sigworth FJ, Tagare HD. A Bayesian adaptive basis algorithm for single particle reconstruction. Journ. Struct. Biol. 2012;179(1):56–67. doi: 10.1016/j.jsb.2012.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Hyvarinen A, Karhunen J, Oja E. Independent Component Analysis: Algorithms and Applications. Wiley; 2001. [DOI] [PubMed] [Google Scholar]
  28. Bien J, Tibshirani R. Sparse Estimation of a Covariance Matrix. Biometrika. 2010 doi: 10.1093/biomet/asr054. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2
Download video file (20.2MB, mp4)

RESOURCES