Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2010 Jan 10.
Published in final edited form as: J Microsc. 2009 Mar;233(3):391–403. doi: 10.1111/j.1365-2818.2009.03137.x

Estimating contrast transfer function and associated parameters by constrained non-linear optimization

C YANG *, W JIANG , D -H CHEN , U ADIGA *, E G NG *, W CHIU
PMCID: PMC2804061  NIHMSID: NIHMS164704  PMID: 19250460

Summary

The three-dimensional reconstruction of macromolecules from two-dimensional single-particle electron images requires determination and correction of the contrast transfer function (CTF) and envelope function. A computational algorithm based on constrained non-linear optimization is developed to estimate the essential parameters in the CTF and envelope function model simultaneously and automatically. The application of this estimation method is demonstrated with focal series images of amorphous carbon film as well as images of ice-embedded icosahedral virus particles suspended across holes.

Keywords: Contrast transfer function, cryo-electron microscopy, parameter estimation

Introduction

The two-dimensional (2D) images collected from an electron microscope (EM) are not perfect 2D projections of the three-dimensional (3D) structure. Each experimentally collected image can be treated as a modulated projection with noise. The modulation of the image is determined by a number of factors that are related to the EM settings and imaging conditions (Chiu, 1978; Saad et al., 2001). The modulation process has been modelled mathematically as the contrast transfer function (CTF) (Erickson & Klug, 1970; Thon, 1971) and the envelope function (Hanszen, 1967). Each of these functions contains a number of parameters affecting the image contrast and quality. In this work, we assume that CTF modulation is considered invariant in the entire micrograph. This assumption is valid in general for single-particle cryo-EM study in which the grid plane is normal to the incident electron beam. However, the CTF must be considered position-dependent in tomography studies when the sample grid is purposely tilted or even in some single-particle studies if the grid is severely bent.

The CTF parameter estimation problem is essentially a nonlinear curve-fitting problem. A number of schemes have been proposed to solve this problem for single-particle imaging (Zhu et al., 1997; Conway & Steven, 1999; Ludtke et al., 1999; Huang et al., 2003; Sander et al., 2003; Velazquez-Muriel et al., 2003; Mallick et al., 2005). However, most of these schemes involve some ad-hoc or manual fitting steps instead of making use of the state-of-the-art numerical optimization algorithms that can be done objectively and accurately. As a result, parameter determination becomes difficult, especially when the experimental images are collected near focus where only one or two CTF rings are apparent.

This paper describes the use of efficient and accurate numerical optimization techniques to estimate these parameters by treating the estimation problem as a constrained non-linear optimization problem. Such an approach was perceived as infeasible or too computationally demanding in the past. Our experimental results demonstrate that this can be done reliably, efficiently and automatically.

Problem formulation

A thin biological specimen, consisting of mostly low atomic elements (C, N and O), can be approximated as a weak-phase object for transmission electron microscopy. For the weak-phase objects, the mathematical model that describes the relationship between the object potential function and the observed image has been well established (Erickson & Klug, 1970; Hanszen & Trepte, 1971; Thon, 1971). In order to demonstrate our approach, we will describe both the well-known and the derived formulations.

Mathematical model for image contrast

In the image contrast theory, the 2D Fourier transform of an image, which we denote by I(s), can be related linearly to the structure factor of the specimen, F(s), through the expression

I(s)=F(s)H(s)+N(s), (1)

where H(s) is the modulation function characterizing the instrument and experimental conditions, and N(s) is the noise function originating from various sources including surrounding buffer, electron inelastic scattering and recording media. Here, the bold-faced s denotes a 2D frequency vector. This is to be distinguished from the non-bold-faced s, which denotes a one-dimensional, (1D) spatial frequency scalar.

Note that I(s), F(s)and N(s)are all complex-valued functions. In this paper, we assume that the microscope optics is well aligned during image acquisition so that H(s) is a real-valued function. The computational problem to be solved is to construct H(s) and N(s), given I(s) and F(s). Analytical expressions for H(s) exist (Erickson & Klug, 1970; Hanszen & Trepte, 1971; Thon, 1971). These expressions contain a number of unknown parameters that can be determined through a numerical fitting procedure (Saad et al., 2001). If one assumes that the projection image is not correlated with the background noise, then it follows from Eq. (1) that

I2(s)=F2(s)H2(s)+N2(s), (2)

where I2(s), F2(s) and N2(s) are the power spectra of the observed projection image, the structure factor and the background noise, respectively. Here, the power spectrum of an image is defined as the expectation value of the Fourier intensity of the image. The subscript 2 in Eq. (2) is used to indicate that functions I2(s), F2(s) and N2(s) describe mappings from the 2D frequency space to the set of real numbers. These functions are to be distinguished from the functions I(s), F(s) and N(s) that are defined in subsequent sections to describe mappings from 1D frequency to real numbers.

Equation (2) can be written in polar coordinates as

I2(s,θ)=F2(s)H2(s,θ)+N2(s,θ). (3)

Note that F2(s), which corresponds to the rotationally averaged value of the structure factor associated with the specimen, is a 1D rotationally invariant function. Such rotationally averaged 1D structure factor can be measured in an X-ray scattering experiment of a solution suspension of the specimen (Schmid et al., 1999; Thuman-Commike et al., 1999; Saad et al., 2001). Alternatively, structure factor can also be estimated at low resolution directly from particle images and at high resolution from a model (Ludtke et al., 1999). When F2(s) is known, the parameter estimation problem becomes well defined.

The analytical function used to describe the background noise term N2(s, θ) in Eq. (3) is somewhat arbitrary and less well defined in the image contrast theory. In this paper, we extend the model defined previously (Saad et al., 2001) by including the azimuthal dependence of the functions, that is, we set

N2(s,θ)=n3(θ)en4(θ)s2n2(θ)sn1(θ)s, (4)

where ni(θ)(i = 1, 2, 3, 4) are unknown parameters to be determined.

The modulation function H2(s, θ) can be defined (Erickson & Klug, 1970; Hanszen & Trepte, 1971; Thon, 1971) as

H2(s,θ)=α2CTF2(s,θ)·E2(s,θ), (5)

where

CTF(s,θ)=(1Q2sinγ(s,θ)+Qcosγ(s,θ)), (6)
γ(s,θ)=2π(Csλ3s44+Δz(θ)λs22). (7)

The sin[γ(s, θ)] and cos[γ(s, θ)] terms in the CTF function in Eq. (6) are known as the phase CTF and the amplitude CTF, respectively (Erickson & Klug, 1970). The wavelength (λ) and the spherical aberration (Cs) are known constants. The unknown parameters to be estimated in Eq. (5) are the defocus (Δz), the amplitude contrast ratio (Q), the amplitude coefficient (α) and the envelope function (E (s, θ)). We should point out that Q is, in principle, dependent on the spatial frequency and atomic composition of the specimen. However, for weak-phase objects, the variation of Q with respect to these factors is so small that it can be considered as a constant parameter. The defocus Δz(θ) is anisotropic in general. It can be represented by

Δz(θ)=Δz0+Δz1sin(2(θθ0)), (8)

where Δz0 is the mean defocus of the sample, Δz1 is the focal difference due to axial astigmatism and θ0 represents the reference angle of axial astigmatism (Thon, 1971). When astigmatism is present, the power spectrum often exhibits elliptically shaped CTF rings.

The envelope function E (s, θ) in Eq. (5) is used to account for the spatial and temporal coherence effects, specimen drift and other signal decay factors such as the modulation function of the recording medium in H2(s, θ). Analytical expressions for some of these factors in the envelope function have been described previously (Hanszen, 1967, 1971; Frank, 1969, 1976). In practice, it has been empirically observed that the envelope function for images of ice-embedded particles with sub-nanometre resolution (<6 Å) data from most of the EMs can be approximated by a single Gaussian function of the form

E(s,θ)=eB(s,θ)s2, (9)

where the non-negative parameter B(s, θ) has been called the experimental B factor (Saad et al., 2001). The techniques we employ to solve the parameter estimation problem allows alternative formulations of the envelope function. In particular, we have experimented with using a more general envelope function of the form

E(s,θ)=eB1(s,θ)sB2(s,θ)s2B3(s,θ)s3

to model decay of the power spectrum from low to high frequencies for some data sets.

Parameter estimation via constrained non-linear optimization

We measure the discrepancy between the analytical model in Eq. (3) and the experimentally measured power spectrum Î2(s, θ) by the residual function r2(s, θ; x) = I2(s, θ; x) − Î2(s, θ; x), where x = (α, Δz0, Δz1, θ0, B, Q, n1, n2, n3, n4). To determine the optimal value of x, we propose to minimize the non-linear objective function

ρ2(x)=||r2(s,θ;x)||2, (10)

where ||r2(s, θ; x)|| is defined as the standard two-norm form of r2(s, θ; x), that is,

||r2(s,θ;x)||=sminsmax02πr22(s,θ;x)dθds, (11)

for some low and high cutoff frequencies smin and smax.

When C T F (s, θ), E (s, θ) and N2(s, θ) are independent of θ (i.e. astigmatism and drift are ignored, which is the case for good-quality experimental images), we can simplify the notation to obtain

I(s)=α2F(s)E2(s)CTF2(s)+N(s). (12)

When the structure factor F (s) is available, one can determine the parameters in x = (α, Δz0, B, Q, n1, n2, n3, n4) by minimizing the function

ρ(x)=||r(s;x)||2, (13)

where r(s; x) is the 1D residual function that measures the discrepancy between the 1D analytical model in Eq. (12) and the rotationally averaged power spectrum Î(s). The norm of r(s; x) is defined as

||r(s;x)||=sminsmaxr2(s,x)dsj=jminj=jmaxr2(sj;x). (14)

Note that the objective function in Eq. (14) is evaluated on the interval [smin, smax]. The reason for imposing such a restriction is to eliminate the unreliable and noisy data at both low and high frequencies.

In general, the objective function defined in Eq. (14) has many local minima. To narrow the search range and avoid being trapped at an undesirable local minimum, we impose explicit constraints. In most cases, the defocus value of Δz can be estimated from the experimentally intended imaging conditions to be within [Δzmin, Δzmax]. The valid values for Q are between 0 and 1 (Erickson & Klug, 1970). However, in practice, the upper bound for Q is generally believed to be much smaller than 1 (Toyoshima et al., 1993). Because the experimental B factor is always positive, as defined in Saad et al. (2001), the inequality of the type 0 ≤ BBmax, for some constant Bmax, is a natural constraint. Similarly, to ensure that the intensity of the background noise never falls below 0, we impose n3 ≥ 0.

In addition to these bound constraints, we also impose a set of non-linear inequality constraints in the form of N(sj) ≤ Î(sj), for j = jmin, …, jmax. These constraints are developed to ensure that the noise background term N(s) is always less than Î(s).

Because the intensity of the background signal typically decreases from low to high spatial frequencies, it is desirable to include constraints of the following type:

N(sj)s0forj=jmin,,jmax. (15)

In summary, when C T F (s, θ), E(s, θ) and N2(s, θ) are independent of θ, as found in many experimental cases (e.g. Jiang et al., 2003, 2008; Ludtke et al., 2004, 2008; Liu et al., 2007), we can estimate the unknown parameters for the CTF function that characterize the image modulation process by solving the following constrained non-linear optimization problem:

minxρ(x), (16)

subject to

ΔzminΔzΔzmax, (17)
0BBmax, (18)
QminQQmax, (19)
0n3, (20)
N(sj)I^(sj)forj=jmin,,jmax, (21)
N(sj)s0forj=jmin,,jmax, (22)

where ρ(x) is defined in Eq. (13).

In practice, the magnitude of I (s) may vary by several orders of magnitudes between the low and the high frequencies, as seen in the X-ray solution scattering of the single-particle suspension (Thuman-Commike et al., 1999; Ludtke et al., 2001). In this case, applying a non-linear optimization solver to Eqs. (16)(22) directly may result in an approximate solution that produces more accurate low-frequency fit at the expense of severe misfit at the intermediate- and high-frequency ranges of the power spectrum. Because the defocus parameter Δz, the most important parameter in the CTF model, is largely determined by the intermediate-to-high-frequency part of the power spectrum, such a misfit is likely to be detrimental in subsequent computations.

To overcome this problem, one may introduce a set of weights ωj in Eq. (13) that vary with respect to the frequency sj. That is, one can define the objective function in Eq. (13) to be

ρ(x)=sminsjsmax[I^(sj)I(sj)]2ωj. (23)

However, choosing a set of appropriate weights is not a trivial task.

An alternative strategy for mitigating problems associated with the large magnitude variation in Î(s) is to estimate the desired parameters by fitting log(Î(s)) instead. Because the log function is monotonically increasing on (0, ∞), minimizing Eq. (13) is equivalent to minimizing

η(x)=log(I^(s))log(I(s))2. (24)

Note that I(s) also depends on the parameter x to be estimated. In this formulation, we may need to impose additional constraints

I(sj)>0,forsminsjsmax, (25)

to ensure that the second logarithmic term in Eq. (24) is well defined.

The use of the objective function in Eq. (13) is not appropriate when astigmatism and drift are present in the micrograph. When C T F (s, θ), the envelope function E (s, θ) and the background noise N2(s, θ) all vary with respect to θ, one must resort to the most general form of the objective function defined in Eq. (10). Because parameters α, B, Q and ni (i = 1, 2, 3, 4) are all assumed to have angular dependency in this case, and the defocus is now parameterized by three parameters Δz0, Δz1 and θ0 that appear in Eq. (8), the number of unknown parameters to be estimated becomes 7mθ + 3, where mθ is the number of angular samples used in the evaluation of Eq. (10). The angular dependency of the parameters to be estimated also introduces angular dependency in the constraints defined by Eqs. (18)(22). As a result, the total number of non-linear constraints Eqs. (21) and (22)) in the constrained non-linear optimization model will increase by a factor of mθ. The increased number of unknowns and constraints makes the optimization problem much more difficult to solve. Hence, we need to seek other alternatives that are computationally more efficient.

When the angular dependency of Eq. (3) is caused solely by astigmatism, that is, the defocus Δz(θ) is the only parameter that varies with respect to θ, integrating the right-hand side of Eq. (5) with respect to θ yields a closed-form expression, which we will not show here. Such an expression allows us to again reduce the 2D fitting problem to a 1D fitting problem. Unfortunately, the residual norm in Eq. (13) associated with this 1D fitting problem has far too many local minima within the domain defined by the constraints in Eqs. (18)(22). Hence, it is difficult to compute the optimal estimation of the desired parameters in practice.

When CT F (s, θ), E (s, θ) and N2(s, θ) vary slowly with respect to θ, which is the case for good-quality images, a simple and practical strategy that one can use to reduce the complexity of the computation is to divide the 2D power spectrum evenly into k angular sectors for some k that is relatively small (e.g., between 8 and 10). This strategy is discussed in Frank (1996) and implemented in Huang et al. (2003). All parameters are assumed to be rotationally invariant within each sector. Rotational averaging of the power spectrum is performed within each sector to produce k averaged 1D profiles. The unknown parameters associated with Eq. (3) are estimated separately within each sector by solving Eqs. (16)(22) within that sector. This procedure returns k defocus values Δz(j), j = 1, 2, …, k. These defocus values can be used to estimate the parameters Δz0, Δz1 and θ0 by solving a constrained non-linear least squares (NLSQ) problem

minδ0,δ1,θ012j=1k[Δz0+Δz1sin(2j2πk2θ0)Δz(j)]2, (26)

subject to

Δz0minΔz0Δz0max, (27)
Δz1minΔz1Δz1max, (28)
θ0minθ0θ0max. (29)

We will demonstrate that this strategy works very well for images that contain a modest level of astigmatism.

When the experimental images contain significant amount of astigmatism, one may need to divide the power spectrum of each image into a larger number of angular sectors in order to accurately determine the astigmatism parameters. The potential pitfall of this approach is that the signal-to-noise ratio (SNR) associated with the 1D rotationally averaged power spectrum within each sector is likely to be very low; hence, the defocus, the experimental B factor and other parameters associated with each sector may not be reliably estimated. We argue that in this case, the collected images should be discarded anyway. However, such a decision calls for an image analysis tool that can automatically make a distinction between images that contain mild astigmatism and images that are too distorted to be useful. We have developed such a tool based on the active contour model (ACM) algorithm (Blake & Isard, 1998). The main idea behind the ACM algorithm is to use a special contour-tracing technique to identify concentric Thon rings in the power spectra of each image. The ratio between the radii associated with the major and minor axes of these elliptically shaped rings is estimated through a least squares procedure. When the estimated ratio is much greater than 1 (e.g 1.1), the image would be excluded from subsequent image processing and reconstruction.

Numerical methods

In this section, we describe numerical algorithms and software that we used to tackle the constrained non-linear minimization problem formulated above. We focus on the 1D curve-fitting formulation shown in Eqs. (16)(22), which can be used directly to estimate the unknown defocus and the parameters for the envelope and noise functions when astigmatism and drift are negligible. When both astigmatism and anisotropic experimental B factor are present in the data, we divide the power spectrum into several angular sectors and perform separate 1D curve fittings within each sector.

The constrained minimization problem described by Eqs. (16)(22) can be solved in a number of ways. Algorithms for solving general constrained non-linear optimization problem include the quadratic penalty method, the log barrier method, the augmented Lagrangian method and the sequential quadratic programming (SQP) method (Nocedal & Wright, 1999). We have chosen the SQP method because recent studies (Gouldet al., 2004)indicate that the SQP method is the most effective one for small-to-medium-sized problems, that is, problems with less than a thousand variables and constraints.

To simplify the notation in the discussion that follows, we denote the set of non-linear constraint functions in Eqs. (21) and (22) by q (x), where x is a column vector representation of the unknown parameters to be estimated, that is:

x=(α,B,Δz,Q,n1,n2,n3,n4). (30)

In this notation, all non-linear constraints in Eqs. (21) and (22) can be conveniently represented by a single vector inequality q (x) ≥ 0.

The SQP algorithm searches for an optimal solution to Eqs. (16)(22) iteratively. In SQP, the approximate solution xk is updated, at each step, by

xk+1xk+τkpk, (31)

where the search direction pk is obtained by solving a quadratic minimization problem of the form

minpk12pkTHkpk+ρ(xk)Tpk, (32)

subject to the same bound constraints as defined in Eqs. (17)(20) and also the linearized constraint

qj(xk)Tpk+qj(xk)0. (33)

The matrix Hk in Eq. (32) is an approximate Hessian of the Lagrangian function

L(x,μ)=ρ(x)j=1nμjqj(x), (34)

evaluated at the kth iterate xk, and μ = (μ1, μ2, …, μn) denotes a set of Lagrangian multipliers associated with the non-linear constraints qj(x). The step length τk in Eq. (31) is chosen to minimize some merit function while keeping the approximate solution xk+1 within the bound constraints (Nocedal & Wright, 1999).

To use a constrained non-linear optimization solver, one must provide procedures for calculating the objective function in Eq. (13) or (24) and the constraint function q(x). One may also provide a procedure for computing the gradient of the objective function and the constraint with respect to the unknown parameters in x. Because the derivatives of the objective and constraint functions with respect to the unknown parameters are easy to compute for the problem defined in Eqs. (16)(22), we carry out these operations explicitly. If the procedures for the gradient calculation are not supplied by the user, most software packages have the capability to compute approximate gradient through the technique of finite difference.

Several software packages have been developed to solve non-linear constrained optimization problems using SQP. Among the most well known are NPSOL (Stanford Business Software, Inc. Palo Alto, CA, USA) (Gill et al., 1986, 2002; Schittkowski, 1986) and the fmincon function in MATLAB (MathWorks, 2004). We use the MATLAB fmincon function for our implementation.

Results and discussion

Image data

Two types of data sets were generated to demonstrate the applicability of the proposed method for the microscope parameter determination. One data set was the focal series images of an amorphous carbon film, which was evaporated on freshly cleaved mica surface and then transferred onto a holey grid. The images were recorded at 200 kV in a JEM2010F EM (JEOL Ltd. Inc., Tokyo, Japan) onto the Gatan 4k × 4k CCD camera (US4000) (Gatan Inc., Pleasanton, CA) at an effective magnification of 110 400×. To assess the reliability and accuracy of our computational estimation scheme, we collected images of the carbon film in a broad range of defocus settings from 0.2- to 5.0-μm underfocus.

The carbon film was chosen as the test specimen for our algorithm because of the ease of detecting the CTF rings in the power spectra of the images. The images were taken with a pre-determined defocus so that we can assess the accuracy of the proposed computational procedure. Figure 1 shows a focal series of carbon film images and their corresponding power spectra. In this case, the astigmatism was well adjusted to a negligible level prior to the data collection. These represent the best type of data that could be recorded. To test the capability of astigmatism estimation, we purposely introduced a mild level of stigmatism in the image, the power spectrum of which is shown in Fig. 2. The structure factor for the carbon film data set was estimated from electron diffraction pattern of the carbon film (courtesy of Dr. Jaap Brink).

Fig. 1.

Fig. 1

200-kV CCD frames of carbon film (a–d) and their corresponding power spectra (e–h) taken under different defocus settings: (a) 0.5 μm, (b) 1 μm, (c) 3 μm and (d) 4 μm.

Fig. 2.

Fig. 2

Power spectrum of a carbon film image with a mild astigmatism (CTF ring eccentricity = 0.03).

The second set of data was the P22 mature phage particles recorded onto photographic films (Kodak SO-163, Kodak Co., Rochester, NY) in the JEM3000SFF EM (JEOL Ltd. Inc.) operated at 300 kV and specimen temperature of 4.2 K. Images between 0.5- and 3-μm underfocus were used in this test (e.g. Fig. 3). In this data set, no carbon film was used to determine the CTF and associated parameters because the ice-embedded virus particles are suspended across holes with no support film. The data were digitized with a Nikon scanner (Nikon Inc., Melville, NY) at a scanning interval of 1.06 Å/pixel. The 1D scattering curve associated with the P22 mature phage was obtained from the modified X-ray solution scattering experiment (Thuman-Commike et al., 1999) to yield the best fit for a broad spectra of spatial frequencies to the cryo-EM data.

Fig. 3.

Fig. 3

300-kV CCD frames of P22 mature phage (a and b) taken under different defocus settings and the corresponding power spectra (c and d). The estimated defocus is 0.41 μm for (a) and 1.14 μmfor (b).

CTF and associated parameter estimation of carbon film images with negligible astigmatism

Using the estimated structure factor, we applied the constrained non-linear minimization algorithm (the MATLAB fmincon function) discussed in ‘Numerical methods’ to each individual power spectrum image shown in Fig. 1. The bound constraints for each of the parameters are listed in Table 1. The cutoff frequencies defined in Eq. (11) were chosen to be smin = 0.02 Å−1 and smax = 0.2 Å−1. Our initial guesses for the B factors, amplitude contrast ratio, and noise parameters were set to: B = 100, Q = 0.1 and n1 = n2 = n3 = n4 = 1.0, respectively. Because our data set contains images taken under a wide range of defocus settings, we tried five different starting guesses for the defocus value (dz = 1.0, 3.0, 5.0, 7.0, 9.0 μm) for each of the runs. The CTF, envelope function and noise parameters associated with minimum final objective function value in Eq. (13) among the five runs were chosen to be our optimal estimation of the parameters. Note that all these procedures are implemented as part of the fitting processes; there is no need for the user to provide initial guess for different micrographs and repeat the runs.

Table 1.

The bound constraints for the CTF, envelope function and noise parameters to be estimated.

Parameter Lower bound (carbon film) Upper bound (carbon film) Lower bound (P22 phage) Upper bound (P22 phage)
Δz 0 9.0 0 4.0
B 0 0
α 0 0
Q 0 0.2 0 0.1
n1 − ∞ − ∞
n2 − ∞ − ∞
n3 0.0 0.0
n4 − ∞ − ∞

Table 2 shows typical convergence history associated with each fmincon run. The first column of this table lists the iteration number. Column 2 gives the total number of function evaluations performed up to the kth iteration. The progress of the convergence is measured by the value of the objective function (column 3), the magnitude of the directional derivative along the search direction (column 4) and the norm of the Lagrangian gradient (column 5), which provides the necessary first-order optimal condition for the constrained non-linear optimization problem defined in Eq. (16). The minimization procedure was terminated when the norm of the Lagrangian gradient is less than 0.05. The final objective function attains the value of 0.08, indicating a good match between the computational model defined by the estimated parameters and the power spectrum data. In Fig. 4, we plot both the 1D rotationally averaged power spectrum (the red curve) and the intensity curve defined by the function in Eq. (12) using the optimal parameters returned from the constrained minimization procedure (the blue curve). It is apparent that the difference between the experimental data (the red curve) and the fitted data (the blue curve) is negligible in the frequency domain of interest. This suggests that our constrained optimization procedure successfully identified the global minimum of the objective function defined in Eq. (16).

Table 2.

A typical convergence history of fmincon when it is applied to CTF parameter estimation of a carbon film.

k f-count η(α, β, Δz, Q, {ni}) ηTsk ||L||
1 9 453.9 −1440 3160
11 126 32.82 −0.33 14.4
21 232 28.71 0.002 6.02
31 341 28.34 −0.06 4.77
41 453 24.00 −0.2 18.4
51 559 21.12 −0.39 25.8
61 662 3.538 1.07 85.4
71 762 0.09 −0.02 3.63
76 812 0.08 −7e−7 0.04

The first column gives the iteration number. The second column gives the total number of function evaluations at the end of the kth iteration. The third column lists the relative norm of the residual. The fourth column gives the directional derivative at the kth iteration. The last column gives the first-order optimality of the constrained optimization problem.

Fig. 4.

Fig. 4

Comparing the 1D rotationally averaged power spectrum data (the red dots) with the CTF fitting curves (the solid blue curves) generated by the constrained non-linear minimization in a series of carbon film images shown in Fig. 1(a)–(d). The dash-dotted curves in (a)–(d) show the noise background estimated by solving Eqs. (16)(22).

In Table 3, we list the optimal parameters associated with these carbon film images. The first row of this table gives the intended defocus values during the data collection, where the defocus of the first image was determined using the DigitalMicrograph software (Gatan, Inc.) and the defocuses of the rest of the images were digitally set using the JAMES software (Marsh et al., 2007) (Booth et al., 2004). Clearly, our estimations of the defocus values (the second row) match very well with the intended defocuses. This suggests that our fitting procedure can be used reliably to estimate the CTF parameters associated with images taken under a wide range of defocus settings.

Table 3.

The CTF, envelope function and noise background parameters returned from the constrained non-linear minimization procedure (the MATLAB fmincon function) for carbon film images taken at different defocus settings.

Intended Δz (μm) 0.2 0.5 1.0 1.5 2.0 3.0 4.0 5.0 9.0
Determined Δz (μm) 0.23 0.54 1.04 1.55 2.05 3.06 4.10 5.13 9.36
B2) 119 104 104 104 106 110 116 126 186
α 12.0 8.14 6.66 5.91 5.43 4.99 4.72 4.63 4.39
Q 0.03 0.0 0.01 0.02 0.03 0.03 0.03 0.03 0.02
n1 −2.26 26.0 16.4 13.5 23.2 31.6 33.0 27.9 33.5
n2 17.0 −33.6 −10.1 −2.8 −22.5 −39.8 −4.2 −27.9 −34.7
n3 5.8 343 86.3 56.9 197 547 719 435 1402
n4 −0.7 4.6 −2.5 −1.3 3.4 4.5 4.6 −3.4 3.6

The first row (bold-faced numbers) of the table shows the intended defocus under which each image is taken.

The importance of constraints

We shall emphasize the importance of constraints in the formulation of the minimization problem in Eqs. (16)(22). Removing bound and/or non-linear constraints from the problem formulation turns the CTF parameter estimation problem into a standard NLSQ problem that can be solved efficiently using a Gauss–Newton-type of method (Dennis et al., 1981; More et al., 1984; Nocedal & Wright, 1999). However, unless the starting guess used by an NLSQ solver is sufficiently close to the optimal solution, one may obtain a solution that is physically wrong.

To illustrate this point, we use the power spectrum associated with the carbon film micrograph as an example. The image is taken under roughly 1.0-μm defocus (Fig. 1b). We plot the contour of the function

ζ(Δz,B)=ρ(α¯,B,Δz,Q¯,n¯1,n¯2,n¯3,n¯4), (35)

where ρ is the objective function defined in Eq. (13), and , and i are fixed at the optimal values obtained from a manual fit. This function is the restriction of ρ to a 2D subspace (spanned by B and Δz), with α = , Q = and ni = i, for i = 1, 2, 3, 4.

The contour plot shown in Fig. 5 indicates that ζz, B) has two local minima within [−2, 2] × [0, 300]. The desired local minimum is marked by a plus sign on the right half of the figure. If the update of the approximate minimizer in an NLSQ solver is not restricted to ensure that the underdefocus is used, the optimization procedure may converge to a local minimum that is entirely infeasible. Figure 5 also shows that the convergence of the optimization algorithm is less sensitive to the starting guess for B because ζz, B) appears to be convex in the direction of B within the neighbourhood of interest.

Fig. 5.

Fig. 5

The contour of associated with the image of carbon film taken under roughly 1.0-μm defocus. This function has two local minima. The desired local minimum is marked by a plus sign on the right half of the contour plot.

To demonstrate the importance of the non-linear constraints, we applied an NLSQ solver to Eq. (16) alone without additional constraints using a starting guess close to the optimal solution. Figure 6 shows that without the nonlinear constraints, the NLSQ solver converged to an infeasible solution in which the background term N2(s) in Eq. (12) becomes larger than the measured power spectrum at the second and the third CTF zeros. The quality of the fitting curve is considerably worse than that obtained from constrained non-linear optimization shown in Fig. 9(b).

Fig. 6.

Fig. 6

Without imposing the non-linear constraints in Eqs. (21) and (22), applying an NLSQ fitting procedure to the P22 mature phage particle image shown in Fig. 3(b) returns a solution in which the background term N(s) in Eq. (12) (the black dash-dotted curve) is larger than the power spectrum (the red dots) near 0.06 Å−1 and 0.14 Å−1.

Fig. 9.

Fig. 9

Comparing the 1D rotationally averaged power spectrum data (the red dots) with the CTF fitting curves (the solid blue curves) generated by the constrained non-linear minimization on the P22 mature phage images shown in Fig. 3(a) and (b). The dash-dotted curves in Fig. 3(a) and (b) show the noise background estimated by solving Eqs. (16)(22).

Multiple starting guesses for the defocus parameter

The bound and non-linear constraints established in Eqs. (17)(22) do not completely remove all undesirable local minima of Eq. (16). When particle images are collected under a broad range of defocus settings (i.e., the difference between Δzmin and Δzmax is large), the objective function in Eq. (16) may still have multiple local minima within the range of the defocus of interest. Figure 7 shows the change of the objective function in Eq. (16) with respect to different defocus values along the line segment defined by x = (ᾱ, B̄, Δz, Q̄, n̄1, n̄2, n̄3, n̄4), where , , and i (i = 1, 2, 3, 4) are optimal parameters determined in advance and Δz ∈ [0, 10] μm. Clearly, the objective function ρ(x) contains two local minima within this interval. The global minimum (marked by the circle) is located at Δz = 0.53μm, which is the desired defocus value associated with this particular data set. However, if one chooses the initial guess of the defocus to be around 8.0 μm, for example, SQP may converge to an incorrect defocus value that corresponds to the undesirable local minimum located near 7.8 μm.

Fig. 7.

Fig. 7

Variation of the objective function in Eq. (16) with respect to the defocus values along the line segment defined by x = (ᾱ, B̄, Δz, Q̄, n̄1, n̄2, n̄3, n̄4), where ᾱ, B̄, Q = and i (i = 1, 2, 3, 4) are optimal parameters determined in advance. Clearly, Eq. (16) has two local minima in [0, 10] μm. The desired global minimum is marked by a circle near 0.5μm.

To prevent SQP from converging to the wrong local minima within the defocus range of interest, we solve Eqs. (16)(22) with multiple starting guesses evenly distributed between Δzmin and Δzmax. Because the number of local minima within the defocus range of interest is typically small, we normally need to try only three to five different starting guesses.

Astigmatism estimation

When images contain a mild level of astigmatism, such as the one shown in Fig. 2, we apply the practical procedure discussed earlier to estimate all angular dependent parameters. To test this procedure on the power spectrum shown in Fig. 2, we divided the power spectrum evenly into eight angular sectors. Each sector was rotationally averaged to produce a 1D curve to be fitted with the constrained nonlinear model described in Eqs. (16)(22). Figure 8(a) shows the 1D curves generated from different angular sectors of the power spectrum shown in Fig. 2. These curves differ slightly in the positions of their peaks and valleys, implying the variation of the defocus along different radial directions.

Fig. 8.

Fig. 8

(a) Variation of the 1D power spectra obtained from rotationally averaging the 2D power spectrum shown in Fig. 2 among eight evenly divided angular sectors. (b) Variation of the defocus along different radial directions. The circles represent the defocus value estimated from each angular sector of the power spectrum. The curve corresponds to the function Δz(θ) = Δz0 + Δz1sin(2(θ − θ0)), where Δz0, Δz1 and θ0 are estimated by a non-linear least squares fitting procedure.

In Fig. 8(b), we plot the defocus values derived from the constrained non-linear minimization procedure (applied to each angular sector) against θ̂k, where θ̂k is the angle formed by the bisector of the kth angular sector and the horizontal axis. The estimated defocus values are marked by circles. An NLSQ algorithm was used to fit these defocus values to the analytical expression Δz(θ) = Δz0 + Δz1 sin(2(θ − θ0)). The fitting procedure producesΔz0 = 0.546μm, Δz1 = 0.017 μm and θ0 = 0.059. At this level of astigmatism, the power spectrum exhibits obvious elliptically shaped Thon rings. The level of eccentricity (e = Δz1z0) of the Thon rings is 0.03 based on the above fitted defocus parameters.

We should point out that multiple starting guesses are typically required to solve the constrained NLSQ problem in Eqs. (16)(22) in at least one of the angular sectors. Once the parameter estimation problem has been solved for that angular sector, the estimated parameters associated with that particular angular sector can be used as the starting guesses for the minimization procedure applied to other angular sectors. Because the variation of the CTF, envelope function and noise background parameters are typically small for images that contain mild astigmatism and drift, the use of this starting guess often enables the optimization routine to converge to a few iterations.

It is worth pointing out that typical cryo-EM images to be used for image reconstruction have a negligible level of astigmatism (Thon ring eccentricity <0.01) that is much smaller than that shown in this test case (Thon ring eccentricity = 0.03). In fact, single-particle cryo-EM studies have been able to obtain near-atomic-resolution, (~4 Å) 3D reconstructions without the need of considering astigmatism in the images (Jiang et al., 2008; Ludtke et al., 2008). Although we have shown here that this method can successfully handle the images with a significant level of astigmatism, the use of this functionality is rarely necessary in practice for single-particle cryo-EM study.

Parameter estimation with images of ice-embedded particles

Often the ice-embedded particles are suspended across holes without any carbon substrate, as shown in Fig. 3(a) and (b). The incoherent average of the power spectrum of the boxed-out particles has been used to determine the CTF and associated parameters by manual fitting procedure (Saad et al., 2001). Figure 9 shows that the fitting curves produced by the optimization procedure match extremely well with the 1D rotationally averaged power spectra of the micrographs. Furthermore, the determined parameters compare well with those determined manually. These data show that even without the carbon support film, our fitting method works equally well with authentic ice-embedded particle images.

CPU Requirements

The average amount of processing time required to fit a micrograph of 240-MB size on a 1.8-GHz Pentium 4 laptop is under a minute. With a meticulous choice of the convergence tolerance and maximum iteration number, the computational time can be further reduced. In practice, the range of defocus values associated with the experimental data is often much less than 10 μm. Thus, we may either tighten the bound constraints associated with the defocus or reduce the number of initial guesses to further speed up the computation.

General accessibility of the software

The algorithms and techniques described here have been implemented as a stand-alone Python script, which is available on the NCMI website (http://ncmi.bcm.edu/software/fitctf). It runs on all the major computer platforms (Linux, Windows and MacOS X). It allows the user to perform CTF estimation in a fully automated fashion once the 2D single particles have been boxed out from the micrograph. The resulting parameters are formatted to become compatible to single-particle image-processing software package EMAN (Ludtke et al., 1999).

Conclusions

An accurate determination of the CTF and associated parameters is essential in 3D structural determination of biological samples. Though manual fitting methods for these determinations have been used successfully, it is time-consuming and subject to human errors. The proposed constrained non-linear minimization algorithm has provided not only an objective and accurate but also an automated protocol. The examples shown here demonstrate its utility not only in images of carbon film but also in ice-embedded biological particles. A unique feature of this algorithm is the ability of determining images taken at smaller defocus (i.e. 0.5 μm), which is desirable for high-resolution structure determination (Liu et al., 2007; Jiang et al., 2008). For such a small defocus image, it is generally difficult to estimate its CTF with confidence by a manual fitting method. This algorithm has been successfully applied to determine the CTF and associated parameters in images used for 3D reconstruction in a broad range of resolutions (e.g. Chang et al., 2006; Jiang et al., 2006, 2008).

Although the extension (Fig. 8) of our non-linear optimization-based fitting procedure can handle images with astigmatism and anisotropic B factor, the accuracy of their determination may not be very high because of the need of dividing the power spectrum into multiple sectors, resulting in a poor SNR. By contrast, many structures of single particles have been determined to a 4- to 9-Å resolution without considering astigmatism and anisotropic B factor by excluding those images with apparent astigmatism and/or drift (Zhou et al., 2001; Jiang et al., 2003; Ludtke et al., 2004, 2005). More recently, the structure of ε15 phage has been solved to 4.5 Å (Jiang et al., 2008) using the CTF and associated parameters determined with this procedure. Therefore, our proposed algorithm will be of immediate usage for data up to this resolution range in which the resulting structure is interpretable in terms of protein backbone trace.

Acknowledgments

This research has been supported by NIH grants (P01GM064692, P41RR02250 and R01GM070557). We thank Dr. Robert M. Glaeser at University of California, Berkeley for helpful discussions.

References

  1. Blake A, Isard M. Active Contours. Springer Verlag; New York: 1998. [Google Scholar]
  2. Booth CR, Jiang W, Baker ML, Zhou ZH, Ludtke SJ, Chiu W. A 9 angstroms single particle reconstruction from CCD captured images on a 200 kV electron cryomicroscope. J Struct Biol. 2004;147:116–127. doi: 10.1016/j.jsb.2004.02.004. [DOI] [PubMed] [Google Scholar]
  3. Chang J, Weigele P, King J, Chiu W, Jiang W. Cryo-EM asymmetric reconstruction of bacteriophage P22 reveals organization of its DNA packaging and infecting machinery. Struct. 2006;14:1073–1083. doi: 10.1016/j.str.2006.05.007. [DOI] [PubMed] [Google Scholar]
  4. Chiu W. Factors in high resolution biological structure analysis by conventional transmission electron microscopy. Scanning Electr Microsc. 1978;1:569–580. [Google Scholar]
  5. Conway JF, Steven AC. Methods for reconstructing density maps of “single” particles from cryoelectron micrographs to subnanometer resolution. J Struct Biol. 1999;128:106–118. doi: 10.1006/jsbi.1999.4168. [DOI] [PubMed] [Google Scholar]
  6. Dennis JE, Gay DM, Welsch RE. NL2SOL – an adaptive nonlinear least squares algorithm. ACM Trans Math Software. 1981;7:369–383. [Google Scholar]
  7. Erickson HP, Klug A. The Fourier transform of an electron micrograph: effects of defocussing and aberrations, and implications for the use of underfocus contrast enhancement. Phil Trans Roy Soc Lond B. 1970;261:105–118. [Google Scholar]
  8. Frank J. Nachweis von objektbewegungen im lichtoptischen diffraktogramm von elektronenmikroskopischen aufnahmen. Optik. 1969;30:171–180. [Google Scholar]
  9. Frank J. Determination of source size and energy spread from electron micrographs using the method of Young’s fringes. Optik. 1976;44:379–391. [Google Scholar]
  10. Frank J. Three-Dimensional Electron Microscopy of Macromolecular Assemblies. Academic Press; San Diego: 1996. [Google Scholar]
  11. Gill PE, Murray W, Saunders MA, Wright MH. User’s guide for NPSOL version 4.0: a FORTRAN package for nonlinear programming. Stanford University; Palo Alto: 1986. [Google Scholar]
  12. Gill PE, Murray W, Saunders MA. SNOPT: an SQP algorithm for large-scale constrained optimization. SIAM J Optim. 2002;12:979–1006. [Google Scholar]
  13. Gould NIM, Orban D, Toint PL. GALAHAD, a library of thread-safe FORTRAN 90 packages for large-scale nonlinear optimization. ACM Trans Math Software. 2004;29:353–372. [Google Scholar]
  14. Hanszen KJ. New knowledge on resolution and contrast in the electron microscope image. Naturwissenschaften. 1967;54:125–133. doi: 10.1007/BF00625103. [DOI] [PubMed] [Google Scholar]
  15. Hanszen KJ. The optical transfer theory of the electron microscope: fundamental principles and applications. In: Barer R, Cosslett VE, editors. Advances in Optical and Electron Microscopy. Academic Press; New York: 1971. pp. 1–84. [Google Scholar]
  16. Hanszen KJ, Trepte L. Die kontrastbertragung im elektronenmikroskop bei partiell kohärenter beleuchtung. Optik. 1971;33:166–182. [Google Scholar]
  17. Huang Z, Baldwin PR, Mullapudi S, Penczek PA. Automated determination of parameters describing power spectra of micrograph images in electron microscopy. J Struct Biol. 2003;144:79–94. doi: 10.1016/j.jsb.2003.10.011. [DOI] [PubMed] [Google Scholar]
  18. Jiang W, Chang J, Jakana J, Weigele P, King J, Chiu W. Structure of epsilon 15 bacteriophage reveals genome organization and DNA packaging/injection apparatus. Nature. 2006;439:612–616. doi: 10.1038/nature04487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Jiang W, Li Z, Zhang Z, Baker ML, Prevelige PE, Chiu W. Coat protein fold and maturation transition of bacteriophage P22 seen at sub-nanometer resolutions. Nat Struct Biol. 2003;10:131–135. doi: 10.1038/nsb891. [DOI] [PubMed] [Google Scholar]
  20. Jiang W, Baker ML, Jakana J, Weigele PR, King J, Chiu W. Backbone structure of the infectious epsilon15 virus capsid revealed by electron cryomicroscopy. Nature. 2008;451:1130–1134. doi: 10.1038/nature06665. [DOI] [PubMed] [Google Scholar]
  21. Liu X, Jiang W, Jakana J, Chiu W. Averaging tens to hundreds of icosahedral particle images to resolve protein secondary structure elements using a multi-path simulated annealing optimization algorithm. J Struct Biol. 2007;160:11–27. doi: 10.1016/j.jsb.2007.06.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ludtke SJ, Baldwin PR, Chiu W. EMAN: semiautomated software for high-resolution single-particle reconstructions. J Struct Biol. 1999;128:82–97. doi: 10.1006/jsbi.1999.4174. [DOI] [PubMed] [Google Scholar]
  23. Ludtke SJ, Jakana J, Song JL, Chuang DT, Chiu W. A 11.5 Å single particle reconstruction of GroEL using EMAN. J Mol Biol. 2001;314:253–262. doi: 10.1006/jmbi.2001.5133. [DOI] [PubMed] [Google Scholar]
  24. Ludtke SJ, Chen DH, Song JL, Chuang DT, Chiu W. Seeing GroEL at 6 Å resolution by single particle electron cryomicroscopy. Structure. 2004;12:1129–1136. doi: 10.1016/j.str.2004.05.006. [DOI] [PubMed] [Google Scholar]
  25. Ludtke SJ, Serysheva II, Hamilton SL, Chiu W. The pore structure of the closed RyR1 channel. Structure. 2005;13:1203–1211. doi: 10.1016/j.str.2005.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Ludtke SJ, Baker ML, Chen DH, Song JL, Chuang DT, Chiu W. De novo backbone trace of GroEL from single particle electron cryomicroscopy. Structure. 2008;16:441–448. doi: 10.1016/j.str.2008.02.007. [DOI] [PubMed] [Google Scholar]
  27. Mallick SP, Carragher B, Potter CS, Kriegman DJ. ACE: automated CTF estimation. Ultramicroscopy. 2005;104:8–29. doi: 10.1016/j.ultramic.2005.02.004. [DOI] [PubMed] [Google Scholar]
  28. Marsh MP, Chang JT, Booth CR, Liang NC, Schmid MF, Chiu W. Modular Software platform for low-dose electron microscopy and tomography. J Microsc. 2007;228:384–389. doi: 10.1111/j.1365-2818.2007.01856.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. MathWorks. Optimization Toolbox User’s Guide. The MathWorks, Inc; Natick: 2004. [Google Scholar]
  30. More JJ, Sorensen DC, Hillstrom KE, Garbow BS. The MINPACK project. In: Cowell WJ, editor. Sources and Development of Mathematical Software. Prentice Hall; New Jersey: 1984. pp. 88–111. [Google Scholar]
  31. Nocedal J, Wright SJ. Numerical Optimization. Springer; New York: 1999. [Google Scholar]
  32. Saad A, Ludtke SJ, Jakana J, Rixon FJ, Tsuruta H, Chiu W. Fourier amplitude decay of electron cryomicroscopic images of single particles and effects on structure determination. J Struct Biol. 2001;133:32–42. doi: 10.1006/jsbi.2001.4330. [DOI] [PubMed] [Google Scholar]
  33. Sander B, Golas MM, Stark H. Automatic CTF correction for single particles based upon multivariate statistical analysis of individual power spectra. J Struct Biol. 2003;142:392–401. doi: 10.1016/s1047-8477(03)00072-8. [DOI] [PubMed] [Google Scholar]
  34. Schittkowski K. NLPQL: a FORTRAN subroutine for solving constrained nonlinear programming problems. Ann Oper Res. 1986;5:485–500. [Google Scholar]
  35. Schmid MF, Sherman MB, Matsudaira P, Tsuruta H, Chiu W. Scaling structure factor amplitudes in electron cryomicroscopy using X-ray solution scattering. J Struct Biol. 1999;128:51–57. doi: 10.1006/jsbi.1999.4173. [DOI] [PubMed] [Google Scholar]
  36. Thon F. Phase Contrast Electron Microscopy. Academic Press; New York: 1971. [Google Scholar]
  37. Thuman-Commike PA, Tsuruta H, Greene B, Prevelige PE, Jr, King J, Chiu W. Solution X-ray scattering-based estimation of electron cryomicroscopy imaging parameters for reconstruction of virus particles. Biophys J. 1999;76:2249–2261. doi: 10.1016/S0006-3495(99)77381-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Toyoshima C, Yonekura K, Sasabe H. Contrast transfer for frozen-hydrated specimens II. Amplitude contrast at very low frequencies. Ultramicroscopy. 1993;48:165–176. [Google Scholar]
  39. Velazquez-Muriel JA, Sorzano CO, Fernandez JJ, Carazo JM. A method for estimating the CTF in electron microscopy based on ARMA models and parameter adjustment. Ultramicroscopy. 2003;96:17–35. doi: 10.1016/S0304-3991(02)00377-7. [DOI] [PubMed] [Google Scholar]
  40. Zhou ZH, Baker ML, Jiang W, et al. Electron cryomicroscopy and bioinformatics suggest protein fold models for rice dwarf virus. Nat Struct Biol. 2001;8:868–873. doi: 10.1038/nsb1001-868. [DOI] [PubMed] [Google Scholar]
  41. Zhu J, Penczek PA, Schroder R, Frank J. Three-dimensional reconstruction with contrast transfer function correction from energy-filtered cryoelectron micrographs: procedure and application to the 70S Escherichia coli ribosome. J Struct Biol. 1997;118:197–219. doi: 10.1006/jsbi.1997.3845. [DOI] [PubMed] [Google Scholar]

RESOURCES