Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2014 Jul 17;369(1647):20130329. doi: 10.1098/rstb.2013.0329

The correlation of single-particle diffraction patterns as a continuous function of particle orientation

Andrew V Martin 1,
PMCID: PMC4052865  PMID: 24914156

Abstract

A statistical model for X-ray scattering of a non-periodic sample to high angles is introduced. It is used to calculate analytically the correlation of distinct diffraction measurements of a particle as a continuous function of particle orientation. Diffraction measurements with shot-noise are also considered. This theory provides a general framework for a deeper understanding of single particle imaging techniques used at X-ray free-electron lasers. Many of these techniques use correlations as a measure of diffraction-pattern similarity in order to determine properties of the sample, such as particle orientation.

Keywords: X-ray diffraction, X-ray free-electron laser, coherent diffractive imaging

1. Introduction

X-ray free-electron lasers (XFELs) can produce pulses of sufficient peak brightness to probe individual viruses, nanoparticles and large biological molecules [1]. With XFEL sources, coherent diffractive imaging techniques can be used to extract structural information about the sample from a diffraction pattern. The study of non-periodic samples is known as ‘single particle imaging’ to distinguish it from crystallography, the study of periodic samples.

Single particle imaging experiments at X-ray laser facilities involve taking a large number of diffraction snapshots of individual particles, typically injected into the path of the X-ray pulses in solution or as an aerosol [2]. A large number of measurements is required, because X-rays diffract very weakly from an individual particle, even with the high intensities of XFEL pulses. It is already possible to collect this data, because X-ray lasers typically have high repetition rates (10 Hz at SACLA [3], 120 Hz at LCLS [4] and in the future 27 kHz at the European XFEL [5]). However, the rapid injection of particles makes it very difficult to measure the individual orientation or conformation of a particle at the time of measurement. There are methods that can determine these parameters from the measured X-ray diffraction as part of the data analysis.

Analysis methods for single particle diffraction often search for information that is common to many different noisy diffraction measurements. For example, pattern-to-pattern correlations are used to classify and average similar diffraction patterns to improve signal to noise before determining orientations by the common arcs method [6]. Bayesian methods use likelihood measures to compare noisy data to a three-dimensional intensity model [7] or a manifold [8]. Graph-theoretic methods [9,10] and geodesic methods [11] map networks formed with Euclidean distances. By treating the dataset as a whole, Bayesian methods and graph-based methods are perhaps the most promising for treating the very low signals expected from individual protein molecules (102−103 photons). Although notably, combined correlation–classification/common-arc orientation methods have also produced good results in low-signal, high-resolution simulations and continue to be actively pursued [12,13]. Bayesian- and correlation-based methods have also been combined [14].

Although a diverse array of algorithms have been developed, experimental demonstrations are few and at very low resolution [11,15]. It is still unknown how these algorithms will perform with realistic conditions, very low signals and at high resolution. While Poisson noise is frequently addressed in simulations, the effects of varying beam parameters, background noise, structural changes or radiation damage have yet to be studied in detail and remain outstanding issues. The further development of algorithms with more realistic simulations is hampered by the time- and memory-intensive nature of high-resolution simulations with a full dataset (105−106 patterns). In many simulation studies, resolution [7] or the range of orientations [8] is restricted.

An alternative to large-scale simulations is to use statistical models to calculate results analytically. This approach was pioneered in an early study of pattern-to-pattern correlations in a diffraction classification scheme to improve the signal-to-noise of molecular diffraction [6]. The statistical foundations come from well-known results in the theory of crystallographic diffraction [16]. Huldt et al. [6] applied these results using a coarse-grained angular approximation sufficient for the limit of perfect alignment and the limit of a large misalignment angle, but lacking the continuity of single particle diffraction. Because single particle diffraction is inherently continuous, it is hard to apply statistical models further without addressing the issue of continuity.

In this study, we present a statistical model for continuous diffraction and continuous changes of particle orientation. We consider the mean pattern-to-pattern correlation as a continuous function of relative particle orientation with and without shot noise. These results compare favourably with simulation studies of virus diffraction that predict a Gaussian-dependence with a width characterized by the size of the virus, but not its internal structure [17]. Our approach provides a common framework to derive, and unify, the results of Huldt et al. [6] and Ziaja et al. [17]. We then use our approach to derive new results concerning the standard deviation of the correlation, with and without shot noise.

There are many different analysis algorithms, but the measures of diffraction similarity they use are much fewer, as explained above. Although we consider only correlations in this paper, we envisage that the statistical tools presented here could be used in future to study Euclidean distances and Bayesian likelihood estimates, so that statistical models can be applied more widely.

2. Theoretical model of single particle diffraction

The intensity scattered by a particle is given by the formula

2. 2.1

where ϕ is the incident fluence, re is the classical electron radius, is the solid angle and F(q) is the structure factor of the particle. To model the structure factor, consider the smallest parallelpiped that encloses the particle, defined by three vectors a, b and c. The Fourier transform of the solid parallelpiped is

2. 2.2

where a*, b* and c* are reciprocal vectors and A, B, C are the lengths of a, b and c. An orthonormal basis for the Fourier transform of any object enclosed by the parallelpiped can be defined by functions S(qqhkl), where qhkl = ha* + kb* + lc*.

The function S(q) has the following useful properties

2. 2.3
2. 2.4
2. 2.5

The first two relations are properties of the sinc functions in equation (2.2). The last property can be proved by first translating all the functions S(qqhkl) by a constant vector qc to define a new orthonormal basis

2. 2.6

The new basis can be chosen such that Inline graphic. In the new basis, there is one term in the sum over hkl that takes the value 1 and, all other terms contribute 0 (using equations (2.3) and (2.4)). Therefore, equation (2.5) holds for all q values.

The structure factor of a single non-periodic particle can be written as

2. 2.7

The terms Fhkl are samples of the Fourier transform of the particle's electron density. At high scattering angles, the terms Fhkl can be described statistically by assuming that all atoms are randomly located [16]. The real and imaginary parts of Fhkl are then drawn randomly from the following distribution

2. 2.8

where Inline graphic is the mean intensity at a pixel. Table 1 summarizes some useful quantities that can be calculated using this distribution. We assume that the variables Fhkl are statistically independent, such that

2. 2.9

This is exact if the object fills the parallelpiped used to construct S(qqhkl) and the atomic positions are really uncorrelated, otherwise, it is an approximation. By defining the parallelepiped to be the smallest that can enclose the object, the error from making this approximation is minimized.

Table 1.

Averages of the structure factor using the statistical model.

Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
0 Inline graphic 0 Inline graphic 0 Inline graphic 0 Inline graphic

To account for centrosymmetry, we need to exclude the case where Inline graphic from equation (2.9). This term contributes only in special cases, such as a rotation by π around the beam axis at low resolution. In order to present simple and concise derivations in what follows, we ignore the centrosymmetric contributions which are not difficult to include if required.

3. Evaluating correlations

The intensity measured by a pixel is denoted by I(q), and the intensity measured by the same pixel after the molecule has been rotated is denoted by I(q + qα). A general three-dimensional rotation of the molecule can be specified by three Euler angles. However, it is more convenient to label the displacement vector qα by the angular distance between the two q-space points sampled by the pixel. For a general rotation, qα is not the same for all pixels and α is not necessarily equal to any of the Euler angles, though for each pixel it can be calculated from them. The mean value of the correlation between I(q) and I(q + qα) is then given by

3. 3.1

To simplify the notation, the three subscripts hkl have been replaced by a single subscript n, such that SnS(q qhkl), and we have defined Inline graphic. A simplification can be made with a judicious change of basis via equation (2.6), such that q = q0. Then, m = n = 0 and equation (3.1) simplifies to

3. 3.2

In order to calculate the mean and standard deviation of the correlation, we can use the fact that many terms in equation (3.2) are statistically independent. If two random variables, A and B, are statistically independent, then the following relation holds

3. 3.3

Combining this relation with the fact that Inline graphic, we find that there are only two sets of values of r and s for which Inline graphic is not zero. The first case occurs when r = s = 0 and contributes the following term

3. 3.4

The second case arises when r = s ≠ 0 and contributes

3.

Using the results from table 1 and combining the two non-zero terms, we find that

3. 3.6

The predicted angular-dependence of the correlation is approximately Gaussian for small angles α:

3. 3.7

where a resolution shell at a magnitude q has been considered, such that qα = . The width of the angular-dependence is close, but not exactly what is given by simulations reported in [17]. In that paper, the half-width at half maximum value was found to be 1/4qR, whereas equation (3.7) predicts a narrower width of Inline graphic. The discrepancy is around 20% and is due to the fact that the statistical independence of Fhkl (equation (2.9)) is only approximately true for the icosahedral virus particle used in the simulations for reference [17]. Nevertheless, the theory presented here does account for the most significant contribution to the angular-dependence of the correlation.

4. Derivation of the standard deviation

The standard deviation of the correlation Inline graphic is calculated from

4. 4.1

We again use equation (2.6) to choose a set of basis functions such that q is located at the position q0. Therefore, we can write

4. 4.2

Using the results from table 1, it can be shown that only terms with m = s, n = t or n = t, m = s are non-zero. When these restrictions have been applied, there are only six non-zero terms to evaluate

4. 4.3
4. 4.4
4.
4. 4.5

and

4.
4. 4.6

Taking the sum of these equations, we obtain the result

4. 4.7

The standard deviation is

4. 4.8

For the limiting case of perfect alignment, α = 0, we have Inline graphic. For the case of large α, we have Inline graphic. Both these limits agree with those given by Huldt et al. [6].

5. Poisson noise

To determine the expected correlation when the measurement contains shot noise, we follow the method of Huldt et al. [6]. We define P(K, I) to be the probability of measuring a photon count K and having an expected intensity of I. The probability P(K, I) can be written as

5. 5.1

where P(K|I) is the conditional probability of measuring K when the expected intensity is known to be I. The conditional probability is given by the Poisson distribution:

5. 5.2

This distribution can be used to show the following relations

5. 5.3

and

5. 5.4

A correlation of the photon count recorded at a detector pixel is given by

5. 5.5

This shows that the mean correlation is not changed by the introduction of Poisson noise. The variance is calculated by a similar derivation and turns out to be

5. 5.6

In the limit of large α, this agrees with Huldt et al. [6]. However, in that paper, there was a mistake for the α = 0 case. The correct limiting result is Inline graphic.

6. Pattern-to-pattern correlations

The results presented so far refer to the correlation of two diffraction measurements at a point (i.e. a particular pixel on the detector). A correlation of two diffraction patterns involves taking the sum over all the pixels. For an arbitrary rotation of the molecule in three dimensions, the difference of the q coordinate of a pixel, qα, will be different for each pixel. The mean and standard deviation of the pattern-to-pattern correlation involves an integral over the point-to-point correlation (equation (3.6)) with a distribution of values for qα generated by the rotation. The intensity values on neighbouring pixels are not independent. As discussed in [6], the pattern-to-pattern correlation is calculated by scaling the mean pixel correlation by the number of independent variables needed to describe the intensity. The number of independent variables needed to describe the intensity is determined by sampling along the a*, b* and c* directions at twice the rate as that used for the structure factor in equation (2.7).

A simple case occurs when the molecule only rotates around the beam axis, so that qα is a function of resolution shell, but not the polar angular coordinate of the pixel. To provide a simple illustration of the results we have obtained, we show an example of this special case rather than show a general three-dimensional rotation. The number of independent sampling points in the resolution shell was set to 200, which corresponds to a resolution 32 times smaller than the particle, e.g. the diffraction from a 9.6 nm particle at 0.3 nm resolution. The mean correlation as a function of angle is plotted in figure 1. The range of correlation values within three standard deviations of the mean is also shown, indicating how the distribution narrows as α increases. As a consequence of this change of distribution, the mean correlation is not a perfect indicator of the most probable angle that gives rise to that correlation value. This can be seen in figure 2, which shows that the mean correlation of perfectly aligned particles, Inline graphic, can arise from a range of orientations with close to equal probability. The most likely angle is around 1/4qR. Values of the correlation higher than the mean value are needed to ensure selecting the aligned case with the greatest probability.

Figure 1.

Figure 1.

The mean correlation of the intensity in a resolution shell as a function of relative molecular orientation for a rotation around the beam axis for the noise-free case. It applies to a resolution shell of q = 32/2R, where R is the particle's radius. The mean correlation is shown as the solid line and the upper and lower bounds 3σ from the mean are indicated by the dashed lines. The correlation has been normalized to 1 at α = 0. Its actual value in an experiment will be dependent on the number of incident photons and the composition of the sample.

Figure 2.

Figure 2.

The probability of obtaining particular correlation values as a function of relative particle orientation. Note that the height of the Inline graphic distribution has been scaled to improve visibility and is illustrative of the shape of the distribution only.

7. Conclusion

By addressing the issue of continuity in statistical models, we aim to improve their suitability for the analysis of single particle diffraction, which is inherently continuous. We view statistical models as potentially complimentary to numerical simulations in ongoing efforts to address the outstanding analysis issues for single particle imaging, particularly for low-signal, high-resolution applications to individual molecules. Although not all single-particle analysis algorithms are based on correlations, the methods we have presented could potentially be applied in future to Euclidean distances and likelihood functions to broaden the applicability of statistical models to Bayesian methods, graph-theoretic methods and manifold techniques.

Acknowledgements

This research was supported by the Australian Research Council through its Centres of Excellence programme.

References

  • 1.Neutze R, Wouts W, van der Spoel D, Weckert E, Hajdu J. 2000. Potential for biomolecular imaging with femtosecond X-ray pulses. Nature 406, 752–757. ( 10.1038/35021099) [DOI] [PubMed] [Google Scholar]
  • 2.Seibert MM, et al. 2011. Single mimivirus particles intercepted and imaged with an X-ray laser. Nature 470, 78–81. ( 10.1038/nature09748) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ishikawa T, et al. 2012. A compact X-ray free-electron laser emitting in the sub-ångström region. Nat. Photon. 6, 540–544. ( 10.1038/nphoton.2012.141) [DOI] [Google Scholar]
  • 4.Emma P, et al. 2010. First lasing and operation of an Ångstrom-wavelength free-electron laser. Nat. Photon. 4, 641–647. ( 10.1038/nphoton.2010.176) [DOI] [Google Scholar]
  • 5.European XFEL See http://www.xfel.eu.
  • 6.Huldt G, Szoke A, Hajdu JJ. 2003. Diffraction imaging of single particles and biomolecules. J. Struct. Biol. 144, 219–227. ( 10.1016/j.jsb.2003.09.025) [DOI] [PubMed] [Google Scholar]
  • 7.Loh NTD, Elser V. 2009. Reconstruction algorithm for single-particle diffraction imaging experiments. Phys. Rev. E 80, 026705 ( 10.1103/PhysRevE.80.026705) [DOI] [PubMed] [Google Scholar]
  • 8.Fung R, Shneerson V, Saldin DK, Ourmazd A. 2009. Structure from fleeting illumination of faint spinning objects in flight. Nat. Phys. 5, 64–67. ( 10.1038/nphys1129) [DOI] [Google Scholar]
  • 9.Giannakis D, Schwander P, Ourmazd A. 2012. The symmetries of image formation by scattering. I. Theoretical framework . Opt. Express. 20, 12 799–12 826. ( 10.1364/OE.20.012799) [DOI] [PubMed] [Google Scholar]
  • 10.Schwander P, Giannakis D, Yoon CH, Ourmazd A. 2012. The symmetries of image formation by scattering. II. Applications. Opt. Express. 20, 12 827–12 849. ( 10.1364/OE.20.012827) [DOI] [PubMed] [Google Scholar]
  • 11.Kassemeyer S, Jafarpour A, Lomb L, Steinbrener J, Martin AV, Schlichting I. 2013. Optimal mapping of X-ray laser diffraction patterns into three dimensions using routing algorithms. Phys. Rev. E 88, 042710 ( 10.1103/PhysRevE.88.042710) [DOI] [PubMed] [Google Scholar]
  • 12.Bortel G, Faigel G, Tegze M. 2009. Classification and averaging of random orientation single macromolecular diffraction patterns at atomic resolution. J. Struct. Biol. 166, 226–233. ( 10.1016/j.jsb.2009.01.005) [DOI] [PubMed] [Google Scholar]
  • 13.Bortel G, Tegze M. 2011. Common arc method for diffraction pattern orientation. Acta Crystallogr. A 67, 533–543. ( 10.1107/S0108767311036269) [DOI] [PubMed] [Google Scholar]
  • 14.Tegze M, Bortel G. 2012. Atomic structure of a large biomolecule from diffraction patterns of random orientations. J. Struct. Biol. 179, 41–45. ( 10.1016/j.jsb.2012.04.014) [DOI] [PubMed] [Google Scholar]
  • 15.Loh ND, et al. 2010. Cryptotomography: reconstructing 3D Fourier intensities from randomly oriented single-shot diffraction patterns. Phys. Rev. Lett. 104, 225501 ( 10.1103/PhysRevLett.104.225501) [DOI] [PubMed] [Google Scholar]
  • 16.Shmueli U, Wilson AJC. 2006. International tables for crystallography, vol. B, ch. 2.1.5. IUCR. [Google Scholar]
  • 17.Ziaja B, Martin AV, Wang F, Chapman HN, Weckert E. 2011. Theoretical estimation for correlations of diffraction patterns from objects differently oriented in space. Ultramicroscopy 111, 793–797. ( 10.1016/j.ultramic.2010.12.021) [DOI] [PubMed] [Google Scholar]

Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES