Abstract
Time-resolved serial femtosecond crystallography (TR-SFX) provides access to protein dynamics on sub-picosecond timescales, and with atomic resolution. Due to the nature of the experiment, these datasets are often highly incomplete and the measured diffracted intensities are affected by partiality. To tackle these issues, one established procedure is that of splitting the data into time bins, and averaging the multiple measurements of equivalent reflections within each bin. This binning and averaging often involve a loss of information. Here, we propose an alternative approach, which we call low-pass spectral analysis (LPSA). In this method, the data are projected onto the subspace defined by a set of trigonometric functions, with frequencies up to a certain cutoff. This approach attenuates undesirable high-frequency features and facilitates retrieving the underlying dynamics. A time-lagged embedding step can be included prior to subspace projection to improve the stability of the results with respect to the parameters involved. Subsequent modal decomposition allows to produce a low-rank description of the system's evolution. Using a synthetic time-evolving model with incomplete and partial observations, we analyze the LPSA results in terms of quality of the retrieved signal, as a function of the parameters involved. We compare the performance of LPSA to that of a range of other sophisticated data analysis techniques. We show that LPSA allows to achieve excellent dynamics reconstruction at modest computational cost. Finally, we demonstrate the superiority of dynamics retrieval by LPSA compared to time binning and merging, which is, to date, the most commonly used method to extract dynamical information from TR-SFX data.
I. INTRODUCTION
Time-resolved serial femtosecond crystallography (TR-SFX) has emerged as a prominent technique for investigating the dynamics of light-sensitive macromolecules with atomic spatial resolution at ultrafast timescales.1–4 In a typical experiment, a laser pulse pumps the molecules into an excited state. An x-ray pulse from an X-ray Free-Electron Laser (X-FEL) probes the system a certain time delay—typically in the femtosecond to picosecond range—after photo-excitation. Microcrystals of the protein of interest, embedded in a viscous medium, are delivered into the interaction region through a continuous flow. The experiment is carried out in a serial fashion. Since the interaction with the X-FEL beam is destructive, each microcrystal gives rise to up to one diffraction pattern, from a specific, random orientation (Fig. 1).
FIG. 1.
Schematic representation of the time-resolved serial crystallography experiment. The microcrystals are brought into the interaction region by a continuous flow of a viscous medium. Individual crystals are probed by an X-FEL pulse at a certain time delay after optical pumping. The diffraction condition , with a reciprocal lattice vector, is represented graphically by the diffraction sphere construction.10 A dataset is composed of an ensemble of frames, each recording a diffraction pattern from an individual crystal in a random orientation.
Under typical experimental conditions, only a small fraction of the reciprocal lattice points within the accessible resolution range fulfill the elastic scattering condition for a specific crystal's orientation (Fig. 1). The diffraction signal recorded in one frame is therefore highly incomplete [Fig. 2(a)]. Because crystals have a finite size and some extent of lattice disorder, and the spectral distribution of the X-FEL beam is limited, the recorded intensities are in general, smaller—by factors that are orientation dependent—than the corresponding diffraction intensities from an infinite and perfectly ordered crystal [Fig. 2(b)]. This effect is commonly referred to as partiality.5 Variations in crystal size and beam fluence distribution can be accounted for by estimating and applying frame-related scale factors. Such estimates can be computed using standard SFX software.6 The uncertainty in timing between pump and probe pulses (timing jitter)7,8 and photon counting errors9 are other factors that affect time-resolved serial crystallography data.
FIG. 2.
Schematic representation of data incompleteness and partiality. (a) The diffraction condition is satisfied at the intersection between the reciprocal lattice, whose orientation is determined by the crystal's, and the diffraction sphere, with radius . Most reciprocal lattice points are unmeasured in an individual frame. This determines the incompleteness of the data. (b) Due to the finite size of the crystal and lattice disorder, the reciprocal lattice regions that can give rise to constructive interference are not vanishingly small, but can rather be modeled as three-dimensional ellipsoids. The spectral distribution of the incident X-FEL beam can be accounted for by attributing a finite thickness to the scattering sphere (scattering shell). In this model, diffraction arises from the volumes at the intersection between reciprocal space ellipsoids and the scattering shell,11 so that the resulting partial intensities are, in general, smaller and not representative of the full intensities that would arise from an infinite and perfect crystal.
To retrieve a complete set of reciprocal-space intensities and mitigate the effects of partiality, a binning-and-merging procedure is routinely adopted (see, for example, Refs. 12–18). This approach involves dividing the pump–probe delay window into time bins, and merging the measurements by averaging the equivalent reflections within each bin. The number of frames required in the time-binning approach is commonly of the order of the tens of thousands for each time point, often leading to broad bins and a consequent deterioration of the timing information.
Here, we analyze alternative strategies to the binning-and-merging approach, with the purpose of improving the quality of the reconstructed dynamics and limiting information losses. We specifically tackle the issues of data incompleteness and partiality. To demonstrate our findings while isolating these effects, we employ a synthetic dataset. We present a new method, which we call low-pass spectral analysis (LPSA), to extract accurate dynamics from extremely incomplete and partial data. We compare the performance of LPSA, in terms of reconstruction quality and computational effort, to that of time binning as well as a range of other dynamics-retrieval techniques.
II. OUTLINE OF THE PROBLEM
Consider reciprocal lattice points in the resolution range of interest. Let be the diffraction intensity related to the lattice point . The -tuple for a given set of non-negative numbers may be viewed as a possible diffraction pattern. If the physical system giving rise to the diffraction pattern undergoes dynamical evolution, then, in the absence of stochasticity, the dynamics are governed by a differential equation with respect to time. This implies that the associated diffraction pattern as a function of time must be at least singly differentiable with respect to . Hence, in the immediate vicinity of any time point , we can approximate as follows:
| (1) |
This means that locally, the points lie on a straight line. Local linearity renders a one-dimensional manifold.19
For a given orthonormal basis of , , the system's trajectory in reciprocal space can be expanded as follows:
| (2) |
with the expansion coefficients,
| (3) |
The basis vectors associated with nonnegligible expansion coefficients span the linear subspace of explored by the trajectory . As a consequence of experimental reality, the one-dimensional manifold that underlies the system's dynamical evolution may be completely unrecognizable. In practice, the stochastic effects of incompleteness and partiality alter the trajectory of the system in data space and artificially increase the apparent dimension of the subspace explored by the dynamics.
The task at hand is then twofold. First, we have to retrieve the one-dimensional manifold described by the dynamical system, that is, mitigate the stochastic effects introduced by data incompleteness and partiality. Second, we need to identify the linear space of minimal dimension in which the recovered manifold can be embedded. There are various strategies to accomplish the first of our tasks, which will be detailed in Secs. III and V. Subsequent singular value decomposition (SVD)20,21 allows to identify the linear subspace explored by the system.
Since the recovered dynamics are expected to describe a locally linear manifold, to help the analysis of our results, we introduce a measure of the deviation from local linearity. We denote with the sequence of time-ordered data vectors related to the time points , with . From any pair of temporally neighboring points, and , we can construct a local linear approximation to , which we call ,
| (4) |
Local linearity implies that the two immediate temporal neighbors of and , i.e., and , lie close to the points and , respectively. We, therefore, define
| (5) |
The average over all represents our measure of deviation from local linearity .
III. LOW-PASS SPECTRAL ANALYSIS
We map the incomplete data to a sparse-matrix representation, that is, we set any unmeasured component of equal to zero.22 With this choice, the fundamental issue that we need to address and mitigate is the stochastic weighting (by factors comprised between zero and one) of the underlying intensities, introduced by sparsity and partiality. To alleviate the effects of randomness, we project the data onto the subspace spanned by a set of trigonometric functions. The frequencies of these functions are defined by integer multiples of the first harmonic corresponding to one period of oscillation in the time range covered by the (time-lagged embedded) data points and up to a certain cutoff. This procedure effectively removes undesired high-frequency features and allows to recover the system's underlying trajectory in data space. Time-lagged embedding of the data23–25 can be performed prior to subspace projection, to improve the stability of the reconstructed signal, but at an increased computational cost (Secs. IV and VI). Subsequent modal decomposition allows to represent the dynamical evolution of the system in the subspace of minimal dimension. We present the details of the method hereafter.
A. Time-lagged embedding
For sampled time points, the columns of give a discretized representation of the trajectory of interest . Hence, for a given time point , , where is the th column of . With the values of missing entries set equal to zero, is typically highly sparse. The time-lagged embedding procedure, with concatenation parameter , consists in the delayed-coordinate mapping defined by
| (6) |
B. Low-pass filtering
With the purpose of denoising the data by removal of the high-frequency components, and given the (time-lagged embedded) data , with , we define a time-domain projector , with . We consider the fundamental oscillation period corresponding to the time window spanned by the ensemble of the time-lagged embedded points. We define , where is the th column of . The matrix entries are obtained by sampling a series of trigonometric functions at discrete time points as follows:
| (7) |
| (8) |
for ; and is a constant vector. The columns of , are obtained by orthonormalization of the vectors to fulfill the condition,
| (9) |
with . Only if equals ,
| (10) |
holds. Typical choices of satisfy to (i) remove the undesired high-frequency features and (ii) make the subsequent calculations more affordable. The low-pass filtered data retain frequency components up to the cutoff .
C. Modal decomposition
The linear subspace of explored by the system's dynamics can be identified by modal decomposition of the linear mapping given by the subspace projection,
| (11) |
The SVD of is
| (12) |
with , and is the rank of . Using , and , with and columns of and , respectively, the SVD gives the modal decomposition,
| (13) |
D. Reconstruction in time-lagged embedding space
The reconstructed time-lagged embedded data points are
| (14) |
where only dominant modes are retained.
E. Signal retrieval in data space
The reconstructed time-lagged embedded points have the structure described in Eq. (6). Hence, the data point can be retrieved by averaging the reconstructed copies extracted from time-lagged embedded vectors in the range from to .
IV. RESULTS OF LPSA
To investigate the capabilities of LPSA and compare it to other dynamics retrieval methods, we employ the synthetic model,
| (15) |
shown in Fig. 3(a), with corresponding to the middle of the time interval considered, , , , , and noncollinear vectors , with components,
| (16) |
for and . Fundamental molecular physics dictates that, particularly on ultrafast time scales, structural dynamics are dominated by vibrations and, occasionally, quasi-irreversible transitions between local minima of the respective potential energy surfaces. Our model [Eq. (15)] is designed to reflect both of these effects. To mimic the extent of incompleteness affecting TR-SFX datasets, we set equal to zero 98.2% of the values, chosen at random (and matching the incompleteness of the dataset in Ref. 16). In addition, we multiply each data point by a random number extracted from a constant distribution between zero and one [Fig. 3(b)], to model data partiality. Because the signal is generated by a linear combination of six linearly independent vectors, the dimension of the linear subspace of explored by the underlying dynamics is six. However, the dimension of the subspace of explored by the time-lagged embedded data manifold is not constrained to six, which rather represents a lower limit.
FIG. 3.
(a) Underlying dynamics [Eq. (15)], with the th component of data vector . (b) Incomplete and partial input data. Missing entries are assigned to zero values generating a sparse input data matrix.
We process the sparse and partial dataset by LPSA and measure the quality of the retrieved signal as a function of the number of modes employed in the reconstruction [Eq. (14)]. We examine the evolution of the results as we vary the two parameters involved: the concatenation parameter and the cutoff frequency . The quality of the reconstructed signal is quantified by calculating the linear correlation coefficient to the underlying dynamics from Eq. (15). Since this metric is generally unavailable, we analyze the corresponding evolution of two indicators that can be used to guide the choice of the number of modes to be employed in the reconstruction. The singular value spectrum of the matrix shows the relative weight of the terms of the modal decomposition in Eq. (13). We expect noise terms to have a relatively low weight. In addition, we compute a measure of the deviation from local linearity of the reconstructed signal, which we call (see Sec. II). We expect the retrieved dynamics to deviate significantly from local linearity when the number of modes employed exceeds the optimal one, and noise from the input data is reintroduced in the reconstructed signal.
Figure 4 shows the evolution of the quantities mentioned above with varying , for various values of and fixed . We observe that the best linear correlation of the reconstructed signal to the ground truth is obtained with six modes and = 1, matching our expectation that the dimension of the subspace explored by the data manifold is six [Fig. 4(a)]. We also notice that for = 1 the correlation coefficient does not converge to its maximal value with increasing number of modes, but rather deteriorates progressively when increases beyond six. The choice of the number of modes is then critical to the quality of the results. Such a choice can be guided by identifying the end of the plateau section in [Fig. 4(b)]. This shows that considerable noise is added to the reconstructed signal as is increased beyond six modes. A concomitant sharp decline in the singular value spectrum is observed [Fig. 4(c)]. With large values of the concatenation parameter, , a robust convergence of the correlation coefficient with increasing is observed, at the cost of higher computational effort. The minimal number of modes required to obtain the maximal correlation is ten, in accordance with our expectation that in time-lagged embedding space, the dimension of the explored subspace can be larger than six.
FIG. 4.
LPSA of the sparse and partial dataset shown in Fig. 3(b), for various values of , and with (a) Linear correlation coefficient between the reconstructed signal and the underlying dynamics. (b) Measure of deviation from local linearity. (c) Singular value spectrum.
We now analyze the evolution of the results with varying , for = 1 (Fig. 5) and = 4000 (Fig. 6). The results converge toward the optimal reconstruction with increasing , with = 1 and ; and with = 4000 and . We observe a degradation of the reconstruction quality as the number of modes exceeds the optimal one for = 1, but a robust convergence with increasing number of modes for = 4000. The measure of deviation from local linearity [Figs. 5(b) and 6(b)] and the singular value spectrum [Figs. 5(c) and 6(c)] can guide the choice of . Relatively high noise levels in the reconstructed signal, and relatively low singular values are observed when the optimal number of modes is exceeded. In particular, with = 4000 and sufficiently high , the local linearity measure allows to detect a sharp increase in noise reconstruction beyond 10 modes, in agreement with the abrupt decline of the singular value.
FIG. 5.
LPSA of the sparse and partial dataset shown in Fig. 3(b), for various values of and with = 1. (a) Linear correlation coefficient between the reconstructed signal and the underlying dynamics. (b) Measure of deviation from local linearity. (c) Singular value spectrum.
FIG. 6.
LPSA of the sparse and partial dataset shown in Fig. 3(b), for various values of and with = 4000. (a) Linear correlation coefficient between the reconstructed signal and the underlying dynamics. (b) Measure of deviation from local linearity. (c) Singular value spectrum.
V. COMPARISON WITH OTHER DYNAMICS RETRIEVAL METHODS
A. Singular spectrum analysis
Time-lagged embedding followed by modal decomposition by SVD is known as singular spectrum analysis (SSA).26 With concatenation parameter = 1, SSA is equivalent to SVD.
B. Nonlinear Laplacian spectral analysis
Nonlinear Laplacian spectral analysis (NLSA) was introduced in Ref. 27 and used in the context of time resolved experiments in Refs. 22 and 28. Similar to LPSA, the overarching framework is that of a subspace projection preceded by time-lagged embedding and followed by modal decomposition. The difference between the two methods resides in the choice of the subspace basis set. In NLSA, a data-driven basis set is employed, specifically a set of functions derived from the diffusion map algorithm.29 In this work, we propose a modified version of the diffusion map, which allows to obtain a set of orthonormal vectors to use directly as a subspace basis set. We also consider two different formulations of NLSA. In the standard version (E-NLSA), each data point's Euclidean nearest neighbors are considered. With the purpose of using all available information, we propose a procedure whereby time nearest neighbors are retained instead (T-NLSA). Here, timing information is used to guide the nearest-neighbor selection. As described in Sec. III, missing observations are set equal to zero.
1. Distance calculation
Euclidean distances between highly sparse time-lagged embedded vectors are calculated by retaining only common terms (i.e., the set of reflections that are present in both time-lagged embedded vectors), and are normalized by the number of retained components. This approach appears to better represent underlying distances (those pertaining to the underlying dynamics), compared to distances calculated by including all components.
2. Diffusion map
We use the diffusion map kernel,
| (17) |
as a measure of similarity between time-lagged embedded points. In this expression, are Euclidean distances in and refers to the extent of the local neighborhood. An estimate of this parameter is obtained as described in Ref. 30. For each time-lagged embedded point, nearest neighbors are considered. In standard NLSA, Euclidean nearest neighbors are retained (E-NLSA). Alternatively, we consider nearest neighbors in time (T-NLSA). After symmetrization, the diffusion kernel is normalized to consider local densities:31
| (18) |
With and , we define the diagonal and positive-definite matrix . We solve the eigenvalue problem,
| (19) |
for the real and symmetric matrix,
| (20) |
The orthonormal eigenvectors are closely related to the (in general, nonorthogonal) eigenvectors of the probability matrix,
| (21) |
obtained by row-normalization of .
3. Modal decomposition and reconstruction
We use a subset of the orthonormal eigenvectors to construct a data-driven projector to a subspace of . A number of eigenvectors, related to the eigenvalues with largest absolute value, are retained and used as a basis set for the subspace projection, analogous to Eq. (11). Modal decomposition and signal reconstruction are carried out as described in Sec. III.
C. Time binning
To compare the dynamics-retrieval results to those from time binning and merging, we compute the running average of for various values of the time bin size.
VI. DISCUSSION
We compare the performance of LPSA to that of the methods presented in Sec. V on the task of reconstructing the synthetic signal presented in Sec. IV [Eq. (15)], from input data with extreme incompleteness and partiality. Figure 7 shows the evolution of the linear correlation between the recovered signal and the underlying dynamics, as a function of the number of modes employed, for the various techniques considered.
FIG. 7.
Comparison between SVD, SSA ( ), LPSA ( ), E-NLSA ( ). Linear correlation between the reconstructed dynamics from sparse and partial input data and the underlying dynamics.
With an increasing number of modes, the reconstructed signal from pure SVD reproduces more and more closely the sparse and partial input data. A maximum in the correlation between the reconstructed signal and the underlying dynamics is observed with four modes. As the number of modes exceeds four, the correlation deteriorates. The maximal correlation achievable by SVD lies well below that from concatenation-based or projection-based methods. By including a time-lagged embedding step (SSA), the reconstruction achieves optimal quality with ten modes, and shows a robust convergence with increasing number of modes. The drawback resides in the large size of the matrix to be singular-value decomposed.
NLSA produces excellent reconstruction results, similar to SSA. The subspace projection allows to reduce the size of the matrix to be singular-value decomposed ( , with ). However, the computation of data-driven subspace basis vectors is expensive, as it involves the calculation of (in E-NLSA) Euclidean distances between tuples in , and the eigendecomposition of the large (but sparse) matrix . A specific drawback of NLSA is that the parameter space to be considered is four dimensional .
We compare the results from E-NLSA and T-NLSA, for = 1500 and = 3000. Figure 8 shows that the best reconstruction is obtained with T-NLSA and = 1500. This is due to the fact that T-NLSA effectively uses the time order of the input data as prior knowledge to guide the choice of the nearest neighbors. In addition, compared to E-NLSA, T-NLSA presents the advantage that only , rather than , Euclidean distances must be computed. The use of a data-driven subspace basis may be important when dealing with chaotic dynamical systems, whereby the underlying dynamics explores a truly high-dimensional subspace of . NLSA appears in this case to effectively provide a low-rank representation of the dynamics, where SSA fails to do so.27 Data-driven basis functions were found to approximate a set of periodic functions at large values of the concatenation parameter.32
FIG. 8.
Comparison between E-NLSA and T-NLSA. Linear correlation between the reconstructed signal and the underlying dynamics as a function of the number of modes employed. The NLSA was computed with concatenation parameter , subspace dimension , neighborhood size , nearest-neighbor number (a) and (b) .
The dynamics retrieval from LPSA is excellent, as shown in Fig. 7. Similar to NLSA, the subspace projection allows to reduce the size of the matrix to be singular-value decomposed ( , with ). In contrast to NLSA, the computation of the subspace basis set is inexpensive in LPSA. In addition, LPSA only involves two parameters , which represents a major practical advantage compared to NLSA. LPSA involves particularly cheap computations when no time-lagged embedding is performed ( ). While excellent results can be achieved in this case, it is important to consider that the number of employed modes plays a major role in determining the quality of the reconstruction, as convergence with respect to the number of modes cannot be assured. In this case, the singular value spectrum, and the deviation from local linearity measured by the indicator , can be used to guide the choice of the number of modes.
Typical values in TR-SFX applications are , and . In this context, SSA mandates the SVD of a 1013-element matrix. By contrast, projection-based methods involve the SVD of a much smaller matrix. With the number of basis vectors typically ranging between 101 and 102, is substantially smaller than in both NLSA and LPSA TR-SFX applications. However, the calculation of data-driven basis vectors for the NLSA subspace projection is computationally expensive. To this end, the calculation of distances between time-lagged embedded vectors is particularly challenging. To give an example, the analysis of a 105-frame dataset, with and , involves the computation of Euclidean distances between 108-element tuples. In addition, to obtain data-driven subspace basis vectors, a (sparse) 1010-element matrix must be eigendecomposed. In this respect, LPSA presents the practical advantage that the subspace basis vectors can be readily computed as a set of orthonormalized trigonometric functions.
Finally, we compare our results to those from time binning and merging, which is to date the most commonly used technique to analyze TR-SFX data. Figure 9 shows the linear correlation between the binned and merged signal and the underlying dynamics, as a function of the size of the time bins. The maximal correlation achieved is 0.952, below the value of 0.987 obtained by LPSA. The difference in the quality of the reconstruction can be appreciated by comparing the binned and merged signal in Fig. 10(c) to the LPSA results in Fig. 10(b). It should be emphasized that, in the standard binning-and-merging approach, there is no intrinsic way to ensure whether the width of each bin has been selected optimally. Here, we present the best-case scenario, in which we choose the optimal bin size by maximizing the correlation with the benchmark, which is generally not available. By contrast, LPSA parameters can be optimized based on the deviation from local linearity and the singular value spectrum, i.e., indicators that do not depend on a priori knowledge of the ground truth.
FIG. 9.
Linear correlation between the binned and merged signal obtained by computing the running average of , and the underlying dynamics, as a function of the bin size employed.
FIG. 10.
(a) Underlying dynamics [Eq. (15)]. (b) Reconstruction by LPSA with , and 10 modes. The correlation coefficient to the underlying signal is 0.9869. (c) Reconstruction by time-binning and merging with bins comprising 1201 frames. The correlation coefficient to the underlying signal is 0.9519.
VII. CONCLUSIONS
We have presented a new approach to retrieving dynamical information from highly incomplete and partial data, of the type obtained by TR-SFX experiments. This approach, which we call LPSA, allows an improved signal reconstruction, compared to time binning and merging, which is to date the most common procedure to gain dynamical insight from TR-SFX data. We have also shown that, while achieving the same reconstruction quality as other sophisticated dynamics retrieval techniques (SSA, NLSA), LPSA presents practical advantages, in particular concerning the computational cost of the algorithm, and the number of parameters to be optimized. While being developed in the context of TR-SFX data analysis, LPSA is a general tool for the analysis of time series affected by stochastic weighting and incompleteness, which could be employed in a diverse range of applications in science and engineering.
ACKNOWLEDGMENTS
G.F.X.S. acknowledges Swiss National Science Foundation Grant Nos. 173335, 192760, and 192780. A.H. and A.O. were supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award No. DE-SC0002164. R.S. acknowledges support from the Cluster of Excellence “CUI: Advanced Imaging of Matter” of the Deutsche Forschungsgemeinschaft (DFG)-EXC 2056-project ID 390715994.
Contributor Information
Cecilia M. Casadei, Email: mailto:cecilia.casadei@psi.ch.
Robin Santra, Email: mailto:robin.santra@cfel.de.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Cecilia Maria Casadei: Formal analysis (lead); Investigation (equal); Methodology (supporting); Software (lead); Visualization (lead); Writing – original draft (lead); Writing – review and editing (equal). Ahmad Hosseinizadeh: Methodology (supporting); Writing – review and editing (supporting). Gebhard F. X. Schertler: Conceptualization (equal); Funding acquisition (lead); Investigation (supporting); Project administration (lead); Resources (lead); Supervision (equal); Writing – review and editing (equal). Abbas Ourmazd: Conceptualization (supporting); Investigation (supporting); Methodology (supporting); Writing – review and editing (supporting). Robin Santra: Conceptualization (equal); Formal analysis (supporting); Investigation (equal); Methodology (lead); Supervision (equal); Writing – original draft (supporting); Writing – review and editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available within the article.
References
- 1. Schlichting I., “ Serial femtosecond crystallography: The first five years,” IUCrJ 2, 246–255 (2015). 10.1107/S205225251402702X [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Fromme P., “ Xfels open a new era in structural chemical biology,” Nat. Chem. Biol. 11, 895–899 (2015). 10.1038/nchembio.1968 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Spence J. C. H., “ XFELs for structure and dynamics in biology,” IUCrJ 4, 322–339 (2017). 10.1107/S2052252517005760 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Chapman H. N., “ X-ray free-electron lasers for the structure and dynamics of macromolecules,” Annu. Rev. Biochem. 88, 35–58 (2019). 10.1146/annurev-biochem-013118-110744 [DOI] [PubMed] [Google Scholar]
- 5. Ginn H. M., Brewster A. S., Hattne J., Evans G., Wagner A., Grimes J. M., Sauter N. K., Sutton G., and Stuart D. I., “ A revised partiality model and post-refinement algorithm for X-ray free-electron laser data,” Acta Crystallogr., Sect. D 71, 1400–1410 (2015). 10.1107/S1399004715006902 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. White T. A., Mariani V., Brehm W., Yefanov O., Barty A., Beyerlein K. R., Chervinskii F., Galli L., Gati C., Nakane T., Tolstikova A., Yamashita K., Yoon C. H., Diederichs K., and Chapman H. N., “ Recent developments in CrystFEL,” J. Appl. Crystallogr. 49, 680–689 (2016). 10.1107/S1600576716004751 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Bionta M. R., Lemke H. T., Cryan J. P., Glownia J. M., Bostedt C., Cammarata M., Castagna J.-C., Ding Y., Fritz D. M., Fry A. R., Krzywinski J., Messerschmidt M., Schorb S., Swiggers M. L., and Coffee R. N., “ Spectral encoding of x-ray/optical relative delay,” Opt. Express 19, 21855–21865 (2011). 10.1364/OE.19.021855 [DOI] [PubMed] [Google Scholar]
- 8. Glownia J. M., Gumerlock K., Lemke H. T., Sato T., Zhu D., and Chollet M., “ Pump–probe experimental methodology at the Linac Coherent Light Source,” J. Synchrotron Radiat. 26, 685–691 (2019). 10.1107/S160057751900225X [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Borek D., Minor W., and Otwinowski Z., “ Measurement errors and their consequences in protein crystallography,” Acta Crystallogr., Sect. D 59, 2031–2038 (2003). 10.1107/S0907444903020924 [DOI] [PubMed] [Google Scholar]
- 10. Ewald P., “ Zur theorie der interferenzen der röntgenstrahlen in kristallen,” Phys. Z. 14, 465 (1913). [Google Scholar]
- 11. Sherwood D. and Cooper J., Crystals, X-Rays, and Proteins: Comprehensive Protein Crystallography ( Oxford University Press, 2015). [Google Scholar]
- 12. Tenboer J., Basu S., Zatsepin N., Pande K., Milathianaki D., Frank M., Hunter M., Boutet S., Williams G. J., Koglin J. E., Oberthuer D., Heymann M., Kupitz C., Conrad C., Coe J., Roy-Chowdhury S., Weierstall U., James D., Wang D., Grant T., Barty A., Yefanov O., Scales J., Gati C., Seuring C., Srajer V., Henning R., Schwander P., Fromme R., Ourmazd A., Moffat K., Thor J. J. V., Spence J. C. H., Fromme P., Chapman H. N., and Schmidt M., “ Time-resolved serial crystallography captures high-resolution intermediates of photoactive yellow protein,” Science 346, 1242–1246 (2014). 10.1126/science.1259357 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Barends T. R. M., Foucar L., Ardevol A., Nass K., Aquila A., Botha S., Doak R. B., Falahati K., Hartmann E., Hilpert M., Heinz M., Hoffmann M. C., Köfinger J., Koglin J. E., Kovacsova G., Liang M., Milathianaki D., Lemke H. T., Reinstein J., Roome C. M., Shoeman R. L., Williams G. J., Burghardt I., Hummer G., Boutet S., and Schlichting I., “ Direct observation of ultrafast collective motions in co myoglobin upon ligand dissociation,” Science 350, 445–450 (2015). 10.1126/science.aac5492 [DOI] [PubMed] [Google Scholar]
- 14. Nango E., Royant A., Kubo M., Nakane T., Wickstrand C., Kimura T., Tanaka T., Tono K., Song C., Tanaka R., Arima T., Yamashita A., Kobayashi J., Hosaka T., Mizohata E., Nogly P., Sugahara M., Nam D., Nomura T., Shimamura T., Im D., Fujiwara T., Yamanaka Y., Jeon B., Nishizawa T., Oda K., Fukuda M., Andersson R., Båth P., Dods R., Davidsson J., Matsuoka S., Kawatake S., Murata M., Nureki O., Owada S., Kameshima T., Hatsui T., Joti Y., Schertler G., Yabashi M., Bondar A.-N., Standfuss J., Neutze R., and Iwata S., “ A three-dimensional movie of structural changes in bacteriorhodopsin,” Science 354, 1552–1557 (2016). 10.1126/science.aah3497 [DOI] [PubMed] [Google Scholar]
- 15. Pande K., Hutchison C. D. M., Groenhof G., Aquila A., Robinson J. S., Tenboer J., Basu S., Boutet S., DePonte D. P., Liang M., White T. A., Zatsepin N. A., Yefanov O., Morozov D., Oberthuer D., Gati C., Subramanian G., James D., Zhao Y., Koralek J., Brayshaw J., Kupitz C., Conrad C., Roy-Chowdhury S., Coe J. D., Metz M., Xavier P. L., Grant T. D., Koglin J. E., Ketawala G., Fromme R., Šrajer V., Henning R., Spence J. C. H., Ourmazd A., Schwander P., Weierstall U., Frank M., Fromme P., Barty A., Chapman H. N., Moffat K., van Thor J. J., and Schmidt M., “ Femtosecond structural dynamics drives the trans/cis isomerization in photoactive yellow protein,” Science 352, 725–729 (2016). 10.1126/science.aad5081 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Nogly P., Weinert T., James D., Carbajo S., Ozerov D., Furrer A., Gashi D., Borin V., Skopintsev P., Jaeger K., Nass K., Båth P., Bosman R., Koglin J., Seaberg M., Lane T., Kekilli D., Brünle S., Tanaka T., Wu W., Milne C., White T., Barty A., Weierstall U., Panneels V., Nango E., Iwata S., Hunter M., Schapiro I., Schertler G., Neutze R., and Standfuss J., “ Retinal isomerization in bacteriorhodopsin captured by a femtosecond x-ray laser,” Science 361, eaat0094 (2018). 10.1126/science.aat0094 [DOI] [PubMed] [Google Scholar]
- 17. Coquelle N., Sliwa M., Woodhouse J., Schirò G., Adam V., Aquila A., Barends T. R. M., Boutet S., Byrdin M., Carbajo S., De la Mora E., Doak R. B., Feliks M., Fieschi F., Foucar L., Guillon V., Hilpert M., Hunter M. S., Jakobs S., Koglin J. E., Kovacsova G., Lane T. J., Lévy B., Liang M., Nass K., Ridard J., Robinson J. S., Roome C. M., Ruckebusch C., Seaberg M., Thepaut M., Cammarata M., Demachy I., Field M., Shoeman R. L., Bourgeois D., Colletier J.-P., Schlichting I., and Weik M., “ Chromophore twisting in the excited state of a photoswitchable fluorescent protein captured by time-resolved serial femtosecond crystallography,” Nat. Chem. 10, 31–37 (2018). 10.1038/nchem.2853 [DOI] [PubMed] [Google Scholar]
- 18. Skopintsev P., Ehrenberg D., Weinert T., James D., Kar R. K., Johnson P. J. M., Ozerov D., Furrer A., Martiel I., Dworkowski F., Nass K., Knopp G., Cirelli C., Arrell C., Gashi D., Mous S., Wranik M., Gruhl T., Kekilli D., Brünle S., Deupi X., Schertler G. F. X., Benoit R. M., Panneels V., Nogly P., Schapiro I., Milne C., Heberle J., and Standfuss J., “ Femtosecond-to-millisecond structural changes in a light-driven sodium pump,” Nature 583, 314–318 (2020). 10.1038/s41586-020-2307-8 [DOI] [PubMed] [Google Scholar]
- 19. Willmore T., An Introduction to Differential Geometry, Dover Books on Mathematics Series ( Dover Publications, 2012). [Google Scholar]
- 20. Aubry N., Guyonnet R., and Lima R., “ Spatiotemporal analysis of complex signals: Theory and applications,” J. Stat. Phys. 64, 683–739 (1991). 10.1007/BF01048312 [DOI] [Google Scholar]
- 21. Golub G. H. and Van Loan C. F., Matrix Computations, 3rd ed. ( The Johns Hopkins University Press, 1996). [Google Scholar]
- 22. Hosseinizadeh A., Breckwoldt N., Fung R., Sepehr R., Schmidt M., Schwander P., Santra R., and Ourmazd A., “ Few-fs resolution of a photoactive protein traversing a conical intersection,” Nature 599, 697–701 (2021). 10.1038/s41586-021-04050-9 [DOI] [PubMed] [Google Scholar]
- 23. Packard N. H., Crutchfield J. P., Farmer J. D., and Shaw R. S., “ Geometry from a time series,” Phys. Rev. Lett. 45, 712–716 (1980). 10.1103/PhysRevLett.45.712 [DOI] [Google Scholar]
- 24. Takens F., “ Detecting strange attractors in turbulence,” in Dynamical Systems and Turbulence, Warwick 1980, edited by Rand D. and Young L.-S. ( Springer, Berlin, Heidelberg, 1981), pp. 366–381. [Google Scholar]
- 25. Sauer T., Yorke J. A., and Casdagli M., “ Embedology,” J. Stat. Phys. 65, 579–616 (1991). 10.1007/BF01053745 [DOI] [Google Scholar]
- 26. Vautard R. and Ghil M., “ Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series,” Physica D 35, 395–424 (1989). 10.1016/0167-2789(89)90077-8 [DOI] [Google Scholar]
- 27. Giannakis D. and Majda A. J., “ Nonlinear Laplacian spectral analysis for time series with intermittency and low-frequency variability,” Proc. Natl. Acad. Sci. 109, 2222–2227 (2012). 10.1073/pnas.1118984109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Fung R., Hanna A. M., Vendrell O., Ramakrishna S., Seideman T., Santra R., and Ourmazd A., “ Dynamics from noisy data with extreme timing uncertainty,” Nature 532, 471–475 (2016). 10.1038/nature17627 [DOI] [PubMed] [Google Scholar]
- 29. Coifman R. R., Lafon S., Lee A. B., Maggioni M., Nadler B., Warner F., and Zucker S. W., “ Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps,” Proc. Natl. Acad. Sci. 102, 7426–7431 (2005). 10.1073/pnas.0500334102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Coifman R. R., Shkolnisky Y., Sigworth F. J., and Singer A., “ Graph Laplacian tomography from unknown random projections,” IEEE Trans. Image Process. 17, 1891–1899 (2008). 10.1109/TIP.2008.2002305 [DOI] [PubMed] [Google Scholar]
- 31. Coifman R. R. and Lafon S., “ Diffusion maps,” Appl. Comput. Harmonic Anal. 21, 5–30 (2006). 10.1016/j.acha.2006.04.006 [DOI] [Google Scholar]
- 32. Giannakis D., “ Delay-coordinate maps, coherence, and approximate spectra of evolution operators,” Res. Math. Sci. 8, 8 (2021). 10.1007/s40687-020-00239-y [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this study are available within the article.










