Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Nov 18.
Published in final edited form as: J Phys Chem B. 2021 Nov 8;125(45):12401–12412. doi: 10.1021/acs.jpcb.1c05820

Resolving Dynamics in the Ensemble: Finding Paths through Intermediate States and Disordered Protein Structures

Adam K Nijhawan 1,§, Arnold M Chan 2,§, Darren J Hsu 3, Lin X Chen 4, Kevin L Kohlstedt 5
PMCID: PMC9096987  NIHMSID: NIHMS1803768  PMID: 34748336

Abstract

Proteins have been found to inhabit a diverse set of three-dimensional structures. The dynamics that govern protein interconversion between structures happen over a wide range of time scales—picoseconds to seconds. Our understanding of protein functions and dynamics is largely reliant upon our ability to elucidate physically populated structures. From an experimental structural characterization perspective, we are often limited to measuring the ensemble-averaged structure both in the steady-state and time-resolved regimes. Generating kinetic models and understanding protein structure–function relationships require atomistic knowledge of the populated states in the ensemble. In this Perspective, we present ensemble refinement methodologies that integrate time-resolved experimental signals with molecular dynamics models. We first discuss integration of experimental structural restraints to molecular models in disordered protein systems that adhere to the principle of maximum entropy for creating a complete set of ensemble structures. We then propose strategies to find kinetic pathways between the refined structures, using time-resolved inputs to guide molecular dynamics trajectories and the use of inference to generate tailored stimuli to prepare a desired ensemble of protein states.

Graphical Abstract

graphic file with name nihms-1803768-f0001.jpg

1. INTRODUCTION

The set of accessible protein structures that perform physiological functions is now known to be much more heterogeneous than originally thought due to intrinsic disorder within the proteome.14 Resolving ensembles of heterogeneous structures has been a major challenge in the protein community, despite the many achievements in structural characterization at thermodynamic equilibrium. However, in vivo, proteins are subjected to various environmental stimuli that drive structures away from equilibrium, such as changes in pH, temperature, or ion concentrations that can perturb or tune desired cellular functions.5,6 Understanding protein structural dynamics in response to these environmental signals is paramount to elucidating functional mechanisms. Beyond the knowledge gap in our understanding of non-native state conformations or intrinsically disordered proteins (IDPs), the connections of non-native structures and their kinetic pathways are essential for improving the efficacy of drug delivery,7 preventing diseases,8 and guiding discovery of new treatments.9 To gain insight into these structures and the kinetic pathways connecting them, molecular models and simulations are useful to bridge gaps in experimental structural characterizations.

The refinement of experimentally populated protein structures has been a major triumph of computer aided molecular models over the past few decades.7,10 Key to the molecular model’s ability to predict three-dimensional protein structures has been the inclusion of diverse experimental structural characterizations to guide and refine a model’s predicted structure toward experimentally relevant equilibrium conformations.1113 These inputs include force field parameters,14 structural restraints,15 metainference refinements,16 and data-driven structural refinement models.17 The use of these models has been especially successful in structural characterizations of well-folded proteins at both the atomic level and coarse-grained level.18

The next major milestone in characterizing protein structure is the prediction of non-native intermediate structures and/or competing populated states and their kinetic relationships to folded structures.19,20 Understanding the conformational transition kinetics will help elucidate the structure–function relationships between proteins that mediate important roles like cellular signaling.2123 To fully understand the structure–function relationship, it is important to understand the kinetic pathway between relevant conformations with atomistic resolution. IDPs or intrinsically disordered regions (IDRs) of proteins are particularly appealing, as they exist in multiple conformations, serving as an ideal testing ground to accurately model the kinetic pathway and energetics for protein dynamics.24 Moreover, intermediates along folding pathways of small proteins and oligomerization are only transiently populated, often for less than milliseconds.25,26

In this article, we focus on the challenges and progress made in characterizing and understanding IDP structures, intermediate states of small proteins, and intermediate states of oligomeric proteins during association and disassociation. We emphasize the ensemble refinement problem and how weighting the predicted transient structures is a barrier to the prediction of the kinetic pathways between them. There has been a significant effort to utilize various experimental and theoretical techniques to understand the function and dynamics of proteins from each of the aforementioned categories.27,28 The generalization of the maximum entropy principle to nonequilibrium pathways is a notable example, but its use over a broad set of proteins has yet to be examined.14 Studying protein structural dynamics in response to environmental stimuli requires time-resolved experimental techniques capable of spanning a variety of time scales.

Protein structural dynamics occur over multiple time scales, spanning from femtoseconds to seconds or hours, and on multiple spatial scales from sub-angstroms to tens of nanometers. As such, techniques with a broad dynamic range of time scales and complementary spatial resolutions are required to understand complete mechanisms and functions. There are a number of experimental probes that allow for time-resolved observance of structural changes in biological systems on the relevant biological time scales. Infrared spectroscopy (IR and two-dimensional IR (2DIR)), nuclear magnetic resonance (NMR), fluorescence resonance energy transfer (FRET), X-ray transient absorption (XTA) spectroscopy, and time-resolved X-ray solution scattering (TRXSS) are among the experimental probes that have been successful in tracking changes in protein structure to elucidate kinetic information.2933 However, extracting holistic 3-D information along a nonequilibrium transient conformational pathway while distinguishing the corresponding populated ensemble from solely experimental techniques is challenging. There exists no single methodology capable of obtaining an accurate atomic-level description across the spectrum of temporal changes in structure. Additionally, some experimental techniques involve isotopic labeling, large sample consumption, or ensemble averages for changes in structure, which can be an impediment for broad applications. Molecular dynamics (MD) simulations offer an atomic-level description but are unsuitable for slower dynamical processes as a result of the computational cost and complexity (Figure 1).2123,26,32,3454 Structurally heterogeneous proteins like IDPs and proteins with IDRs are dynamic at the same time which can intrinsically sample a multitude of diverse structures, thus providing a robust data set for testing structural refinement methodologies and kinetic pathway sampling using time-resolved techniques with molecular models.

Figure 1.

Figure 1.

Time scales for protein dynamics grouped by technique.2123,26,32,3454 Relevant protein time scales are plotted vs the number of residues for the system of interest for studies done using MD (purple), TRXSS (green), IR (blue), and trp-fluorescence (orange). MD studies are primarily located around smaller proteins and shorter time scales. The shaded gray region between 10 and 120 residues indicates the region primarily consisting of IDPs.

Recently, we have incorporated experimental inputs to drive MD simulations.4,55 Specifically, we have developed an X-ray solution scattering restraint simulation package that uses time-dependent small-angle X-ray scattering (SAXS) signatures to restrain the MD trajectories to result in structures consistent with the experimental solution scattering signal (Figure 2). The challenge now becomes using experimentally driven MD to extract accurate information about the kinetics of intermediary or non-native states to form a complete understanding of the functional dynamics.

Figure 2.

Figure 2.

A flowchart for the SAXS-guided MD following an environmental perturbation. TRXSS difference curves can be used as an experimental input to biased-MD simulations toward sampling experimentally relevant conformations. The scattering profile of the MD structure is calculated on-the-fly, and a harmonic constraint is used to drive the structure toward the experimental input.

This article is organized into three main sections. The first discusses experimental probes as a means to unlock kinetic information. We show how XTA, IR, NMR, and TRXSS have been successful in elucidating transient intermediate states for model systems. Additionally, we contextualize the pros and cons of various experimental techniques and the information that can be extracted to guide MD. Next, we focus on how one can generate physically populated structures along kinetic pathways from guided MD. We will discuss the challenges in determining non-native disordered structures for IDPs, oligomerization, and small proteins. We will also discuss important factors and approaches for elucidating reaction rates/pathways between disordered structures. Finally, we will discuss a new way of framing excited state probes through tailored stimuli for time-resolved measurements. This will incorporate machine learning to gain insight into stimuli parameters such as the effects of T-jumps, pH, and mixed stimuli.

2. TIME-RESOLVED CHARACTERIZATIONS FOR MD STRUCTURAL REFINEMENT

To capture biomolecular structural dynamics, one must choose from a variety of biophysical methods. For equilibrium studies, to analyze the dynamics of different conformations, nuclear magnetic resonance (NMR) spectroscopy has been a powerful tool on various protein length scales. Particularly, key advancements in the amyloid binding mechanisms were revealed with NMR spectroscopy, while modern advancements have combined the technique with X-ray diffraction and electron microscopy to characterize both nuanced atomic-level oligomerization and fibril morphology.56,57 However, to capture kinetically trapped intermediates in environmentally perturbed nonequilibrium systems, an experimental design that will synchronously initiate a reaction and track its time evolution is required. Rapid mixing technology has been demonstrated for many biomolecular dynamic studies and is particularly useful in ligand binding systems.58 However, mixing times as short as hundreds of microseconds can still limit the observation of fundamental IDP transitions and fast-folding peptide behavior. An alternative approach to access higher time resolution is through light activation: using a laser pulse to trigger the structural dynamics directly or indirectly. This methodology stems from the field of ultrafast photochemical dynamics, where an excitation laser pulse (pump) photoexcites a molecule and subsequent measurement pulses (probes) are time-delayed relative to the initiation for stroboscopic observation.

2.1. Photodissociation-Induced Metalloprotein Structural Reorganization.

An obvious application of pump–probe approaches is toward photoactive proteins. For example, the structural dynamics of heme proteins (myoglobin and cytochrome c oxidase) have been widely studied with laser-triggered photolysis of a ligand to observe subsequent heme-doming and transient intermediates, critical in active site activation and oxidase reaction mechanisms.52,5962 Cytochrome c (cyt c), a 104-residue electron transport chain protein, is a long-established model system for heme protein biophysics. In parallel time-resolved spectroscopy and scattering experiments, the folding mechanisms, including transient intermediates, of cyt c were observed upon the photodissociation of a CO-bound ground state. In less than 1 ps, the optical photoexcitation of the heme changes the Fe oxidation state and triggers cyt c folding toward its native state (Figure 3a). Leveraging the technology of modern synchrotron and free electron laser light sources, many have used X-ray spectroscopy and scattering methods to characterize biomolecular structure. The near-edge X-ray absorption spectra (XANES) can track the changes in oxidation state of the iron active site, which differentiates between neighboring ligands along the folding pathway. Additionally, the extended X-ray absorption fine structure (EXAFS) spectra provided the specific iron to ligand bond distances, revealing an early time water-bound first intermediate conformation. The Fe electronic structure, heme structure, and secondary protein structure were all cross correlated and interpreted together to establish a map of the folding mechanism from several nanoseconds to 100 ms. Comparing the signal time evolution in XTA and TRXSS gives a stitched temporal picture of cyt c conformational shifts, as shown in Figure 3b. Singular value decomposition (SVD) is used to determine the number of linearly independent scattering signatures in the time series data. Using the chosen SVD components as parameters of a kinetic model, one can derive species-associated difference (SAD) patterns by global fitting, where each SAD curve corresponds to a defined state in the kinetic model. In Figure 3c, the SAD scattering signatures and their corresponding time evolution show that photoexcited cyt c clearly underwent an unfolding pathway traversing two different transient intermediates. The initial photolyzed state was found to return to the folded state (FM) via the Met80-bound unfolding intermediate (UM) or the slower His26/33-bound unfolding intermediate (UH) to a further unfolded state.

Figure 3.

Figure 3.

Experimentally tracking photolyzed ligation of heme protein. (a) Scheme of the cytochrome c heme active site. XTA tracks the local heme structural dynamics, and TRXSS tracks the global conformational changes. (b) Difference signal kinetics from XTA (edge and post-edge) overlaid with those from TRXSS (SAXS and WAXS). The post-edge and WAXS signatures correspond to the same process, bridging the spatial and temporal extents of the experiment. (c) Species-associated difference (SAD) on the left and their corresponding relative measured population on the right for the two intermediates, UM and UH, and the final unfolded state, FM. Reproduced from ref 60 with permission from the Royal Society of Chemistry.

Since the EXAFS structural characterization and wide-angle X-ray scattering (WAXS) were well correlated to describe the heme and nearby ligand dynamics, the two parallel observations were stitched together to distinguish more transient species than each experiment could individually. The X-ray scattering differences at longer time delays tracked the propagation of heme-doming motions into conformational changes in the outer protein, revealed in the species-associated difference signals in the SAXS regime.60 A similar experiment on cyt c was also conducted to monitor structural changes upon electron transport, initiated by the photoexcitation of NADH.63 Although this folding pathway did not reveal disordered intermediate states, the experimental data was modeled with an ensemble of MD simulated structures. However, without incorporating experimental data into the MD sampling, it is challenging to validate the predictive ability of the simulation. Metalloproteins undergo diverse folding pathways, which require careful experimental observation with tailored MD simulations to reveal atomistic structure.

2.2. Structural Probes Following a Perturbation: T-Jump and pH-Jump.

Although direct photoexcitation for redox or ligand dissociation triggered reactions is relevant for many photoactive enzymes, other indirect perturbation methods are emerging to expand the scope of protein folding/unfolding studies. Many proteins, photoactive or not, populate non-native states strongly correlated to the surrounding temperature. To leverage this general physical property, the temperature-jump (T-jump), usually induced by a laser pulse that excites the overtone stretch mode of water in aqueous solution, can trigger a temperature increase within nanoseconds that lasts up to milliseconds.4,22,64 This indirect light activated pump–probe approach has been widely used to observe disordered protein regions and oligomerization mechanisms.4,22

To utilize computational refinements for time-resolved processes requires both a time-resolved experimental input and MD trajectories to efficiently sample structures away from the stationary state. In MD, the time-resolved structural changes are visible, but it is especially challenging to ensure sampled configurations are consistent with the experimental ensemble after a perturbation such as T-jump, pH-jump, or another stroboscopic triggering event. Techniques such as replica exchange molecular dynamics (REMD) and umbrella sampling/metadynamics are useful for increasing the distribution of structures sampled during a simulation, but there remains the question as to the physical significance and relevance of each structure observed.13 Bayesian weighting and collective variable analysis can be used to extract physically populated structures, but there still lacks a direct link between experimentally observed changes and atomic-level understanding of structural change.15 Using a structural restraint, like TRXSS, along with a maximum entropy sampling technique, like REMD, offers a solution toward generating a diverse set of structures and corresponding kinetic transitions.65

2.2.1. Insulin Dimer Dissociation.

One widely studied system is the protein hormone insulin, since it exists in a variety of oligomeric states (dimers, tetramers, or hexamers), and its dissociation is a prerequisite for in vivo cellular signaling processes. Insulin dimerization can be induced and observed in vitro with temperature-jump time-resolved 2D-IR spectroscopy, where key intermolecular residues were isotopically labeled such that their spatial and temporal evolution could be tracked by amide vibrational dipole–dipole interaction signatures in the IR region. Since the insulin dimer is formed by two monomers docked at their C-terminal β-sheets, this targeted amide I infrared spectroscopic approach described the local transient structures well.21 At this point, the dissociation of insulin could be broadly described by fast-solvent heating and hydrogen bond weakening in the first few nanoseconds, followed by overlapping states corresponding to monomer conformational changes and β-sheet dissociation spanning hundreds of microseconds.21 In comparison, a T-jump TRXSS study was able to identify a clear intermediate corresponding to the reorganization of the monomer and construct a mechanism with more structural evidence than the initial 2DIR study.22 One of the reasons for the complexity of the insulin dimer dissociation mechanism is due to the fact that it is a transiently disordered protein. Such mechanisms are not uniquely characterized with experimental methods alone; rather, they should be modeled with transient intermediates fed in as transitional benchmarks in the pathway. To adequately sample rare events, like dissociation, simulation studies of insulin have utilized biased-MD approaches. These results showed two limiting pathways within the ensemble of intermediates: one initial pathway dominant toward the α-helical dissociation and another dominant toward β-sheet dissociation.27

2.2.2. Intrinsically Disordered Proteins.

As previously demonstrated, global structural characterization gives direct information about transient states without the limitation of non-native labels. Leveraging SVD and global analysis into the kinetic modeling of measured data, global structural dynamics of independent species can be obtained in TRXSS experiments. Although X-ray crystallographic methods give the highest spatial resolution, they seldom can be applied to IDPs or IDRs. Disordered proteins are either impossible to crystallize or undergo large conformational changes that ruin the crystalline lattice.29 Electron microscopy methods cannot spatially resolve smaller model secondary structures nor representative tertiary structure domains that change on sub-microsecond time scales.66 Furthermore, there are cases when the IDRs are unknown a priori and cannot be identified without systematic trial and error.67,68 Thus, the preferred approach to capture global structural dynamics of disordered biological systems in response to environmental perturbations is TRXSS.

The Bayesian incorporation of experimental data to guide MD simulations is a growing method of refining the conformational space of IDPs. Steady-state experimental characterization techniques (like NMR and SAXS) have been used to improve structural selection.17,19,67,68 However, limitations still exist in calculating the simulated experimental results and selection of an accurate prior for a largely underdetermined problem.69 Kinetic information extracted from time-resolved experiments should further refine IDP structures because deterministic transition pathways will refine the possible confirmations. Recently, a combined approach of TRXSS and MD simulations of transient conformationally disordered states was applied to a model IDP system. Calcium-bound bovine α-lactalbumin (BLA) was known to populate molten globule states upon temperature perturbation. The transient conformational species-associated signatures were used to bias subsequent MD simulations to generate an atomic resolution picture of the intermediate and unfolded states. Specifically, a disordered region in the secondary structure of the molten globule state was identified as an intermediate state on the pathway toward further unfolded states.4 The combination of indirect pump–probe measurements and tailored MD simulations demonstrates a more deterministic way of illustrating temperature-dependent IDP behavior.

In addition to temperature perturbations, chemical stimuli can induce conformational changes in IDPs. Adaptive biomaterials are inspired by fundamental rearrangements of pH-sensitive peptides; one example is poly-l-glutamic acid (PGA), which collapses from a disordered polypeptide at high pH to a helical configuration at low pH. A pH-jump can be triggered upon photoexcitation of o-nitrobenzaldehyde (o-NBA), a photoacid that will release a proton, lowering the pH of the solution in its excited state. Analogous to the indirect temperature-jump, the pH change can be tuned by adjusting the “pump” laser power. Transient pH-jump IR spectroscopy studies have shown more transient conformational states of PGA with larger PGA systems.70 A pH-jump TRXSS experiment showed a transient charged intermediate in the longest (200-residue) PGA.51 Both studies point toward a nucleation-propagation folding mechanism, triggered by a sudden pH decrease.

2.3. Generating Structures along Kinetic Pathways from Guided MD.

Transient structural characterization techniques are useful for observing a change in the structural conformation of a protein or change in the ensemble conformational distribution, following an environmental perturbation. However, complications arise when assigning atomic-level structures to changes in the conformational ensemble. Furthermore, the kinetic pathway in between structures requires additional insight. Beyond accurate structural and energetic insight of the two states, a multitude of plausible transition structures are required along the different kinetic pathways. Finally, analysis of plausible pathways and chemical intuition are required to propose a feasible kinetic pathway. There are different approaches to the structural analysis of possible kinetic pathways, which we discuss in this section.

One major barrier with using MD simulations for generating ensembles of non-native structures necessary to find the kinetic pathways is the appropriate sampling of a representative set of structures that conforms to the maximum entropy principle.65 While MD simulations offer an atomic-level description of protein structural dynamics, the inability to widely sample the potential energy surface leads to missing some structures and/or over-representation of others. The maximum entropy principle, in the context of sampling protein structures, sets a criterion to prevent under-sampling of structural ensembles. The implementation of maximum entropy methods falls into two broad categories: experimental restraints to the MD Hamiltonian via back-calculated observables to match an experimental input or a posteriori reweighting of structures to match an experimental observable. There has been a lot of effort in both categories to adhere to the maximum entropy principle, and a majority of the progress has centered around using Bayesian inference to account for errors in the priors and experimental inputs when matching predicted MD structures to a set of experimental observables.71

First, experimental information is required to develop a preliminary understanding of the number of unique components involved and a kinetic model for the system. In TRXSS, SAD signals represent the scattering profile of a single chemical species from the observed ensemble. However, especially in the case of IDPs, it is possible that a single SAD pattern represents multiple configurations. Global kinetic modeling has also been applied to time-resolved vibrational spectroscopy, XTA, etc., for the same purpose, but these experimental techniques do not provide direct global structural insight.

Structural interpretations of SAD signatures, along the corresponding kinetic mechanism, can be achieved with a guided MD protocol. SAXS-guided MD can be used to generate structures consistent with experimental scattering profiles, while metadynamics and Markov state models can be used to analyze plausible kinetic pathways. SAXS-guided MD uses experimentally obtained SADs to bias the simulation toward sampling measured conformations. SAXS-guided MD requires a complete three-dimensional starting structure for the protein and the SAD curves for the system of interest. It is also essential to account for solvent shell interactions during the simulation, as they affect the calculated scattering profile. For a given guided MD simulation, a SAD curve serves as the target scattering profile for the simulation. A harmonic constraint is implemented such that the scattering profile of the starting structure is driven toward that of the SAD curve. The final MD structure will have a scattering profile consistent with the SAD. However, because the SAD is a one-dimensional curve, there is no guarantee that the resultant structure describes the observed signal, as a number of three-dimensional structures can produce the same scattering pattern.72 Furthermore, there is an additional concern where the SAD represents the average across a large number of diverse structures, none of whose individual scattering pattern matches the SAD. Strategies to increase the distribution of sampled structures during simulations have been developed; notably, those that use the maximum entropy principle have been successful at generating structures consistent with an experimental restraint or Bayesian reweighting scheme.28,55,71 At the same time, increasing the sampling distribution can also be accomplished using replicas of the system.

REMD has successfully been implemented to increase the distribution of sampled structures. Sampling the complete configuration space is especially important for IDPs that can have a large number of potential energy minima, indicative of multiple competing populated states. For IDPs, increasing the number of replicas for the biased-MD simulations broadens the distribution of sampled structures, such that the ergodicity nearly resembles that of a free-equilibrium simulation after the appropriate number of replicas have been introduced, adhering to the principle of maximum entropy (Figure 4).65 An accurate structural ensemble is essential for characterizing non-native or competing populated states, which is especially important for using biased-MD to determine non-native, disordered structures following an environmental perturbation.

Figure 4.

Figure 4.

(a, b) Overlay of trajectories sampled during (a) unbiased and (b) four-replica refinement MD simulations. (c, d) Distribution of radii of gyration sampled during MD simulations. The divergence in structures is quantified by Kullback–Leibler (KL) (red) and Jensen–Shannon (JSD) (orange) divergences. Adapted with permission from ref 65. Copyright 2019 American Chemical Society.

Biased-MD can be used to determine relevant structures along the kinetic pathway for small model proteins, oligomerization processes, and IDPs. SAXS-guided MD successfully modeled the three transient states for the unfolding process of the molten globule protein, bovine alpha-lactalbumin. Through inputting the relevant SAD curves, determined through global analysis and SVD, ensemble weighted structures were assigned to each state in the unfolding process. These structures were determined without the kinetic time scale information extracted from TRXSS. However, the results, in combination with the experimentally determined kinetic profile, offer a complete understanding of the BLA unfolding process.

For oligomerization processes, such as insulin association and dissociation dynamics, guided MD, in conjunction with metadynamics, has been successful for determining structures along the kinetic pathway (Figure 5). A free energy landscape for the reaction can be constructed using metadynamics by selecting the appropriate collective variables (CVs). Appropriate CVs should be able to distinguish between the stable intermediary states identified experimentally and should be among or representative of the slowest degrees of freedom for the transitions of interest. While the use of CVs for a variety of proteins has been useful for disentangling relaxation degrees of freedom in constructing free energy surfaces, it is not guaranteed that selected CVs will be descriptive nor that we will have a priori knowledge of them. Methods to generate CVs or relaxation pathways such as VAMPNets,73 machine-learned CVs,74,75 and TICA predicted Markov states76,77 have recently been shown to be useful for IDPs and oligomeric protein dissociation.78,20

Figure 5.

Figure 5.

Insulin dimer MSM analysis. (a) Overview of native and twisted insulin dimer conformations. (b) Network plot color-coded by non-hydrogen RMSD with respect to the crystal structure. (c) Network plot for the identified states native (N), twisted (T), and intermediates (18, 99, 80, 45, 16) with darker lines indicating a higher transition probability based on the transition matrix. (d, e) Calculated kinetics for the exchange between native and twisted states, initially starting from the twisted state. Adapted with permission from ref 20. Copyright 2021 American Chemical Society.

Tokmakoff, Dinner, and co-workers demonstrated how experiment and MD can be combined to offer atomic-level insight into dissociation of the insulin dimer into two monomers.27 They constructed various free energy diagrams as a function of CVs, selected based on interacting residue pairs with a large difference in solvent accessible surface area between dimer and monomer states. The minimum energy path from the free energy diagram can be used to approximate the kinetic pathway and reaction energetics. While such an analysis is useful, it is often challenging to select appropriate CVs. In the last section of this Perspective, we discuss the possibility of incorporating machine learning to select appropriate CVs.

Markov state models (MSMs) are particularly useful for discerning kinetic pathways between two known structures.79,80 MSMs work by first generating a set of MD trajectories. From here, each are analyzed and grouped into discrete “states”, with the idea of selecting the minimum number of states required to encompass the structural variety sampled. Next, based on the occurrences of each structure in the MD simulations, a rate matrix can be produced containing information about the likelihood of transitioning between structures. Recently, Feng et al. examined the dissociation and dynamics of the insulin dimer (Figure 5a) using MSMs and simulated 2D-IR spectra.20 They developed multiple network representations of the dimer MSM for various CVs (Figure 5b) based on a series of unbiased-MD simulations totaling 1.71 ms. Finally, a network plot between the native and twisted states was analyzed based on the transition probability matrix, and kinetic analysis between states was performed (Figure 5ce). MSMs are a powerful tool for analyzing dynamical systems. In the last section, we discuss how we envision combining TRXSS, MSMs, SAXS-guided MD, and machine learning (ML) to gain insight into stimuli parameters and atomic-level descriptions for relevant events.

2.4. Tailored Stimuli for Time-Resolved Measurements.

While modern empirical force fields have been successful in reproducing experimental results for globular proteins, there exists a disconnect between existing force fields and reproducing experimental observables for IDPs.17 IDPs do not adapt well-defined structures in solution. Rather, they exist in a multitude of partially or fully disordered states. They have a complicated potential energy surface, often with many local minima. As a result, it is computationally very demanding to survey the entire free energy surface. Techniques like TRXSS are ideal for studying IDP dynamics, as they report on changes in the average conformational ensemble of structures. However, because the conformational ensemble for IDPs often is associated with many structures, it is important to complement TRXSS with simulation-based methods to elucidate the underlying structures of the conformational ensemble. In this section, we discuss how ML can be used to gain insight into simulation parameters.

TRXSS requires an environmental perturbation to initiate the dynamics, often in the form of a pH-jump, T-jump, or photoexcitation event. Currently, we use the SADs, extracted from analysis of the time-resolved experimental data, to perform biased-MD without regard for the environmental perturbations preceding the observed dynamics. ML can be used to provide insight into the environmental perturbations and sort through feasible structures and kinetic pathways. Our proposed improvements focus on accurate incorporation of the experimental data to achieve consistent structures with biased-MD.

However, the inclusion of experimental data in MD simulations brings an additional challenge with respect to accurately simulating the transient effects of the perturbation, rather than defining an assumed starting configuration. It is not yet clear how to best generate meaningful, diverse, and accurate starting structures that are both consistent with the scattering signal and account for effects of the environmental perturbation. For example, in the case of a T-jump, the solution is rapidly heated to induce unfolding of the protein, and resulting dynamics are monitored as the system cools and returns to equilibrium. Currently, we do not have a way to characterize and incorporate effects of the increased thermal energy on the generated structures. The SAD curves are conveniently matched to the initial temperature of the simulation and do not take into account effects of the perturbation. However, the SAD curves represent ensemble information that includes effects of the perturbation. The challenge arises in quantifying the extent of the perturbation so that simulated structural ensembles are consistent with experiment. The distribution of structures observed at pre-environmental perturbation will be different from those observed post-perturbation. REMD can be used to sample conformations more readily accessible at higher temperatures, but there is not a guarantee those structures are consistent with the post T-jump structures. The incorporation of REMD will help in the commitment to the principle of maximum entropy and will generate a diverse set of structures for the heterogeneous ensembles, but it will not directly account for effects of the perturbation. To generate a diverse set of structures, consistent with experiment, and account for the input parameters and their uncertainty in the input parameters, Bayesian inference or ML can be used. Bayesian inference has been successful in conformational sampling for proteins, but it requires prior knowledge of the structure as well as experimental parameters.81,82 On the other hand, machine-learned neural networks are capable of determining correlations between degrees of freedom and therefore offer the ability to reduce the parameter space of the system.83 This dimension reduction is especially important for relatively undefined parameters such as system responses to perturbative stimuli. We aim to understand the impact of perturbative stimuli on the system and use this information to improve MD simulations. Understanding the perturbative effects will also allow us to tailor the stimuli used in time-resolved measurements.

ML can be used to gain insight into stimuli and examine the best set of input parameters for MD simulations. For example, ML can classify the degree of pH or T-jump. One opportunity for using ML to analyze a range of parameters (T-jump, pH-jump, etc.) would be to learn which set of input parameters are most critical when generating structural ensembles most consistent with experimental perturbations. Hydration shell parameters for different types of atoms are a complicated function of multiple inputs including effects of nearby solute atoms, solvent atoms, and temperature.84 ML can be used to optimize the hydration shell parameters following a T-jump by exploring the effects of varying such parameters on the structural ensembles and using experimental results to determine the best fit. Fundamentally understanding the extent of the perturbations on the structural ensembles will result in insight into the experimental settings. By accounting for these settings (temperature, pH, etc.), we can generate an ensemble of structures using biased-MD that are at the same settings. Additionally, as we learn more about quantifying the physical effects of the perturbations through our parameters obtained from ML, we can potentially improve future TRXSS experimental designs. For example, through examining the structural ensembles across multiple varying parameters (hydration shell, temperature, pH, etc.), we can begin to understand their effect on the structural ensembles. In doing so, we can tailor stimuli for our TRXSS experiments to attain structural ensembles predicted by the parameter search. Our overall goal is to improve integrated data streams to achieve the most accurate structural insight.

As mentioned in section 2.3, it is also possible to incorporate ML into our analysis of the structures generated from the biased-MD simulations. Once one has achieved a set of structures consistent with each SAD curve, we propose using Markov state models, similar to Dinner et al.,27 to analyze transition pathways between structures. There is the possibility of using ML to analyze the generated structures and group them into similar categories through unsupervised clustering of structures based on CV values.85 Brute-force sorting of structures requires significant prior knowledge of the relevant collective variables that change significantly between structural groups. By implementing ML, the structures could be sorted efficiently into groups that share values of collective variables within a specified threshold. From here, a Markov state analysis using the structural groups and a transition probability matrix would provide transition pathways between structures. For IDPs and IDRs that exhibit significant structural variety, implantation of a ML algorithm to sort structures would be highly beneficial.

3. CONCLUSIONS

Our understanding of proteins’ functions relies on a complete understanding of their dynamics. Approximately half of all proteins are IDPs or contain IDRs. Experimental techniques provide ensemble averages for such proteins’ structures, but the challenge emerges with accurately weighting the structures that contribute to the heterogeneous ensemble. Here, we propose using TRXSS to probe structural changes of the ensemble after an environmental perturbation (pH, T-jump, photodissociation, etc.) and to understand relevant time scales. Next, we propose using SVD and global analysis to extract SADs to be used for biased-MD. A series of biased-MD simulations, with a commitment to the principle of maximum entropy, can be used to generate a multitude of structures consistent with the experimental input. Machine learning can be implemented to sort through the simulations and group similar structures. From here, MSMs can be used to elucidate plausible transition pathways between structures. We believe this process provides a complete picture of protein dynamics, with the time-scale information coming from TRXSS and kinetic and structural insight achieved through biased-MD and MSMs. Additionally, machine learning will be used to gain additional insight into stimuli parameters. In doing so, experimental setups can be tailored to maximize environmental perturbation and MD can better reflect the perturbed system. Understanding the dynamics and structures of IDPs and IDRs will assist in addressing fundamental questions related to human health and drug delivery and development. Additionally, we will continue to address and improve how experimental inputs can accurately be incorporated into MD to generate physically measured structures and insight into transition pathways.

Supplementary Material

Support Information

ACKNOWLEDGMENTS

This work was supported by the National Institute of Health (NIH), under contract no. R01-GM115761. A.M.C. was funded by a training grant supported by the NIGMS/NIH (no. T32GM140995). This research used resources of the Advanced Photon Source, sponsored by the U.S. DOE Office of Science and operated by Argonne National Laboratory under contract no. DE-AC02-06CH11357.

Biographies

graphic file with name nihms-1803768-b0002.gif

Adam K. Nijhawan is a third year graduate student in the chemistry department at Northwestern University. He earned his undergraduate degree in chemistry from Carleton College while working in Prof. Daniela Kohen’s research group. Adam’s research interests involve using experimental signals as input for molecular dynamics simulations to study protein folding dynamics at the atomic level. He is interested in using time-resolved X-ray solution scattering and advanced molecular dynamics sampling techniques to characterize kinetic pathways for intrinsically disordered proteins. Outside of chemistry, Adam is passionate about teaching, soccer, and jazz music.

graphic file with name nihms-1803768-b0003.gif

Arnold M. Chan is a doctoral candidate in the Department of Chemistry at Northwestern University. He received his undergraduate degree in chemistry from the University of California at Berkeley as a Regents’ and Chancellor’s Scholar in 2018. In the same year, he began his graduate studies, advised by Prof. Lin X. Chen, studying protein and nucleic acid structural responses to environmental perturbations with time-resolved X-ray scattering methods. He was a trainee in the NIH sponsored Molecular Biophysics Training Program at Northwestern University. Arnold has broad research interests in biophysical chemistry, which include understanding energy transport in photo-synthetic light-harvesting proteins and investigating the structural dynamics of biomolecules with mixed experimental and computational approaches.

graphic file with name nihms-1803768-b0004.gif

Darren J. Hsu received his B.Sc. in Chemistry in 2015 from National Taiwan University and Ph.D. in Chemistry in 2020 from Northwestern University, focusing on the instrumentation of time-resolved X-ray scattering experiments and integration of such data into molecular dynamics simulations. He is currently a postdoctoral research associate at the National Center for Computational Science at Oak Ridge National Laboratory. His research interest is in the development of experimental data-driven simulations, high-throughput methods for sampling ligand binding poses, and efficient machine-learned configuration and structure generators.

graphic file with name nihms-1803768-b0005.gif

Lin X. Chen is a Professor of Chemistry at Northwestern University and a Distinguished Fellow at Argonne National Laboratory. She received her Ph.D. in physical chemistry from the University of Chicago and did postdoctoral research at University of California. Using ultrafast laser and X-ray spectroscopies and X-ray scattering, she studies fundamental light–matter interactions of different solar energy conversion platforms and excited state functional structural dynamics of transition metal complexes and biomacromolecules. She has served as Senior Editor of ACS Energy Letters, Associate Editor of Chemical Science (RSC), Basic Energy Science Advisory Committee, Basic Energy Science, US Department of Energy. She is an AAAS and RSC Fellow. Her group website is http://chemgroups.northwestern.edu/chen_group/.

graphic file with name nihms-1803768-b0006.gif

Kevin L. Kohlstedt is a Research Assistant Professor of Chemistry at Northwestern University. He earned his undergraduate degree in engineering physics from the University of Kansas and a Ph.D. in chemical engineering from Northwestern University. He was a postdoctoral fellow at the University of Michigan. Kevin joined Northwestern University in 2015 as a research faculty in chemistry. Kevin’s research interests involve describing mesoscale phenomena such as charge transport, photonic properties of self-assembled nanostructures, and protein structure and dynamics, using computational and theoretical frameworks. He uses a variety of approaches to study not only the phenomena of interest but also the energetics and kinetics of the molecular structures. Kevin has numerous publications across a variety of journals. His awards include the DOE Computational Science Graduate Fellowship, Nature Publication Group’s best presentation at the Soft Matter Conference (2012), and the Air Force Summer Faculty Fellowship (2019). More information can be found at his website: http://sites.northwestern.edu/kevkohls.

Footnotes

Supporting Information

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.1c05820.

The full data reported in Figure 1 (PDF)

The authors declare no competing financial interest.

Complete contact information is available at: https://pubs.acs.org/10.1021/acs.jpcb.1c05820

Contributor Information

Adam K. Nijhawan, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States.

Arnold M. Chan, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States.

Darren J. Hsu, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States.

Lin X. Chen, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States; Chemical Sciences and Engineering Division, Argonne National Laboratory, Argonne, Illinois 60439, United States.

Kevin L. Kohlstedt, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States.

REFERENCES

  • (1).Liu Y; Wang X; Liu B A Comprehensive Review and Comparison of Existing Computational Methods for Intrinsically Disordered Protein and Region Prediction. Briefings Bioinf. 2019, 20, 330–346. [DOI] [PubMed] [Google Scholar]
  • (2).DeForte S; Uversky VN Order, Disorder, and Everything in Between. Molecules 2016, 21, 1090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (3).Michaelis M; Hildebrand N; Meißner RH; Wurzler N; Li Z; Hirst JD; Micsonai A; Kardos J; Delle Piane M; Colombi Ciacchi L Impact of the Conformational Variability of Oligopeptides on the Computational Prediction of Their Cd Spectra. J. Phys. Chem. B 2019, 123, 6694–6704. [DOI] [PubMed] [Google Scholar]
  • (4).Hsu DJ; Leshchev D; Kosheleva I; Kohlstedt KL; Chen LX Unfolding Bovine A-Lactalbumin with T-jump: Characterizing Disordered Intermediates Via Time-Resolved X-Ray Solution Scattering and Molecular Dynamics Simulations. J. Chem. Phys 2021, 154, 105101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Davis CM; Reddish MJ; Dyer RB Dual Time-Resolved Temperature-Jump Fluorescence and Infrared Spectroscopy for the Study of Fast Protein Dynamics. Spectrochim. Acta, Part A 2017, 178, 185–191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Jeong B-S; Dyer RB Proton Transport Mechanism of M2 Proton Channel Studied by Laser-Induced Ph Jump. J. Am. Chem. Soc 2017, 139, 6621–6628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Poroikov VV Computer-Aided Drug Design: From Discovery of Novel Pharmaceutical Agents to Systems Pharmacology. Biochemistry (Moscow), Supplement Series B: Biomedical Chemistry 2020, 14, 216–227. [Google Scholar]
  • (8).Marinko JT; Huang H; Penn WD; Capra JA; Schlebach JP; Sanders CR Folding and Misfolding of Human Membrane Proteins in Health and Disease: From Single Molecules to Cellular Proteostasis. Chem. Rev 2019, 119, 5537–5606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Hu G; Doruker P; Li H; Demet Akten E Editorial: Understanding Protein Dynamics, Binding and Allostery for Drug Design. Frontiers in Molecular Biosciences 2021, 8, 681364. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Khan FI; Wei D-Q; Gu K-R; Hassan MI; Tabrez S Current Updates on Computer Aided Protein Modeling and Designing. Int. J. Biol. Macromol 2016, 85, 48–62. [DOI] [PubMed] [Google Scholar]
  • (11).Paissoni C; Jussupow A; Camilloni C Determination of Protein Structural Ensembles by Hybrid-Resolution Saxs Restrained Molecular Dynamics. J. Chem. Theory Comput 2020, 16, 2825–2834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Bottaro S; Bengtsen T; Lindorff-Larsen K Integrating Molecular Simulation and Experimental Data: A Bayesian/Maximum Entropy Reweighting Approach. In Structural Bioinformatics: Methods and Protocols; Gáspári Z, Ed.; Springer; US: New York, 2020; pp 219–240. [DOI] [PubMed] [Google Scholar]
  • (13).Orioli S; Larsen AH; Bottaro S; Lindorff-Larsen K Chapter Three - How to Learn from Inconsistencies: Integrating Molecular Simulations with Experimental Data. In Progress in Molecular Biology and Translational Science; Strodel B, Barz B, Eds.; Academic Press: 2020; Vol. 170, pp 123–176. [DOI] [PubMed] [Google Scholar]
  • (14).Latham AP; Zhang B Maximum Entropy Optimized Force Field for Intrinsically Disordered Proteins. J. Chem. Theory Comput 2020, 16, 773–781. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (15).Pitera JW; Chodera JD On the Use of Experimental Observations to Bias Simulated Ensembles. J. Chem. Theory Comput 2012, 8, 3445–3451. [DOI] [PubMed] [Google Scholar]
  • (16).Pietrek LM; Stelzl LS; Hummer G Hierarchical Ensembles of Intrinsically Disordered Proteins at Atomic Resolution in Molecular Dynamics Simulations. J. Chem. Theory Comput 2020, 16, 725–737. [DOI] [PubMed] [Google Scholar]
  • (17).Demerdash O; Shrestha UR; Petridis L; Smith JC; Mitchell JC; Ramanathan A Using Small-Angle Scattering Data and Parametric Machine Learning to Optimize Force Field Parameters for Intrinsically Disordered Proteins. Frontiers in Molecular Biosciences 2019, 6, 64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Karczyńska AS; Mozolewska MA; Krupa P; Giełdoń A; Liwo A; Czaplewski C Prediction of Protein Structure with the Coarse-Grained Unres Force Field Assisted by Small X-Ray Scattering Data and Knowledge-Based Information. Proteins: Struct., Funct., Genet 2018, 86, 228–239. [DOI] [PubMed] [Google Scholar]
  • (19).Brotzakis ZF; Vendruscolo M; Bolhuis PG A Method of Incorporating Rate Constants as Kinetic Constraints in Molecular Dynamics Simulations. Proc. Natl. Acad. Sci. U. S. A 2021, 118, e2012423118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Feng C-J; Sinitskiy A; Pande V; Tokmakoff A Computational Ir Spectroscopy of Insulin Dimer Structure and Conformational Heterogeneity. J. Phys. Chem. B 2021, 125, 4620–4633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Zhang X-X; Jones KC; Fitzpatrick A; Peng CS; Feng C-J; Baiz CR; Tokmakoff A Studying Protein–Protein Binding through T-Jump Induced Dissociation: Transient 2d Ir Spectroscopy of Insulin Dimer. J. Phys. Chem. B 2016, 120, 5134–5145. [DOI] [PubMed] [Google Scholar]
  • (22).Rimmerman D; Leshchev D; Hsu DJ; Hong J; Kosheleva I; Chen LX Direct Observation of Insulin Association Dynamics with Time-Resolved X-Ray Scattering. J. Phys. Chem. Lett 2017, 8, 4413–4418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Chung HS; Ganim Z; Jones KC; Tokmakoff A Transient 2d Ir Spectroscopy of Ubiquitin Unfolding Dynamics. Proc. Natl. Acad. Sci. U. S. A 2007, 104, 14237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Bhowmick A; et al. Finding Our Way in the Dark Proteome. J. Am. Chem. Soc 2016, 138, 9730–9742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Jones CM; Henry ER; Hu Y; Chan CK; Luck SD; Bhuyan A; Roder H; Hofrichter J; Eaton WA Fast Events in Protein Folding Initiated by Nanosecond Laser Photolysis. Proc. Natl. Acad. Sci. U. S. A 1993, 90, 11860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (26).Piana S; Lindorff-Larsen K; Shaw DE Atomic-Level Description of Ubiquitin Folding. Proc. Natl. Acad. Sci. U. S. A 2013, 110, 5915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (27).Antoszewski A; Feng C-J; Vani BP; Thiede EH; Hong L; Weare J; Tokmakoff A; Dinner AR Insulin Dissociates by Diverse Mechanisms of Coupled Unfolding and Unbinding. J. Phys. Chem. B 2020, 124, 5571–5587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (28).Lincoff J; Haghighatlari M; Krzeminski M; Teixeira JMC; Gomes G-NW; Gradinaru CC; Forman-Kay JD; Head-Gordon T Extended Experimental Inferential Structure Determination Method in Determining the Structural Ensembles of Disordered Protein States. Communications Chemistry 2020, 3, 74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (29).Levantino M; Yorke BA; Monteiro DCF; Cammarata M; Pearson AR Using Synchrotrons and Xfels for Time-Resolved X-Ray Crystallography and Solution Scattering Experiments on Biomolecules. Curr. Opin. Struct. Biol 2015, 35, 41–48. [DOI] [PubMed] [Google Scholar]
  • (30).LeBlanc SJ; Kulkarni P; Weninger KR Single Molecule Fret: A Powerful Tool to Study Intrinsically Disordered Proteins. Biomolecules 2018, 8, 140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (31).Xie M; Yu L; Bruschweiler-Li L; Xiang X; Hansen AL; Brüschweiler R Functional Protein Dynamics on Uncharted Time Scales Detected by Nanoparticle-Assisted Nmr Spin Relaxation. Science Advances 2019, 5, eaax5560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).Thompson MC; Barad BA; Wolff AM; Sun Cho H; Schotte F; Schwarz DMC; Anfinrud P; Fraser JS Temperature-Jump Solution X-Ray Scattering Reveals Distinct Motions in a Dynamic Enzyme. Nat. Chem 2019, 11, 1058–1066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (33).Cammarata M; Levantino M; Schotte F; Anfinrud PA; Ewald F; Choi J; Cupane A; Wulff M; Ihee H Tracking the Structural Dynamics of Proteins in Solution Using Time-Resolved Wide-Angle X-Ray Scattering. Nat. Methods 2008, 5, 881–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Chong S-H; Im H; Ham S Explicit Characterization of the Free Energy Landscape of Pkid–Kix Coupled Folding and Binding. ACS Cent. Sci 2019, 5, 1342–1351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (35).Qiao Q; Bowman GR; Huang X Dynamics of an Intrinsically Disordered Protein Reveal Metastable Conformations That Potentially Seed Aggregation. J. Am. Chem. Soc 2013, 135, 16092–16101. [DOI] [PubMed] [Google Scholar]
  • (36).Banerjee P; Bagchi B Dynamical Control by Water at a Molecular Level in Protein Dimer Association and Dissociation. Proc. Natl. Acad. Sci. U. S. A 2020, 117, 2302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (37).Marinelli F; Pietrucci F; Laio A; Piana S A Kinetic Model of Trp-Cage Folding from Multiple Biased Molecular Dynamics Simulations. PLoS Comput. Biol 2009, 5, e1000452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (38).Snow CD; Zagrovic B; Pande VS The Trp Cage: Folding Kinetics and Unfolded State Topology Via Molecular Dynamics Simulations. J. Am. Chem. Soc 2002, 124, 14548–14549. [DOI] [PubMed] [Google Scholar]
  • (39).Cellmer T; Buscaglia M; Henry ER; Hofrichter J; Eaton WA Making Connections between Ultrafast Protein Folding Kinetics and Molecular Dynamics Simulations. Proc. Natl. Acad. Sci. U. S. A 2011, 108, 6103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (40).Chang N.-y.; Li Y-C; Jheng C-P; Kuo Y-T; Lee C-I Characterizing the Denatured State Ensemble of Ubiquitin under Native Conditions Using Replica Exchange Molecular Dynamics. RSC Adv. 2016, 6, 95584–95589. [Google Scholar]
  • (41).Henry L; Panman MR; Isaksson L; Claesson E; Kosheleva I; Henning R; Westenhoff S; Berntsson O Real-Time Tracking of Protein Unfolding with Time-Resolved X-Ray Solution Scattering. Struct. Dyn 2020, 7, 054702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (42).Berntsson O; et al. Time-Resolved X-Ray Solution Scattering Reveals the Structural Photoactivation of a Light-Oxygen-Voltage Photoreceptor. Structure 2017, 25, 933–938.e3. [DOI] [PubMed] [Google Scholar]
  • (43).Kim JG; Kim TW; Kim J; Ihee H Protein Structural Dynamics Revealed by Time-Resolved X-Ray Solution Scattering. Acc. Chem. Res 2015, 48, 2200–2208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (44).Heyes DJ; Hardman SJO; Pedersen MN; Woodhouse J; De La Mora E; Wulff M; Weik M; Cammarata M; Scrutton NS; Schirò G Light-Induced Structural Changes in a Full-Length Cyanobacterial Phytochrome Probed by Time-Resolved X-Ray Scattering. Communications Biology 2019, 2, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (45).Meuzelaar H; Panman MR; van Dijk CN; Woutersen S Folding of a Zinc-Finger Bβα-Motif Investigated Using Two-Dimensional and Time-Resolved Vibrational Spectroscopy. J. Phys. Chem. B 2016, 120, 11151–11158. [DOI] [PubMed] [Google Scholar]
  • (46).Meuzelaar H; Marino KA; Huerta-Viga A; Panman MR; Smeenk LEJ; Kettelarij AJ; van Maarseveen JH; Timmerman P; Bolhuis PG; Woutersen S Folding Dynamics of the Trp-Cage Miniprotein: Evidence for a Native-Like Intermediate from Combined Time-Resolved Vibrational Spectroscopy and Molecular Dynamics Simulations. J. Phys. Chem. B 2013, 117, 11490–11501. [DOI] [PubMed] [Google Scholar]
  • (47).Snow CD; Qiu L; Du D; Gai F; Hagen SJ; Pande VS Trp Zipper Folding Kinetics by Molecular Dynamics and Temperature-Jump Spectroscopy. Proc. Natl. Acad. Sci. U. S. A 2004, 101, 4077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (48).Mohammed OF; Jas GS; Lin MM; Zewail AH Primary Peptide Folding Dynamics Observed with Ultrafast Temperature Jump. Angew. Chem., Int. Ed 2009, 48, 5628–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (49).Meadows CW; Balakrishnan G; Kier BL; Spiro TG; Klinman JP Temperature-Jump Fluorescence Provides Evidence for Fully Reversible Microsecond Dynamics in a Thermophilic Alcohol Dehydrogenase. J. Am. Chem. Soc 2015, 137, 10060–10063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (50).Qiu L; Pabit SA; Roitberg AE; Hagen SJ Smaller and Faster: The 20-Residue Trp-Cage Protein Folds in 4 Ms. J. Am. Chem. Soc 2002, 124, 12952–12953. [DOI] [PubMed] [Google Scholar]
  • (51).Rimmerman D; Leshchev D; Hsu DJ; Hong J; Abraham B; Henning R; Kosheleva I; Chen LX Revealing Fast Structural Dynamics in Ph-Responsive Peptides with Time-Resolved X-Ray Scattering. J. Phys. Chem. B 2019, 123, 2016–2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (52).Rimmerman D; Leshchev D; Hsu DJ; Hong J; Abraham B; Henning R; Kosheleva I; Chen LX Probing Cytochrome C Folding Transitions Upon Phototriggered Environmental Perturbations Using Time-Resolved X-Ray Scattering. J. Phys. Chem. B 2018, 122, 5218–5224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (53).Baiz CR; Lin Y-S; Peng CS; Beauchamp KA; Voelz VA; Pande VS; Tokmakoff A A Molecular Interpretation of 2d Ir Protein Folding Experiments with Markov State Models. Biophys. J 2014, 106, 1359–1370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (54).Jones KC; Peng CS; Tokmakoff A Folding of a Heterogeneous B-Hairpin Peptide from Temperature-Jump 2d Ir Spectroscopy. Proc. Natl. Acad. Sci. U. S. A 2013, 110, 2828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (55).Hsu DJ; Leshchev D; Kosheleva I; Kohlstedt KL; Chen LX Integrating Solvation Shell Structure in Experimentally Driven Molecular Dynamics Using X-Ray Solution Scattering Data. J. Chem. Phys 2020, 152, 204115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (56).Fitzpatrick AWP; et al. Atomic Structure and Hierarchical Assembly of a Cross-B Amyloid Fibril. Proc. Natl. Acad. Sci. U. S. A 2013, 110, 5468–5473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (57).Newby FN; De Simone A; Yagi-Utsumi M; Salvatella X; Dobson CM; Vendruscolo M Structure-Free Validation of Residual Dipolar Coupling and Paramagnetic Relaxation Enhancement Measurements of Disordered Proteins. Biochemistry 2015, 54, 6876–6886. [DOI] [PubMed] [Google Scholar]
  • (58).Calvey GD; Katz AM; Pollack L Microfluidic Mixing Injector Holder Enables Routine Structural Enzymology Measurements with Mix-and-Inject Serial Crystallography Using X-Ray Free Electron Lasers. Anal. Chem 2019, 91, 7139–7144. [DOI] [PubMed] [Google Scholar]
  • (59).Stickrath AB; Mara MW; Lockard JV; Harpham MR; Huang J; Zhang X; Attenkofer K; Chen LX Detailed Transient Heme Structures of Mb-Co in Solution after Co Dissociation: An X-Ray Transient Absorption Spectroscopic Study. J. Phys. Chem. B 2013, 117, 4705–4712. [DOI] [PubMed] [Google Scholar]
  • (60).Hsu DJ; Leshchev D; Rimmerman D; Hong J; Kelley MS; Kosheleva I; Zhang X; Chen LX X-Ray Snapshots Reveal Conformational Influence on Active Site Ligation During Metalloprotein Folding. Chemical Science 2019, 10, 9788–9800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (61).Cho HS; Dashdorj N; Schotte F; Graber T; Henning R; Anfinrud P Protein Structural Dynamics in Solution Unveiled Via 100-Ps Time-Resolved X-Ray Scattering. Proc. Natl. Acad. Sci. U. S. A 2010, 107, 7281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (62).Shelby ML; et al. Interplays of Electron and Nuclear Motions Along Co Dissociation Trajectory in Myoglobin Revealed by Ultrafast X-Rays and Quantum Dynamics Calculations. Proc. Natl. Acad. Sci. U. S.A 2021, 118, e2018966118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (63).Kim TW; et al. Protein Folding from Heterogeneous Unfolded State Revealed by Time-Resolved X-Ray Solution Scattering. Proc. Natl. Acad. Sci. U. S. A 2020, 117, 14996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (64).Ashwood B; Lewis NHC; Sanstead PJ; Tokmakoff A Temperature-Jump 2d Ir Spectroscopy with Intensity-Modulated Cw Optical Heating. J. Phys. Chem. B 2020, 124, 8665–8677. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (65).Hermann MR; Hub JS Saxs-Restrained Ensemble Simulations of Intrinsically Disordered Proteins with Commitment to the Principle of Maximum Entropy. J. Chem. Theory Comput 2019, 15, 5103–5115. [DOI] [PubMed] [Google Scholar]
  • (66).Frank J Time-Resolved Cryo-Electron Microscopy: Recent Progress. J. Struct. Biol 2017, 200, 303–306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (67).Ota M; Koike R; Amemiya T; Tenno T; Romero PR; Hiroaki H; Dunker AK; Fukuchi S An Assignment of Intrinsically Disordered Regions of Proteins Based on Nmr Structures. J. Struct. Biol 2013, 181, 29–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (68).Goda N; Shimizu K; Kuwahara Y; Tenno T; Noguchi T; Ikegami T; Ota M; Hiroaki H A Method for Systematic Assessment of Intrinsically Disordered Protein Regions by Nmr. Int. J. Mol. Sci 2015, 16, 15743–15760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (69).Brookes DH; Head-Gordon T Experimental Inferential Structure Determination of Ensembles for Intrinsically Disordered Proteins. J. Am. Chem. Soc 2016, 138, 4530–4538. [DOI] [PubMed] [Google Scholar]
  • (70).Donten ML; Hamm P Ph-Jump Induced A-Helix Folding of Poly-L-Glutamic Acid. Chem. Phys 2013, 422, 124–130. [Google Scholar]
  • (71).Crehuet R; Buigues PJ; Salvatella X; Lindorff-Larsen K Bayesian-Maximum-Entropy Reweighting of Idp Ensembles Based on Nmr Chemical Shifts. Entropy 2019, 21, 898. [Google Scholar]
  • (72).Björling A; Niebling S; Marcellini M; van der Spoel D; Westenhoff S Deciphering Solution Scattering Data with Experimentally Guided Molecular Dynamics Simulations. J. Chem. Theory Comput 2015, 11, 780–787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (73).Sidky H; Chen W; Ferguson AL High-Resolution Markov State Models for the Dynamics of Trp-Cage Miniprotein Constructed over Slow Folding Modes Identified by State-Free Reversible Vampnets. J. Phys. Chem. B 2019, 123, 7999–8009. [DOI] [PubMed] [Google Scholar]
  • (74).Chiavazzo E; Covino R; Coifman RR; Gear CW; Georgiou AS; Hummer G; Kevrekidis IG Intrinsic Map Dynamics Exploration for Uncharted Effective Free-Energy Landscapes. Proc. Natl. Acad. Sci. U. S. A 2017, 114, E5494–E5503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (75).Chen W; Tan AR; Ferguson AL Collective Variable Discovery and Enhanced Sampling Using Autoencoders: Innovations in Network Architecture and Error Function Design. J. Chem. Phys 2018, 149, 072312. [DOI] [PubMed] [Google Scholar]
  • (76).Schwantes CR; Pande VS Improvements in Markov State Model Construction Reveal Many Non-Native Interactions in the Folding of Ntl9. J. Chem. Theory Comput 2013, 9, 2000–2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (77).Perez-Hernandez G; Paul F; Giorgino T; De Fabritiis G; Noe F Identification of Slow Molecular Order Parameters for Markov Model Construction. J. Chem. Phys 2013, 139, 015102. [DOI] [PubMed] [Google Scholar]
  • (78).Herrera-Nieto P; Pérez A; De Fabritiis G Characterization of Partially Ordered States in the Intrinsically Disordered N-Terminal Domain of P53 Using Millisecond Molecular Dynamics Simulations. Sci. Rep 2020, 10, 12402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (79).Husic BE; Pande VS Markov State Models: From an Art to a Science. J. Am. Chem. Soc 2018, 140, 2386–2396. [DOI] [PubMed] [Google Scholar]
  • (80).Pande VS; Beauchamp K; Bowman GR Everything You Wanted to Know About Markov State Models but Were Afraid to Ask. Methods 2010, 52, 99–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (81).Perez A; MacCallum JL; Dill KA Accelerating Molecular Simulations of Proteins Using Bayesian Inference on Weak Information. Proc. Natl. Acad. Sci. U. S. A 2015, 112, 11846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (82).Potrzebowski W; Trewhella J; Andre I Bayesian Inference of Protein Conformational Ensembles from Limited Structural Data. PLoS Comput. Biol 2018, 14, e1006641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (83).Tsuchiya Y; Tomii K Neural Networks for Protein Structure and Function Prediction and Dynamic Analysis. Biophys. Rev 2020, 12, 569–573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (84).Laage D; Elsaesser T; Hynes JT Water Dynamics in the Hydration Shells of Biomolecules. Chem. Rev 2017, 117, 10694–10725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (85).Fleetwood O; Kasimova MA; Westerlund AM; Delemotte L Molecular Insights from Conformational Ensembles Via Machine Learning. Biophys. J 2020, 118, 765–780. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Support Information

RESOURCES