Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Feb 1.
Published in final edited form as: Curr Opin Struct Biol. 2021 Sep 27;72:95–102. doi: 10.1016/j.sbi.2021.08.009

Modeling biomolecular kinetics with large-scale simulation

Peter M Kasson 1,2,*
PMCID: PMC9476681  NIHMSID: NIHMS1833958  PMID: 34592698

Abstract

The molecular details of biomolecular kinetics present a challenging estimation problem because the identities of relevant intermediates as well as the rates of exchange between them must be determined. These can be derived from prior knowledge, but in recent years great advances have been made in the development and application of methods to systematically determine states and rates using biomolecular simulation. Doing this for biological systems of reasonable complexity requires substantial computational power, and contemporary methods leverage distributed-computing or leadership-class computing resources to accomplish this. The result has been substantial insight into pressing contemporary problems, including structural activation of pandemic viruses. Here we highlight recent developments in both methodology and exciting applications.

Keywords: ensemble molecular dynamics, biomolecular kinetics, Markov State Models, unbiased sampling


The kinetics of biomolecular systems are challenging yet richly informative targets for computational prediction. Biomolecular kinetics are hard and useful for the same reason: while equilibrium distributions depend on the relative free energies of each molecular population, kinetics depend on both the molecular pathways between these stable populations and the free energies of transition states. These details illuminate molecular mechanisms to a much greater extent but have more stringent requirements for accuracy and sampling of the molecular processes involved. Biomolecular kinetics can also be used to examine non-equilibrium systems that dominate many important biological processes.

Physics-based methods to estimate complex biomolecular kinetics run into a fundamental problem: the vibrational frequencies of interatomic bonds are fast, which requires a short timestep for discrete-time integrators, on the order of femtoseconds. However, many biomolecular processes of interest have timescales of milliseconds to seconds, and these processes must be sampled many times over to develop accurate estimates of pathways and rates. Two types of advanced sampling approaches have been developed to help overcome this challenge. Biasing-based approaches, often but not always knowledge-guided, can alter the simulation Hamiltonian to facilitate barrier crossing and ease sampling. Such approaches (reviewed in [1]) can be very powerful but can also fall afoul of the no-free-lunch principle: in many cases the knowledge injected into the system to facilitate sampling encodes a relatively stringent set of mechanistic assumptions. When these are correct or, better yet, based on experimental data [24], this is an excellent approach.

Here, we focus on unbiased approaches to estimating biomolecular kinetics using molecular dynamics simulations. “Unbiased” in this case specifies that the individual molecular trajectories evolve according to the native Hamiltonian for the system rather than a perturbed Hamiltonian. Many of the approaches for unbiased simulation originated from and align with the computational capabilities of distributed computing projects [5], but the underlying algorithms have generalized well to a wide range of computational platforms that provide a mix of breadth (number of trajectories) and depth (length of trajectories) [6,7]. We review principles, briefly survey new algorithmic developments, and then focus on insight gained for complex biomolecular systems before outlining challenges and future areas for development.

1. Why ensemble-based models?

In order to estimate biomolecular kinetics, we must sample each step in a molecular pathway several times to obtain a robust rate estimate. By the principle of ergodicity, we could do this with a single trajectory many times, or we could do the same with many uncorrelated trajectories. The latter scenario forms the basis for ensemble models, since most compute platforms have better weak scaling than strong scaling (one could double the system “size” or in this case run 2 systems and get better performance than running the original system with twice as many compute elements). The challenge, of course, is obtaining uncorrelated trajectories. In a system with multiple metastable states and good separation of timescales, individual trajectories will decorrelate within a metastable state much faster than they transition between states. This property is not guaranteed, however. Analytic approaches such as Markov State Models, described below, attempt to optimize the metastability of each state to achieve such a separation of timescales.

In addition to parallelism, ensemble models can achieve speedup via advanced sampling techniques that do not bias individual trajectories. Instead, algorithms select conformations as the start point for unbiased trajectories such that the resulting dataset will yield a converged kinetic model more efficiently. These sampling algorithms can encode information about reaction coordinates (biased sampling) or remain agnostic of such data (unbiased sampling). As we discuss for several applications, biased sampling tends to yield more rapid convergence but places more stringent requirements on initial knowledge and is less robust to uncertainties about the system. Setting aside errors of the individual simulation trajectories for the moment, unbiased-sampling models have two classes of errors: classifying states incorrectly and sampling the wrong regions of conformation space. The first causes kinetic estimates to be too fast, while second could be either direction but primarily causes estimates to be too slow (Fig. 2). The reason for too-slow estimates is if a pathway from state a to state b is assumed to go through region c but in fact has higher flux (and faster kinetics) through region d. Fortunately, this error can be bounded by the quantity of unbiased simulation starting in state a: given n trajectories, if the pathway through region d is expected to be k-fold faster, then each trajectory would be ~k-fold more likely to go through d than c. So given a sufficient number of unbiased trajectories a->c, one can bound the likelihood of a region d that yields a k-fold faster rate. These error types also apply to biased sampling, but the aggregate unbiased sampling is typically lower from a metastable basin (due to the greater efficiency), so biased sampling is more prone to such errors.

Figure 2. Two types of errors in sampling and statistical model building.

Figure 2.

Errors are illustrated using a mountain landscape. State classification errors can result in too-fast estimates of kinetics: if states a and c are misclassified together as â, then the rate â->b will be much faster than the a->b rate. Sampling incorrect paths can cause too-slow estimates of kinetics: if directed sampling is used to optimize the path a->c->d, local optimization may not explore a->d->b, which yields overall faster kinetics and would carry more flux from a->b. Rendering using Google Earth.

In recent years, these ensemble models have gone from highly specialized to mainstream, with multiple ensemble approaches finding application to many biomolecular systems. Tools to apply ensemble simulation and construct kinetic models are also maturing. We highlight some recent algorithmic developments and notable applications of ensemble simulation, with apologies to all the excellent work that we are not able to include.

2. Recent algorithmic developments

One of the great advances in ensemble simulation has been an increase in the use of algorithms to guide the selection of trajectories rather than only for post-hoc analysis. Such algorithms for ensemble simulation can be roughly categorized as 1) knowledge-guided, 2) based on path sampling or phase-space sampling, or 3) driven by a generative model, here defined as a probabilistic model for the underlying kinetic process where trajectories can be treated as samples from the model. The first has been very successful in selected applications [8,9] but tends not to be algorithmic, although there are certainly generalizable strategies that can be gleaned. We briefly describe algorithmic advances in the second and third categories.

Path sampling and phase-space sampling algorithms for ensemble simulation

Path-sampling and phase-space-sampling algorithms generally work on the principle of starting new simulations from under-sampled regions of conformation space. The primary difference is that path-sampling focuses this sampling on transition paths between identified initial and final states, whereas phase-space sampling relaxes this condition (Fig. 3). In the absence of a reasonable starting path or when substantially different new paths can be discovered, phase-space sampling is more robust. However, when the paths carrying the majority of the flux between initial and final states of interest can be well identified, path-sampling is more efficient. Many of the fundamental ideas behind this approach were worked out for transition-path sampling [10] and milestoning [11,12]; more recent developments have been based largely on the string [13,14] formalism and the swarms variant [15], which explicitly uses ensembles distributed along some progress path to generate trajectories. Notable recent developments in this area have relaxed the requirements regarding reaction coordinates. We highlight here the weighted-ensemble method [1618], which involves resampling according to a progress coordinate, which can be multidimensional. Other notable advances include diffusion-map-directed molecular dynamics, which employs locally scaled diffusion maps to direct sampling and thus does not require predetermined progress coordinates [19].

Figure 3. Path-sampling and phase-space-sampling methods.

Figure 3.

Path-sampling (schematized in panel i) uses one or more transition paths between states a and b to launch new simulation trajectories and estimate intermediates and rates. In some approaches, the path can be discovered as sampling proceeds. Phase-space-sampling (panel ii) does not rely upon progress coordinates but instead only requires a metric of structural or kinetic similarity. It uses dissimilarity to previously sampled regions as a criterion for launching new simulation trajectories. If a path can be readily and accurately identified, path sampling is expected to be more efficient (subject to the potential errors discussed). Phase-space-sampling is often less efficient but more robust to challenges in progress coordinates or path identification.

Ensemble simulation using generative models

Markov State Models [2022] are perhaps the prototypical generative kinetic model for molecular dynamics simulations, as they encode the stochastic version of the systems of differential equations (Fig. 4) often used to represent chemical reactions schematically [23]. Using Markov State Models and specifically the model uncertainty as a means to direct sampling was proposed by Singhal and Pande [24]. This “adaptive sampling” approach was prospectively implemented by Pronk and colleagues [25] and by many others thereafter. Recent notable developments include improvements both in adaptive sampling algorithm methodology and the development of more mature software releases for Markov State Model construction and analysis [2628]. From a software perspective, recent reviews have outlined the fundamental set of ensemble operations required to construct general adaptive ensemble algorithms [29].

Figure 4. Generative models.

Figure 4.

A graph representation of a Markov State Model is schematized in panel (i) with the corresponding system of differential equations in panel (ii. Panel (iii) shows a set of trajectories represented as sequences of states; these could be either sampled from the Markov State Model or conversely used to parameterize a model.

In recent years, other families of generative models have been developed that leverage algorithms and software from deep learning: particularly variational autoencoders (described in a classic preprint) and Boltzmann machines [30,31]. The corresponding kinetic generator models of VAMPnets [32] and Boltzmann Generators [33] represent promising new approaches that may permit more robust generation of kinetic models from molecular dynamics simulations.

The FAST (fluctuation amplification of specific traits) algorithm [34] provides a good example of a hybrid between goal-directed and unbiased sampling, as it selects new start conformations for sampling based on a weighted sum of conformational dissimilarity to highly-sampled states and progress along a goal-directed gradient. Optimization algorithms of this nature provide the potential to achieve the “best of both worlds” by attaining goal-directed speedup while increasing conformational diversity in sampling and, of course, maintaining unbiased individual trajectories that can be used to construct kinetic models. Clementi and colleagues have also performed a simulated comparison of different adaptive sampling approaches [35].

3. Important recent applications

Protein-protein association

Protein-protein association has been a challenging problem for atomic-level characterization because it involves diffusional encounters in a variety of orientations, partial desolvation, sometimes conformational change by the binding partners, and then formation of a stable complex, all occurring in a reversible manner. In the past few years, remarkable progress has been made in capturing this process using a set of different ensemble algorithms. Plattner and co-workers used adaptive Markov State Model simulations as well as non-adaptive comparator ensembles to characterize the binding pathways and kinetics of the well-known barnase-barstar complex [36]. Validation was performed by comparing predicted and experimentally observed binding free energies and association rates for 35 previously measured single- and double-alanine mutants. Further fitting of a consensus kinetic model for these mutants to experimental data created an improved model to overcome a statistically undersampled transition.

Saglam and Chong also used barnase-barstar as the prototype system to demonstrate the application of weighted-ensemble methods to predict protein-protein association rates and pathways [37]. In this case, a two-dimensional reaction coordinate was chosen as the minimum distance between binding partners and an aligned RMSD of two key residues of barstar relative to barnase when compared to the bound-state crystal structure. The resulting weighted-ensemble simulations of binding were used to calculate kinetics, the fraction of encounters that resulted in productive binding, as well as binding pathways and the maximum-flux path. Interestingly one but not both of the anchor residues used in the reaction coordinate were identified as forming key interactions controlling the binding kinetics. This finding suggests that although a choice of reaction coordinate surely has a negative impact on the results, the key interactions were not fully specified by the reaction coordinate, a good argument for robustness.

A third recent paper [38] combined long-timescale ensemble MD (without a sampling algorithm) and a binding-optimized Hamiltonian Monte Carlo tempering algorithm to examine the binding rates and pathways of five complexes, including barnase-barstar. These results were validated against ΔG of binding, association, and dissociation rates measured via experiment for wild-type barnase-barstar complexes. Analysis primarily focused on where in the binding process the transition state occurred and how this related to the desolvation process required for association.

SARS-CoV-2 spike conformational changes

The refocusing of global scientific effort onto SARS-CoV-2 biology during the COVID pandemic has provided multiple examples of how ensemble-based simulation can facilitate estimation of structural transitions and kinetics in extremely challenging systems. The SARS-CoV-2 spike protein undergoes a conformational transition upon receptor binding (Fig. 5) that involves large-scale movement of the receptor-binding domain (RBD) [3941]. Such transitions have been extremely difficult to capture with atomic-scale models because of the large size of the proteins and the slow timescales involved. Ensemble-based simulation approaches have achieved notable successes in modeling SARS-CoV-2 spike opening, with multiple teams achieving impressive results via different approaches.

Figure 5. Conformational activation of SARS-COV-2 spike protein.

Figure 5.

Cryo-EM structures of SARS-CoV-2 spike trimers are shown with all three receptor-binding domains in the “down” conformation (green) and with one in the “up” conformation (blue) that corresponds to ACE-2 receptor binding. Renderings are based on PDB structure 6VXX [40] and 6VSB [41].

Both studies highlighted here utilized a variant of a directed-sampling approach. One recent study by Zimmerman et al. [42] first used the FAST adaptive sampling approach to target conformations involving RBD opening by selecting a set of intramolecular distances as progress coordinates. After several rounds of adaptive sampling in this manner, the Folding@Home platform was used to launch a larger set of non-adaptive simulation trajectories. The aggregate dataset was then used to construct Markov State Models for subsequent analysis. Analytic work in this study focused on two major aspects: the relative opening of spike protein between SARS-CoV-2, SARS-1, and NL63 coronaviruses as compared to their previously measured affinities for human ACE2 and the exposure of epitopes that are hidden in the RBD-down state but become accessible for antibody or small-molecule binding either in the open state or some other intermediate. In particular, an “extremely open” intermediate was predicted in addition to the experimentally resolved “up” and “down” states of the RBD. This state had greater epitope accessibility and may also be of functional significance.

Another recent study by Sztain et al. used weighted-ensemble simulation where sampling was directed by a 2D progress coordinate of RBD-spike “core” distance and aligned RMSD of the RBD itself [43]. This work also recovered trajectories of RBD opening, but analysis focused on three different aspects from those examined by Zimmerman et al. First, a glycan was identified that played a key role in controlling RBD opening. This hypothesis was tested by mutating the glycosylated asparagine to alanine and showing reduced ACE2 binding by this spike mutant via biolayer interferometry. Second, the hydrogen-bond and salt-bridge contacts that formed and broke at different points along the opening trajectories were used to characterize structural changes associated with RBD opening. Third, the opening trajectory was compared to progress coordinates identified from manifold-embedding analysis of cryo-electron microscopy data, with relatively good agreement found.

Overall, using a knowledge-guided set of progress coordinates seems appropriate in the case of SARS-CoV-2 spike simulations because it leverages extensive structural data and understanding about what the start and end states of RBD opening might look like. Such selection of progress coordinates can of course bias the results if the progress coordinates do not capture important conformational changes, but when a biochemically relevant progress coordinate can be identified, this greatly increases the efficiency of sampling, as is the case here.

Entry mechanisms of other viruses

Subsequent to the conformational activation of viral glycoproteins, enveloped viruses enter cells via a process of membrane fusion. Viral membrane fusion is challenging to understand mechanistically because it is a non-equilibrium process with transient intermediate states, many of which are degenerate with respect to experimental observables: multiple states have the same experimental observable. The appropriate reaction progress coordinates are not well established, which complicates directed-sampling approaches. Nonetheless, membrane fusion has been the subject of ensemble simulation approaches since the early development of large-scale ensemble simulation [44]. Although directed-sampling ensemble methods have been used to good effect for other biological membrane fusion [45], most ensemble simulation of viral membrane fusion has been undirected due to the high-dimensional space of protein and lipid rearrangements [9,46]. One recent study has developed a new, hybrid approach for ensemble simulation of viral membrane fusion: time-sequential multiscaling [47]. In this approach, an ensemble of simulations was initiated from a start state at atomic resolution; once simulation members reached the next intermediate, in this case fusion stalk formation, they were a) converted to coarse-grained representation and b) duplicated, similar to a milestoning approach [12]. At the next major intermediate, in this case fusion pore formation, successful simulations were rewound to t = tpore - Δt for some small Δt, converted to atomic resolution, and duplicated. This overall approach was inspired by milestoning and avoids the bias introduced by a suboptimal choice of progress coordinates. The conversion between coarse-grained and atomistic representations was driven by prior knowledge regarding the timescales involved and also the ability of each representation to faithfully capture important details (particularly last-layer desolvation prior to fusion stalk formation and first-layer water in fusion pore formation). This approach was used to generate a more complete mechanistic model for influenza membrane fusion. Further automation and theoretical development for the ensemble algorithm is anticipated to prove similarly useful in other complex, multistate biological processes.

4. Outlook

In recent years, ensemble molecular dynamics simulation using unbiased trajectories has greatly matured. The algorithms have improved, they are now available in multiple open-source software implementations, and they have been used for great insight into previously intractable biomolecular processes. It is the hope of ensemble simulation developers that these methods, using biased or unbiased trajectories as appropriate, will become as ubiquitous as molecular dynamics simulation itself. If this becomes true, statistical model generation from molecular dynamics will become as much a part of structural biology research as molecular-dynamics-based visualization currently is.

Figure 1. Constructing kinetic models from ensemble simulation data.

Figure 1.

Ensemble simulations (trajectories denoted as arrows in panel a) are computed from different starting points in conformation space. Based on ensemble sampling, kinetic models are constructed of metastable states (black, orange, green, blue outlines in panel b) and transition rates between them (colored arrows in panel b). Topographical rendering from USGS National Map.

Acknowledgements.

The author thanks C. Davis for critical reading of the manuscript. This work was supported by R01 GM115790 and a Wallenberg Academy Fellowship to P.M.K.

Footnotes

The author declares no conflict of interest.

References

  • 1.Bonomi M, Heller GT, Camilloni C, Vendruscolo M: Principles of protein structural ensemble determination. Current opinion in structural biology 2017, 42:106–116. [DOI] [PubMed] [Google Scholar]
  • 2.Boomsma W, Ferkinghoff-Borg J, Lindorff-Larsen K: Combining experiments and simulations using the maximum entropy principle. PLoS Comput Biol 2014, 10:e1003406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Larsen AH, Wang Y, Bottaro S, Grudinin S, Arleth L, Lindorff-Larsen K: Combining molecular dynamics simulations with small-angle X-ray and neutron scattering data to study multi-domain proteins in solution. PLoS computational biology 2020, 16:e1007870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Hays JM, Cafiso DS, Kasson PM: Hybrid Refinement of Heterogeneous Conformational Ensembles Using Spectroscopic Data. J Phys Chem Lett 2019:3410–3414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Snow CD, Nguyen H, Pande VS, Gruebele M: Absolute comparison of simulated and experimental protein-folding dynamics. Nature 2002, 420:102–106. [DOI] [PubMed] [Google Scholar]
  • 6.Kohlhoff KJ, Shukla D, Lawrenz M, Bowman GR, Konerding DE, Belov D, Altman RB, Pande VS: Cloud-based simulations on Google Exacycle reveal ligand modulation of GPCR activation pathways. Nature Chemistry 2014, 6:15–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Balasubramanian V, Treikalis A, Weidner O, Jha S: Ensemble toolkit: Scalable and flexible execution of ensembles of tasks. In Parallel Processing (ICPP), 2016 45th International Conference on: IEEE: 2016:458–463. [Google Scholar]
  • 8.Kasson P, Kelley N, Singhal N, Vrljic M, Brunger AT, Pande VS: Ensemble molecular dynamics yields sub-millisecond kinetics and intermediates of membrane fusion. . Proc Natl Acad Sci U S A 2006, 103:11916–11921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kasson PM, Lindahl E, Pande VS: Atomic-resolution simulations predict a transition state for vesicle fusion defined by contact of a few lipid tails. PLoS Comput Biol 2010, 6:e1000829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Dellago C, Bolhuis PG, Csajka FS, Chandler D: Transition path sampling and the calculation of rate constants. Journal of Chemical Physics 1998, 108:1964–1977. [Google Scholar]
  • 11.Vanden-Eijnden E, Venturoli M, Ciccotti G, Elber R: On the assumptions underlying milestoning. J Chem Phys 2008, 129:174102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Faradjian AK, Elber R: Computing time scales from reaction coordinates by milestoning. J Chem Phys 2004, 120:10880–10889. [DOI] [PubMed] [Google Scholar]
  • 13.Maragliano L, Fischer A, Vanden-Eijnden E: String method in collective variables: Minimum free energy paths and isocommittor surfaces. The Journal of chemical physics 2006, 125:024106. [DOI] [PubMed] [Google Scholar]
  • 14.Weinan E, Ren W, Vanden-Eijnden E: String method for the study of rare events. Physical Review B 2002, 66:052301. [DOI] [PubMed] [Google Scholar]
  • 15.Pan AC, Sezer D, Roux B: Finding Transition Pathways Using the String Method with Swarms of Trajectories. J. Phys. Chem. B 2008, 112:3432–3434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Huber GA, Kim S: Weighted-ensemble Brownian dynamics simulations for protein association reactions. Biophysical Journal 1996, 70:97–110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Zhang BW, Jasnow D, Zuckerman DM: The “weighted ensemble” path sampling method is statistically exact for a broad class of stochastic processes and binning procedures. The Journal of Chemical Physics 2010, 132:054107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Zuckerman DM, Chong LT: Weighted Ensemble Simulation: Review of Methodology, Applications, and Software. Annual Review of Biophysics 2017, 46:43–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Zheng W, Rohrdanz MA, Clementi C: Rapid Exploration of Configuration Space with Diffusion-Map-Directed Molecular Dynamics. J. Phys. Chem. B 2013, 117:12769––12776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Swope WC, Pitera JW, Suits F: Describing protein folding kinetics by molecular dynamics simulations. 1. Theory. Journal of Physical Chemistry B 2004, 108:6571–6581. [Google Scholar]
  • 21.Singhal N, Snow CD, Pande VS: Using path sampling to build better Markovian state models: Predicting the folding rate and mechanism of a tryptophan zipper beta hairpin. Journal of Chemical Physics 2004, 121:415–425. [DOI] [PubMed] [Google Scholar]
  • 22.Noe F, Horenko I, Schutte C, Smith JC: Hierarchical analysis of conformational dynamics in biomolecules: transition networks of metastable states. J Chem Phys 2007, 126:155102. [DOI] [PubMed] [Google Scholar]
  • 23.Kurtz TG: The Relationship between Stochastic and Deterministic Models for Chemical Reactions. The Journal of Chemical Physics 1972, 57:2976–2978. [Google Scholar]
  • 24.Singhal N, Pande VS: Error analysis and efficient sampling in Markovian state models for molecular dynamics. Journal of Chemical Physics 2005, 123:–. [DOI] [PubMed] [Google Scholar]
  • 25.Pronk S, Larsson P, Pouya I, Bowman GR, Haque IS, Beauchamp K, Hess B, Pande VS, Kasson PM, Lindahl E: Copernicus: A new paradigm for parallel adaptive molecular dynamics. Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis 2011:60. [Google Scholar]
  • 26.Scherer MK, Trendelkamp-Schroer B, Paul F, Prez-Hernndez G, Hoffmann M, Plattner N, Wehmeyer C, Prinz J-H, No: PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J. Chem. Theory Comput. 2015, 11:5525––5542. [DOI] [PubMed] [Google Scholar]
  • 27.Harrigan MP, Sultan MM, Hernández CX, Husic BE, Eastman P, Schwantes CR, Beauchamp KA, McGibbon RT, Pande VS: MSMBuilder: statistical models for biomolecular dynamics. Biophysical journal 2017, 112:10–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Porter JR, Zimmerman MI, Bowman GR: Enspara: Modeling molecular ensembles with scalable data structures and parallel computing. The Journal of chemical physics 2019, 150:044108. * Together with PyEMMA and MSMBuilder, enspara provides tools for clustering and Markov State Model construction.
  • 29.Kasson PM, Jha S: Adaptive ensemble simulations of biomolecules. Current opinion in structural biology 2018, 52:87–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Hinton GE: Training products of experts by minimizing contrastive divergence. Neural computation 2002, 14:1771–1800. [DOI] [PubMed] [Google Scholar]
  • 31.Kingma DP, Welling M: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 2013. [Google Scholar]
  • 32.Mardt A, Pasquali L, Wu H, Noe F: VAMPnets for deep learning of molecular kinetics. Nat Commun 2018, 9:5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Noé F, Olsson S, Köhler J, Wu H: Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science 2019, 365. ** This paper develops new formulations for generative models leveraging deep learning.
  • 34.Zimmerman MI, Bowman GR: FAST Conformational Searches by Balancing Exploration/Exploitation Trade-Offs. Journal of Chemical Theory and Computation 2015, 11:5747–5757. [DOI] [PubMed] [Google Scholar]
  • 35.Hruska E, Abella JR, Nüske F, Kavraki LE, Clementi C: Quantitative comparison of adaptive sampling methods for protein dynamics. The Journal of Chemical Physics 2018, 149:244119. [DOI] [PubMed] [Google Scholar]
  • 36.Plattner N, Doerr S, De Fabritiis G, Noé F: Complete protein–protein association kinetics in atomic detail revealed by molecular dynamics simulations and Markov modelling. Nature Chemistry 2017, 9:1005–1011. [DOI] [PubMed] [Google Scholar]
  • 37.Saglam AS, Chong LT: Protein–protein binding pathways and calculations of rate constants using fully-continuous, explicit-solvent simulations. Chemical Science 2019, 10:2360–2372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Pan AC, Jacobson D, Yatsenko K, Sritharan D, Weinreich TM, Shaw DE: Atomic-level characterization of protein–protein association. Proceedings of the National Academy of Sciences 2019, 116:4244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Lu M, Uchil PD, Li W, Zheng D, Terry DS, Gorman J, Shi W, Zhang B, Zhou T, Ding S, et al. : Real-time conformational dynamics of SARS-CoV-2 spikes on virus particles. Cell host & microbe 2020, 28:880–891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Walls AC, Park Y-J, Tortorici MA, Wall A, McGuire AT, Veesler D: Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein. Cell 2020, 181:281–292.e286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Wrapp D, Wang N, Corbett KS, Goldsmith JA, Hsieh CL, Abiona O, Graham BS, McLellan JS: Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science 2020, 367:1260–1263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Zimmerman MI, Porter JR, Ward MD, Singh S, Vithani N, Meller A, Mallimadugula UL, Kuhn CE, Borowsky JH, Wiewiora RP: SARS-CoV-2 simulations go exascale to predict dramatic spike opening and cryptic pockets across the proteome. Nature Chemistry 2021:1–9. ** This paper uses Markov State Models to analyze SARS-CoV-2 spike protein activation (as well as other viral protein targets).
  • 43. Sztain T, Ahn S-H, Bogetti AT, Casalino L, Goldsmith JA, McCool RS, Kearns FL, McCammon JA, McLellan JS, Chong LT, et al. : A glycan gate controls opening of the SARS-CoV-2 spike protein. Nature Chemistry 2021:1–6. ** This paper uses weighted-ensemble methods to analyze SARS-CoV-2 spike protein activation.
  • 44.Kasson PM, Kelley NW, Singhal N, Vrljic M, Brunger AT, Pande VS: Ensemble molecular dynamics yields sub-millisecond kinetics and intermediates of membrane fusion. . Proc Natl Acad Sci U S A 2006, 103:11916–11921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Smirnova YG, Risselada HJ, Muller M: Thermodynamically reversible paths of the first fusion intermediate reveal an important role for membrane anchors of fusion proteins. Proc Natl Acad Sci U S A 2019, 116:2571–2576. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Larsson P, Kasson PM: Lipid tail protrusion in simulations predicts fusogenic activity of influenza fusion peptide mutants and conformational models. PLoS Comput Biol 2013, 9:e1002950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Pabis A, Rawle RJ, Kasson PM: Influenza hemagglutinin drives viral entry via two sequential intramembrane mechanisms. Proc Natl Acad Sci U S A 2020, 117:7200–7207. * This paper uses ensemble simulation and a time-sequential multiscaling approach to analyze influenza fusion mechanisms.

RESOURCES