Abstract
We introduce an enhanced-sampling method for molecular dynamics (MD) simulations referred to as ensemble-biased metadynamics (EBMetaD). The method biases a conventional MD simulation to sample a molecular ensemble that is consistent with one or more probability distributions known a priori, e.g., experimental intramolecular distance distributions obtained by double electron-electron resonance or other spectroscopic techniques. To this end, EBMetaD adds an adaptive biasing potential throughout the simulation that discourages sampling of configurations inconsistent with the target probability distributions. The bias introduced is the minimum necessary to fulfill the target distributions, i.e., EBMetaD satisfies the maximum-entropy principle. Unlike other methods, EBMetaD does not require multiple simulation replicas or the introduction of Lagrange multipliers, and is therefore computationally efficient and straightforward in practice. We demonstrate the performance and accuracy of the method for a model system as well as for spin-labeled T4 lysozyme in explicit water, and show how EBMetaD reproduces three double electron-electron resonance distance distributions concurrently within a few tens of nanoseconds of simulation time. EBMetaD is integrated in the open-source PLUMED plug-in (www.plumed-code.org), and can be therefore readily used with multiple MD engines.
Main Text
Biophysical techniques probing the structural dynamics of biomolecules typically yield signals that arise from an ensemble of molecular conformations, and thus it is often not straightforward to interpret the experimental data unambiguously. For example, double electron-electron resonance (DEER) spectroscopy is increasingly used to measure distances between spin-labeled cysteine residues (1), and to assess conformational mechanisms in proteins. DEER spectra, however, actually translate into distance probability distributions, which are often multimodal and interdependent, and might reflect a variety of protein conformations and rotameric states of the labels.
Molecular dynamics (MD) simulations are arguably the best computational approach to address this problem. The concept is to employ an MD simulation to construct an ensemble of molecular configurations X that is consistent with the measured probability distribution of an observable , while simultaneously representing the molecular system more realistically (solvent, temperature, etc.) than in standard structural-refinement methods. In practice, this approach entails a modification of the simulation energy function, , so that the resulting probability distribution, , fulfills the experimental data with the minimum possible bias i.e., the so-called maximum-entropy principle (2,3). If the experimental data for observable ξ is binned into a histogram, a possible modification of is a linear perturbation, leading to (4)
(1) |
where the i index denotes each of the bins in the measured histogram of ξ, and if the value of is in bin i, while otherwise (and β = 1/kBT, where kB is the Boltzmann constant and T is the temperature). The λi parameters, which must be determined in each case, ensure that the time averages of are equal to the experimental probability values for each of the bins i. Practical applications of the maximum-entropy formulation in Eq. 1 have so far relied on computationally intensive approaches such as averaging over multiple system replicas simulated concurrently (4–7) or iterative optimization algorithms to determine the values of λi (2,3).
Here, we present an alternative, single-replica approach inspired by the metadynamics method (8,9), which is also consistent with the maximum-entropy principle. We refer to this method as ensemble-biased metadynamics (EBMetaD). Let us define ρexp(ξ) as the target experimental probability distribution of observable ξ and F(ξ) as the free energy,
(2) |
where C is a constant. In the limit of infinitesimally narrow bins, Eq. 1 becomes (Appendix S1 in the Supporting Material):
(3) |
In EBMetaD, a biasing potential is added to the energy function so that the simulation samples in Eq. 3. Like in standard metadynamics, this biasing potential, denoted by , is constructed throughout the simulation as a cumulative sum of Gaussians, added one at a time at a frequency of 1/τ, each centered on the value of ξ at that time. In EBMetaD, however, these Gaussian functions are weighted by the target probability distribution, that is
(4) |
where Xt′ denotes the atomic coordinates at time t′ and σ is the Gaussians width, which sets the resolution of and . The quantity is the differential entropy of , i.e., is the effective volume in ξ spanned by , and serves a normalization factor to ensure that the mean height of the Gaussians added in the range of is equal to w. As in standard metadynamics, EBMetaD simulations remain close to equilibrium if w, σ, and τ in Eq. 4 are selected adequately (Appendix S2 in the Supporting Material), and a stationary condition is reached at a certain time te after which the biasing potential fluctuates around an average profile that converges asymptotically (10). Specifically, the change of from this point forward is (11,12)
(5) |
where C is a constant. Provided that the region in which is energetically allowed by , the implication of Eq. 5 is that the average biasing potential converges to
(6) |
That is, when t > te, the EBMetaD simulation samples the space of ξ as in the target distribution .
It is straightforward to generalize this approach to the case of multiple observables ξi and probability distributions thereof, , employing a multidimensional biasing potential analogous to that in Eq. 4:
(7) |
Owing to the scaling factors , several distributions can be simultaneously targeted even if they have very different effective volumes. The observables ξi, however, ought not be a function of each other (2–4).
To test the validity of the EBMetaD method, we first considered the two-dimensional model potential shown in Fig. 1 A; the corresponding one-dimensional probability distribution , calculated analytically, is shown in Fig. 1 B (gray). We aim to sample instead a hypothetical experimental distribution , also shown in Fig. 1 B (black). We thus carry out an overdamped Langevin dynamics simulation on the potential, using EB-MetaD to slowly construct the biasing potential defined in Eq. 4. As Fig. 1 B shows, the calculated histogram evolves gradually until it converges to the target probability distribution. Thereafter, the simulation reaches a stationary condition, and neither nor the average bias potential change significantly (Fig. 1 B, inset). To assess whether the ensemble sampled at convergence corresponds to that defined in Eq. 3, i.e., whether EBMetaD indeed fulfills the maximum-entropy principle, we directly compare the calculated simulation histogram with the modified two-dimensional potential, , calculated analytically (kBT = 1). As Fig. 1 C shows, these distributions match perfectly; that is, the bias introduced so as to reproduce does not alter for any ξ-value. An extension of this test in which two hypothetical distributions and are concurrently targeted further confirms that EBMetaD fulfills the maximum-entropy condition (see Appendix S3 and Fig. S2 in the Supporting Material).
To test EBMetaD in a realistic application, we next considered T4 lysozyme in explicit water (Fig. 2 A, and Weaver and Matthews (13)). Following Roux and Islam (4), three methanethio-sulfonate spin-labels were attached at positions E62C, T109C, and A134C. Experimental distance distributions for each pair of nitroxide groups were obtained via electron spin resonance (ESR)/DEER spectroscopy; data were kindly provided by R. A. Stein and H. S. McHaourab (Vanderbilt University Medical Center, Nashville, TN).
A single trajectory of ∼200 ns was then calculated with EBMetaD, using a three-dimensional biasing potential identical to that defined in Eq. 7, i.e., the three experimental distributions are targeted concurrently. For comparison, an unbiased ∼270-ns trajectory was also calculated using a standard MD. As shown in Fig. 2 B, the distance histograms derived from the unbiased trajectory fail to reproduce those obtained experimentally. By contrast, the histograms derived from the EBMetaD simulation converge to the ESR/DEER data within a few tens of nanoseconds, and preserve that agreement thereafter. To further assess the performance of the method, we compared the time-averaged biasing potential applied to each of the spin-spin distances in three fragments of the EBMetaD trajectory (excluding only the first 5 ns). As shown in Fig. 2 B (insets), the shape of the biasing potentials is largely constant in time, with fluctuations significantly larger than kBT only in the distal, low-probability regions, thus confirming that EBMetaD reaches an approximately stationary condition (Eqs. 5 and 6). Consistent with the maximum-entropy principle, the ensemble correction introduced by EBMetaD primarily entails a population shift in the rotameric states of the spin labels (Fig. S3 A), with no significant changes in the protein backbone (Fig. S3 B); the root-mean-square deviation of the Cα-trace, relative to the starting x-ray structure, is within 2 Å in both the unbiased and EBMetaD trajectories.
In summary, we have introduced an enhanced-sampling MD simulation method to generate molecular ensembles that reproduce probability distributions for one or more independent observables. This method, referred to as ensemble-biased metadynamics, adaptively provides an ensemble correction consistent with the maximum entropy principle (2–6), without mean field approximations (4), multiple simulation replicas (4,5,7), or the iterative optimization of Lagrangian parameters (2,3). Owing to the computational efficiency and practical simplicity of the method, we posit that EBMetaD can be extremely useful in a wide range of applications, such as structure refinement, mechanistic studies based on spectroscopic data, or purely computational simulation studies. EBMetaD is integrated within the PLUMED 1.3 plug-in (14), and can be thus readily used with multiple simulation engines.
Author Contributions
F.M. and J.D.F.G. designed research and wrote the article. F.M. performed research, contributed analytical tools, and analyzed the data.
Acknowledgments
This work was funded by the Division of Intramural Research of the National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD.
Editor: Bert de Groot.
Contributor Information
Fabrizio Marinelli, Email: fabrizio.marinelli@nih.gov.
José D. Faraldo-Gómez, Email: jose.faraldo@nih.gov.
Supporting Material
References
- 1.Jeschke G. DEER distance measurements on proteins. Annu. Rev. Phys. Chem. 2012;63:419–446. doi: 10.1146/annurev-physchem-032511-143716. [DOI] [PubMed] [Google Scholar]
- 2.Pitera J.W., Chodera J.D. On the use of experimental observations to bias simulated ensembles. J. Chem. Theory Comput. 2012;8:3445–3451. doi: 10.1021/ct300112v. [DOI] [PubMed] [Google Scholar]
- 3.White A.D., Voth G.A. Efficient and minimal method to bias molecular simulations with experimental data. J. Chem. Theory Comput. 2014;10:3023–3030. doi: 10.1021/ct500320c. [DOI] [PubMed] [Google Scholar]
- 4.Roux B., Islam S.M. Restrained-ensemble molecular dynamics simulations based on distance histograms from double electron-electron resonance spectroscopy. J. Phys. Chem. B. 2013;117:4733–4739. doi: 10.1021/jp3110369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cavalli A., Camilloni C., Vendruscolo M. Molecular dynamics simulations with replica-averaged structural restraints generate structural ensembles according to the maximum entropy principle. J. Chem. Phys. 2013;138:094112. doi: 10.1063/1.4793625. [DOI] [PubMed] [Google Scholar]
- 6.Roux B., Weare J. On the statistical equivalence of restrained-ensemble simulations with the maximum entropy method. J. Chem. Phys. 2013;138:084107. doi: 10.1063/1.4792208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Islam S.M., Stein R.A., Roux B. Structural refinement from restrained-ensemble simulations based on EPR/DEER data: application to T4 lysozyme. J. Phys. Chem. B. 2013;117:4740–4754. doi: 10.1021/jp311723a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Laio A., Gervasio F.L. Metadynamics: a method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science. Rep. Prog. Phys. 2008;71:126601. [Google Scholar]
- 9.Laio A., Parrinello M. Escaping free-energy minima. Proc. Natl. Acad. Sci. USA. 2002;99:12562–12566. doi: 10.1073/pnas.202427399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bussi G., Laio A., Parrinello M. Equilibrium free energies from nonequilibrium metadynamics. Phys. Rev. Lett. 2006;96:090601. doi: 10.1103/PhysRevLett.96.090601. [DOI] [PubMed] [Google Scholar]
- 11.Barducci A., Bussi G., Parrinello M. Well-tempered metadynamics: a smoothly converging and tunable free-energy method. Phys. Rev. Lett. 2008;100:020603. doi: 10.1103/PhysRevLett.100.020603. [DOI] [PubMed] [Google Scholar]
- 12.Dama J.F., Parrinello M., Voth G.A. Well-tempered metadynamics converges asymptotically. Phys. Rev. Lett. 2014;112:240602. doi: 10.1103/PhysRevLett.112.240602. [DOI] [PubMed] [Google Scholar]
- 13.Weaver L.H., Matthews B.W. Structure of bacteriophage T4 lysozyme refined at 1.7 Å resolution. J. Mol. Biol. 1987;193:189–199. doi: 10.1016/0022-2836(87)90636-x. [DOI] [PubMed] [Google Scholar]
- 14.Bonomi M., Branduardi D., Parrinello M. PLUMED: a portable plugin for free-energy calculations with molecular dynamics. Comput. Phys. Commun. 2009;180:1961–1972. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.