Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions

Fabrizio Marinelli; José D Faraldo-Gómez

doi:10.1016/j.bpj.2015.05.024

. 2015 Jun 16;108(12):2779–2782. doi: 10.1016/j.bpj.2015.05.024

Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions

Fabrizio Marinelli ^1,^∗, José D Faraldo-Gómez ^1,^∗∗

PMCID: PMC4472218 PMID: 26083917

Abstract

We introduce an enhanced-sampling method for molecular dynamics (MD) simulations referred to as ensemble-biased metadynamics (EBMetaD). The method biases a conventional MD simulation to sample a molecular ensemble that is consistent with one or more probability distributions known a priori, e.g., experimental intramolecular distance distributions obtained by double electron-electron resonance or other spectroscopic techniques. To this end, EBMetaD adds an adaptive biasing potential throughout the simulation that discourages sampling of configurations inconsistent with the target probability distributions. The bias introduced is the minimum necessary to fulfill the target distributions, i.e., EBMetaD satisfies the maximum-entropy principle. Unlike other methods, EBMetaD does not require multiple simulation replicas or the introduction of Lagrange multipliers, and is therefore computationally efficient and straightforward in practice. We demonstrate the performance and accuracy of the method for a model system as well as for spin-labeled T4 lysozyme in explicit water, and show how EBMetaD reproduces three double electron-electron resonance distance distributions concurrently within a few tens of nanoseconds of simulation time. EBMetaD is integrated in the open-source PLUMED plug-in (www.plumed-code.org), and can be therefore readily used with multiple MD engines.

Main Text

Biophysical techniques probing the structural dynamics of biomolecules typically yield signals that arise from an ensemble of molecular conformations, and thus it is often not straightforward to interpret the experimental data unambiguously. For example, double electron-electron resonance (DEER) spectroscopy is increasingly used to measure distances between spin-labeled cysteine residues (1), and to assess conformational mechanisms in proteins. DEER spectra, however, actually translate into distance probability distributions, which are often multimodal and interdependent, and might reflect a variety of protein conformations and rotameric states of the labels.

Molecular dynamics (MD) simulations are arguably the best computational approach to address this problem. The concept is to employ an MD simulation to construct an ensemble of molecular configurations X that is consistent with the measured probability distribution of an observable $ξ = ξ^{f} (X)$ , while simultaneously representing the molecular system more realistically (solvent, temperature, etc.) than in standard structural-refinement methods. In practice, this approach entails a modification of the simulation energy function, $U (X)$ , so that the resulting probability distribution, $ρ (X)$ , fulfills the experimental data with the minimum possible bias i.e., the so-called maximum-entropy principle (2,3). If the experimental data for observable ξ is binned into a histogram, a possible modification of $U (X)$ is a linear perturbation, leading to (4)

ρ (X) = \frac{e x p {- β U (X) + \sum_{i} λ_{i} h_{i} [ξ^{f} (X)]}}{\int d X^{'} e x p {- β U (X^{'}) + \sum_{i} λ_{i} h_{i} [ξ^{f} (X^{'})]}},

(1)

where the i index denotes each of the bins in the measured histogram of ξ, and $h_{i} [ξ^{f} (X)] = 1$ if the value of $ξ^{f} (X)$ is in bin i, while $h_{i} [ξ^{f} (X)] = 0$ otherwise (and β = 1/k_BT, where k_B is the Boltzmann constant and T is the temperature). The λ_i parameters, which must be determined in each case, ensure that the time averages of $h_{i} (ξ^{f} (X))$ are equal to the experimental probability values for each of the bins i. Practical applications of the maximum-entropy formulation in Eq. 1 have so far relied on computationally intensive approaches such as averaging over multiple system replicas simulated concurrently (4–7) or iterative optimization algorithms to determine the values of λ_i (2,3).

Here, we present an alternative, single-replica approach inspired by the metadynamics method (8,9), which is also consistent with the maximum-entropy principle. We refer to this method as ensemble-biased metadynamics (EBMetaD). Let us define ρ_exp(ξ) as the target experimental probability distribution of observable ξ and F(ξ) as the free energy,

F (ξ) = - \frac{1}{β} l n [\int d X e x p {- β U (X)} δ (ξ - ξ^{f} (X))] + C,

(2)

where C is a constant. In the limit of infinitesimally narrow bins, Eq. 1 becomes (Appendix S1 in the Supporting Material):

ρ (X) = \frac{\exp {- β U (X) + \ln ρ_{\exp} [ξ^{f} (X)] + β F [ξ^{f} (X)]}}{\int d X^{'} \exp {- β U (X^{'}) + \ln ρ_{\exp} [ξ^{f} (X^{'})] + β F [ξ^{f} (X^{'})]}} .

(3)

In EBMetaD, a biasing potential is added to the energy function so that the simulation samples $ρ (X)$ in Eq. 3. Like in standard metadynamics, this biasing potential, denoted by $V (ξ^{f} (X), t)$ , is constructed throughout the simulation as a cumulative sum of Gaussians, added one at a time at a frequency of 1/τ, each centered on the value of ξ at that time. In EBMetaD, however, these Gaussian functions are weighted by the target probability distribution, that is

V (ξ, t) = \sum_{t^{'} = τ, 2 τ \dots}^{t} \frac{w e x p {- {[ξ - ξ^{f} (X_{t^{'}})]}^{2} / 2 σ^{2}}}{e x p {S_{ρ}} ρ_{exp} [ξ^{f} (X_{t^{'}})]},

(4)

where X_t′ denotes the atomic coordinates at time t′ and σ is the Gaussians width, which sets the resolution of $F (ξ)$ and $ρ_{exp} (ξ)$ . The quantity $S_{ρ} = - \int d ξ ρ_{\exp} (ξ) \ln [ρ_{\exp} (ξ)]$ is the differential entropy of $ρ_{exp} (ξ)$ , i.e., $\exp {S_{ρ}}$ is the effective volume in ξ spanned by $ρ_{exp} (ξ)$ , and serves a normalization factor to ensure that the mean height of the Gaussians added in the range of $ρ_{exp} (ξ)$ is equal to w. As in standard metadynamics, EBMetaD simulations remain close to equilibrium if w, σ, and τ in Eq. 4 are selected adequately (Appendix S2 in the Supporting Material), and a stationary condition is reached at a certain time t_e after which the biasing potential fluctuates around an average profile that converges asymptotically (10). Specifically, the change of $V (ξ^{f} (X), t)$ from this point forward is (11,12)

\dot{V} (ξ, t > t_{e}) \approx \frac{C}{ρ_{e x p} (ξ)} \frac{e x p {- β [F (ξ) + V (ξ, t)]}}{\int d ξ^{'} e x p {- β [F (ξ^{'}) + V (ξ^{'}, t)]}} \approx C,

(5)

where C is a constant. Provided that the region in which $ρ_{\exp} (ξ) > 0$ is energetically allowed by $U (X)$ , the implication of Eq. 5 is that the average biasing potential converges to

\bar{V} (ξ, t > t_{e}) \approx - \frac{1}{β} l n ρ_{e x p} (ξ) - F (ξ) .

(6)

That is, when t > t_e, the EBMetaD simulation samples the space of ξ as in the target distribution $ρ_{exp} (ξ)$ .

It is straightforward to generalize this approach to the case of multiple observables ξ_i and probability distributions thereof, $ρ_{exp} [ξ_{i}^{f} (X)]$ , employing a multidimensional biasing potential analogous to that in Eq. 4:

V (ξ_{1}, ξ_{2}, \dots, t) = \sum_{t^{'} = τ, 2 τ \dots}^{t} \sum_{i} \frac{w_{i} e x p {- {[ξ_{i} - ξ_{i}^{f} (X_{t^{'}})]}^{2} / 2 σ_{i}^{2}}}{e x p {S_{ρ_{i}}} ρ_{e x p} [ξ_{i}^{f} (X_{t^{'}})]} .

(7)

Owing to the scaling factors $S_{ρ_{i}}$ , several distributions can be simultaneously targeted even if they have very different effective volumes. The observables ξ_i, however, ought not be a function of each other (2–4).

To test the validity of the EBMetaD method, we first considered the two-dimensional model potential $U (ξ, ξ^{'})$ shown in Fig. 1 A; the corresponding one-dimensional probability distribution $ρ_{o} (ξ)$ , calculated analytically, is shown in Fig. 1 B (gray). We aim to sample instead a hypothetical experimental distribution $ρ_{exp} (ξ)$ , also shown in Fig. 1 B (black). We thus carry out an overdamped Langevin dynamics simulation on the $U (ξ, ξ^{'})$ potential, using EB-MetaD to slowly construct the biasing potential $V (ξ, t)$ defined in Eq. 4. As Fig. 1 B shows, the calculated histogram $ρ_{sim} (ξ)$ evolves gradually until it converges to the target probability distribution. Thereafter, the simulation reaches a stationary condition, and neither $ρ_{sim} (ξ)$ nor the average bias potential change significantly (Fig. 1 B, inset). To assess whether the ensemble sampled at convergence corresponds to that defined in Eq. 3, i.e., whether EBMetaD indeed fulfills the maximum-entropy principle, we directly compare the calculated simulation histogram $ρ_{sim} (ξ, ξ^{'})$ with the modified two-dimensional potential, $U (ξ, ξ^{'}) - \ln ρ_{\exp} (ξ) - F (ξ)$ , calculated analytically (k_BT = 1). As Fig. 1 C shows, these distributions match perfectly; that is, the bias introduced so as to reproduce $ρ_{exp} (ξ)$ does not alter $ρ (ξ')$ for any ξ-value. An extension of this test in which two hypothetical distributions $ρ_{exp} (ξ)$ and $ρ_{exp}^{'} (ξ^{'})$ are concurrently targeted further confirms that EBMetaD fulfills the maximum-entropy condition (see Appendix S3 and Fig. S2 in the Supporting Material).

(A) Model two-dimensional potential used to test EBMetaD, via an overdamped Langevin dynamics simulation. (B) Histogram of ξ as a function of the number of simulation steps (*red lines*), compared with the probability distribution associated with the model potential (*gray*), and with the target distribution (*black*). (*Inset*) Average biasing potential, versus $- \ln ρ_{\exp} (ξ) - F (ξ)$ (Eq. 6), with t_e = 5 × 10⁵ steps. $F (ξ)$ was calculated analytically, as $F (ξ) = - \ln [\int d ξ' \exp {- U (ξ, ξ^{'})}] + C$ . (C) Histogram of ξ and ξ’ from EBMetaD (*red isolines*), overlaid on the ensemble-corrected potential calculated analytically (*black isolines*) (Eq. 3). Diffusion coefficients in ξ,ξ’ were set to 10, the integration time step was 10⁻⁵, and k_BT = 1. Gaussians of height 10⁻⁴k_BT and width 0.1 were added every 10³ steps. Equivalent results were obtained for a wide range of alternative values (Fig. S1).

To test EBMetaD in a realistic application, we next considered T4 lysozyme in explicit water (Fig. 2 A, and Weaver and Matthews (13)). Following Roux and Islam (4), three methanethio-sulfonate spin-labels were attached at positions E62C, T109C, and A134C. Experimental distance distributions for each pair of nitroxide groups were obtained via electron spin resonance (ESR)/DEER spectroscopy; data were kindly provided by R. A. Stein and H. S. McHaourab (Vanderbilt University Medical Center, Nashville, TN).

(A) Spin-labeled T4 lysozyme simulated in explicit water (PDB:2LZM (13)). The protein is enclosed in a truncated-octahedron periodic box containing 11,895 TIP3P water molecules and 10 Cl⁻ counterions that neutralize the total charge of the system. The distances between the spin-label nitroxide groups measured by ESR/DEER are indicated (*solid arrows*). (B) Comparison of the experimental and calculated probability distributions for each of the spin-label pairs, from either unbiased MD simulations or EBMetaD; the latter are given for different simulation times. (*Insets*) EBMetaD biasing potential, averaged over the simulated trajectory (t_e = 5 ns). Error bars are standard errors over three simulation fragments (see Fig. S4 for further details).

A single trajectory of ∼200 ns was then calculated with EBMetaD, using a three-dimensional biasing potential identical to that defined in Eq. 7, i.e., the three experimental distributions are targeted concurrently. For comparison, an unbiased ∼270-ns trajectory was also calculated using a standard MD. As shown in Fig. 2 B, the distance histograms derived from the unbiased trajectory fail to reproduce those obtained experimentally. By contrast, the histograms derived from the EBMetaD simulation converge to the ESR/DEER data within a few tens of nanoseconds, and preserve that agreement thereafter. To further assess the performance of the method, we compared the time-averaged biasing potential applied to each of the spin-spin distances in three fragments of the EBMetaD trajectory (excluding only the first 5 ns). As shown in Fig. 2 B (insets), the shape of the biasing potentials is largely constant in time, with fluctuations significantly larger than k_BT only in the distal, low-probability regions, thus confirming that EBMetaD reaches an approximately stationary condition (Eqs. 5 and 6). Consistent with the maximum-entropy principle, the ensemble correction introduced by EBMetaD primarily entails a population shift in the rotameric states of the spin labels (Fig. S3 A), with no significant changes in the protein backbone (Fig. S3 B); the root-mean-square deviation of the Cα-trace, relative to the starting x-ray structure, is within 2 Å in both the unbiased and EBMetaD trajectories.

In summary, we have introduced an enhanced-sampling MD simulation method to generate molecular ensembles that reproduce probability distributions for one or more independent observables. This method, referred to as ensemble-biased metadynamics, adaptively provides an ensemble correction consistent with the maximum entropy principle (2–6), without mean field approximations (4), multiple simulation replicas (4,5,7), or the iterative optimization of Lagrangian parameters (2,3). Owing to the computational efficiency and practical simplicity of the method, we posit that EBMetaD can be extremely useful in a wide range of applications, such as structure refinement, mechanistic studies based on spectroscopic data, or purely computational simulation studies. EBMetaD is integrated within the PLUMED 1.3 plug-in (14), and can be thus readily used with multiple simulation engines.

Author Contributions

F.M. and J.D.F.G. designed research and wrote the article. F.M. performed research, contributed analytical tools, and analyzed the data.

Acknowledgments

This work was funded by the Division of Intramural Research of the National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD.

Editor: Bert de Groot.

Contributor Information

Fabrizio Marinelli, Email: fabrizio.marinelli@nih.gov.

José D. Faraldo-Gómez, Email: jose.faraldo@nih.gov.

Supporting Material

Document S1. Three appendices, Supporting Materials and Methods, and four figures

mmc1.pdf^{(8.9MB, pdf)}

Document S2. Article plus Supporting Material

mmc2.pdf^{(8.8MB, pdf)}

References

1.Jeschke G. DEER distance measurements on proteins. Annu. Rev. Phys. Chem. 2012;63:419–446. doi: 10.1146/annurev-physchem-032511-143716. [DOI] [PubMed] [Google Scholar]
2.Pitera J.W., Chodera J.D. On the use of experimental observations to bias simulated ensembles. J. Chem. Theory Comput. 2012;8:3445–3451. doi: 10.1021/ct300112v. [DOI] [PubMed] [Google Scholar]
3.White A.D., Voth G.A. Efficient and minimal method to bias molecular simulations with experimental data. J. Chem. Theory Comput. 2014;10:3023–3030. doi: 10.1021/ct500320c. [DOI] [PubMed] [Google Scholar]
4.Roux B., Islam S.M. Restrained-ensemble molecular dynamics simulations based on distance histograms from double electron-electron resonance spectroscopy. J. Phys. Chem. B. 2013;117:4733–4739. doi: 10.1021/jp3110369. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Cavalli A., Camilloni C., Vendruscolo M. Molecular dynamics simulations with replica-averaged structural restraints generate structural ensembles according to the maximum entropy principle. J. Chem. Phys. 2013;138:094112. doi: 10.1063/1.4793625. [DOI] [PubMed] [Google Scholar]
6.Roux B., Weare J. On the statistical equivalence of restrained-ensemble simulations with the maximum entropy method. J. Chem. Phys. 2013;138:084107. doi: 10.1063/1.4792208. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Islam S.M., Stein R.A., Roux B. Structural refinement from restrained-ensemble simulations based on EPR/DEER data: application to T4 lysozyme. J. Phys. Chem. B. 2013;117:4740–4754. doi: 10.1021/jp311723a. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Laio A., Gervasio F.L. Metadynamics: a method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science. Rep. Prog. Phys. 2008;71:126601. [Google Scholar]
9.Laio A., Parrinello M. Escaping free-energy minima. Proc. Natl. Acad. Sci. USA. 2002;99:12562–12566. doi: 10.1073/pnas.202427399. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Bussi G., Laio A., Parrinello M. Equilibrium free energies from nonequilibrium metadynamics. Phys. Rev. Lett. 2006;96:090601. doi: 10.1103/PhysRevLett.96.090601. [DOI] [PubMed] [Google Scholar]
11.Barducci A., Bussi G., Parrinello M. Well-tempered metadynamics: a smoothly converging and tunable free-energy method. Phys. Rev. Lett. 2008;100:020603. doi: 10.1103/PhysRevLett.100.020603. [DOI] [PubMed] [Google Scholar]
12.Dama J.F., Parrinello M., Voth G.A. Well-tempered metadynamics converges asymptotically. Phys. Rev. Lett. 2014;112:240602. doi: 10.1103/PhysRevLett.112.240602. [DOI] [PubMed] [Google Scholar]
13.Weaver L.H., Matthews B.W. Structure of bacteriophage T4 lysozyme refined at 1.7 Å resolution. J. Mol. Biol. 1987;193:189–199. doi: 10.1016/0022-2836(87)90636-x. [DOI] [PubMed] [Google Scholar]
14.Bonomi M., Branduardi D., Parrinello M. PLUMED: a portable plugin for free-energy calculations with molecular dynamics. Comput. Phys. Commun. 2009;180:1961–1972. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Three appendices, Supporting Materials and Methods, and four figures

mmc1.pdf^{(8.9MB, pdf)}

Document S2. Article plus Supporting Material

mmc2.pdf^{(8.8MB, pdf)}

[bib1] 1.Jeschke G. DEER distance measurements on proteins. Annu. Rev. Phys. Chem. 2012;63:419–446. doi: 10.1146/annurev-physchem-032511-143716. [DOI] [PubMed] [Google Scholar]

[bib2] 2.Pitera J.W., Chodera J.D. On the use of experimental observations to bias simulated ensembles. J. Chem. Theory Comput. 2012;8:3445–3451. doi: 10.1021/ct300112v. [DOI] [PubMed] [Google Scholar]

[bib3] 3.White A.D., Voth G.A. Efficient and minimal method to bias molecular simulations with experimental data. J. Chem. Theory Comput. 2014;10:3023–3030. doi: 10.1021/ct500320c. [DOI] [PubMed] [Google Scholar]

[bib4] 4.Roux B., Islam S.M. Restrained-ensemble molecular dynamics simulations based on distance histograms from double electron-electron resonance spectroscopy. J. Phys. Chem. B. 2013;117:4733–4739. doi: 10.1021/jp3110369. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib5] 5.Cavalli A., Camilloni C., Vendruscolo M. Molecular dynamics simulations with replica-averaged structural restraints generate structural ensembles according to the maximum entropy principle. J. Chem. Phys. 2013;138:094112. doi: 10.1063/1.4793625. [DOI] [PubMed] [Google Scholar]

[bib6] 6.Roux B., Weare J. On the statistical equivalence of restrained-ensemble simulations with the maximum entropy method. J. Chem. Phys. 2013;138:084107. doi: 10.1063/1.4792208. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib7] 7.Islam S.M., Stein R.A., Roux B. Structural refinement from restrained-ensemble simulations based on EPR/DEER data: application to T4 lysozyme. J. Phys. Chem. B. 2013;117:4740–4754. doi: 10.1021/jp311723a. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] 8.Laio A., Gervasio F.L. Metadynamics: a method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science. Rep. Prog. Phys. 2008;71:126601. [Google Scholar]

[bib9] 9.Laio A., Parrinello M. Escaping free-energy minima. Proc. Natl. Acad. Sci. USA. 2002;99:12562–12566. doi: 10.1073/pnas.202427399. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] 10.Bussi G., Laio A., Parrinello M. Equilibrium free energies from nonequilibrium metadynamics. Phys. Rev. Lett. 2006;96:090601. doi: 10.1103/PhysRevLett.96.090601. [DOI] [PubMed] [Google Scholar]

[bib11] 11.Barducci A., Bussi G., Parrinello M. Well-tempered metadynamics: a smoothly converging and tunable free-energy method. Phys. Rev. Lett. 2008;100:020603. doi: 10.1103/PhysRevLett.100.020603. [DOI] [PubMed] [Google Scholar]

[bib12] 12.Dama J.F., Parrinello M., Voth G.A. Well-tempered metadynamics converges asymptotically. Phys. Rev. Lett. 2014;112:240602. doi: 10.1103/PhysRevLett.112.240602. [DOI] [PubMed] [Google Scholar]

[bib13] 13.Weaver L.H., Matthews B.W. Structure of bacteriophage T4 lysozyme refined at 1.7 Å resolution. J. Mol. Biol. 1987;193:189–199. doi: 10.1016/0022-2836(87)90636-x. [DOI] [PubMed] [Google Scholar]

[bib14] 14.Bonomi M., Branduardi D., Parrinello M. PLUMED: a portable plugin for free-energy calculations with molecular dynamics. Comput. Phys. Commun. 2009;180:1961–1972. [Google Scholar]

PERMALINK

Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions

Fabrizio Marinelli

José D Faraldo-Gómez

Abstract

Main Text

Figure 1.

Figure 2.

Author Contributions

Acknowledgments

Contributor Information

Supporting Material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions

Fabrizio Marinelli

José D Faraldo-Gómez

Abstract

Main Text

Figure 1.

Figure 2.

Author Contributions

Acknowledgments

Contributor Information

Supporting Material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases