Abstract
The protein–ligand binding affinity quantifies the binding strength between a protein and its ligand. Computer modeling and simulations can be used to estimate the binding affinity or binding free energy using data- or physics-driven methods or a combination thereof. Here we discuss a purely physics-based sampling approach based on biased molecular dynamics simulations. Our proposed method generalizes and simplifies previously suggested stratification strategies that use umbrella sampling or other enhanced sampling simulations with additional collective-variable-based restraints. The approach presented here uses a flexible scheme that can be easily tailored for any system of interest. We estimate the binding affinity of human fibroblast growth factor 1 to heparin hexasaccharide based on the available crystal structure of the complex as the initial model and four different variations of the proposed method to compare against the experimentally determined binding affinity obtained from isothermal titration calorimetry experiments.
Subject terms: Computational biophysics, Biophysical chemistry
This work provides a physics-based theoretical framework for accurate protein–ligand binding affinity estimation based on molecular dynamics simulations, enhanced sampling, non-parametric reweighting and the orientation quaternion formalism.
Main
Accurate quantification of absolute binding affinities remains a problem of major importance in computational biophysics1–4. In principle, accurate binding-free-energy calculations should be the cornerstone of any study investigating protein–ligand interactions. However, the high computational costs that typically accompany such calculations necessitate the improvement of the computational methods traditionally used to investigate complex biomolecular interactions3–5. Experimentally determined binding affinities are commonly used as benchmarks to judge the accuracy of various computational binding affinity estimation methods5. Several experimental techniques can be used to study protein–ligand binding equilibria5,6. For instance, isothermal titration calorimetry (ITC) can detect the interaction of binding partners based on changes in solution heat capacity and binding partner concentration6–8. Other methods such as fluorescence spectroscopy rely on changes in fluorescence intensity upon ligand binding6,9,10. Surface plasmon resonance can be used to calculate binding affinities based on changes in refractive index that occur when an immobilized binding partner interacts with a free binding partner6. Studies have found that experimental binding affinities can vary depending on the experimental method used5. Therefore, a thorough understanding of the experimental conditions used to generate reference data is essential when comparing computationally determined binding affinities with experimental values.
Several computational methods at varying levels of rigor and complexity have been used to determine binding affinities for biomolecular interactions3,11–18. Knowledge-based statistical potentials and force-field scoring potentials are typically used to rank docked protein–ligand or protein–protein complexes but can also be used for binding affinity prediction19–21. A major disadvantage of these methods is that they do not treat the entropic effects rigorously, which effectively decreases the accuracy of such binding affinity predictions5. This is also the case for methods such as molecular mechanics/Poisson–Boltzmann surface area (MM-PBSA) and molecular mechanics/generalized Born surface area (MM-GBSA), which combine sampling of conformations from explicit solvent molecular dynamics (MD) simulations with free-energy estimation based on implicit continuum solvent models22–24. Adequate sampling of protein and ligand conformational dynamics as well as ligand roto-translational movements with respect to the protein is essential for accurately quantifying the entropic reduction arising from the binding event24–26. MM-PBSA and MM-GBSA methods typically neglect such entropic contributions to the binding free energy or do not treat them rigorously23,24. Binding Free-Energy Estimator 2 (BFEE2) is a state-of-the-art protein–ligand binding affinity calculation software that addresses the substantial shift in configurational enthalpy and entropy that follows ligand–protein binding, which is hard to represent in brute-force simulations18,27. An energy–entropy approach, energy–entropy multiscale cell correlation, has been introduced to compute the free energy of binding and has been applied for binding-free-energy calculations, which take into consideration the entropy of all flexible degrees of freedom in the system in a consistent and generic way28.
One of the best-known binding-free-energy estimation methods is alchemical free-energy perturbation (FEP), where scaling of non-bonded interactions enables reversal decoupling of the ligand from its environment in the bound state as well as the unbound state29–32. Most entropic and enthalpic contributors to changes in binding affinity are typically considered during FEP simulations, thus avoiding the approximations used by methods such as MM-PBSA and MM-GBSA5,33. A disadvantage of FEP is the fact that ligands tend to move away from the binding site during the decoupling process, which results in poorly defined target states of the FEP calculation being used as starting states for the re-coupling process34. Using receptor–ligand restraints to resolve this issue30,35–37 introduces some ambiguity to the way a standard state is defined, with a level of correlation between the size of the simulation cell and the standard state38. This can be corrected by the use of appropriate geometrical restraints39–41.
Unrestrained long-timescale MD simulations should theoretically allow for the investigation and quantification of protein–ligand or protein–protein binding events42,43. While microsecond-level MD simulations provide a more accurate description of protein conformational dynamics compared with shorter simulations44, efficient sampling of the conformational landscape remains a major issue and requires access to timescales beyond the capabilities of current MD simulations45,46. Several methods have been developed to tackle the sampling problem. Markov state models allow the sampling and characterization of native as well as alternative binding states57. Similarly, weighted ensemble simulations sample the conformational landscape along one or more discretized reaction coordinates based on the assignment of a statistical weight to each simulation47,48. More traditionally, umbrella sampling (US) along such reaction coordinates can be used to guide the binding or unbinding of a ligand, after which algorithms like the weighted histogram analysis method can be used to calculate a unidimensional potential of mean force (PMF) that quantifies ligand binding and unbinding along a reaction coordinate49,50. Better convergence of the calculated free-energy profiles can be achieved by the exchange of conformations between successive US windows as in the bias-exchange umbrella sampling (BEUS)51–53. Other methods based on similar principles include umbrella integration54, well-tempered metadynamics55, adaptive biasing force (ABF) simulations56 and variations of these techniques.
Incomplete sampling of important degrees of freedom, such as orientation of the ligand with respect to the protein, remains a major disadvantage of unidimensional PMF-based methods3,4. To resolve this problem, ref. 3 reported a method wherein explicitly defined geometrical restraints on the orientation and conformation of the binding partners are used to reduce the conformational entropy of the biomolecular system being studied3,4. This results in improved convergence of the PMF calculation3,4. The introduction of a restraining potential based on the root-mean-square deviation (RMSD) of the ligand relative to its average bound conformation reduces the flexibility of the ligand and the number of conformations that need to be sampled3,4. This method avoids the need to decouple the ligand from its surrounding environment as required by alchemical FEP3,4,29–32. Recent studies4,57 have described applications and extensions of the methodology proposed by ref. 3.
In this Article, we describe a purely physics-based enhanced sampling method based on biased MD simulations, which is similar in principle to the stratification strategy proposed by refs. 3,4. Although we use the US method as our enhanced sampling technique, the methodology is generalizable to other techniques as long as they can be combined with additional restraints. There are several important differences between our method and that of refs. 3,4. Our method includes: (1) providing a general scheme that can be easily adapted to any number of restraints; (2) the non-parametric reconstruction of the grid PMF, as defined below; and (3) the use of the unidimensional orientation angle of the ligand with respect to the protein as a collective variable for restraining, as opposed to the use of three Euler angles. We note that the method of refs. 3,4 can in principle be generalized as well; the generalization is not as straightforward as it is in our proposed method, particularly in removing some of the restraints. In other words, while adding more restraints is somewhat similar to our approach in the method of of refs. 3,4, removing some of the restraint requires less trivial changes to the formalism that makes it distinct from our method. We have used this methodology to calculate the binding affinity for the interaction of human fibroblast growth factor 1 (hFGF1) with heparin hexasaccharide, its glycosaminoglycan (GAG) binding partner. hFGF1 is an important signaling protein that is implicated in physiological processes such as cell proliferation and differentiation, neurogenesis, wound healing, tumor growth and angiogenesis58–62. GAGs are linear anionic polysaccharides that interact with positively charged regions of FGF binding partners to regulate their biological activity63. The hFGF1–heparin complex is the most well-known and broadly characterized protein–GAG complex64,65. Heparin binding is thought to stabilize hFGF1 and impart protection against proteolytic degradation. In this study, we show that the absolute binding affinity for the hFGF1–heparin interaction calculated using our approach is in good agreement with binding affinity data from ITC experiments. Four alternative methods are used for estimating the absolute binding affinity within the formalism presented here to determine the workings of the methodology and the effect of the application of different (or no) restraints. We also compare our results with those obtained from FEP simulations and show that although performing longer FEP simulations could improve the accuracy of binding affinity estimates when compared with short FEP simulations, our approach is still more accurate than FEP when similar simulation times are used.
Results
Calculation of binding free energy using four different strategies
We have calculated the absolute binding free energy for the interaction of hFGF1 with heparin hexasaccharide using four variations of the stratification scheme described above, based on a combination of steered MD (SMD) and BEUS simulations. The details of the methodology are discussed in Methods. Four different methods are used with varying effectiveness in estimating the absolute binding free energy. These methods are: (1) the traditional distance-based BEUS simulations that do not employ any additional restraining; (2) distance-based BEUS simulations employing a restraint on the orientation of the ligand (Ω) defined based on the orientation quaternion formalism; (3) distance-based BEUS simulations employing a restraint on the RMSD of both ligand and protein (rL and rP); (4) distance-based BEUS simulations employing a restraint on the RMSD of both ligand and protein as well as the orientation of the ligand (Ω, rL and rP). In each case, appropriate correction terms are calculated as discussed in the ‘Theoretical foundation’ section and shown in Table 1.
Table 1.
Quantity | No restraints | Ω restraint | rL and rP restraint | Ω, rL and rP restraint |
---|---|---|---|---|
Grid PMF difference (kcal mol−1) | ΔG(xB) = −19.7 ± 1.1a | ΔGΩ(xB) = −13.2 ± 0.3 | = −17.7 ± 1.0 | = −17.0 ± 0.5 |
Orientation correction (kcal mol−1) | NA | ΔUΩ(xB) = 4.4 ± 0.3 | NA | = 4.6 ± 0.3 |
Ligand RMSD correction (kcal mol−1) | NA | N/A | = 0.6 ± 0.1 | = 0.6 ± 0.1 |
Protein RMSD correction (kcal mol−1) | NA | NA | = 0.3 ± 0.1 | = 0.3 ± 0.1 |
ΔGV (kcal mol−1) | 3.7 ± 0.2 | 2.5 ± 0.2 | 2.3 ± 0.2 | 2.7 ± 0.2 |
ΔG°(kcal mol−1) | −16.0 ± 1.2 | −6.3 ± 0.5 | −14.5 ± 1.0 | −8.7 ± 0.7 |
Kd (μM)b | O(10−6) | 25 | O(10−5) | 0.5 |
Kd range (μM)c | 10−7–10−5 | 11–58 | 10−6–10−4 | 0.2−2.0 |
We are comparing the ΔG° of four different restraining methods (see ‘BEUS simulations’ in Methods for details).
aAll error estimates are based on 1 s.d.
bEquilibrium dissociation constant (Kd) values are determined directly from mean absolute binding-free-energy (ΔG°) values using relation (2).
cKd range is determined from the lower and upper limits of ΔG° values (mean ± s.d.) using relation (2). The experimentally determined Kd and ΔG° were 1.68 ± 0.03 μM and −7.88 ± 0.01 kcal mol−1, respectively (see Fig. 4). The orientation angle of heparin with respect to the protein (Ω), RMSD of the protein (rP), RMSD of heparin (rL), and the contribution of the difference between the volume of the binding pocket and the bulk to the binding free energy (ΔGV). ΔG(xB), ΔGΩ(xB), and are the PMF difference between the binding pocket center and the bulk associated with respective restraints. NA means the data are not applicable in the corresponding section. ΔUΩ(xB), , and are correction terms associated with restraints (see ‘Theoretical foundation’ in Methods for details).
We denote the PMF of the ligand at a given position x (with respect to the center of the heparin binding pocket) as the grid PMF, as the PMF is estimated at different grid points in this approach (Fig. 1). The average grid PMF profiles along the ligand–protein distance for the four different methods used here (as shown in Fig. 1) confirm the differential behavior of these methods (see Supplementary Fig. 1 for a schematic representation of these simulations). Note that since x = 0 is the grid point associated with the lowest PMF by definition, the average PMF along |x| has its global minimum at |x| = 0. The most successful method is expected to be the one employing restraints on Ω, rL and rP (Table 1 and Fig. 1). The largest contributor to the free energy is the difference between the grid PMF associated with the heparin hexasaccharide at a grid point at the center of the binding pocket and at any grid point in the bulk, which is −17.0 ± 0.5 kcal mol−1 (Fig. 2a and Table 1).
The PMF calculations above are based on the BEUS simulations along the protein–ligand distance; however, the orientation and RMSD of the ligand and the RMSD of the protein are restrained to speed up convergence. To account for the orientation bias, a correction term needs to be applied, which is calculated from the PMF associated with the ligand orientation angle at the bulk and binding pocket (Fig. 2b). The orientation bias is estimated to be 4.4 ± 0.3 kcal mol−1 (Table 1). Similarly, a correction term is calculated based on the PMF of the ligand RMSD and that of the protein (Fig. 2c). These correction terms are estimated to be 0.7 ± 0.1 kcal mol−1 and 0.4 ± 0.1 kcal mol−1 for the ligand and protein, respectively (Table 1).
Finally, another term is needed to account for the difference in the volume accessible to the ligand in the binding pocket and in the bulk (volume contribution). Figure 3 shows that the binding pocket contribution (ΔGP) (or binding pocket volume (VP)) for the distance-based BEUS simulations with no restraint as determined from the 20-lowest free-energy grid points is almost equal to that obtained from all visited grid points inside or outside the binding pocket. For the distance-based BEUS simulations with Ω, rL and rP restraints, this term is estimated to be 2.7 ± 0.2 kcal mol−1 (Table 1), which results in an absolute binding free energy of −8.7 ± 0.7 kcal mol−1 (Table 1).
On the basis of our error analysis, equilibrium dissociation constant (Kd) values calculated from the absolute binding free energy were found to be in the micromolar range with an average value of 0.6 μM (using the mean absolute binding free energy (ΔG°) estimate) and ranging from 0.2 μM to 2.0 μM (based on the lower and upper bounds of free energy estimates) (Table 1). These are in very good agreement with the Kd value obtained from ITC experiments. We performed the ITC experiments in triplicate resulting in a Kd of 1.68 ± 0.03 μM (as shown in Fig. 4), 1.65 ± 0.07 μM and 1.69 ± 0.05 μM in three independent experiments. The binding free energy calculated from the experimental Kd (−7.87 kcal mol−1, −7.88 kcal mol−1 or −7.89 kcal mol−1, depending on the experiment) is also in good agreement with the computationally calculated binding free energy (Fig. 4 and Table 1).
Comparison between computationally and experimentally calculated binding free energy of heparin–hFGF1
The quantitative agreement between the computational and experimental binding affinity estimates is a great indicator of the accuracy of our absolute binding-free-energy calculation method. However, if proper restraining is not used as in the distance-based BEUS simulations with no restraints or only RMSD restraints, the binding affinity estimates would be off by several orders of magnitude. The simulations that restrain only the orientation of the ligand are interestingly quite successful as well, being off by only one order of magnitude in terms of binding affinity, which is generally considered a good estimate. This provides some evidence that the orientation of the ligand is perhaps the degree of freedom with the most substantial contribution to the absolute binding free energy besides the ligand–protein distance. While the average grid PMF profiles along the ligand–protein distance (as shown in Fig. 1) confirm that the four methods used here produce different results, it is important to note that the correction terms should ideally eliminate these differences. This is seen to some extent when comparing the two methods involving orientation restraints that happen to estimate binding affinities that are reasonably close (Table 1) to the experimentally determined value. Another source of error in our calculations could be in estimating the VP and eventually the contribution of the difference between the volume of the binding pocket and the bulk to the binding free energy (ΔGV). In doing so, we have made an assumption that the VP can be calculated from relation (31) approximating the grid PMF with that obtained from biased simulations. Comparing ΔGV values from Table 1 shows that different biases result in different approximating values ranging from 2.3 kcal mol−1 to 3.7 kcal mol−1. For more information on these results and the convergence of data, see Supplementary Table 1 and Figs. 2–5 ref. 27.
Examining how this approach compares with other prevalent binding-free-energy calculation methods
Recent computational studies have used the MM-GBSA method to calculate the binding free energy of the hFGF1–heparin interaction, with values ranging from −84.2 kcal mol−1 to −106.1 kcal mol−1 (ref. 66). The outcomes of the MM-GBSA technique are considerably different from those of ours. The MM-PBSA and MM-GBSA, which is not an all-atom simulation approach like ours, has drawbacks including the continuum solvent approximation. The intrinsic dielectric constant’s appropriate setting presents another challenge. It has long been known that the selection of the intrinsic dielectric constant has a significant impact on the computed electrostatic energy22–24. However, these contributors to the binding affinity are typically taken into account during FEP simulations, thus obviating the need for the approximations used in MM-PBSA and MM-GBSA5,33. It is widely accepted that binding-free-energy estimates from MM-PBSA and MM-GBSA are less accurate than those from FEP, which is considered to be the gold standard for the calculation of absolute binding affinities67. We performed double annihilation FEP to calculate the absolute binding free energy of the hFGF1–heparin complex. The BFEE2 method was used to estimate binding affinities from FEP simulations with the consideration of several restraints to improve sampling within the framework of the method of refs. 3,4. The FEP simulations here were designed to have an aggregate simulation time comparable to that used in our BEUS simulations (~2.3 μs for FEP compared with ~1.1 μs for SMD + BEUS). An absolute binding free energy of 0.55 ± 30.25 kcal mol−1 was obtained for the FEP (Fig. 5a and Supplementary Table 2). Unlike the absolute binding free energy estimated from the BEUS simulations (−8.7 ± 0.7 kcal mol−1) (Table 1), the estimates from the FEP simulations are not in a good agreement with the binding free energy determined from ITC experiments (−7.88 ± 0.01 kcal mol−1) (Fig. 4). More importantly, a large uncertainty is associated with the FEP results that is due to the relative large size of the ligand. To show the effectiveness of our plan, we also calculated the binding free energies of hFGF1 and heparin directly using the method of refs. 3,4 as implemented within the BFEE2 package (the geometrical route). To make a fair comparison, we ran 1.6 μs of aggregate simulation using the ABF free-energy calculations. Heparin and hFGF1 have a binding free energy of −19.04 ± 2.95 kcal mol−1 based on the geometric approach (Fig. 5b and Supplementary Table 3). In contrast to our technique, the BFEE2 geometric route anticipated a value that was twice as high as the experimental value of free energy. This comparison shows the effectiveness of our method over well-established binding-free-energy calculation methods. To firmly establish the efficiency of our strategy, additional research with a bigger data sample will be required in the future. In particular, it is important to determine what parameters make our method more efficient than the BFEE2 geometric protocol. For instance, it could be due to use of BEUS simulation scheme or a more fundamental difference regarding the use of simpler restraints and analysis schemes.
Studies have shown that the binding affinity and free-energy results derived from computational methods can be compared with experimental binding affinities obtained from ITC experiments7,8. However, for a reliable computational free-energy estimate, employing purely physics-based free-energy calculation methods such as those employed here has proven to be difficult. Herein we showed that using a careful strategy that considers all relevant free-energy terms and ensures the use of powerful enhanced sampling techniques could result in good quantitative agreements between the computational and experimental binding affinity estimates. Our methodology could serve as a robust free-energy calculation method for determining the binding affinities of any protein and ligand of interest. However, the accuracy of the resulting binding free energies is still limited by the reliability of the force field parameters, which is at least equally as important as sampling for accurate physics-based binding affinity estimation.
Discussion
The formalism presented in this work has notable similarities to the method previously proposed by ref. 3, and later implemented4,57. However, there are major differences that make the current method more practical. The grid PMF and its various estimates provide a simple conceptual framework to understand how restraining can be accounted for with appropriate correction terms. The average grid PMF in terms of the ligand–protein distance provides an alternative to the PMF in terms of d as is often constructed. The non-parametric reweighting allows for calculating the grid PMF in terms of the distance from the center of the binding pocket, as defined in this work, eliminating the need for calculating the PMF in terms of the polar and azimuthal angles as in the method of refs. 3,4. Relation (30) is a general scheme that can be easily adapted to any number of restraints. For instance, one may or may not add the polar and azimuthal angles to the restraints using the trivial generalization of relation (30). The orientation angle of the ligand with respect to the protein as determined using the orientation quaternion formalism provides a simple way of determining the absolute binding free energy with a feasible computational cost. Among the four different sets of restraints, the two involving orientation restraints predict binding free energies similar to that determined experimentally. Again, if restraining the orientation angle does not allow for a rapid convergence, one can add more restraints including the tilt and/or spin restraints. While the traces of the method of refs. 3,4 is clear in our derivation of the binding free energy, there are also clear differences in the use of the concept of the grid PMF that allows treating any restraints within the general formalism expressed in relation (30). A more extensive work is needed to determine when restraints in addition to those used in this work are necessary.
The outcomes of a simulation are significantly influenced by a variety of other factors, including model quality and the precision of docking. The efficiency of the method’s findings may also be impacted by the degree to which the force field of ligands is accurately modeled. The restrictions that are often associated with these kinds of approach are connected to sampling, which might vary from project to project; for instance, bigger ligands may demand greater sampling than smaller ligands do. Also extremely crucial is the beginning structure of the bound state; the more precise the bound state, the more accurate the binding affinities will be.
Methods
Theoretical foundation
Binding affinity is often quantified using the equilibrium dissociation constant (Kd), defined as:
1 |
where [P], [L] and [P:L] are the concentrations of protein, ligand and the protein–ligand complex, respectively. Computationally, the absolute binding free energy (ΔG°), which is the standard molar free energy of binding, is more convenient to calculate. The dissociation constant and the absolute binding free energy are related via
2 |
where R is the gas constant, T is the temperature and 1 M is 1 molar concentration. Various strategies have been used to estimate ΔG°, some of which were briefly discussed above. The methodology proposed here has a notable resemblance to the stratification strategy of refs. 3,4. However, the two methods have major differences as will be discussed later in this section.
Absolute binding free energy or ΔG° is the free-energy change associated with moving the ligand from the bulk to the binding pocket (Supplementary Table 1). Within the formalism presented in this work, ΔG° is determined from the grid PMF G(x), where x is the position of the ligand mass center from the center of the binding pocket (Supplementary Table 1), G(x) is the PMF associated with the ligand position x. In practice, we need to bin the three-dimensional space and define the PMF at every bin or grid point as:
3 |
where p(x) is the probability of finding the ligand at bin x.
We define ΔG(x) = G(x) − G(0), where x = 0 (that is, the center of the binding pocket) is defined as the grid point associated with the lowest grid PMF. One can show:
4 |
in which the binding ‘pocket’ refers to all x ∈ V where the ligand is considered bound and ‘bulk’ refers to all x ∈ V where the ligand is not interacting with the protein. V here is a subset of space with a single protein in standard concentration (that is, 1 M). As ΔG(x) is the same everywhere in the bulk, we can simplify relation (4) as follows:
5 |
where VB is the bulk volume per protein associated with the standard concentration, xB is any grid point in the bulk and VP is the binding pocket volume defined as:
6 |
Defining ΔGV as the contribution of the difference between the volume of the binding pocket and the bulk to the binding free energy:
7 |
Combining equations (5) and (7), we have:
8 |
We can find the bulk volume (VB) associated with the standard concentration for a single protein approximately as:
9 |
where NA is Avogadro’s constant and L is the unit of volume (litres). We can now rewrite ΔGV as:
10 |
in which ΔGB is the bulk volume contribution and ΔGP is the binding pocket contribution:
11 |
Determining both ΔG(xB) and ΔGP requires finding the grid PMF ΔG(x). ΔG(xB) is the PMF difference between the binding pocket center and the bulk and ΔGP also requires an estimate for ΔG(x) within the binding pocket. We therefore do not need to find ΔG(x) for all x if we have a good estimate for ΔG(x) within the binding pocket and in the bulk. Ideally, ΔG(x) for these points can be determined by pulling the ligand out of the binding pocket towards the bulk and using an enhanced sampling technique such as US to sample the space of a collective variable such as d, that is, the distance between the mass centers of the ligand and protein. ΔG(x) can be estimated for all sampled grid points x using this distance-based US simulation. Note that the collective variable used for biasing would be d, while the collective variable used for the PMF calculations would be the three-dimensional position vector of the mass center of ligand with respect to protein’s binding pocket center. One may estimate the grid PMF from the distance-based US simulations using a non-parametric reweighting algorithm as discussed in this section. ΔG(x) can also be used to estimate ΔGP as defined in relation (11). There is often no need to strictly define the binding pocket as only low ΔG(x) values have non-negligible contribution to VP and thus even if we include all sampled grid points, only those close to the binding pocket center have non-negligible contributions.
A practical issue with determining ΔG(xB) is the convergence. The key obstacles for the sampling that slow down the convergence are the orientation of the ligand, and the conformational changes of the ligand and protein. Using an approach similar in spirit to the previously proposed stratification strategy3,4,24, we can circumvent extensive sampling of these degrees of freedom. Let us first focus on the orientation of the ligand (Ω), defined using the orientation quaternion formalism. We can restrain Ω during the distance-based US simulations using a biasing potential ( where a k is harmonic force constant) and later correct the free-energy difference based on the PMF associated with the Ω, which is different in the bulk (F(xB, Ω)) and in the binding pocket (F(0, Ω)). More generally, for any grid point x, we may determine ΔG(x) based on the PMF associated with the Ω at x (F(x, Ω)) and 0 (F(0, Ω)):
12 |
Note that F(x, Ω) is the PMF associated with x and Ω, defined such that:
13 |
where c is an arbitrary constant. We therefore have:
14 |
We now define GΩ(x) as the grid PMF of the restrained system (by Ω):
15 |
We also define UΩ(x) as the average biasing potential at grid point x:
16 |
Now we have from relations (14), (15) and (16):
17 |
where, the free energy of grid point x from the center 0 (ΔG(x)) is calculated based on its equivalent free energy (ΔGΩ(x)) in a system biased by a harmonic restraint on Ω and a correction term ΔUΩ(x). For x = xB:
18 |
To determine the above ensemble averages, we need to determine the PMF along Ω for the bound and unbound ligand and calculate the ensemble averages analytically using relation (16). ΔGΩ(xB) can be determined from PMF calculations, where the distance between the protein and ligand is varied and the orientation of the ligand is restrained (distance-based BEUS with restrained orientation). We note that:
19 |
where we assume ΔUΩ(x) is negligible for x within the binding pocket. In other words, for x close to 0.
In brief, if we choose to restrain the orientation, our absolute binding-free-energy estimate includes the following terms (using relations (8) and (17)):
20 |
F(xB, Ω) can be calculated numerically from orientation angle distribution of a free ligand: where p(Ω) is determined from the distribution of Euler angles (, where 0 ≤ ϕ, ψ ≤ 2π and 0 ≤ θ ≤ π) given that:
21 |
can then be calculated using relation (16) with numerically estimated F(xB, Ω) and the k value used in the simulations. was numerically estimated by discretizing each of the 3 Euler angles with a bin width of 1° and a total of 360 × 360 × 180 bins to estimate p(Ω) from p(ϕ, θ, ψ). F(0, Ω) can be determined approximately using orientation-based US simulations of bound ligand. F(0, Ω) can then be used to estimate using relation (16).
The above strategy can be extended to other degrees of freedom for which unbiased sampling may hinder the convergence. Most notably, the internal conformational changes of the ligand and that of the protein may also play a crucial role in slowing down the convergence. In the following, we show how one can restrain not only the orientation of the ligand but also the RMSD of the ligand (denoted here by r) in distance-based US simulations (along d) to speed up convergence. In this case, the grid PMF difference ΔG(x) is calculated based on ΔGΩ,r(x), the grid PMF of a system whose Ω and r are both restrained:
22 |
Using a similar strategy as in relation (14), we have:
23 |
which results in:
24 |
Here we have defined GΩ,r(x) as:
25 |
where k′ is the harmonic force constant associated with the r based on biasing potential (). We also define Ur(x) similar to UΩ(x) in relation (15) except for using r instead of Ω. is also defined similar to UΩ(x) except for the additional restraint on r:
26 |
Finally, we have:
27 |
In brief, if we choose to restrain both the orientation and RMSD, our absolute binding-free-energy estimate includes the following terms:
28 |
Here we are using an approximation similar to that in relation (19):
29 |
Using relations (20) and (28), we can generalize the stratification strategy to include three restraints on arbitrary collective variables α, β and γ:
30 |
where:
31 |
Isothermal titration calorimetry of hFGF1 with heparin hexasaccharide
ITC data were obtained using MicroCal iTC 200 (Malvern) with microcal origin software. The change in heat during the biomolecular interaction was measured by titrating heparin (loaded in the syringe) into the hFGF1 solution in the calorimetric cell. Both the protein and the heparin samples were prepared in the buffer containing 10 mM phosphate buffer with 100 mM NaCl at pH 7.2 and were degassed before loading. The protein-to-heparin ratio was maintained at 1:10 with the protein concentration being 100 μM and the heparin concentration being 1 mM. A total of 30 injections were conducted with a constant temperature of 25 °C and stirring speed of 300 rpm. One set of sites binding model was used for the ITC binding curve68. The standard binding free energy ΔG° was determined from dissociation constant via relation (2) at T = 25 °C. The experiment was repeated three times with the same sample and the results obtained were very similar to each other. The mean and standard deviation were reported for both Kd and ΔG°.
All-atom MD simulations
For our bound state, we utilized the X-ray crystal structure of the dimeric hFGF1 combination with heparin hexasaccharide (PDB 2AXM; resolution, 3.0 Å)69, and for our apo state, we used the X-ray crystal structure of monomeric hFGF1 (PDB 1RG8; resolution, 1.1 Å)70. The NAMD 2.13 (ref. 71) was used to run MD simulations. Using a conjugate gradient, we energy-minimized the system for 10,000 steps. We next relaxed the systems using stepwise restrained MD simulations (for 1 ns) using CHARMM-GUI72. All production runs were done in an NPT (constant N, number of atoms; P, pressure; T, temperature) ensemble after the first NVT (constant N, number of atoms; V, volume; T, temperature) relaxation. Simulations were done at 300 K with a 2 fs time step and a 0.5 ps−1 damping coefficient using a Langevin integrator. Nosé–Hoover–Langevin pistons were used to maintain 1 atm pressure72. Long-range electrostatic interactions were estimated using the particle mesh Ewald approach. The initial runs were done for 15 ns, followed by the production run on the Anton 2 supercomputer (Pittsburgh Supercomputing Center) for 4.8 μs with a 2.5 fs time step.
MD simulations of free heparin hexasaccharide
Heparin hexasaccharide69 was simulated in a rectangular water box without the protein. The system was set up as described previously in the ‘All-atom MD simulations’ section. The final conformation after relaxation was then used as the starting conformation for 10 production runs for 40 ns each. The total simulation time was around 400 ns.
SMD simulations
The final conformations of the hFGF1–heparin73, apo hFGF1 (ref. 73) and free heparin hexasaccharide equilibrium simulations were used to generate starting conformations for the non-equilibrium pulling simulations. Four collective variables74 were used for the SMD simulations75: (1) distance between the heavy-atom center of mass of heparin and that of the protein (d); (2) the orientation angle of heparin with respect to the protein (Ω) defined using the orientation quanternion formalism; (3) RMSD of the protein (rP); (4) RMSD of heparin (rL). Six independent sets of simulations were performed. The distance-based SMD simulation was run for 9.5 ns, while the orientation-based SMD simulation was run for 8 ns. The distance-based SMD simulation was used to pull the heparin away from the protein by approximately 30 Å (10 Å → 40 Å) with a force constant of 100 kcal (mol Å2)−1. The orientation angle was also restrained in this simulation with a force constant of 0.5 kcal (mol degree2)−1 to stay close to its initial orientation in the bound state. The orientation-based SMD simulation was used to rotate the bound heparin locally with respect to the protein (0° → 73°) with a force constant of 100 kcal (mol degree2)−1. Four RMSD-based SMD simulations were run for 10 ns each using a force constant of 50 kcal (mol Å2)−1: (1) to change the RMSD of the bound protein (0.5 Å→2 Å) (the RMSD of heparin was restrained in this simulation with a force constant of 1 kcal (mol Å2)−1); (2) to change the RMSD of the bound heparin (1.5 Å → 4 Å); (3) to change the RMSD of the unbound protein (0.8 Å → 3.2 Å); (4) to change the RMSD of the free heparin (1.5 Å → 5.5 Å).
BEUS simulations
BEUS53,76,77, which is a variation of the US simulation method, was performed to estimate grid PMF (Supplementary Fig. 1). Four independent sets of distance (d)-based BEUS simulations were performed, with no restraints, restraint on orientation angle of heparin with respect to the protein (Ω), restraint on RMSD of the ligand (rL) and RMSD of the protein (rP), and restraints on Ω, rL and rP. Two sets of BEUS simulations were also performed using the Ω collective variable, one with and one without a restraint on rL and rP. In addition, two sets of BEUS simulations were performed using the rP collective variable (bound protein with restraint on rL; unbound protein) and two sets were performed using the rL collective variable (bound ligand; free ligand). Selected SMD conformations were assigned to individual BEUS windows with equal spacing in each one of these BEUS simulations. The distance-based BEUS simulations ran for 10 ns with 31 replicas/windows and the orientation-based simulations ran for 10 ns with 30 replicas/windows. The RMSD-based BEUS simulations ran for 10 ns with 12 replicas/windows. The force constant used for ligand–protein distance (d) in distance-based BEUS was 2 kcal (mol Å2)−1 while the orientation was restrained as in SMD simulations using a force constant of 0.5 kcal (mol degree2)−1. For orientation-based BEUS simulations, the force constant for the ligand orientation angle (as in SMD simulations) was set to 0.5 kcal (mol degree2)−1. The force constant used for rL and rP in all cases was 1 kcal (mol Å2)−1. See Supplementary Fig. 1 for a schematic representation of these simulations.
Free-energy calculations using non-parametric reweighting
Once the BEUS simulations described above were converged, a non-parametric reweighting method76,78, which is somewhat similar to the multi-state Bennett acceptance ratio method79, was used to construct the PMF. In this method76, each sampled configuration will be assigned a weight, which can be used to construct the PMF in terms of a desired collective variable. Suppose that a system is biased (for instance, within a BEUS scheme) using N different biasing potentials Ui(r), where i = 1, …, N, and r represents all atomic coordinates. Typically, Ui(r) is a harmonic potential defined in terms of a collective variable with varying centers for different i. Assuming an equal number of sampled configurations from each of the N generated trajectories, we can combine them in a single set of samples {rk} (irrespective of which bias was used to generate each sample rk) and determine the weight of each sample (wk) as:
32 |
where, c is the normalization constant such that and both {wk} and perturbed free energies {Fi} are determined iteratively using the above equation and the following:
33 |
Converged wk values can be used to construct any ensemble averages including any PMF (for example, G(ζ) PMF of the atomic system in the collective variable space(ζ))) in terms of not only the collective variable used for biasing but also any other collective variables that are sufficiently sampled. One may use a weighted histogram method to construct the PMF as follows:
34 |
35 |
To estimate the uncertainty of any of PMF calculations described above, one may use bootstrapping. Here, we have used a block Bayesian bootstrapping technique77, where 100 alternative datasets are resampled from the existing dataset and the same non-parametric reweighting algorithm and the same PMF calculation is repeated for each set to generate 100 alternative PMFs. The standard deviation of the PMF at any point along the reaction coordinate provides an estimate for the error.
Alchemical FEP simulations
We used the BFEE2 (ref. 27) package to estimate the absolute free energy of binding in silico for an alchemical or geometrical route with multiple subprocesses and geometric constraints. Alchemical FEP simulations were performed to calculate the absolute binding free energy for the interaction of hFGF1 with heparin hexasaccharide. We used a double annihilation protocol80, wherein the heparin hexasaccharide is annihilated in both the free and bound states. The final conformations of the hFGF1–heparin complex73 and free heparin hexasaccharide equilibrium simulations (discussed previously in the ‘All-atom MD simulations’ section) were used to generate starting conformations for the bound hFGF1–heparin and free heparin FEP simulations respectively. For the alchemical route, four separate simulations are performed: (1) coupling the restraints of seven collective variables in the bound state; (2) decoupling the ligand alchemically in the bound state; (3) coupling the ligand alchemically in the unbound state; (4) decoupling the conformational restraints in the unbound state. The FEP simulations 1 and 3 were performed bidirectionally using 200 λ-windows (λ is the coupling parameter associated with the FEP that could vary between 0 and 1). Each λ-window included a 0.5 ns of equilibration and 5.0 ns of averaging for both the unbound and bound states, for a total of 2.3 μs (Supplementary Table 3). The decoupling FEP simulations 2 and 4 were also performed bidirectionally, each one for 51 ns. All FEP simulations were performed using the NAMD 2.13 (ref. 71) simulation package with the CHARMM36m all-atom additive force field, using the protocol discussed previously for the equilibrium simulations. We used the state-of-the-art BFEE2 (ref. 27) method to make input files and analyze the FEP simulations.
Binding-free-energy calculations using geometrical route
The extended ABF technique with an umbrella integration estimator was used to calculate the free-energy change along the coarse variables required to characterize reversible heparin–hFGF1 binding3,24,27. We used the software BFEE2 (ref. 27) to generate the input files for these simulations. In the geometrical route, these collective variables are often subjected to restrictions, and the amount of reversible work required to impose each constraint is determined by a sequence of very accurate PMF simulations. The collective variables used here are the RMSDs of the two proteins’ backbone distances from the reference, native conformation, the three Euler angles (Θ, Φ and Ψ) that describe their relative orientation and the polar (θ) and azimuth angles (φ) that describe their relative position27,81. The geometrical path consists of a sequence of separate PMF computations performed sequentially with the gradual inclusion of restrictions (RMSD, Θ, Φ, Ψ, θ and φ), as shown in Supplementary Table 2. Each geometric collective variable (RMSD, Θ, Φ, Ψ, θ, φ and r = (1/β) ln(S*I*C°); β = (𝑘BT)−1, with kB the Boltzmann constant and T the temperature; C° denotes the standard concentration of 1 M. I*, which stands for the separation term, and S*, which stands for the surface term, indicate the percentage of a sphere with radius r*, centered at the binding site of the reference protein, that is, accessible to its partner) simulation was run with 10 replicas per restriction, and each replica simulation included 20 ns (RMSD, Θ, Φ, Ψ, θ and φ) of simulation time (r collective variables simulations were run for 40 ns for each replica), for a total of 1.6 μs. The BFEE2 (ref. 27) Gui was used to analyze the final ABF simulation data.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Source data
Acknowledgements
This research is supported by National Science Foundation grants CHE 1945465 and OAC 1940188 (to M.M.) and the Arkansas Biosciences Institute (to M.M.). This work is also supported by the Department of Energy grant DE-FG02-01ER15161 (to M.M. and T.K.S.K.), the National Institutes of Health grants R15GM139140 (to M.M.), R01CA172631 (to T.K.S.K.), P30GM103450 (to T.K.S.K.) and P20GM139768 (to T.K.S.K.), and the Arkansas Integrative Metabolic Research Center at the University of Arkansas (to T.K.S.K.). Anton 2 computer time was provided by the Pittsburgh Supercomputing Center (PSC) through Grant R01GM116961 from the National Institutes of Health. The Anton 2 machine at PSC was generously made available by D. E. Shaw Research. This research is also part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993) and the state of Illinois. This work also used the Extreme Science and Engineering Discovery Environment (allocation MCB150129 to M.M.), which is supported by National Science Foundation grant number ACI-1548562. This research is also supported by the Arkansas High Performance Computing Center, which is funded through multiple National Science Foundation grants and the Arkansas Economic Development Commission.
Author contributions
M.M. designed the research. V.G.K. and A.P. performed the simulations and analyzed the simulation data. T.K.S.K. designed the experiments. S.A. performed the experiments and analyzed the experimental data. V.G.K., A.P., M.M., T.K.S.K. and S.A. wrote the manuscript.
Peer review
Peer review information
Nature Computational Science thanks Jeffry Setiadi, Richard Henchman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team.
Data availability
Datasets related to this article are deposited to the Zenodo repository82. Source data for Figs. 1–5 is available with this paper. Protein Data Bank (https://www.rcsb.org/) was used to collect the crystal structures2AXM ref. 69 and 1RG8 ref. 70.
Code availability
All scripts as well as the full source code for non-parametric reweighting can be obtained from Zenodo82.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s43588-022-00389-9.
References
- 1.Mobley DL, Gilson MK. Predicting binding free energies: frontiers and benchmarks. Annu. Rev. Biophys. 2017;46:531–558. doi: 10.1146/annurev-biophys-070816-033654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Wan S, Bhati AP, Zasada SJ, Coveney PV. Rapid, accurate, precise and reproducible ligand–protein binding free energy prediction: binding free energy prediction. Interface Focus. 2020;10:20200007. doi: 10.1098/rsfs.2020.0007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Woo HJ, Roux B. Calculation of absolute protein–ligand binding free energy from computer simulations. Proc. Natl Acad. Sci. USA. 2005;102:6825–6830. doi: 10.1073/pnas.0409005102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gumbart JC, Roux B, Chipot C. Standard binding free energies from computer simulations: what is the best strategy? J. Chem. Theory Comput. 2013;9:794–802. doi: 10.1021/ct3008099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Siebenmorgen T, Zacharias M. Computational prediction of protein–protein binding affinities. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2020;10:e1448. doi: 10.1002/wcms.1448. [DOI] [Google Scholar]
- 6.Du X, et al. Insights into protein–ligand interactions: mechanisms, models, and methods. Int. J. Mol. Sci. 2016;17:144. doi: 10.3390/ijms17020144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Fenley AT, Henriksen NM, Muddana HS, Gilson MK. Bridging calorimetry and simulation through precise calculations of cucurbituril-guest binding enthalpies. J. Chem. Theory Comput. 2014;10:4069–4078. doi: 10.1021/ct5004109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Talhout R, Villa A, Mark AE, Engberts JBFN. Understanding binding affinity: a combined isothermal titration calorimetry/molecular dynamics study of the binding of a series of hydrophobically modified benzamidinium chloride inhibitors to trypsin. J. Am. Chem. Soc. 2003;125:10570–10579. doi: 10.1021/ja034676g. [DOI] [PubMed] [Google Scholar]
- 9.Weiss S. Measuring conformational dynamics of biomolecules by single molecule fluorescence spectroscopy. Nat. Struct. Biol. 2000;7:724–729. doi: 10.1038/78941. [DOI] [PubMed] [Google Scholar]
- 10.Rossi AM, Taylor CW. Analysis of protein–ligand interactions by fluorescence polarization. Nat. Protoc. 2011;6:365–387. doi: 10.1038/nprot.2011.305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Huang D, Caflisch A. Efficient evaluation of binding free energy using continuum electrostatics solvation. J. Med. Chem. 2004;47:5791–5797. doi: 10.1021/jm049726m. [DOI] [PubMed] [Google Scholar]
- 12.Rodinger T, Howell PL, Pom̀s Ŕ. Absolute free energy calculations by thermodynamic integration in four spatial dimensions. J. Chem. Phys. 2005;123:34104. doi: 10.1063/1.1946750. [DOI] [PubMed] [Google Scholar]
- 13.Ytreberg FM, Zuckerman DM. Simple estimation of absolute free energies for biomolecules. J. Chem. Phys. 2006;124:104105. doi: 10.1063/1.2174008. [DOI] [PubMed] [Google Scholar]
- 14.Rodinger T, Howell PL, Pom̀s Ŕ. Calculation of absolute protein–ligand binding free energy using distributed replica sampling. J. Chem. Phys. 2008;129:155102. doi: 10.1063/1.2989800. [DOI] [PubMed] [Google Scholar]
- 15.Doudou S, Burton NA, Henchman RH. Standard free energy of binding from a one-dimensional potential of mean force. J. Chem. Theory Comput. 2009;5:909–918. doi: 10.1021/ct8002354. [DOI] [PubMed] [Google Scholar]
- 16.Jiang W, Roux B. Free energy perturbation Hamiltonian replica-exchange molecular dynamics (FEP/H-REMD) for absolute ligand binding free energy calculations. J. Chem. Theory Comput. 2010;6:2559–2565. doi: 10.1021/ct1001768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.General IJ, Dragomirova R, Meirovitch H. Absolute free energy of binding of avidin/biotin, revisited. J. Phys. Chem. B. 2012;116:6628–6636. doi: 10.1021/jp212276m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Fu H, et al. Accurate determination of protein:ligand standard binding free energies from molecular dynamics simulations. Nat. Protoc. 2022;17:1114–1141. doi: 10.1038/s41596-021-00676-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Zhang C, Liu S, Zhu Q, Zhou Y. A knowledge-based energy function for protein–ligand, protein–protein, and protein–DNA complexes. J. Med. Chem. 2005;48:2325–2335. doi: 10.1021/jm049314d. [DOI] [PubMed] [Google Scholar]
- 20.Chéron JB, Zacharias M, Antonczak S, Fiorucci S. Update of the ATTRACT force field for the prediction of protein––protein binding affinity. J. Comput. Chem. 2017;38:1887–1890. doi: 10.1002/jcc.24836. [DOI] [PubMed] [Google Scholar]
- 21.Lensink MF, Wodak SJ. Docking and scoring protein interactions: CAPRI 2009. Proteins Struct. Funct. Bioinf. 2010;78:3073–3084. doi: 10.1002/prot.22818. [DOI] [PubMed] [Google Scholar]
- 22.Srinivasan J, Cheatham TE, Cieplak P, Kollman PA, Case DA. Continuum solvent studies of the stability of DNA, RNA, and phosphoramidate-DNA helices. J. Am. Chem. Soc. 1998;120:9401–9409. doi: 10.1021/ja981844+. [DOI] [Google Scholar]
- 23.Wang C, et al. Calculating protein–ligand binding affinities with MMPBSA: method and error analysis. J. Comput. Chem. 2016;37:2436–2446. doi: 10.1002/jcc.24467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Fu H, et al. BFEE: a user-friendly graphical interface facilitating absolute binding free-energy calculations. J. Chem. Inf. Model. 2018;58:556–560. doi: 10.1021/acs.jcim.7b00695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Chipot C. Frontiers in free-energy calculations of biological systems. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2014;4:71–89. doi: 10.1002/wcms.1157. [DOI] [Google Scholar]
- 26.Chodera JD, Mobley DL. Entropy–enthalpy compensation: role and ramifications in biomolecular ligand recognition and design. Annu. Rev. Biophys. 2013;42:121–142. doi: 10.1146/annurev-biophys-083012-130318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Fu H, Chen H, Cai W, Shao X, Chipot C. BFEE2: automated, streamlined, and accurate absolute binding free-energy calculations. J. Chem. Inf. Model. 2021;61:2116–2123. doi: 10.1021/acs.jcim.1c00269. [DOI] [PubMed] [Google Scholar]
- 28.Ali HS, Chakravorty A, Kalayan J, de Visser SP, Henchman RH. Energy–entropy method using multiscale cell correlation to calculate binding free energies in the SAMPL8 host–guest challenge. J. Comput. Aided Mol. Des. 2021;35:911–921. doi: 10.1007/s10822-021-00406-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kollman P. Free energy calculations: applications to chemical and biochemical phenomena. Chem. Rev. 1993;93:2395–2417. doi: 10.1021/cr00023a004. [DOI] [Google Scholar]
- 30.Gilson MK, Given JA, Bush BL, McCammon JA. The statistical–thermodynamic basis for computation of binding affinities: a critical review. Biophys. J. 1997;72:1047–1069. doi: 10.1016/S0006-3495(97)78756-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hermans J, Wang L. Inclusion of loss of translational and rotational freedom in theoretical estimates of free energies of binding. Application to a complex of benzene and mutant T4 lysozyme. J. Am. Chem. Soc. 1997;119:2707–2714. doi: 10.1021/ja963568+. [DOI] [Google Scholar]
- 32.Tuckerman ME. Free Energy Calculations: Theory and Applications in Chemistry and Biology Springer Series in Chemical Physics, 86 Edited by Christophe Chipot (Université Henri Poincaré Vandoeuvre-lès-Nancy, France) and Andrew Pohorille (University of California, San Francisco, USA) J. Am. Chem. Soc. 2007;129:10963–10964. doi: 10.1021/ja076952n. [DOI] [Google Scholar]
- 33.Fratev F, Sirimulla S. An improved free energy perturbation FEP+ sampling protocol for flexible ligand-binding domains. Sci. Rep. 2019;9:16829. doi: 10.1038/s41598-019-53133-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Jorgensen WL. Free-energy calculations: a breakthrough for modeling organic chemistry in solutions. Acc. Chem. Res. 1989;22:184–189. doi: 10.1021/ar00161a004. [DOI] [Google Scholar]
- 35.Boresch S, Tettinger F, Leitgeb M, Karplus M. Absolute binding free energies: a quantitative approach for their calculation. J. Phys. Chem. B. 2003;107:9535–9551. doi: 10.1021/jp0217839. [DOI] [Google Scholar]
- 36.Hermans J, Shankar S. The free energy of xenon binding to myoglobin from molecular dynamics simulation. Isr. J. Chem. 1986;27:225–227. doi: 10.1002/ijch.198600032. [DOI] [Google Scholar]
- 37.Roux B, Nina M, Pomès R, Smith JC. Thermodynamic stability of water molecules in the bacteriorhodopsin proton channel: a molecular dynamics free energy perturbation study. Biophys. J. 1996;71:670–681. doi: 10.1016/S0006-3495(96)79267-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Fujitani H, et al. Direct calculation of the binding free energies of FKBP ligands. J. Chem. Phys. 2005;123:84108. doi: 10.1063/1.1999637. [DOI] [PubMed] [Google Scholar]
- 39.Dixit SB, Chipot C. Can absolute free energies of association be estimated from molecular mechanical simulations? The biotin–streptavidin system revisited. J. Phys. Chem. A. 2001;105:9795–9799. doi: 10.1021/jp011878v. [DOI] [Google Scholar]
- 40.Deng Y, Roux B. Calculation of standard binding free energies: aromatic molecules in the T4 lysozyme L99A mutant. J. Chem. Theory Comput. 2006;2:1255–1273. doi: 10.1021/ct060037v. [DOI] [PubMed] [Google Scholar]
- 41.Deng Y, Roux B. Computations of standard binding free energies with molecular dynamics simulations. J. Phys. Chem. B. 2009;113:2234–2246. doi: 10.1021/jp807701h. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Karplus M, McCammon JA. Molecular dynamics simulations of biomolecules. Nat. Struct. Biol. 2002;9:646–652. doi: 10.1038/nsb0902-646. [DOI] [PubMed] [Google Scholar]
- 43.Dror RO, Dirks RM, Grossman JP, Xu H, Shaw DE. Biomolecular simulation: a computational microscope for molecular biology. Annu. Rev. Biophys. 2012;41:429–452. doi: 10.1146/annurev-biophys-042910-155245. [DOI] [PubMed] [Google Scholar]
- 44.Immadisetty K, Hettige J, Moradi M. What can and cannot be learned from molecular dynamics simulations of bacterial proton-coupled oligopeptide transporter GkPOT? J. Phys. Chem. B. 2017;121:3644–3656. doi: 10.1021/acs.jpcb.6b09733. [DOI] [PubMed] [Google Scholar]
- 45.Gunsteren WF, Mark AE. Validation of molecular dynamics simulation. J. Chem. Phys. 1989;108:6109–6116. doi: 10.1063/1.476021. [DOI] [Google Scholar]
- 46.Gunsteren WF, Dolenc J, Mark AE. Molecular simulation as an aid to experimentalists. Curr. Opin. Struct. Biol. 2008;18:149–153. doi: 10.1016/j.sbi.2007.12.007. [DOI] [PubMed] [Google Scholar]
- 47.Zuckerman DM, Chong LT. Weighted ensemble simulation: review of methodology, applications, and software. Annu. Rev. Biophys. 2017;46:43–57. doi: 10.1146/annurev-biophys-070816-033834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Zwier MC, et al. WESTPA: an interoperable, highly scalable software package for weighted ensemble simulation and analysis. J. Chem. Theory Comput. 2015;11:800–809. doi: 10.1021/ct5010615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Kumar S, Rosenberg JM, Bouzida D, Swendsen RH, Kollman PA. The weighted histogram analysis method for free‐energy calculations on biomolecules. I. The method. J. Comput. Chem. 1992;13:1011–1021. doi: 10.1002/jcc.540130812. [DOI] [Google Scholar]
- 50.Souaille M, Roux B. Extension to the weighted histogram analysis method: combining umbrella sampling with free energy calculations. Comput. Phys. Commun. 2001;135:40–57. doi: 10.1016/S0010-4655(00)00215-0. [DOI] [Google Scholar]
- 51.Luitz M, Bomblies R, Ostermeir K, Zacharias M. Exploring biomolecular dynamics and interactions using advanced sampling methods. J. Phys. Condens. Matter. 2015;27:323101. doi: 10.1088/0953-8984/27/32/323101. [DOI] [PubMed] [Google Scholar]
- 52.Kokubo H, Tanaka T, Okamoto Y. Ab Initio prediction of protein-ligand binding structures by replica-exchange umbrella sampling simulations. J. Comput. Chem. 2011;32:2810–2821. doi: 10.1002/jcc.21860. [DOI] [PubMed] [Google Scholar]
- 53.Moradi M, Tajkhorshid E. Mechanistic picture for conformational transition of a membrane transporter at atomic resolution. Proc. Natl Acad. Sci. USA. 2013;110:18916–18921. doi: 10.1073/pnas.1313202110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Kästner J, Thiel W. Bridging the gap between thermodynamic integration and umbrella sampling provides a novel analysis method: ‘umbrella integration’. J. Chem. Phys. 2005;123:144104. doi: 10.1063/1.2052648. [DOI] [PubMed] [Google Scholar]
- 55.Barducci A, Bussi G, Parrinello M. Well-tempered metadynamics: a smoothly converging and tunable free-energy method. Phys. Rev. Lett. 2008;100:20603. doi: 10.1103/PhysRevLett.100.020603. [DOI] [PubMed] [Google Scholar]
- 56.Comer J, et al. The adaptive biasing force method: everything you always wanted to know but were afraid to ask. J. Phys. Chem. B. 2015;119:1129–1151. doi: 10.1021/jp506633n. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Gumbart JC, Roux B, Chipot C. Efficient determination of protein-protein standard binding free energies from first principles. J. Chem. Theory Comput. 2013;9:3789–3798. doi: 10.1021/ct400273t. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Eswarakumar VP, Lax I, Schlessinger J. Cellular signaling by fibroblast growth factor receptors. Cytokine Growth Factor Rev. 2005;16:139–149. doi: 10.1016/j.cytogfr.2005.01.001. [DOI] [PubMed] [Google Scholar]
- 59.Beenken, A. & Mohammadi, M. The FGF family: biology, pathophysiology and therapy. Nat. Rev. Drug Discov.10.1038/nrd2792 (2009). [DOI] [PMC free article] [PubMed]
- 60.Kuro-o M. Endocrine FGFs and Klothos: emerging concepts. Trends Endocrinol. Metab. 2008;19:239–245. doi: 10.1016/j.tem.2008.06.002. [DOI] [PubMed] [Google Scholar]
- 61.Ornitz DM, et al. FGF binding and FGF receptor activation by synthetic heparan-derived di- and trisaccharides. Science. 1995;268:432–436. doi: 10.1126/science.7536345. [DOI] [PubMed] [Google Scholar]
- 62.Culajay JF, Blaber SI, Khurana A, Blaber M. Thermodynamic characterization of mutants of human fibroblast growth factor 1 with an increased physiological half-life. Biochemistry. 2000;39:7153–7158. doi: 10.1021/bi9927742. [DOI] [PubMed] [Google Scholar]
- 63.Babik S, Samsonov SA, Pisabarro MT. Computational drill down on FGF1–heparin interactions through methodological evaluation. Glycoconj. J. 2017;34:427–440. doi: 10.1007/s10719-016-9745-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Carter EP, Fearon AE, Grose RP. Careless talk costs lives: fibroblast growth factor receptor signalling and the consequences of pathway malfunction. Trends Cell Biol. 2015;25:221–233. doi: 10.1016/j.tcb.2014.11.003. [DOI] [PubMed] [Google Scholar]
- 65.Goetz, R. & Mohammadi, M. Exploring mechanisms of FGF signalling through the lens of structural biology. Nat. Rev. Mol. Cell Biol.10.1038/nrm3528 (2013). [DOI] [PMC free article] [PubMed]
- 66.Bojarski KK, Sieradzan AK, Samsonov SA. Molecular dynamics insights into protein-glycosaminoglycan systems from microsecond-scale simulations. Biopolymers. 2019;110:23252. doi: 10.1002/bip.23252. [DOI] [PubMed] [Google Scholar]
- 67.Genheden S, Ryde U. The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities. Expert Opin. Drug Discov. 2015;10:449–461. doi: 10.1517/17460441.2015.1032936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Le VH, Buscaglia R, Chaires JB, Lewis EA. Modeling complex equilibria in isothermal titration calorimetry experiments: thermodynamic parameters estimation for a three-binding-site model. Anal. Biochem. 2013;434:233–241. doi: 10.1016/j.ab.2012.11.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Digabriele AD, et al. Structure of a heparin-linked biologically active dimer of fibroblast growth factor. Nature. 1998;393:812–817. doi: 10.1038/31741. [DOI] [PubMed] [Google Scholar]
- 70.Bernett MJ, Somasundaram T, Blaber M. An atomic resolution structure for human fibroblast growth factor 1. Proteins Struct. Funct. Genet. 2004;57:626–634. doi: 10.1002/prot.20239. [DOI] [PubMed] [Google Scholar]
- 71.Lee J, et al. CHARMM-GUI input generator for NAMD, GROMACS, AMBER, OpenMM, and CHARMM/OpenMM simulations using the CHARMM36 additive force field. J. Chem. Theory Comput. 2016;12:405–413. doi: 10.1021/acs.jctc.5b00935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Govind Kumar V, Agrawal S, Kumar TKS, Moradi M. Mechanistic picture for monomeric human fibroblast growth factor 1 stabilization by heparin binding. J. Phys. Chem. B. 2021;125:12690–12697. doi: 10.1021/acs.jpcb.1c07772. [DOI] [PubMed] [Google Scholar]
- 73.Fiorin G, Klein ML, Hénin J. Using collective variables to drive molecular dynamics simulations. Mol. Phys. 2013;111:3345–3362. doi: 10.1080/00268976.2013.813594. [DOI] [Google Scholar]
- 74.Izrailev S, Stepaniants S, Balsera M, Oono Y, Schulten K. Molecular dynamics study of unbinding of the avidin–biotin complex. Biophys. J. 1997;72:1568–1581. doi: 10.1016/S0006-3495(97)78804-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Moradi M, Tajkhorshid E. Computational recipe for efficient description of large-scale conformational changes in biomolecular systems. J. Chem. Theory Comput. 2014;10:2866–2880. doi: 10.1021/ct5002285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Moradi M, Enkavi G, Tajkhorshid E. Atomic-level characterization of transport cycle thermo-dynamics in the glycerol-3-phosphate:phosphate transporter. Nat. Commun. 2015;6:8393. doi: 10.1038/ncomms9393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Bartels C. Analyzing biased Monte Carlo and molecular dynamics simulations. Chem. Phys. Lett. 2000;331:446–454. doi: 10.1016/S0009-2614(00)01215-X. [DOI] [Google Scholar]
- 78.Shirts MR, Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. 2008;129:124105. doi: 10.1063/1.2978177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Gilson MK, Given JA, Bush BL, McCammon JA. The statistical–thermodynamic basis for computation of binding affinities: a critical review. Biophys. J. 1997;72:1047–1069. doi: 10.1016/S0006-3495(97)78756-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Phillips JC, et al. Scalable molecular dynamics with NAMD. J. Comput. Chem. 2005;26:1781–1802. doi: 10.1002/jcc.20289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Coderc De Lacam EG, Blazhynska M, Chen H, Gumbart JC, Chipot C. When the dust has settled: calculation of binding affinities from first principles for SARS-CoV-2 variants with quantitative accuracy. J. Chem. Theory Comput. 2022;18:5890–5900. doi: 10.1021/acs.jctc.2c00604. [DOI] [PubMed] [Google Scholar]
- 82.Kumar, V. G., Polasa, A., Agrawal, S., Kumar, T. K. S. & Moradi, M. Binding affinity estimation from restrained umbrella sampling simulations. Zenodo10.5281/zenodo.7348705 (2022). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Datasets related to this article are deposited to the Zenodo repository82. Source data for Figs. 1–5 is available with this paper. Protein Data Bank (https://www.rcsb.org/) was used to collect the crystal structures2AXM ref. 69 and 1RG8 ref. 70.
All scripts as well as the full source code for non-parametric reweighting can be obtained from Zenodo82.