Comparing Alchemical and Physical Pathway Methods for Computing the Absolute Binding Free Energy of Charged Ligands

Nanjie Deng; Di Cui; Bin W Zhang; Junchao Xia; Jeffrey Cruz; Ronald Levy

doi:10.1039/c8cp01524d

. Author manuscript; available in PMC: 2019 Jun 27.

Published in final edited form as: Phys Chem Chem Phys. 2018 Jun 27;20(25):17081–17092. doi: 10.1039/c8cp01524d

Comparing Alchemical and Physical Pathway Methods for Computing the Absolute Binding Free Energy of Charged Ligands

Nanjie Deng ^1,^✉, Di Cui ², Bin W Zhang ², Junchao Xia ², Jeffrey Cruz ¹, Ronald Levy ²

PMCID: PMC6061996 NIHMSID: NIHMS975177 PMID: 29896599

Abstract

Accurately predicting absolute binding free energies of protein-ligand complexes is important as a fundamental problem in both computational biophysics and pharmaceutical discovery. Calculating binding free energies for charged ligands is generally considered to be challenging because of the strong electrostatic interactions between the ligand and its environment in the aqueous solution. In this work, we compare the performance of the potential of mean force (PMF) method and double decoupling method (DDM) for computing absolute binding free energies for charged ligands. We first clarify an unresolved issue concerning the use of the binding site volume explicitly to define the complexed state in DDM together with the use of harmonic restraints. We also provide an alternative derivation for the formula for absolute binding free energy using the PMF approach. We use these formulas to compute the binding free energy of charged ligands at an allosteric site of HIV-1 integrase, which has emerged in recent years as a promising target for developing antiviral therapy. As compared with the experimental results, the absolute binding free energies obtained by using the PMF approach show unsigned errors of 1.5–3.4 kcal/mol, which are somewhat better than the results from DDM (unsigned errors of 1.6–4.3 kcal/mol) using the same amount of CPU time. According to the DDM decomposition of the binding free energy, the ligand binding appears to be dominated by nonpolar interactions despite the presence of very large and favorable intermolecular ligand-receptor electrostatic interactions, which are almost completely cancelled out by the equally large free energy cost of desolvation of the charged moiety of the ligands in solution. We discuss the relative strengths of computing absolute binding free energies using the alchemical and physical pathway methods.

INTRODUCTION

The ability to accurately predict protein-ligand binding affinity is important for improving our understanding of molecular recognition and for rational drug design.^1,2 In recent years, significant progress in methodology, force field and computer speed have been made to improve the accuracy of relative binding free energy calculations using the free energy perturbation (FEP)^3–5 method; for ligands that are chemically not too dissimilar, an accuracy of ~ 1–2 kcal/mol in the relative binding free energy can be expected⁶. However, for ligands that are chemically significantly different and/or bind with very different binding modes, absolute binding free energy methods may be a better approach for computing their binding affinities.

Currently, two methods are primarily used for computing absolute binding free energy in explicit solvent (but also see references that used other approaches^7–9), the alchemical double decoupling method DDM^10–12 and the potential of mean force PMF approach which relies on physical pathways to move the ligand between bound and unbound states.^13–15 DDM has been applied to a number of protein-ligand complexes in which errors mostly fall in the range 0.5–5 kcal/mol, i.e. somewhat larger than FEP for the relative binding free energy. Compared with the increasing use of DDM in recent years, the PMF method has been employed less widely, with the reported deviations from experiments spanning a wide range of 0.2 kcal/mol to 10 kcal/mol^13,14,16,17. However, the PMF approach may have an advantage over DDM for computing binding free energies involving charged ligands.¹³ Such ligands have very large, favorable solvation free energies, which need to be balanced by similarly large and favorable receptor-ligand electrostatic interactions for the binding to occur. The accurate calculation of the two very large, electrostatic decoupling terms can be challenging. In addition, because DDM requires alchemical decoupling the charged ligand from its environment, additional calculations may be necessary to correct the errors caused by the use of a finite sized, periodic solvent box.^18,19 Such complications are expected to be less important in the PMF approach, where the ligand and receptor remain in the same solvent box throughout the pulling simulation. In addition, PMF may be the only practical method for computing the absolute binding free energy of very large ligands such as those found in the protein-peptide, protein-protein or protein-DNA complexes^20–22, where the alchemical decoupling of any of the binding partners would be computationally prohibitive.

In this study, we compare the performance of PMF and DDM in computing the absolute binding free energy of charged ligands with protein in order to understand the important factors that affect the accuracy of the computation in the two methods. The receptor target is an allosteric site in the catalytic core domain dimer interface of HIV-1 Intergrase, which has emerged in recent years as a promising drug target for developing antiviral therapy^23–26. We also resolve an issue concerning the role of the binding site volume V_site in the DDM expression for the absolute binding free energy. The original landmark paper on DDM used a step indicator function to specify a binding site volume V_site for the definition of the complexed state.¹⁰ The more recent DDM work discarded the step function and instead use strong harmonic restraints to confine the ligand movement to within a small volume in the binding pocket in order to accelerate sampling^11,12; in this approach the effective binding site volume V_site is defined implicitly which may cause a problem when treating weakly bound ligand. In addition, it is in principle necessary to explicitly include the V_site in the absolute binding free energy expression in order to make direct comparison with experimental binding affinity measurement, even though the numerical value of $Δ G_{b i n d}^{°}$ is insensitive to the precise definition of V_site for reasonably strong binders.^10,27 We also include the electrostatics corrections for the effects of the use of finite-size periodic solvent box on the DDM-calculated absolute binding free energies. Lastly we provide an alternative derivation for the formula for computing absolute binding free energy using PMF, compared with the previous formula¹³.

METHODS

A DDM expression that explicitly includes V_site in addition to harmonic restraint

In DDM, the absolute binding free energy $Δ G_{b i n d}^{°}$ is evaluated by inserting intermediate states into the statistical mechanical expression^10,11

Δ G_{bind}^{o} = - k_{B} T \ln \frac{V}{V_{0}} \frac{Z_{R L, N} Z_{0, N}}{Z_{R, N} Z_{L, N}}

(1)

Here $V = \frac{1}{C}$ is the volume of the simulation system. $V_{0} = \frac{1}{C^{o}} = 1660 Å^{3}$ is the inverse of the standard concentration of $C^{o} = 1 M$ solution. $Z_{X, N}$ is the configuration integral ( $Z = \int e^{- U (r) / k_{B} T} d r$ ) of a single solute X solvated in a box of volume V containing N waters. Thus, $Z_{R L, N}$ represents a system containing one receptor-ligand complex in the bound state solvated by N water molecules. $Z_{0, N}$ represents a system of pure solvent.

To define the receptor-ligand complexed state, the original DDM paper¹⁰ uses a step indicator function I(ξ) acting on the ligand center ξ to specify whether a given configuration is bound or not and this defines an effective binding site volume V_site ( $V_{s i t e} = \int I (ξ) d ξ, I (ξ) = 1$ when ligand is inside and $I (ξ) = 0$ outside the pocket ). V_site can be related to the binding signal in an experiment or to the onset of the plateau in the PMF. More recent DDM papers ^11,12 did not explicitly include a geometric criteria to define the receptor-ligand complex; instead, harmonic restraints $U_{r e s t r} (ξ)$ are applied in one or more steps to confine the ligand to within a small volume inside the binding site. This helps accelerate the sampling of configurations. The binding site volume is implicitly defined by the configurational space explored by the fully coupled ligand in the absence of the harmonic restraints. This implicit definition of the complex has since been widely used and works well for reasonably strong binders. However, the binding site volume explored by a bound ligand in unrestrained MD may not be equivalent to the V_site that is consistent with experiment. This could make direct comparison between the computed and experimental $Δ G_{bind}^{°}$ less rigorous. In certain problems such an implicit definition of the bound state can be problematic: in order to compute the free energy of restraining the ligand-receptor distance using the harmonic restraint $U_{r e s t r} (ξ)$ , FEP is typically used, starting with the λ_restr=0 simulation window in which the ligand-receptor distance is unrestrained. For a weakly bound ligand such as those encountered in fragment-based design, the ligand has a finite chance to escape from the binding pocket in the unrestrained simulation in the λ_restr=0 window of the FEP. To avoid such potential complications which arise when the definition of the bound state is ill defined and to show formally how V_site enters into DDM expression, we explicitly define the complexed state using V_site by applying a flat-bottomed restraint to implement the step function I(ξ) from the beginning (including λ_restr = 0). Here we provide a modified expression of $Δ G_{bind}^{°}$ taking into account this explicit definition of the complex. The configurational integral of the complexed state is

Z_{R L, N} = Z_{R L, N}^{V_{s i t e}} = \int I (ξ) e^{- U (r_{R}, r_{L}, r_{w}, ξ) / k_{B} T} d r_{R} d r_{L} d r_{w} d ξ

(2)

Where r_R, r_L, and r_w represent the internal coordinates of the receptor atoms, the internal coordinates of the ligand, and the solvent degrees of freedom, respectively. ξ is the position of the ligand center relative to the receptor binding site. Accordingly, Eq. (1) can be evaluated by inserting intermediate states

Δ G_{bind}^{o} = - k_{B} T ln \frac{V}{V_{0}} \frac{Z_{R L, N}^{V_{s i t e}} Z_{0, N}}{Z_{R, N} Z_{L, N}} = - k_{B} T \ln (\frac{V}{V_{0}} \frac{Z_{R L, N}^{V_{s i t e}}}{Z_{R \dots L, N}^{V_{s i t e}}} \frac{Z_{R \dots L, N}^{V_{s i t e}}}{Z_{R, N \dots L_{g a s}}^{V_{s i t e}}} \frac{Z_{R, N \dots L_{g a s}}^{V_{s i t e}}}{Z_{R, N + L_{g a s}}^{V_{s i t e}}} \frac{Z_{R, N + L_{g a s}}^{V_{s i t e}}}{Z_{R, N} Z_{L_{g a s}}} \frac{Z_{L_{g a s}} Z_{0, N}}{Z_{L, N}})

(3)

Here, the superscript V_site is used to emphasize that the ligand center is confined to within the binding site volume by the flat-bottomed potential which implements $I (ξ)$ . $Z_{R \dots L, N}^{V_{s i t e}}$ is the configuration integral of a solvated ligand-receptor complex in the fully coupled state, in which the ligand is harmonically restrained by $U_{r e s t r} (ξ)$ , in addition to the step function I(ξ). $Z_{R, N \dots L_{g a s}}^{V_{s i t e}}$ is the configuration integral of a bound complex in which the ligand is restrained by both I(ξ) and $U_{r e s t r} (ξ)$ but its nonbonded interactions with its environment, i.e. receptor plus solvent, are turned off. $Z_{R, N + L_{g a s}}^{V_{s i t e}}$ represents a bound complex in which the ligand is not interacting with its environment but is confined to the receptor binding site V_site only by the step function I(ξ). $Z_{L_{g a s}}$ describes an ideal-gas phase ligand and $Z_{0, N}$ represents N pure solvent molecules.

Thus, Eq. (3) allows $Δ G_{b i n d}^{°}$ to be computed as

Δ G_{b i n d}^{°} = - Δ G_{r e s t r}^{b o u n d} - Δ G_{d e c o u p l}^{b o u n d} + Δ G_{r e s t r}^{g a s} + Δ G_{d e c o u p l}^{b u l k}

(4)

Each term in Eq. (4) can be computed from simulation or analytically:

- Δ G_{r e s t r}^{b o u n d} = - k_{B} T \ln \frac{z_{R L, N}^{V_{s}_{i t e}}}{z_{R \dots L, N}^{V_{s i t e}}}

(4.1)

- Δ G_{d e c o u p l}^{b o u n d} = - k_{B} T \ln \frac{Z_{R \dots L, N}^{V_{s i t e}}}{Z_{R, N \dots L_{g a s}}^{V_{s i t e}}}

(4.2)

Δ G_{r e s t r}^{g a s} = - k_{B} T \ln \frac{V}{V_{0}} \frac{Z_{R, N \cdot \cdot \cdot L_{g a s}}^{V_{s i t e}}}{Z_{R, N + L_{g a s}}^{V_{s i t e}}} \frac{Z_{R, N + L_{g a s}}^{V_{s i t e}}}{Z_{R, N} Z_{L_{g a s}}}

(4.3)

Δ G_{d e c o u p l}^{b u l k} = - k_{B} T \ln \frac{Z_{L_{g a s}} Z_{0, N}}{Z_{L, N}}

(4.4)

In Eq. (4.1), $Δ G_{r e s t r}^{b o u n d}$ is the free energy of switching on the ligand-receptor harmonic distance restraint when the ligand is fully coupled to its environment (receptor and solvent). This term can be readily computed by using FEP (or thermodynamic integration, TI) in a series of λ_restr. Since the magnitude of this term is often small, it may be obtained from a single simulation of the bound complex (i.e. λ_restr=0) using the Zwanzig formula³, $Δ G_{r e s t r}^{b o u n d} = k_{B} T \ln {〈 e^{U_{r e s t r} (ξ) / k_{B} T} 〉}_{U_{r e s t r}}$ . In Eq. (4.2), $Δ G_{d e c o u p l}^{b o u n d}$ is the free energy of turning off the nonbonded interactions between the ligand and its environment, with the ligand remaining in the binding site because of the ligand-receptor harmonic distance restraint; this term can be computed using FEP or TI by simulating the decoupling process. In principle, both calculations of $- Δ G_{r e s t r}^{b o u n d}$ and $- Δ G_{d e c o u p l}^{b o u n d}$ are done in the presence of the flat-bottomed restraint I(ξ) on the ligand center ξ during the FEP or TI simulations. In Eq. (4.3), $Δ G_{r e s t r}^{g a s}$ is the free energy of switching on the ligand-receptor harmonic distance restraint $U_{r e s t r} (ξ)$ and the flat-bottomed restraint I(ξ) that force the gas-phase ligand to remain inside the receptor binding site; this term is evaluated analytically as

Δ G_{r e s t r}^{g a s} = - k_{B} T \ln \frac{V}{V_{0}} \frac{z_{R, N \dots L_{g a s}}^{V_{s i t e}}}{z_{R, N + L_{g a s}}^{v_{s i t e}}} \frac{Z_{R, N + L_{g a s}}^{V_{s i t e}}}{Z_{R, N} Z_{L_{g a s}}} - k_{B} T \ln \frac{V}{V_{0}} - k_{B} T \ln \frac{\int_{ξ \in V_{s i t e}}^{} e^{- U_{r e s t r} (ξ) / k_{B} T} d ξ}{V_{s i t e}} - k_{B} T \ln \frac{V_{s i t e}}{V} = - k_{B} T \ln \frac{\int_{ξ \in V_{s i t e}}^{} e^{- U_{r e s t r} (ξ) / k_{B} T} d ξ}{V_{0}}

(5)

since it is straightforward to show that $\frac{Z_{R, N \dots L_{g a s}}^{V_{s i t e}}}{Z_{R, N + L_{g a s}}^{V_{s i t e}}} = \frac{\int_{ξ \in V_{s i t e}}^{} e^{- U_{r e s t r} (ξ) / k_{B} T} d ξ}{V_{s i t e}}$ and $\frac{Z_{R, N + L_{g a s}}^{V_{s i t e}}}{Z_{R, N} Z_{L_{g a s}}} = \frac{V_{s i t e}}{V}$ .

Note that the system volume V drops out from the final expression of $Δ G_{b i n d}^{°}$ , as it should be, since in general $Δ G_{b i n d}^{°}$ can only depend on the standard state concentration ${C^{o} = 1 / V}_{0}$ and V_site, in addition to the non-ideal terms calculated by Eq. (4.1), (4.2) and (4.4). It can be shown that when the force constant k_r in $U_{r e s t r} (ξ)$ is sufficiently strong, or when V_site is sufficiently large, i.e. V_site^1/3> ${(\frac{2 k_{B} T}{k_{r}})}^{1 / 2}$ , we can write

\int_{ξ \in v_{s i t e}}^{} e^{- U_{r e s t r} (ξ) / k_{B} T} d ξ \approx 4 π \int_{0}^{\infty} Δ r^{2} e^{- \frac{1}{2} k_{r} Δ r^{2} / k_{B} T} d Δ r = {(\frac{2 π k_{B} T}{k_{r}})}^{\frac{3}{2}}

(6)

which can be considered the effective binding site volume explored by the harmonically restrained gas-phase ligand. Note that except for very weak binders, when V_site or k_r is properly chosen, $Δ G_{b i n d}^{°}$ becomes independent of the exact values for these parameters.^10,11,27

Lastly, ${- Δ G}_{d e c o u p l}^{b u l k}$ in Eq. (4.4) is the free energy of inserting the gas-phase ligand into the solvent. This term is the ligand solvation free energy, which can be calculated using TI.

For the ligands studied in this work, we did not apply the flat-bottom restraint I(ξ) in either DDM or PMF simulations. This is because all the ligands remain bound on the MD timescales of tens of nanoseconds during the $λ_{r e s t r} = 0$ (coupled and unrestrained) simulation, Therefore, applying the flat-bottom restraint I(ξ) will not change the terms computed in Eq. (4.1) and (4.2). The only effect of I(ξ) on the binding free energy is from $Δ G_{r e s t r}^{g a s}$ in Eq. (4.3), which is calculated analytically in Eq. (5) and Eq. (6). The magnitude of the change due to I(ξ) is negligibly small unless the V_site is chosen to be unreasonably small. For example, for KF134, suppose the upper limit of the integral in Eq. (6), which corresponds to the boundary of V_site, is chosen to be just 2 + beyond the minimum in the PMF curve (Fig. 2). With such a restrictive V_site, the corresponding change in the free energy due to the flat-bottom restraint I(ξ) is still only 0.06 kcal/mol.

Figure 2. — The PMF computed from independent umbrella sampling simulations for the protein-ligand complexes containing KF134, KF116 and BI-224436.

An alternative derivation for the expression for the absolute binding free energy using the PMF approach

To derive an expression of $Δ G_{b i n d}^{°}$ in the PMF approach,¹³ we also start from Eq. (1).^10,11,28 The denominator of the logarithm in the right-hand side of Eq. (1) is the product of two configurational integrals, $Z_{R, N}$ and $Z_{L, N}$ , which represent a box V with receptor R and N solvent molecules, and the a box V with ligand L and N solvent molecules, respectively. When the solvent box V is sufficiently large, we can transfer L from its own solvent box to the solvent box containing R without affecting the product of the two configurational integrals. After the transfer, R and L are in one solvent box V, and remain dissociated. The original solvent box V where L is transferred from now contains only N solvent molecules. This transformation is equivalent to rewriting the denominator in the logarithm of the right-hand side of Eq. (1) as

Δ G_{bind}^{°} = - k_{B} T \ln \frac{V}{V_{0}} \frac{Z_{R L, N} Z_{0, N}}{Z_{R, N} Z_{L, N}} = - k_{B} T \ln \frac{V}{V_{0}} \frac{Z_{R L, N} Z_{0, N}}{Z_{R + L, N} Z_{0, N}}

(7)

Here $Z_{P + L, N}$ represents the configuration integral of the box V now containing receptor R, the unbound L and N solvent molecules. Now the two $Z_{0, N}$ in the numerator and the denominator cancel and we obtain

Δ G_{bind}^{°} = - k_{B} T \ln \frac{V}{V_{0}} \frac{z_{R L, N} z_{0, N}}{z_{R + L, N} z_{0, N}} = - k_{B} T \ln \frac{V}{V_{0}} \frac{z_{R L, N}}{z_{R + L, N}}

(8)

To evaluate the right-hand side of Eq. (8) we insert intermediate states

Δ G_{bind}^{°} = - k_{B} T \ln \frac{V}{V_{0}} \frac{Z_{R L, N}}{Z_{R + L, N}} = - k_{B} T \ln \frac{V}{V_{0}} \frac{Z_{R L, N}}{Z_{R L (θ, ϕ, Θ, Φ, Ψ), N}} \frac{Z_{R L (θ, ϕ, Θ, Φ, Ψ), N}}{Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}} \frac{Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}}{Z_{R + L, N}}

(9)

Here $Z_{R L (θ, ϕ, Θ, Φ, Ψ), N}$ represents a complex RL in which ligand L’s external degrees of freedom, which are defined by the polar angles (θ, ϕ) and three Euler angles (Θ, Ф, Ψ), are restrained to their equilibrium values in the bound state by a set of harmonic restraints: $U_{θ} = \frac{1}{2} k_{θ} {(θ - θ_{0})}^{2}, U_{ϕ} = \frac{1}{2} k_{ϕ} {(ϕ - ϕ_{0})}^{2}, U_{Θ} = \frac{1}{2} k_{Θ} {(Θ - Θ_{0})}^{2}, U_{Φ} = \frac{1}{2} k_{Φ} {(Φ - Φ_{0})}^{2}, {a n d U}_{Ψ} = \frac{1}{2} k_{Ψ} {(Ψ - Ψ_{0})}^{2}$ . Fig. 1 gives the definition of the polar angles (θ, ϕ) and three Euler angles (Θ, Ф, Ψ) which specify the ligand orientation relative to the receptor. $Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}$ represents a system in which the ligand L is not only subject to the polar and orientational restraints $(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ}),$ but is also subject to the harmonic restraint $U_{r^{*}} = {\frac{1}{2} k}_{r} {(r - r^{*})}^{2}$ , which forces the ligand atom A to be close to a bulk location r*.

Figure 1. — The coordinate frame in which the orientation and the position of the ligand relative to the receptor are defined. The ligand position is defined in the polar coordinates by the distance r_aA and two polar angles θ: b-a-A;ϕ: c-b-a-A. The ligand orientation is defined by the three Euler angles: Θ: a-A-B; Ф: a-A-B-C; Ψ: b-a-A-B.

The different terms inside the logarithm of the right-hand side of Eq. (9) correspond to the following thermodynamic transformations:

The first term inside the logarithm in the right-hand side of Eq. (9) $\frac{Z_{R L, N}}{Z_{R L (θ, ϕ, Θ, Φ, Ψ), N}}$ corresponds to the free energy of switching on the angular restraints $(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ})$ on the ligand in the bound state, i.e.

\frac{z_{R L, N}}{z_{R L (θ, ϕ, Θ, Φ, Ψ), N}} = e^{Δ G_{(θ, ϕ, Θ, Φ, Ψ)}^{b o u n d} / k_{B} T}

(10.1)

Here $Δ G_{(θ, ϕ, Θ, Φ, Ψ)}^{b o u n d}$ is the free energy change of slowly switching on the polar angular and orientational restraints $(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ})$ . It is computed in this work using 24 λ windows: 12 λ windows (λ = 0.0, 0.01, 0.025, 0.05, 0.075, 0.1, 0.2, 0.35, 0.5, 0.65, 0.8, 1.0) are used to switch on the polar angle restraints ( $U_{θ}, U_{ϕ}$ ), followed by another 12 λ which switch on the ligand orientational restraints $(U_{Θ}, U_{Φ}, U_{Ψ})$ . In this work the free energy change in Eq. (10.1) is computed by analyzing the data obtained from FEP simulations using the Bennett acceptance ratio (BAR) method²⁹ (FEP or TI can also be used). The definition of V_site is implicit in Eq. (10.1) in the same way as it is for the DDM calculation; namely it is defined by the volume accessible to the ligands in the binding site of the receptor as the set of five angular restraints $(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ})$ are turned on.

The second term inside the logarithm of the right-hand side of Eq. (9) is $\frac{Z_{R L (θ, ϕ, Θ, Φ, Ψ), N}}{Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}}$ . In the system represented by the denominator $Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}$ , L is held at r* in the bulk solution by the distance restraint $U_{r^{*}} = {\frac{1}{2} k}_{r} {(r - r^{*})}^{2}$ . In addition, the orientation of L relative to R is held at near the equilibrium value by the angular restraints $(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ}) .$ Using the 1D potential of mean force (PMF) w(r) along the axis r_aA, we can compute the ratio $\frac{Z_{R L (θ, ϕ, Θ, Φ, Ψ), N}}{Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}}$ . Let w(r) be the 1D PMF as a function of $r = | r_{a A} |$ . The PMF function w(r), which encompasses both the complexed and the unbound state, can be computed using umbrella sampling simulation on the ligand in the presence of the angular restraints $(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ})$ . The PMF w(r) is defined as

e^{- w (r) / k T} = \int d r_{R} d r_{L} d r_{w} δ (r_{a A} - r) e^{- H (r_{R}, r_{L}, r_{w}) / k_{B} T}

(10.2.1)

Here, the H(r_R,r_L,r_W) includes both the normal interactions involving the receptor, ligand and solvent molecules, and the angular restraints $(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ})$ . Note that the numerator in $\frac{Z_{R L (θ, ϕ, Θ, Φ, Ψ), N}}{Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}}$ is

Z_{R L (θ, ϕ, Θ, Φ, Ψ), N} = \int_{r_{a A} \in V_{s i t e}}^{} d r_{R} d r_{L} d r_{W} e^{- H (r_{R}, r_{L}, r_{W}) / k_{B} T}

(10.2.2)

which can be written as

Z_{R L (θ, ϕ, Θ, Φ, Ψ), N} = \int_{r_{a A} \in V_{s i t e}}^{} d r_{R} d r_{L} d r_{W} e^{- \frac{H (r_{R}, r_{L}, r_{W})}{k_{B} T}} = \int_{r \in V_{s i t e}}^{} d r \int_{}^{} d r_{R} d r_{L} d r_{W} δ (r_{a A} - r) e^{- \frac{H (r_{R}, r_{L}, r_{W})}{k_{B} T}}

(10.2.3)

Substituting Eq. (10.2.3) into Eq. (10.2.1) yields

Z_{R L (θ, ϕ, Θ, Φ, Ψ), N} = \int_{r \in V_{s i t e}}^{} d r e^{- w (r) / k_{B} T}

(10.2.4)

We now simplify the denominator of $\frac{z_{R L (θ, ϕ, Θ, Φ, Ψ), N}}{z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}}$ , which is

Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N} = \int_{}^{} d r_{R} d r_{L} d r_{W} e^{- H (r_{R}, r_{L}, r_{W}) / k_{B} T} e^{- U (r_{a A} - r^{*}) / k_{B} T}

(10.2.5)

For every $r_{a A} = r$ , the corresponding contribution to the integral in the right-hand side of Eq. (10.2.5) is $e^{- U (r - r^{*}) / k T} \int d r_{R} d r_{L} d r_{W} δ (r_{a A} - r) e^{- H (r_{R}, r_{L}, r_{W}) / k T}$ ; therefore, Eq. (10.2.5) can be written as

Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N} = \int_{}^{} d r_{R} d r_{L} d r_{W} e^{\frac{H (r_{R}, r_{L}, r_{W})}{k_{B} T}} e^{- \frac{U (r_{a A} - r *)}{k_{B} T}} = \int_{}^{} d r e^{- \frac{U (r - r^{*})}{k_{B} T}} \int d r_{R} d r_{L} d r_{w} δ (r_{a A} - r) e^{- \frac{H (r_{R}, r_{L}, r_{W})}{k_{B} T}}

(10.2.6)

Substituting Eq. (10.2.6) into Eq. (10.2.1), we obtain

Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N} = \int_{}^{} d r e^{- \frac{U (r - r^{*})}{k_{B} T}} \int d r_{R} d r_{L} d r_{w} δ (r_{a A} - r) e^{- \frac{H (r_{R}, r_{L}, r_{w})}{k_{B} T}} = \int_{}^{} d r e^{- U (r - r^{*}) / k_{B} T} e^{- w (r) / k_{B} T}

(10.2.7)

Combine Eq. (10.2.4) and Eq. (10.2.7) yields

\frac{z_{R L (θ, ϕ, Θ, Φ, Ψ), N}}{z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}} = \frac{\int_{b o u n d}^{} e^{- w (r) / k_{B} T} d r}{\int_{0}^{\infty} e^{- w (r) / k_{B} T} e^{- U_{r^{*}} / k_{B} T} d r} = \frac{\int_{r_{m i n}}^{r_{s i t e}} e^{- w (r) / k_{B} T} d r}{\int_{0}^{\infty} e^{- w (r) / k_{B} T} e^{- \frac{1}{2} k_{r} {(r - r^{*})}^{2} / k_{B} T} d r}

(10.2a)

Here the line integral $\int_{b o u n d}^{} e^{- w (r) / k_{B} T} d r$ spans the range of r that corresponds to the bound state. Therefore, the upper limit of the integration equals r_site, where the ligand-receptor interaction becomes negligibly small ( see Fig. 2, right panel). The lower limit of the integration corresponds to the smallest distance recorded in the PMF (Fig. 2), denoted as r_min.

To simplify the right-hand side of Eq. 10.2a, note that in the denominator, w(r) can be taken outside the integral $\int_{0}^{\infty} e^{- w (r) / k_{B} T} e^{- {\frac{1}{2} k}_{r} {(r - r^{*})}^{2} / k_{B} T} d r$ as a constant. This is because the integral contains the Gaussian function $e^{- {\frac{1}{2} k}_{r} {(r - r^{*})}^{2} / k_{B} T}$ centered around r*; when the force constant k_r is sufficiently large, the Gaussian function acts like a δ-function, i.e. sharply peaked within a small region around r* and vanishingly small everywhere else. Therefore, in this very small neighborhood of r*, w(r) is essentially constant with $w (r) \approx w (r^{*})$ , and can be taken outside the integral as a constant in Eq. 10.2a.

Thus, Eq. (10.2a) becomes

\frac{z_{R L (θ, ϕ, Θ, Φ, Ψ), N}}{z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}} = \frac{\int_{b o u n d}^{} e^{- w (r) / k_{B} T} d r}{\int_{0}^{\infty} e^{- w (r) / k_{B} T} e^{- \frac{1}{2} k_{r} {(r - r^{*})}^{2} / k_{B} T} d r} = \frac{\int_{b o u n d}^{} e^{- w (r) / k_{B} T} d r}{e^{- w (r^{*}) / k_{B} T} \int_{- \infty}^{\infty} e^{- \frac{1}{2} k_{r} {(r - r^{*})}^{2} / k_{B} T} d (r - r^{*})} = \frac{\int_{b o u n d}^{} e^{- [w (r) - w (r^{*})] / k_{B} T} d r}{\int_{- \infty}^{\infty} e^{- \frac{1}{2} k_{r} {(Δ r)}^{2} / k_{B} T} d Δ r} = \frac{\int_{b o u n d}^{} e^{- [w (r) - w (r^{*})] / k_{B} T} d r}{{(\frac{2 π k_{B} T}{k_{r}})}^{\frac{1}{2}}}

(10.2)

Finally, the last term inside the logarithm of the right hand side of Eq. (9), $\frac{Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}}{Z_{R + L, N}}$ , represents the free energy of removing all the distance and angular restraints $(U_{r^{*}}, U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ})$ on the bulk ligand such that the ligand is allowed to explore the system volume V and rotate freely. When the force constants used in the harmonic restraints are sufficiently strong, which is the case in this work, the rigid-rotor approximation applies, allowing this term to be evaluated analytically as¹¹

\frac{Z_{R + L (r^{*}, θ, ϕ, Θ, Φ, Ψ), N}}{Z_{R + L, N}} \approx \frac{r^{*}^{2} s i n θ_{0} s i n Θ_{0} {(2 π k_{B} T)}^{3}}{8 π^{2} V {(k_{r} k_{θ} k_{ϕ} k_{Θ} k_{Φ} k_{Ψ})}^{\frac{1}{2}}}

(10.3)

Where θ₀ and $Θ_{0}$ are the equilibrium values of θ and Θ, respectively. $k_{r} k_{θ} k_{ϕ} k_{Θ} k_{Φ} k_{Ψ}$ in the denominator is the product of the force constants for the corresponding harmonic restraints on (r, θ, ϕ, Θ, Ф, Ψ).

Combining the above Eq. (10.1), (10.2), (10.3) and Eq. (9), we obtain

Δ G_{bind}^{ο} = - Δ G_{(θ, ϕ, Θ, Φ, Ψ)}^{b o u n d} - k_{B} T ln \frac{{\int_{b o u n d} e}^{- \frac{[w (r) - w (r *)]}{k_{B} T}} d r}{{(\frac{2 π k_{B} T}{k_{r}})}^{\frac{1}{2}}} - k_{B} T ln \frac{C^{o} r^{* 2} \sin θ_{0} \sin Θ_{0} {(2 π k_{B} T)}^{3}}{8 π^{2} {(k_{r} k_{θ} k_{ϕ} k_{Θ} k_{Φ} k_{ψ})}^{\frac{1}{2}}} = - Δ G_{(θ, ϕ, Θ, Φ, Ψ)}^{b o u n d} - k_{B} T ln \frac{e^{\frac{w (r^{*})}{k T}} \int_{b o u n d}^{} e^{- \frac{w (r)}{k_{B} T}} d r}{{(\frac{2 π k_{B} T}{k_{r}})}^{\frac{1}{2}}} - k_{B} T ln \frac{C^{o} r^{* 2} \sin θ_{0} \sin Θ_{0} {(2 π k_{B} T)}^{3}}{8 π^{2} {(k_{r} k_{θ} k_{ϕ} k_{Θ} k_{Φ} k_{Ψ})}^{\frac{1}{2}}} = - Δ G_{(θ, ϕ, Θ, Φ, Ψ)}^{} - w (r *) - k_{B} T \ln \frac{\int_{b o u n d}^{} e^{- w (r) / k {}_{B}T} d r}{{(\frac{2 π k_{B} T}{k_{r}})}^{\frac{1}{2}}} + Δ G_{r e s t r}^{b u l k}

(11)

where $Δ G_{r e s t r}^{b u l k} = - k_{B} T \ln \frac{C^{o} r^{*}^{^{2}} s i n θ_{0} s i n Θ_{0} {(2 π k_{B} T)}^{3}}{8 π^{2} {(k_{r} k_{θ} k_{ϕ} k_{Θ} k_{Φ} k_{Ψ})}^{\frac{1}{2}}}$ (the quantity in the logarithm is dimensionless).

Note that although the right-hand side of Eq. (11) contains r*, the $Δ G_{b i n d}^{°}$ calculated using Eq. 11 is actually independent of the choice of the bulk location r*. To see this we note that the w(r) implicitly contains the Jacobian factor of the spherical coordinates that is proportional to lnr², because the polar angle restraints ( $U_{θ}, U_{ϕ}$ ) force the ligand to move along the axis $r_{a A}$ , inside a cone with a cross-section area $= r^{2} s i n θ_{0} 〈 Δ θ 〉 〈 Δ ϕ 〉 . 〈 Δ θ 〉 a n d 〈 Δ ϕ 〉$ are the root-mean-square fluctuation in θ and ϕ, which are related to the angular force constants $k_{θ}, k_{ϕ}$ and the temperature factor by $〈 Δ θ 〉^{2} = \frac{k_{B} T}{k_{θ}}$ and $〈 Δ ϕ 〉^{2} = \frac{k_{B} T}{k_{ϕ}}$ , respectively. Thus, the 1D probability density P(r) is a product of the surface area $r^{2} s i n θ_{0} 〈 Δ θ 〉 〈 Δ ϕ 〉$ and a term $P^{'} (r)$ that depends only on intermolecular forces i.e. $P (r) = r^{2} s i n θ_{0} 〈 Δ θ 〉 〈 Δ ϕ 〉 P^{'} (r)$ , and $w (r) \propto - k_{B} T l n [P (r)] = - k_{B} T l n [r^{2} s i n θ_{0} 〈 Δ θ 〉 〈 Δ ϕ 〉 P^{'} (r)]$ . In the bulk solution, $P^{'} (r)$ is isotropic and therefore a constant, so w(r) varies with lnr² in bulk solution. Now the lnr²-dependence in w(r*) and the $l n {r^{*}}^{2}$ -dependence in $Δ G_{r e s t r}^{b u l k}$ cancel out each other, which makes the $Δ G_{b i n d}^{°}$ calculated using Eq. (11) independent of the choice of the bulk location r*.

The explicit geometrical definition of the complex can also be used in a similar way in the PMF calculation of $Δ G_{b i n d}^{°}$ . In this case, Eq. (11) does not need to be modified, since the information of the binding site volume is already used in choosing the upper limit of the PMF integral $\int_{b o u n d}^{} e^{- w (r) / k_{B} T} d r$ in the right hand side of the Eq. (11): see Eq. (10.2). The only change is that the flat-bottomed restraint on the ligand center should be present when running the FEP (or TI) simulations to compute $Δ G_{(θ, ϕ, Θ, Φ, Ψ)}^{b o u n d}$ , the free energy of turning on the angular restraints $(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ})$ on the ligand external degrees of freedom in the bound state: see Eq. (10.1).

Comparison of our PMF formula with that derived previously by Woo and Roux

Using a δ-function formulation, Woo and Roux have derived the expression for the absolute binding free energy $Δ G_{b i n d}^{°}$ as¹³

Δ G_{bind}^{°} = - k_{B} T \ln C^{o} S^{*} I^{*} + Δ G_{o}^{b u l k} - Δ G_{o}^{s i t e} - Δ G_{a}^{s i t e}

(12)

where

S^{*} = {(r^{*})}^{2} \int_{0}^{π} \sin θ d θ \int_{0}^{2 π} d ϕ e^{- u_{a} (θ, ϕ) / k_{B} T}

(12.1)

I^{*} = \int_{b o u n d}^{} d r e^{- [w (r) - w (r^{*})] / k_{B} T}

(12.2)

Δ G_{o}^{b u l k} = - k_{B} T \ln \frac{1}{8 π^{2}} \int_{0}^{π} d Θ \int_{- π}^{π} d Φ \int_{- π}^{π} d Ψ \sin Θ e^{- \frac{u_{o} (Θ, Φ, Ψ)}{k_{B} T}}

(12.3)

In Eq. (12), $Δ G_{o}^{s i t e}$ and $Δ G_{a}^{s i t e}$ are the free energy of applying the harmonic restraints on (θ, ϕ) and (Θ, Φ, Ψ) for the ligand in the binding site, respectively. The u_a(θ, ϕ) and u_o(Θ, Φ, Ψ) in the above equations are the harmonic restraints on (θ, ϕ) and (Θ, Φ, Ψ), respectively. Comparing Eq. (11) with Eq. (12), it can be shown that the two expressions yields equivalent results when all the six harmonic restraint force constants are sufficiently large.

First, we recognize that the first term on the right hand side of Eq. (11), $- Δ G_{(U_{θ}, U_{ϕ}, U_{Θ}, U_{Φ}, U_{Ψ})}^{b o u n d}$ , is the same as the last two terms of the Eq. (12), $- Δ G_{o}^{s i t e} - Δ G_{a}^{s i t e}$ , therefore these terms cancel.

Next, we compare the remaining terms in the right-hand side of Eq. (11)

- k_{B} T \ln \frac{\int_{b o u n d}^{} e^{- [w (r) - w (r^{*})] / k_{B} T} d r}{{(\frac{2 π k_{B} T}{k_{r}})}^{\frac{1}{2}}} - k_{B} T \ln \frac{C^{o} r^{*}^{2} s i n θ_{0} s i n Θ_{0} {(2 π k_{B} T)}^{3}}{8 π^{2} {(k_{r} k_{θ} k_{ϕ} k_{Θ} k_{Φ} k_{Ψ})}^{\frac{1}{2}}}

(13)

with the remaining terms in the right-hand side of Eq. (12)

- k_{B} T \ln [C^{o} {(r^{*})}^{2} \int_{0}^{π} \sin θ d θ \int_{0}^{2 π} d ϕ e^{- u_{a} / k_{B} T} \int_{b o u n d}^{} d r e^{- [w (r) - w (r^{*})] / k_{B} T}] - k_{B} T \ln \frac{1}{8 π^{2}} \int_{0}^{π} d Θ \int_{- π}^{π} d Φ \int_{- π}^{π} d Ψ \sin Θ e^{- \frac{u_{o} (Θ, Φ, Ψ)}{k_{B} T}}

(14)

Note that $u_{a} (θ, ϕ) = U_{θ} + U_{ϕ} = \frac{1}{2} k_{θ} {(θ - θ_{0})}^{2} + \frac{1}{2} k_{ϕ} {(ϕ - ϕ_{0})}^{2}$ and $u_{o} (Θ, Φ, Ψ) = U_{Θ} + U_{Φ} + U_{Ψ} = \frac{1}{2} k_{Θ} {(Θ - Θ_{0})}^{2} + \frac{1}{2} k_{Φ} {(Φ - Φ_{0})}^{2} + \frac{1}{2} k_{Ψ} {(Ψ - Ψ_{0})}^{2}$ . When the six harmonic restraint force constants are sufficiently large, the Gaussian functions in (14) are of the form $\int d ξ e^{- k_{ξ} {(ξ - ξ_{0})}^{2} / k_{B} T} = {(\frac{2 π k_{B} T}{k_{ξ}})}^{1 / 2}$ . Therefore, the two terms in (14) can be written as

- k_{B} T \ln \frac{1}{8 π^{2}} \int_{0}^{π} d Θ \int_{- π}^{π} d Φ \int_{- π}^{π} d Ψ \sin Θ e^{- \frac{u_{o} (Θ, Φ, Ψ)}{k_{B} T}} \approx - k_{B} T \ln \frac{1}{8 π^{2}} \sin Θ_{0} \frac{{(2 π k_{B} T)}^{\frac{3}{2}}}{{(k_{Θ} k_{Φ} k_{Ψ})}^{\frac{1}{2}}}

(14.1)

and

- k_{B} T \ln {C^{o} {(r^{*})}^{2} \int_{0}^{π} s i n θ d θ \int_{0}^{2 π} d ϕ e^{- u_{a} / k_{B} T} \int_{b o u n d}^{} d r e^{- [w (r) - w (r^{*})] / k_{B} T]}} \approx - k_{B} T \ln {C^{o} {(r^{*})}^{2} s i n θ_{0} \frac{2 π k_{B} T}{{(k_{θ} k_{ϕ})}^{\frac{1}{2}}} \int_{b o u n d}^{} d r e^{- [w (r) - w (r^{*})] / k_{B} T]}}

(14.2)

Comparing the right-hand sides of Eq. (14.1) and (14.2) with the (13), we can see that (13) and (14) are identical. Therefore, Eq. (11) is equivalent to Eq. (12) when the force constants are sufficiently large.

PMF calculation setup

We compute the ligand 1D PMF w(r) using umbrella sampling simulations in explicit solvent ³⁰, in which a series of MD simulations is performed with the harmonic distance restraint on the receptor atom a and ligand atom A: see Fig. 1. The biasing potential in the i-th simulation window is $U_{r, i} = \frac{1}{2} k_{r} {(r - r_{i}^{0})}^{2}$ , where $r_{i}^{0}$ is the reference distance for the i-th sampling window. The full range of the distance space is covered using between 20 and 24 umbrella windows for the different complexes. A single force constant k_r = 1000 kJ mol⁻¹ nm⁻² is used for the distance restraint in all the sampling windows. The force constants used in the angular restraints are: $k_{θ} = k_{ϕ} = k_{Θ} = k_{Φ} = k_{Ψ} = 1000 k J {r a d}^{- 1} {m o l}^{- 1}$ . In each umbrella window, a 30-ns MD simulation is performed, starting from the last simulation snapshot of the previous window. The first 20 ns are treated as equilibration and the last 10 ns of sampling data are used for the calculation of PMF. The biased probability distributions along the distance r accumulated in these sampling windows are unbiased and combined using the Weighted Histogram Analysis Method (WHAM) method to yield the unbiased distribution and the potential of mean force w(r)^31,32. The WHAM program implemented by Grossfield is used to calculate the PMF³³. For each receptor-ligand complex, three independent sets of umbrella sampling simulations were preformed and the statistical uncertainties in the calculated PMF are estimated by the standard deviation of the results obtained from the three sets of umbrella sampling simulations. The total simulation time used to compute the PMF for a single receptor-ligand complex is ~2.1 μs.

The paths along which ligands are pulled out in the PMF calculation are identified by trial and error. First, a ligand atom A and a protein atom a are chosen to define axis r_aA; using the Discovery Studio Visualizer (Biovia Inc), the ligand is translated manually along this axis with its orientation relative to the receptor fixed. If the unbinding ligand collides with the nearby protein atoms, (a clash is shown as red dashed line in the DS Visualizer), then a different set of atom pairs will be tried. After a few tries, one or more low energy paths may be identified; these paths are then used to run umbrella-sampling simulation to get PMF. As an example, for BI-224436, we identified two low free energy paths (Fig. S1 and S2, Supporting Information), with the $Δ G_{b i n d}^{°}$ estimated using path 2 lower than that from path 1. The PMF result reported in Table 1 and Table 2 are for the lowest free energy paths.

Table 1.

Comparisons of absolute binding free energies from DDM and PMF with experiment. Unit: kcal/mol.^a

Ligand	PMF	DDM	Experiment^b
KF134	-7.0 ± 1.7	-9.8 ± 1.4	-5.5
KF116	-7.6 ± 1.4	-8.4 ± 1.4	-9.9
BI-224436	-13.7 ± 3.5	-12.5 ± 0.3	-10.3

Open in a new tab

^a.

There are two symmetry related ALLINI binding sites in one HIV-1 Integrase CCD dimer. All the calculated $Δ G_{b i n d}^{°}$ are for the intrinsic binding constant at a single site.

^b.

The experimental $Δ G_{b i n d}^{°}$ are obtained from surface plasmon resonance (SPR) measurements (private communication with the Kvaratskhelia lab).

Table 2.

Contributions to the absolute binding free energy $Δ G_{b i n d}^{°}$ calculated using the PMF approach (See Eq. (11) for the meaning of different terms).

Component	KF134	KF116	BI-224436
$- w (r_{1}^{*})$	−11.5 ± 1.5	−12.8 ± 1.1	−19.2 ± 3.5
$Δ G_{r e s t r}^{b u l k}$	8.3	8.0	9.7
$- k_{B} T \ln \frac{\int_{b o u n d}^{} e^{- w (r) / k_{B} T} d r}{{(2 π k_{B} T / k_{r})}^{\frac{1}{2}}}$	−0.2 ± 0.1	0.04 ± 0.04	0.05 ± 0.03
$- Δ G_{(θ, ϕ, Θ, Φ, Ψ)}^{b o u n d}$	−3.6 ± 0.8	−2.8 ± 0.5	−4.2 ± 0.15
$Δ G^{o} (b i n d)$	−7.0 ± 1.7	−7.6 ± 1.4	−13.7 ± 3.5

Open in a new tab

DDM calculation setup

A DDM calculation involves two legs of decoupling simulations in order to compute the $Δ G_{d e c o u p l}^{b o u n d}$ and ${- Δ G}_{d e c o u p l}^{b u l k}$ . In the each leg of the decoupling in this work, the Coulomb intermolecular interaction is turned off first, and then the Lennard-Jones intermolecular interaction is switched off. Thus, the calculation provides estimates for the following free energy terms

\begin{array}{l} Δ G_{d e c o u p l}^{b o u n d} = Δ G_{d e c o u p l - C o u l o m b}^{b o u n d} + Δ G_{d e c o u p l - L J}^{b o u n d} \\ Δ G_{d e c o u p l}^{b u l k} = Δ G_{d e c o u p l - C o u l o m b}^{b u l k} + Δ G_{d e c o u p l - L J}^{b u l k} \end{array}

(15)

Combining Eq. (15) with the Eq. (4), the absolute binding free energy $Δ G_{b i n d}^{°}$ can be rearranged as

Δ G_{b i n d}^{°} = - Δ G_{r e s t r}^{b o u n d} - Δ G_{d e c o u p l - C o u l o m b}^{b o u n d} + Δ G_{d e c o u p l - C o u l o m b}^{b u l k} - Δ G_{d e c o u p l - L J}^{b o u n d} + Δ G_{d e c o u p l - L J}^{b u l k} + Δ G_{r e s t r}^{g a s} = - Δ G_{r e s t r}^{b o u n d} + Δ Δ G_{C o u l o m b} + Δ Δ G_{L J} + Δ G_{r e s t r}^{g a s}

(16)

Here $Δ Δ G_{C o u l o m b} = - Δ G_{d e c o u p l - C o u l o m b}^{b o u n d} + {Δ G}_{d e c o u p l - C o u l o m b}^{b u l k}$ , and $Δ {Δ G}_{L J} = - Δ G_{d e c o u p l - L J}^{b o u n d} + {Δ G}_{d e c o u p l - L J}^{b u l k}$ , which can be interpreted as the contributions of the effective electrostatic interactions and the nonpolar interactions to the total binding free energy, respectively ^35,39.

It has been shown that since the simulations of the DDM calculation are performed in a finite-size, periodic solvent box, for charged ligands this can have a non-negligible effect on the calculated electrostatic decoupling free energies $Δ G_{d e c o u p l - C o u l o m b}^{b o u n d}$ and ${Δ G}_{d e c o u p l - C o u l o m b}^{b u l k}$ due to lattice sum artifact.^19,37 Here we used a scheme developed by Rocklin and coworkers to compute the correction for these effects¹⁹. The electrostatic finite-size corrections include the following contributions: periodicity-induced net-charge interactions, periodicity-induced net-charge undersolvation, discrete solvent effects, and residual integrated potential effects.¹⁹ The calculation of the residual integrated potential requires three Poisson-Boltzmann calculations (PB) of electrostatic potentials generated by different combinations of receptor and ligand charge distributions. Here the PB calculations are performed using the APBS program.³⁸ With the inclusion of these corrections, the final expression for the $Δ G_{b i n d}^{°}$ from DDM is

Δ G_{b i n d}^{°} = - Δ G_{r e s t r}^{b o u n d} + G_{C o u l o m b} + Δ Δ G_{L J} + Δ G_{e l e c_c o r r}^{f i n i t e_s i z e} + Δ G_{r e s t r}^{g a s}

(17)

For the systems studied here, 11 λs are used for the Coulomb decoupling: λ_Coulomb = 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0; and 17 λs are used for the Lennard-Jones decoupling: λ_LJ = 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.94, 0.985, 1.0. TI is used to compute the resulting decoupling free energies i.e. $Δ G_{d e c o u p l_C o u l o m b}^{} = \int_{λ_{C o u l o m b} = 0}^{λ_{C o u l o m b} = 1} {〈 \frac{d U}{d λ_{C o u l o m b}} 〉}_{λ_{C o u l o m b}} d λ_{C o u l o m b}$ and

$Δ G_{d e c o u p l_L J}^{} = \int_{λ_{L J} = 0}^{λ_{L J} = 1} {〈 \frac{d U}{d λ_{L J}} 〉}_{λ_{L J}} d λ_{L J}$ . (FEP can also be used.) The soft-core potential implemented in GROMACS is used for the Lennard-Jones decoupling step in the DDM calculation to avoid the problem of end point instability.³⁴ At each λ, a decoupling simulation is performed for 15 ns. The first 5 ns of the trajectory were treated as equilibration and the last 10 ns trajectory were used to compute $Δ G_{b i n d}^{°}$ . For the $Δ G_{b i n d}^{°}$ calculation of each ligand, three independent DDM runs with different starting velocities were carried out. The statistical error bars are estimated from the standard deviations of the three independent sets of DDM simulations. A total of ≈ 2.5 μs simulation data are collected to compute a single $Δ G_{b i n d}^{°}$ value.

MD Simulation Setup

In this work, the MD binding free energy simulations using PMF and DDM were performed using the GROMACS 4.6.4 ⁴⁰. The AMBER parm99ILDN force field is used to model the HIV-1 integrase catalytic core dimer, and the Amber GAFF parameters set ⁴¹ and the AM1-BCC charge model ⁴² are used to describe the different ligands. TIP3P water ⁴³ boxes previously equilibrated at 300 K and 1 atm pressure were used to solvate the protein-ligand complex. The dimension of the solvent box is set up to ensure that the distance between solute atoms from nearest walls of the box is at least 10 Å. To maintain charge neutrality, 1 Cl^- and 1 Na⁺ are added to the solvent boxes containing the protein-ligand complex and that containing the ligand, respectively. The electrostatic interactions were computed using the particle-mesh Ewald (PME) method ⁴⁴ with a real space cutoff of 10 Å and a grid spacing of 1.0 Å. MD simulations were performed in the NPT ensemble with a time step of 2 fs.

RESULTS

Comparison of PMF and DDM in calculating the $Δ G_{b i n d}^{°}$ for the charged ligands

Using the expressions of PMF and DDM derived here, we have computed the absolute binding free energies for three charged ligands that bind at the allosteric site in the dimer interface of the HIV-1 integrase catalytic core domain (CCD): see Table 1. Comparing with the experimental results, the errors in the $Δ G_{b i n d}^{°}$ computed from PMF and DDM are in the range of 1.5–3.4 kcal/mol and 1.6–4.3 kcal/mol, respectively. While the two methods produced comparable results for the two stronger binders BI-224436 and KF116, the DDM method overestimated the binding affinity for the weak inhibitor KF134. Thus, only PMF correctly reproduced the rank order of the binding affinities of these charged ligands. However, as seen from Table 1, the error bars in both sets of results are substantial, especially in the PMF-calculation of BI-224436. Although the PMF method provides a somewhat better estimate for the $Δ G_{b i n d}^{°}$ for these charged ligands in the same amount of computing time, the statistical uncertainty in the calculated $Δ G_{b i n d}^{°}$ makes it less straightforward to clearly favor one method over the other.

Overestimation in the PMF-calculated $Δ G_{b i n d}^{°}$ for BI-224436 is likely related to the bulky tricyclic group carried by the ligand

We analyze the results of the PMF-calculated absolute binding free energies. As shown in Table 1, while the error in the calculated $Δ G_{b i n d}^{°}$ for KF134 and KF116 are about ≤ 2 kcal/mol, the PMF calculation overestimates the absolute binding free energy for BI-224436 by −3.4 kcal/mol. To investigate the source of this overestimation of the $Δ G_{b i n d}^{°}$ , we examine the breakdown of the different contributions to the $Δ G_{b i n d}^{°}$ computed in the PMF method: see Table 2. The term $- w (r_{1}^{*})$ provides the most dominant contribution to the absolute binding free energy. In fact $- w (r_{1}^{*})$ alone correctly predicts the rank order of the binding affinities of these charged ligands, since the $Δ G_{r e s t r}^{b u l k}$ term (see Eq. (6)) is almost a constant, and the differences in $- Δ G_{(θ, ϕ, Θ, Φ, Ψ)}^{b o u n d}$ , the free energy cost of turning on the ligand angular restraints, are smaller than the differences in the $- w (r_{1}^{*})$ .

As seen from Table 2 and Fig. 2, the error bar in the reversible work value $- w (r_{1}^{*})$ for BI-224436 is 3.5 kcal/mol, which is much larger than the statistical errors in the other components. In contrast, the error bar in the PMF calculated for the other two ligands KF134 and KF116 are smaller: see Fig. 2. The different behavior in the calculated PMF for the different ligands can be traced to the chemical groups carried by the ligands: Fig. 3 shows the three ligands in the binding cavity of integrase. The binding modes of the three ligands are almost identical. However, BI-224436 carries a bulky tricyclic arene which is buried in a sub-pocket in the binding site, while in both KF134 and KF116 a single benzene ring occupies the same sub-pocket. Because of its large size, the tricyclic group in the BI-224436 may cause steric clash with the neighboring protein residues when the ligand is pulled out of the binding cavity; this could raise the value of the PMF $w (r)$ and leads to the overestimation of the binding free energy.

Figure 3. — The three charged ligands in the binding site of the HIV-1 integrase CCD dimer interface. The functional groups circled by the red dashed line are buried in the protein binding cavity.

Fig. 2 also shows that in general, the error bars in the PMF increase with r. This is likely related to the fact that the ligand moves inside a cone with a cross-section area proportional to r². Therefore, the accessible volume in the sampling of ligand external degrees of freedom grows with r². It is possible that using a Cartesian coordinate system as the one described by Doudou et al.⁴⁵ in which the accessible volume is constant with distance from the binding site may help reducing the sampling problem at larger r.

Insights from the DDM calculation for charged ligands

Table 3 shows the results of the different free energy terms in the DDM-calculated $Δ G_{b i n d}^{°}$ for the three charged ligands. It can be seen that the electrostatic free energy of decoupling charged ligands from the protein binding pocket $- Δ G_{d e c o u p l - C o u l o m b}^{b o u n d}$ is very large, which is due to both short and long range electrostatic effects and is consistent with the fact that the carboxylate moiety shared by the three charged ligands form multiple correlated hydrogen bonds (Fig. 4) with Glu170/His171/Thr174 in the binding pocket and also with neighboring waters.⁴⁶ However, this large favorable electrostatic free energy is almost exactly cancelled out by the equally large electrostatic free energy of decoupling the charged ligand from the solvent ${Δ G}_{d e c o u p l - C o u l o m b}^{b u l k}$ . The latter is equally large because the same carboxylate group on the ligands also forms multiple hydrogen bonds with the waters in the bulk (data not shown). As a result, the net electrostatic contribution to binding $Δ Δ G_{C o u l o m b}$ is very small, e.g. ranging from 0.9 to −1.5 kcal/mol. In fact, the ligand binding is dominated by nonpolar interactions $Δ Δ G_{L J}$ , which ranges from −11 to −12 kcal/mol. Although the ligand binding is dominated by nonpolar interactions, the intermolecular electrostatic interactions described by $- Δ G_{d e c o u p l - C o u l o m b}^{b o u n d}$ is essential to compensate for the free energy cost ${Δ G}_{d e c o u p l - C o u l o m b}^{b u l k}$ for the desolvation of the charged ligands. Therefore both electrostatic intermolecular interaction and nonpolar interactions are important to the binding of these charged ligands at the HIV-1 integrase CCD dimer interface.

Table 3.

The binding free energy decomposition according to DDM

	KF134	KF116	BI-224436
$Δ G_{d e c o u p l - C o u l o m b}^{b u l k}$	−105.5± 0.9	−179.2 ± 0.7	−92.3 ± 0.2
$- Δ G_{d e c o u p l - L J}^{b o u n d}$	−0.95± 0.8	2.9 ± 0.9	−3.9 ± 0.2
$Δ G_{d e c o u p l - C o u l o m b}^{b u l k}$	105.96 ± 0.06	180.1 ± 0.06	90.8 ± 0.01
$Δ G_{d e c o u p l - L J}^{b u l k}$	−10.5 ± 0.08	−13.4 ± 0.3	−8.4 ± 0.2
$- Δ G_{r e s t r}^{b o u n d}$	−0.04 ± 0.005	−0.1 ± 0.02	−0.03 ± 0.002
$Δ G_{r e s t r}^{g a s}$	3.23	3.23	3.23
$Δ G_{P B C - C o r r}$	−2.0	−1.9	−1.9
^{$Δ G_{bind}^{°}$}	−9.8 ± 1.4	−8.4 ± 1.4	−12.5 ± 0.3

Open in a new tab

Figure 4. — Correlated H-bonds (green dashed lines) between the carboxylate moiety on the ligand and the protein binding pocket.

It should also be noted that, while the electrostatic decoupling free energies are of large magnitude (~−100 kcal/mol), the fluctuations in the $- Δ G_{d e c o u p l - C o u l o m b}^{b o u n d}$ are on the order of only ~1 kcal/mol. This is somewhat surprising, as one would expect the fluctuations in the large electrostatic decoupling free energies to also be large.

As seen from Table 3, the electrostatic correction for the finite-size periodic system $Δ G_{e l e c_c o r r}^{f i n i t e_s i z e}$ is on the order of ~ −2 kcal/mol. Among the different contributions to the $Δ G_{e l e c_c o r r}^{f i n i t e_s i z e}$ , the largest contribution is from the discrete solvent correction term $Δ {Δ G}_{D S C}^{}$ , while the correction terms for the periodicity-induced net-charge interactions and the periodicity-induced net-charge undersolvation nearly cancel out each other.

DISCUSSION

In this study, we compare the performance of DDM and PMF in computing the absolute binding free energies of charged ligands using examples taken from drug discovery projects we are working on.²⁶ For the charged ligands studied here, the PMF and DDM methods yield comparable results, although the PMF result is in somewhat better agreement with the experiment, including recapitulating the ranking order of the experimental binding affinities. A similar comparison study has been reported by Gumbart et al. who calculated the absolute binding free energies for an uncharged peptide using PMF and DDM¹⁶ and found that the $Δ G_{b i n d}^{°}$ computed from the two methods differ by just 0.2 kcal/mol. In our study, we focus on the binding of charged small molecules. For the three charged ligands studied here, the difference in the $Δ G_{b i n d}^{°}$ calculated from PMF and DDM is considerably larger, at 1.2 – 2.8 kcal/mol. In addition, we decompose the $Δ G_{b i n d}^{°}$ from DDM into electrostatic and Lennard-Jones contributions as well as electrostatic corrections for the finite-size, to help dissect the driving force of binding and to study the effect of different terms on the accuracy of the DDM calculation. This free energy decomposition was not reported in the paper by Gumbart et al..¹⁶

We find that the main disadvantage of DDM is that the net binding free energy is obtained as the difference between two very large electrostatic decoupling free energies, which can be ≥ 100 kcal/mol. In contrast, all the terms in the PMF method have smaller magnitudes (i.e. ≤ 20 kcal/mol), since the ligand never leaves the solvent phase in the PMF simulations. Furthermore, in treating the charged ligands, complex procedures need to be applied with DDM to account for the systematic electrostatic errors due to use of finite-size periodic system. Such procedures are not needed for the PMF approach. In addition, the PMF method may have an advantage in treating large flexible ligands: since the DDM method involves a vacuum state, the rugged energy landscape for those large and flexible ligands in the vacuum state can lead to very slow convergence in conformational sampling.

The main advantage of DDM is that it allows more automatic setup, since the calculation does not rely on the geometrical definition of a physical pathway to pull the ligand out of the binding pocket and can therefore be readily applied to a diverse set of ligands in virtual screening. By comparison, the PMF method requires carefully identifying a pulling pathway to avoid potential atomic collisions with the nearby receptor atoms. This makes it difficult to be applied automatically for a diverse set of ligands. For some ligands/binding pocket combinations, such collision-free pathways can be difficult to identify or do not even exist. In such cases, the PMF computed by pulling will lead to overestimated binding free energy estimates. This is because, although binding free energy is a state function, and in principle only depends on the end states and not on pathway, but in practice the receptor atoms involved in the atomic clashes do not have enough time to relax, leading to errors in the estimate of the free energy difference.

Another advantage of DDM is that, in addition to providing absolute binding free energy estimates, the method also provides a thermodynamic decomposition of the total binding free energy into physically meaningful contributions such as electrostatic contributions versus nonpolar ones, which can provide insights about the nature of the interactions between the ligand and the binding pockets. However, since such decomposition schemes are pathway dependent, caution should be applied in their physical interpretation.

Lastly, we explain the rationale for explicitly including the V_site in the DDM expression in this work.V_site , the effective volume of the binding site, is a fundamental quantity in measuring binding affinity either experimentally or computationally. This is because although binding affinity experiments do not explicitly use restraint, they implicitly contain the information of V_site , since what is measured in experiment involves a binding signal which changes when a complex is formed; the onset of the detection of the binding signal provide an experimental definition of V_site^10,47. In order to make direct comparison between calculated $Δ G_{b i n d}^{°}$ with that determined from experiment, the V_site used in the calculation needs to be consistent with what is detected as a bound state in the experiment, and this depends on the nature of the experimental probe being employed to detect binding. In practice, such a rigorous direct comparison has seldom been done partly because except for weak binders, the calculated $Δ G_{b i n d}^{°}$ is insensitive to the exact choice of V_site, as long as the major configurations contributing to the binding affinity have been included in the calculation.¹⁰

The present work makes the connection for the first time between fundamental equations of absolute binding free energy (Eq. 1–3) which use the step function I(ξ) to define V_site ¹⁰ and the standard DDM implementation^11,12 which uses harmonic restraint to accelerate sampling. Our modification of using explicitly the V_site to define the complexed state in addition to the use of harmonic restraint to accelerate sampling is therefore of conceptual importance, because without explicitly including V_site, it is strictly speaking less rigorous to directly compare calculated absolute binding free energy with experiment. In some problems such as weak or non-specific binding, our modification can be of practical importance since in such cases the unrestrained ligand can dissociate from the receptor even in the fully coupled state. Overall, this modification makes DDM broad enough to cover all cases of small molecule binding.

Supplementary Material

esi

NIHMS975177-supplement-esi.pdf^{(314KB, pdf)}

ACKNOWLEDGEMENTS

This research was supported by grants from the National Institutes of Health (NIH U54 401356 and GM30580 (RL)). We thank Dr. Emilio Gallicchio for helpful discussions.

Footnotes

SUPPORTING INFORMATION AVAILABLE

Contains two figures on the paths used in the PMF calculation of BI-224436, and a table containing the meaning of the symbols used in the equations in this work.

REFERENCES

(1).Mobley DL; Gilson MK Predicting Binding Free Energies: Frontiers and Benchmarks. Annual Review of Biophysics 2017, 46 (1), 531–558. [DOI] [PMC free article] [PubMed] [Google Scholar]
(2).Chodera JD; Mobley DL; Shirts MR; Dixon RW; Branson K; Pande VS Alchemical Free Energy Methods for Drug Discovery: Progress and Challenges. Current Opinion in Structural Biology 2011, 21 (2), 150–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
(3).Zwanzig RW High‐Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. The Journal of Chemical Physics 1954, 22 (8), 1420–1426. [Google Scholar]
(4).Lybrand TP; McCammon JA; Wipff G Theoretical Calculation of Relative Binding Affinity in Host-Guest Systems. Proc Natl Acad Sci U S A . 1986, 83 (4), 833–835. [DOI] [PMC free article] [PubMed] [Google Scholar]
(5).Jorgensen WL; Ravimohan C Monte Carlo Simulation of Differences in Free Energies of Hydration. The Journal of Chemical Physics 1985, 83 (6), 3050. [DOI] [PMC free article] [PubMed] [Google Scholar]
(6).Wang L; Wu Y; Deng Y; Kim B; Pierce L; Krilov G; Lupyan D; Robinson S; Dahlgren MK; Greenwood J; et al. Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem. Soc . 2015, 137 (7), 2695–2703. [DOI] [PubMed] [Google Scholar]
(7).Limongelli V; Bonomi M; Parrinello M Funnel Metadynamics as Accurate Binding Free-Energy Method. Proceedings of the National Academy of Sciences 2013, 110 (16), 6358–6363. [DOI] [PMC free article] [PubMed] [Google Scholar]
(8).Moraca F; Amato J; Ortuso F; Artese A; Pagano B; Novellino E; Alcaro S; Parrinello M; Limongelli V Ligand Binding to Telomeric G-Quadruplex DNA Investigated by Funnel-Metadynamics Simulations. Proceedings of the National Academy of Sciences 2017, 114 (11), E2136–E2145. [DOI] [PMC free article] [PubMed] [Google Scholar]
(9).Xie B; Nguyen TH; Minh DDL Absolute Binding Free Energies between T4 Lysozyme and 141 Small Molecules: Calculations Based on Multiple Rigid Receptor Configurations. Journal of Chemical Theory and Computation 2017, 13 (6), 2930–2944. [DOI] [PMC free article] [PubMed] [Google Scholar]
(10).Gilson M; Given J; Bush B; Mccammon J The Statistical-Thermodynamic Basis for Computation of Binding Affinities: A Critical Review. Biophysical Journal 1997, 72 (3), 1047–1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
(11).Boresch S; Tettinger F; Leitgeb M; Karplus M Absolute Binding Free Energies: A Quantitative Approach for Their Calculation. Journal of Physical Chemistry B 2003, 107 (35), 9535–9551. [Google Scholar]
(12).Deng Y; Roux B Calculation of Standard Binding Free Energies: Aromatic Molecules in the T4 Lysozyme L99A Mutant. Journal of Chemical Theory and Computation 2006, 2 (5), 1255–1273. [DOI] [PubMed] [Google Scholar]
(13).Woo H-J; Roux B Calculation of Absolute Protein-Ligand Binding Free Energy from Computer Simulations. Proceedings of the National Academy of Sciences 2005, 102 (19), 6825–6830. [DOI] [PMC free article] [PubMed] [Google Scholar]
(14).Velez-Vega C; Gilson MK Overcoming Dissipation in the Calculation of Standard Binding Free Energies by Ligand Extraction. Journal of Computational Chemistry 2013, n/a-n/a. [DOI] [PMC free article] [PubMed] [Google Scholar]
(15).Heinzelmann G; Henriksen NM; Gilson MK Attach-Pull-Release Calculations of Ligand Binding and Conformational Changes on the First BRD4 Bromodomain. Journal of Chemical Theory and Computation 2017, 13 (7), 3260–3275. [DOI] [PMC free article] [PubMed] [Google Scholar]
(16).Gumbart JC; Roux B; Chipot C Standard Binding Free Energies from Computer Simulations: What Is the Best Strategy? Journal of Chemical Theory and Computation 2013, 9 (1), 794–802. [DOI] [PMC free article] [PubMed] [Google Scholar]
(17).Lee MS; Olson MA Calculation of Absolute Protein-Ligand Binding Affinity Using Path and Endpoint Approaches. Biophysical Journal 2006, 90 (3), 864–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
(18).Figueirido F; Del Buono GS; Levy RM On Finite‐size Effects in Computer Simulations Using the Ewald Potential. The Journal of Chemical Physics 1995, 103 (14), 6133–6142. [Google Scholar]
(19).Rocklin GJ; Mobley DL; Dill KA; Hünenberger PH Calculating the Binding Free Energies of Charged Species Based on Explicit-Solvent Simulations Employing Lattice-Sum Methods: An Accurate Correction Scheme for Electrostatic Finite-Size Effects. The Journal of Chemical Physics 2013, 139 (18), 184103. [DOI] [PMC free article] [PubMed] [Google Scholar]
(20).Dadarlat VM; Skeel RD Dual Role of Protein Phosphorylation in DNA Activator/Coactivator Binding. Biophysical Journal 2011, 100 (2), 469–477. [DOI] [PMC free article] [PubMed] [Google Scholar]
(21).Gumbart JC; Roux B; Chipot C Efficient Determination of Protein–Protein Standard Binding Free Energies from First Principles. Journal of Chemical Theory and Computation 2013, 9 (8), 3789–3798. [DOI] [PMC free article] [PubMed] [Google Scholar]
(22).Cui D; Ou S; Patel S Free Energetics of Rigid Body Association of Ubiquitin Binding Domains: A Biochemical Model for Binding Mediated by Hydrophobic Interaction: Ubiquitin: Model Hydrophobic Interactions. Proteins: Structure, Function, and Bioinformatics 2014, 82 (7), 1453–1468. [DOI] [PubMed] [Google Scholar]
(23).Engelman A; Kessl JJ; Kvaratskhelia M Allosteric Inhibition of HIV-1 Integrase Activity. Current Opinion in Chemical Biology 2013, 17 (3), 339–345. [DOI] [PMC free article] [PubMed] [Google Scholar]
(24).Gallicchio E; Deng N; He P; Wickstrom L; Perryman AL; Santiago DN; Forli S; Olson AJ; Levy RM Virtual Screening of Integrase Inhibitors by Large Scale Binding Free Energy Calculations: The SAMPL4 Challenge. 2014, 28, 475–490. [DOI] [PMC free article] [PubMed] [Google Scholar]
(25).Slaughter A; Jurado KA; Deng N; Feng L; Kessl JJ; Shkriabai N; Larue RC; Fadel HJ; Patel PA; Jena N; et al. The Mechanism of H171T Resistance Reveals the Importance of Ndelta-Protonated His171 for the Binding of Allosteric Inhibitor BI-D to HIV-1 Integrase. Retrovirology 2014, 11 (1), 100. [DOI] [PMC free article] [PubMed] [Google Scholar]
(26).Deng N; Hoyte A; Mansour YE; Mohamed MS; Fuchs JR; Engelman AN; Kvaratskhelia M; Levy R Allosteric HIV-1 Integrase Inhibitors Promote Aberrant Protein Multimerization by Directly Mediating Inter-Subunit Interactions: Structural and Thermodynamic Modeling Studies: Mechanism of Allosteric HIV-1 Integrase Inhibitors. Protein Science 2016, 25 (11), 1911–1917. [DOI] [PMC free article] [PubMed] [Google Scholar]
(27).Gallicchio E; Lapelosa M; Levy RM Binding Energy Distribution Analysis Method (BEDAM) for Estimation of Protein−Ligand Binding Affinities. Journal of Chemical Theory and Computation 2010, 6 (9), 2961–2977. [DOI] [PMC free article] [PubMed] [Google Scholar]
(28).Gallicchio E; Levy RM Recent Theoretical and Computational Advances for Modeling Protein–ligand Binding Affinities In Advances in Protein Chemistry and Structural Biology; Elsevier, 2011; Vol. 85, pp 27–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
(29).Bennett CH Efficient Estimation of Free Energy Differences from Monte Carlo Data. Journal of Computational Physics 1976, 22 (2), 245–268. [Google Scholar]
(30).Deng N-J; Cieplak P Free Energy Profile of RNA Hairpins: A Molecular Dynamics Simulation Study. Biophysical Journal 2010, 98 (4), 627–636. [DOI] [PMC free article] [PubMed] [Google Scholar]
(31).Kumar S; Rosenberg JM; Bouzida D; Swendsen RH; Kollman PA THE Weighted Histogram Analysis Method for Free-Energy Calculations on Biomolecules. I. The Method. Journal of Computational Chemistry 1992, 13 (8), 1011–1021. [Google Scholar]
(32).Gallicchio E; Andrec M; Felts AK; Levy RM Temperature Weighted Histogram Analysis Method, Replica Exchange, and Transition Paths ^†$. Journal of Physical Chemistry B 2005, 109 (14), 6722–6731. [DOI] [PubMed] [Google Scholar]
(33).Grossfield, A. WHAM: The Weighted Histogram Analysis Method; http://membrane.urmc.rochester.edu/content/wham.
(34).Beutler TC; Mark AE; van Schaik RC; Gerber PR; van Gunsteren WF Avoiding Singularities and Numerical Instabilities in Free Energy Calculations Based on Molecular Simulations. Chemical Physics Letters 1994, 222 (6), 529–539. [Google Scholar]
(35).Deng N; Zhang P; Cieplak P; Lai L Elucidating the Energetics of Entropically Driven Protein–Ligand Association: Calculations of Absolute Binding Free Energy and Entropy. The Journal of Physical Chemistry B 2011, 115 (41), 11902–11910. [DOI] [PMC free article] [PubMed] [Google Scholar]
(36).Deng Y; Roux B Computations of Standard Binding Free Energies with Molecular Dynamics Simulations. Journal of Physical Chemistry B 2009, 113 (8), 2234–2246. [DOI] [PMC free article] [PubMed] [Google Scholar]
(37).Reif MM; Oostenbrink C Net Charge Changes in the Calculation of Relative Ligand-Binding Free Energies via Classical Atomistic Molecular Dynamics Simulation. Journal of Computational Chemistry 2014, 35 (3), 227–243. [DOI] [PMC free article] [PubMed] [Google Scholar]
(38).Baker NA; Sept D; Joseph S; Holst MJ; McCammon JA Electrostatics of Nanosystems: Application to Microtubules and the Ribosome. Proceedings of the National Academy of Sciences 2001, 98 (18), 10037–10041. [DOI] [PMC free article] [PubMed] [Google Scholar]
(39).Lin Y-L; Meng Y; Huang L; Roux B Computational Study of Gleevec and G6G Reveals Molecular Determinants of Kinase Inhibitor Selectivity. Journal of the American Chemical Society 2014, 136 (42), 14753–14762. [DOI] [PMC free article] [PubMed] [Google Scholar]
(40).Pronk S; Pall S; Schulz R; Larsson P; Bjelkmar P; Apostolov R; Shirts MR; Smith JC; Kasson PM; van der Spoel D; et al. GROMACS 4.5: A High-Throughput and Highly Parallel Open Source Molecular Simulation Toolkit. Bioinformatics 2013, 29 (7), 845–854. [DOI] [PMC free article] [PubMed] [Google Scholar]
(41).Wang J; Wolf RM; Caldwell JW; Kollman PA; Case DA Development and Testing of a General Amber Force Field. Journal of Computational Chemistry 2004, 25 (9), 1157–1174. [DOI] [PubMed] [Google Scholar]
(42).Jakalian A; Bush BL; Jack DB; Bayly CI Fast, Efficient Generation of High-Quality Atomic Charges. AM1-BCC Model: I. Method. Journal of Computational Chemistry 2000, 21 (2), 132–146. [DOI] [PubMed] [Google Scholar]
(43).Jorgensen WL; Chandrasekhar J; Madura JD; Impey RW; Klein ML Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys . 1983, 79 (2), 926. [Google Scholar]
(44).Essmann U; Perera L; Berkowitz ML; Darden T; Lee H; Pedersen LG A Smooth Particle Mesh Ewald Method. J. Chem. Phys . 1995, 103 (19), 8577. [Google Scholar]
(45).Doudou S; Burton NA; Henchman RH Standard Free Energy of Binding from a One-Dimensional Potential of Mean Force. Journal of Chemical Theory and Computation 2009, 5 (4), 909–918. [DOI] [PubMed] [Google Scholar]
(46).Cui D; Zhang BW; Matubayasi N; Levy RM The Role of Interfacial Water in Protein–Ligand Binding: Insights from the Indirect Solvent Mediated Potential of Mean Force. Journal of Chemical Theory and Computation 2018, 14 (2), 512–526. [DOI] [PMC free article] [PubMed] [Google Scholar]
(47).Hill TL Theory of Protein Solutions. II. The Journal of Chemical Physics 1955, 23 (12), 2270–2274. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

esi

NIHMS975177-supplement-esi.pdf^{(314KB, pdf)}

[R1] (1).Mobley DL; Gilson MK Predicting Binding Free Energies: Frontiers and Benchmarks. Annual Review of Biophysics 2017, 46 (1), 531–558. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] (2).Chodera JD; Mobley DL; Shirts MR; Dixon RW; Branson K; Pande VS Alchemical Free Energy Methods for Drug Discovery: Progress and Challenges. Current Opinion in Structural Biology 2011, 21 (2), 150–160. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] (3).Zwanzig RW High‐Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. The Journal of Chemical Physics 1954, 22 (8), 1420–1426. [Google Scholar]

[R4] (4).Lybrand TP; McCammon JA; Wipff G Theoretical Calculation of Relative Binding Affinity in Host-Guest Systems. Proc Natl Acad Sci U S A . 1986, 83 (4), 833–835. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] (5).Jorgensen WL; Ravimohan C Monte Carlo Simulation of Differences in Free Energies of Hydration. The Journal of Chemical Physics 1985, 83 (6), 3050. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] (6).Wang L; Wu Y; Deng Y; Kim B; Pierce L; Krilov G; Lupyan D; Robinson S; Dahlgren MK; Greenwood J; et al. Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem. Soc . 2015, 137 (7), 2695–2703. [DOI] [PubMed] [Google Scholar]

[R7] (7).Limongelli V; Bonomi M; Parrinello M Funnel Metadynamics as Accurate Binding Free-Energy Method. Proceedings of the National Academy of Sciences 2013, 110 (16), 6358–6363. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] (8).Moraca F; Amato J; Ortuso F; Artese A; Pagano B; Novellino E; Alcaro S; Parrinello M; Limongelli V Ligand Binding to Telomeric G-Quadruplex DNA Investigated by Funnel-Metadynamics Simulations. Proceedings of the National Academy of Sciences 2017, 114 (11), E2136–E2145. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] (9).Xie B; Nguyen TH; Minh DDL Absolute Binding Free Energies between T4 Lysozyme and 141 Small Molecules: Calculations Based on Multiple Rigid Receptor Configurations. Journal of Chemical Theory and Computation 2017, 13 (6), 2930–2944. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] (10).Gilson M; Given J; Bush B; Mccammon J The Statistical-Thermodynamic Basis for Computation of Binding Affinities: A Critical Review. Biophysical Journal 1997, 72 (3), 1047–1069. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] (11).Boresch S; Tettinger F; Leitgeb M; Karplus M Absolute Binding Free Energies: A Quantitative Approach for Their Calculation. Journal of Physical Chemistry B 2003, 107 (35), 9535–9551. [Google Scholar]

[R12] (12).Deng Y; Roux B Calculation of Standard Binding Free Energies: Aromatic Molecules in the T4 Lysozyme L99A Mutant. Journal of Chemical Theory and Computation 2006, 2 (5), 1255–1273. [DOI] [PubMed] [Google Scholar]

[R13] (13).Woo H-J; Roux B Calculation of Absolute Protein-Ligand Binding Free Energy from Computer Simulations. Proceedings of the National Academy of Sciences 2005, 102 (19), 6825–6830. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] (14).Velez-Vega C; Gilson MK Overcoming Dissipation in the Calculation of Standard Binding Free Energies by Ligand Extraction. Journal of Computational Chemistry 2013, n/a-n/a. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] (15).Heinzelmann G; Henriksen NM; Gilson MK Attach-Pull-Release Calculations of Ligand Binding and Conformational Changes on the First BRD4 Bromodomain. Journal of Chemical Theory and Computation 2017, 13 (7), 3260–3275. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] (16).Gumbart JC; Roux B; Chipot C Standard Binding Free Energies from Computer Simulations: What Is the Best Strategy? Journal of Chemical Theory and Computation 2013, 9 (1), 794–802. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] (17).Lee MS; Olson MA Calculation of Absolute Protein-Ligand Binding Affinity Using Path and Endpoint Approaches. Biophysical Journal 2006, 90 (3), 864–877. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] (18).Figueirido F; Del Buono GS; Levy RM On Finite‐size Effects in Computer Simulations Using the Ewald Potential. The Journal of Chemical Physics 1995, 103 (14), 6133–6142. [Google Scholar]

[R19] (19).Rocklin GJ; Mobley DL; Dill KA; Hünenberger PH Calculating the Binding Free Energies of Charged Species Based on Explicit-Solvent Simulations Employing Lattice-Sum Methods: An Accurate Correction Scheme for Electrostatic Finite-Size Effects. The Journal of Chemical Physics 2013, 139 (18), 184103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] (20).Dadarlat VM; Skeel RD Dual Role of Protein Phosphorylation in DNA Activator/Coactivator Binding. Biophysical Journal 2011, 100 (2), 469–477. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] (21).Gumbart JC; Roux B; Chipot C Efficient Determination of Protein–Protein Standard Binding Free Energies from First Principles. Journal of Chemical Theory and Computation 2013, 9 (8), 3789–3798. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] (22).Cui D; Ou S; Patel S Free Energetics of Rigid Body Association of Ubiquitin Binding Domains: A Biochemical Model for Binding Mediated by Hydrophobic Interaction: Ubiquitin: Model Hydrophobic Interactions. Proteins: Structure, Function, and Bioinformatics 2014, 82 (7), 1453–1468. [DOI] [PubMed] [Google Scholar]

[R23] (23).Engelman A; Kessl JJ; Kvaratskhelia M Allosteric Inhibition of HIV-1 Integrase Activity. Current Opinion in Chemical Biology 2013, 17 (3), 339–345. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] (24).Gallicchio E; Deng N; He P; Wickstrom L; Perryman AL; Santiago DN; Forli S; Olson AJ; Levy RM Virtual Screening of Integrase Inhibitors by Large Scale Binding Free Energy Calculations: The SAMPL4 Challenge. 2014, 28, 475–490. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] (25).Slaughter A; Jurado KA; Deng N; Feng L; Kessl JJ; Shkriabai N; Larue RC; Fadel HJ; Patel PA; Jena N; et al. The Mechanism of H171T Resistance Reveals the Importance of Ndelta-Protonated His171 for the Binding of Allosteric Inhibitor BI-D to HIV-1 Integrase. Retrovirology 2014, 11 (1), 100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] (26).Deng N; Hoyte A; Mansour YE; Mohamed MS; Fuchs JR; Engelman AN; Kvaratskhelia M; Levy R Allosteric HIV-1 Integrase Inhibitors Promote Aberrant Protein Multimerization by Directly Mediating Inter-Subunit Interactions: Structural and Thermodynamic Modeling Studies: Mechanism of Allosteric HIV-1 Integrase Inhibitors. Protein Science 2016, 25 (11), 1911–1917. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] (27).Gallicchio E; Lapelosa M; Levy RM Binding Energy Distribution Analysis Method (BEDAM) for Estimation of Protein−Ligand Binding Affinities. Journal of Chemical Theory and Computation 2010, 6 (9), 2961–2977. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] (28).Gallicchio E; Levy RM Recent Theoretical and Computational Advances for Modeling Protein–ligand Binding Affinities In Advances in Protein Chemistry and Structural Biology; Elsevier, 2011; Vol. 85, pp 27–80. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] (29).Bennett CH Efficient Estimation of Free Energy Differences from Monte Carlo Data. Journal of Computational Physics 1976, 22 (2), 245–268. [Google Scholar]

[R30] (30).Deng N-J; Cieplak P Free Energy Profile of RNA Hairpins: A Molecular Dynamics Simulation Study. Biophysical Journal 2010, 98 (4), 627–636. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] (31).Kumar S; Rosenberg JM; Bouzida D; Swendsen RH; Kollman PA THE Weighted Histogram Analysis Method for Free-Energy Calculations on Biomolecules. I. The Method. Journal of Computational Chemistry 1992, 13 (8), 1011–1021. [Google Scholar]

[R32] (32).Gallicchio E; Andrec M; Felts AK; Levy RM Temperature Weighted Histogram Analysis Method, Replica Exchange, and Transition Paths ^†$. Journal of Physical Chemistry B 2005, 109 (14), 6722–6731. [DOI] [PubMed] [Google Scholar]

[R33] (33).Grossfield, A. WHAM: The Weighted Histogram Analysis Method; http://membrane.urmc.rochester.edu/content/wham.

[R34] (34).Beutler TC; Mark AE; van Schaik RC; Gerber PR; van Gunsteren WF Avoiding Singularities and Numerical Instabilities in Free Energy Calculations Based on Molecular Simulations. Chemical Physics Letters 1994, 222 (6), 529–539. [Google Scholar]

[R35] (35).Deng N; Zhang P; Cieplak P; Lai L Elucidating the Energetics of Entropically Driven Protein–Ligand Association: Calculations of Absolute Binding Free Energy and Entropy. The Journal of Physical Chemistry B 2011, 115 (41), 11902–11910. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] (36).Deng Y; Roux B Computations of Standard Binding Free Energies with Molecular Dynamics Simulations. Journal of Physical Chemistry B 2009, 113 (8), 2234–2246. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] (37).Reif MM; Oostenbrink C Net Charge Changes in the Calculation of Relative Ligand-Binding Free Energies via Classical Atomistic Molecular Dynamics Simulation. Journal of Computational Chemistry 2014, 35 (3), 227–243. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] (38).Baker NA; Sept D; Joseph S; Holst MJ; McCammon JA Electrostatics of Nanosystems: Application to Microtubules and the Ribosome. Proceedings of the National Academy of Sciences 2001, 98 (18), 10037–10041. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] (39).Lin Y-L; Meng Y; Huang L; Roux B Computational Study of Gleevec and G6G Reveals Molecular Determinants of Kinase Inhibitor Selectivity. Journal of the American Chemical Society 2014, 136 (42), 14753–14762. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] (40).Pronk S; Pall S; Schulz R; Larsson P; Bjelkmar P; Apostolov R; Shirts MR; Smith JC; Kasson PM; van der Spoel D; et al. GROMACS 4.5: A High-Throughput and Highly Parallel Open Source Molecular Simulation Toolkit. Bioinformatics 2013, 29 (7), 845–854. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] (41).Wang J; Wolf RM; Caldwell JW; Kollman PA; Case DA Development and Testing of a General Amber Force Field. Journal of Computational Chemistry 2004, 25 (9), 1157–1174. [DOI] [PubMed] [Google Scholar]

[R42] (42).Jakalian A; Bush BL; Jack DB; Bayly CI Fast, Efficient Generation of High-Quality Atomic Charges. AM1-BCC Model: I. Method. Journal of Computational Chemistry 2000, 21 (2), 132–146. [DOI] [PubMed] [Google Scholar]

[R43] (43).Jorgensen WL; Chandrasekhar J; Madura JD; Impey RW; Klein ML Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys . 1983, 79 (2), 926. [Google Scholar]

[R44] (44).Essmann U; Perera L; Berkowitz ML; Darden T; Lee H; Pedersen LG A Smooth Particle Mesh Ewald Method. J. Chem. Phys . 1995, 103 (19), 8577. [Google Scholar]

[R45] (45).Doudou S; Burton NA; Henchman RH Standard Free Energy of Binding from a One-Dimensional Potential of Mean Force. Journal of Chemical Theory and Computation 2009, 5 (4), 909–918. [DOI] [PubMed] [Google Scholar]

[R46] (46).Cui D; Zhang BW; Matubayasi N; Levy RM The Role of Interfacial Water in Protein–Ligand Binding: Insights from the Indirect Solvent Mediated Potential of Mean Force. Journal of Chemical Theory and Computation 2018, 14 (2), 512–526. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] (47).Hill TL Theory of Protein Solutions. II. The Journal of Chemical Physics 1955, 23 (12), 2270–2274. [Google Scholar]

PERMALINK

Comparing Alchemical and Physical Pathway Methods for Computing the Absolute Binding Free Energy of Charged Ligands

Nanjie Deng

Di Cui

Bin W Zhang

Junchao Xia

Jeffrey Cruz

Ronald Levy

Abstract

INTRODUCTION

METHODS

A DDM expression that explicitly includes V_site in addition to harmonic restraint

Figure 2.

An alternative derivation for the expression for the absolute binding free energy using the PMF approach

Figure 1.