On the Use of Interaction Entropy and Related Methods to Estimate Binding Entropies

Vilhelm Ekberg; Ulf Ryde

doi:10.1021/acs.jctc.1c00374

. 2021 Jul 13;17(8):5379–5391. doi: 10.1021/acs.jctc.1c00374

On the Use of Interaction Entropy and Related Methods to Estimate Binding Entropies

Vilhelm Ekberg ¹, Ulf Ryde ^1,^*

PMCID: PMC8389774 PMID: 34254810

Abstract

Molecular mechanics combined with Poisson–Boltzmann or generalized Born and solvent-accessible area solvation energies (MM/PBSA and MM/GBSA) are popular methods to estimate the free energy for the binding of small molecules to biomacromolecules. However, the estimation of the entropy has been problematic and time-consuming. Traditionally, normal-mode analysis has been used to estimate the entropy, but more recently, alternative approaches have been suggested. In particular, it has been suggested that exponential averaging of the electrostatic and Lennard–Jones interaction energies may provide much faster and more accurate entropies, the interaction entropy (IE) approach. In this study, we show that this exponential averaging is extremely poorly conditioned. Using stochastic simulations, assuming that the interaction energies follow a Gaussian distribution, we show that if the standard deviation of the interaction energies (σ_IE) is larger than 15 kJ/mol, it becomes practically impossible to converge the interaction entropies (more than 10 million energies are needed, and the number increases exponentially). A cumulant approximation to the second order of the exponential average shows a better convergence, but for σ_IE > 25 kJ/mol, it gives entropies that are unrealistically large. Moreover, in practical applications, both methods show a steady increase in the entropy with the number of energies considered.

Introduction

Estimating accurate free energies for the binding of small molecules to biomacromolecules is one of the most important goals of computational chemists because it would have a strong impact on drug development.¹⁻³ Therefore, a large number of methods have been developed with this aim. Some methods involve docking using simplified scoring functions, which give fast, but not so accurate results.⁴⁻⁶ More accurate results are obtained with free-energy perturbation methods, which employ molecular dynamics (MD) simulations with standard force fields for the free and bound ligands, as well as for a number of alchemical intermediate states.⁷⁻⁹ Consequently, they are computationally very demanding, but they can give an accuracy of ∼4 kJ/mol for well-behaving cases.¹⁰⁻¹²

Intermediate between these two levels of theory, there are some methods that are also based on MD simulations but only of the physical end-states (the complex and possibly also the free protein and ligand).¹³⁻¹⁵ In particular, the molecular mechanics combined with Poisson–Boltzmann or generalized Born and solvent-accessible surface area (MM/PBSA and MM/GBSA) solvation energies have been much used.¹⁶⁻¹⁸ In these, the complex of the macromolecule and the ligand is simulated by MD simulations, and a number of snapshots are collected. For each of these, the water molecules are stripped off, and the binding free energy is approximated by

where E_el is the electrostatic energy and E_vdW is the van der Waals energy, calculated with a standard MM force field, whereas G_sol is the solvation free energy calculated either by solving the Poisson–Boltzmann equation or by the Generalized Born approach, G_SASA is the non-polar solvation free energy, estimated from the solvent-accessible surface area (SASA), T is the absolute temperature, and S_NM is the translational, rotational, and vibrational entropy, estimated from a normal-mode (NM) analysis of vibrational frequencies calculated at the MM level of theory. Each energy term is estimated from the difference between the complex (RL), the free receptor (R), and the ligand (L)

normally obtained by simply stripping off the receptor or the ligand from snapshots taken from the MD simulations of the complex (otherwise the precision will be worse and there will be an additional energy term from the bond, angle, and dihedral interactions).¹³⁻¹⁵ Moreover, each energy term is an average over all snapshots from the MD simulation, indicated by the angular brackets in eq 1.

The time consumption of these approaches is often dominated by the ΔS_NM term (i.e., the frequency calculation), and ΔS_NM is therefore often calculated for only a fraction of the snapshots.^{13−15,19,20} However, then this term will limit the precision of the final results. As this term often does not improve the accuracy of the method (at least not the relative energies), it is often omitted. Several alternative approaches to estimate the binding entropy have been suggested.¹⁹⁻²²

In 2016, Zhang and co-workers suggested a new method to estimate the binding entropy, called the interaction entropy (IE) approach.²³ It estimates the entropy from

where R is the gas constant and ΔE_IE = ΔE_el + ΔE_vdW. Consequently, the entropies can be calculated directly from energies already available from the normal MM/GBSA (for simplicity, we will in the following say only MM/GBSA even when everything applies equally well for MM/PBSA) calculations and therefore does not add any extra computational cost. The IE method has been used in several later studies, also for protein–protein binding and alanine screening.²⁴⁻²⁷

In 2018, Minh and co-workers put the IE approach into a more general theoretical framework and tested a number of cheap methods to calculate the entropy.²² In particular, they expressed the binding free energy as an exponential average of ΔE_IE and approximated it by a cumulant expansion

where σ_IE is the standard deviation of E_IE over all snapshots. Consequently, the binding entropy can be approximated with the second-order cumulant approximation term (C2)

In a recent study, we tried to use these approaches to estimate MM/GBSA binding free energies for the binding of three similar ligands to galectin-3²⁸ but obtained poor and confusing results. Here, we explain those results and compare entropies obtained with the IE and C2 methods. In particular, we address the important question: how many snapshots are needed to obtain a converged estimate of the entropy with the IE and C2 methods? We answer the question by performing stochastic simulations, assuming that E_IE follows a Gaussian distribution, as has been done before for related questions.^29,30

Methods

MD Simulations

We have studied the binding of five different ligands to three proteins: galectin-3 with three ligands, differing only in the position of a single fluorine group (o-, m-, and p-fluoro-phenyltriazolyl-galactosylthioglucoside, called O, M, and P in the following), ferritin with phenol, and the T4 lysozyme Leu99Ala mutant with benzene. All three systems have been studied before by us, and we used the same setup as in our previous studies (which therefore shows some slight variations).^28,31,32 The simulations were based on the crystal structures of the complexes: 6RZF, 6RZG, 6RZH,²⁸3F39,³³ and 181L.³⁴ All crystal-water molecules were kept in the simulations. Each complex was solvated in an octahedral (galectin-3 and lysozyme) or rectangular (ferritin) box of water molecules extending at least 10 Å from the protein using the tleap module. The protonation state of all residues is specified in our previous studies.^28,31,32 No counter ions were used in the simulations.³⁵

The MD simulations were run with the Amber software suite.³⁶ The protein was described by the Amber ff14SB force field,³⁷ water molecules with the TIP3P (ferritin and lysozyme) or TIP4P-Ewald model (galectin-3),³⁸ whereas the ligands were treated with the general Amber force field (GAFF).³⁹ Charges for the ligands were obtained with the restrained electrostatic potential method,⁴⁰ and they were specified before.^28,31,32

For each complex, 1000 steps of minimization were used, followed by 20 ps constant-volume equilibration and 20 ps constant-pressure equilibration, all performed with heavy non-water atoms restrained toward the starting structure with a force constant of 4184 kJ/mol/Å². The system was then equilibrated freely for 1 ns. Two sets of production simulations were then performed for each of the five complexes with constant pressure and without any restraints: In the first, we run 100 ns simulation and sampled snapshots every 10 ps. In the second, we run 10 ns simulation and sampled snapshots every 10 fs. The former is similar to what we normally do in ligand-binding studies,^28,31,32,41 whereas the latter is more similar to what was done in the original IE publication.²³ In both cases, we run 10 independent simulations for each system, using different starting velocities and water solvation boxes.⁴² Consequently, the total simulation time for each complex was 1 μs and 100 ns, respectively, and we collected N = 100,000 and 10,000,000 snapshots from each simulation.

All bonds involving hydrogen atoms were constrained to the equilibrium value using the SHAKE algorithm,⁴³ allowing for a time step of 2 fs. The temperature was kept constant at 301 K (galectin-3 and lysozyme) or 298 K (ferritin) using Langevin dynamics,⁴⁴ with a collision frequency of 2 ps^–1. The pressure was kept constant at 1 atm using a weak-coupling isotropic algorithm⁴⁵ with a relaxation time of 1 ps. Long-range electrostatics were handled by particle-mesh Ewald summation⁴⁶ with a fourth-order B spline interpolation and a tolerance of 10^–5. The cutoff radius for Lennard-Jones interactions was 8 Å (galectin-3 and lysozyme) or 10 Å (ferritin).

MM/GBSA Calculations

MM/GBSA calculations¹⁶⁻¹⁸ were performed using mmpbsa.py utility of AMBER.⁴⁷ The calculations employed the latest generalized Born method GB-Neck2 (igb = 8) with modified Bondi radii (mbondi3)⁴⁸ and a dielectric constant of 80 outside the solute and 1 inside the solute. The non-polar solvation free energy was calculated from the solvent accessible surface, using ΔG_SASA = α SASA + b, with α = 0.0227 kJ/mol/Å² and b = 3.85 kJ/mol.⁴⁹

Entropies were calculated by the IE²³ and C2 approaches²² using the simulated data and a local script. We also divided the complete data (all snapshots) into smaller batches, allowing for estimates of the precision of the estimated entropies (as the standard deviation over the calculated entropy for each batch divided by the square root of the number of batches of equal size).

In addition, we also discuss results of previous MM/GBSA calculations on avidin with seven biotin-like ligands,⁴¹ blood clotting factor Xa with nine inhibitors,⁵⁰ galectin-3 with two additional ligands,⁵¹ and ferritin with eight additional small ligands.³¹ Using the old data, we have calculated IE and C2 entropies with the same script. Most of the old studies reported also entropies estimated with the NM approach.

All entropies in this article are discussed in energy terms, that is, as −TΔS in kJ/mol at 300 K.

Gaussian Simulations

The stochastic simulations used the same approach as in our previous study of the convergence of exponential averaging to estimate reaction free energies in combined quantum and molecular mechanical calculations.²⁹ In fact, the same small simulation program could be used, besides that entropies were considered, instead of free energies (so that average of ΔE_IE was not included). The program generates a certain number of Gaussian-distributed energies (by the Box–Muller transform⁵²) and calculates the exponential average in eq 3. This is repeated 1000 times, and it is noted how many times this average is within a certain limit (we used 4 kJ/mol for all calculations in this study) from the analytical results (eq 5; which is the analytical result for a Gaussian distribution). The program automatically finds the minimum number of snapshots (N, estimated within 0.1%) needed to fulfil these criteria. The Fortran code can be obtained from the authors upon request.

Results and Discussion

Galectin-3

In a recent investigation of the binding thermodynamics of three ligands to galectin-3,²⁸ we were suggested by a reviewer to estimate the energy and entropy of binding using the MM/GBSA and IE approaches. This was an attractive suggestion because we already had 10 × 100 ns simulations of the protein–ligand complexes with structures sampled every 10 ps, so it was only a matter of postprocessing of these snapshots. Unfortunately, the results were not especially encouraging and were therefore presented only in the Supporting Information.²⁸

In particular, whereas the MM/GBSA energies were reasonably well-converged, the interaction entropies showed a very alarming trend depending on how the entropies were calculated: If all data were used for exponential average in eq 3 (N = 100,000, the number of snapshots and individual ΔE_IE energy estimates), we obtained very large entropies for all three ligands, 162–208 kJ/mol, giving positive binding free energies of 33–56 kJ/mol (the experimental estimates are −30 to −33 kJ/mol²⁸). The results look reasonably converged with a variation of 2–5 kJ/mol for the running average over the last 10% of the snapshots, as can be seen in Figure 1.

Running average of IE (left) and C2 (right) entropies for the binding of three ligands (O, M, and P) to galectin-3 as a function of the number of snapshots from the 10 × 100 ns simulations with a sampling frequency of 10 ps.

These entropies do not have any estimates of the uncertainty, and it is more natural to calculate interaction entropies for each of the 10 independent simulations separately (N = 10,000 in each simulation), presenting the average entropy and the standard error over the 10 sets of simulations. This gave somewhat lower estimates, 160 ± 9, 161 ± 9, and 153 ± 10 kJ/mol, for O, M, and P, respectively, but still positive binding free energies (9–25 kJ/mol).

However, we have previously found that more stable entropies are obtained (by dihedral histogramming) if they are obtained from simulations of 5 ns (N = 500; the entropies are averaged over 200 batches) because it reduces the dependence on rare events.⁵³ The 5 ns time window was selected to be similar to the rotational correlation time of the protein (∼7 ns).²⁸ Quite surprisingly, this gave much smaller entropies, 84–89 kJ/mol, with a standard error of 2 kJ/mol. We therefore repeated the calculations with N varying from 100 to 100,000. From Figure 2a, it can be seen that the estimated IE entropy increases steadily with N. At the same time, the estimated standard error increases from 0.5 to 0.6 kJ/mol at N = 100 to 9–10 kJ/mol at N = 10,000, reflecting that the entropies are averaged over fewer independent simulations (from 1000 to 10; the range of the estimated entropies is actually larger at N = 100 than at N = 10,000, 140–210 kJ/mol, compared to 83–97 kJ/mol).

Entropies estimated by the IE (left) and C2 (right) methods by block averaging for the binding of three ligands to galectin-3 as a function of N from the 10 × 100 ns simulations. Note the logarithmic scale on the x-axis.

Naturally, these results are very alarming, showing that we can essentially get any estimate of the entropy between 60 and 200 kJ/mol. To yield some further understanding, we calculated entropies also with the second-order cumulant approximation, C2, as suggested by Minh and co-workers.²² The results are shown in Figures 1b and 2b, and it can be seen that it gives much higher entropies, 175–990 kJ/mol, but with the same increasing trend with respect to N.

It could be hoped that relative entropies are more stable. Therefore, we show in Figure 3 the relative entropy of the three ligands as a function of N. It can be seen that the C2 entropies show a consistent trend (P < O < M) for all values of N, although the range (the difference between the largest and the smallest value) increases from 36 to 280 kJ/mol. However, IE entropies do not give any consistent results, although P has the smallest entropy for N ≤ 5000. The range of the entropies increases from 2 to 45 kJ/mol.

Relative entropies (defined as the entropy of each ligand minus the average of the three ligands) estimated by the IE (left) and C2 (right) methods for the binding of three ligands to galectin-3 from the 10 × 100 ns simulations. For IE, points with N = 100,000 (−19, −7 and 26 kJ/mol for O, M, and P, respectively) were omitted to emphasize the variation for the other, more precise estimates.

We also repeated the entropy calculations from 10 × 10 ns simulations with a sampling frequency of 10 fs (i.e., more similar to what was used in the original IE study²³). However, they gave similar results, as can be seen in Figures S1 and S2. Moreover, we tried to remove various amounts of the initial part of the simulation as equilibration (in Figure 2, the equilibration time was 1 ns). However, this did not lead to any qualitative change in the results (Figure S3 shows the results when 51 ns of equilibration was used).

Next, we studied the root-mean-squared deviation (rmsd) of the ligand from the starting (crystallographic) conformation. It can be seen from Figure S4 that it in general fluctuates around 1.2 Å. However, occasionally, it increases to higher values (up to 5.5 Å). For a few simulations, it also stabilizes around 3 Å. These fluctuations do not represent full unbinding of the ligand but a change in the conformation of parts of the ligand. Still, it is likely that these conformational changes may affect the estimated entropies. Minh and co-workers suggested that only conformations with a low rmsd should be used for the entropy calculations,²² and we therefore tested to exclude all snapshots with a ligand rmsd > 2.2 Å. The results in Figure S5 show that it did not improve the convergence of the entropies.

Finally, we have also tried to extrapolate the block-averaged IE entropies with a power series in 1/N, following the suggestion by Zuckerman and co-workers.^54,55 However, as can be seen in Table S1, the extrapolations give a large uncertainty and a strong dependence on the exponent of the power series.

Curse of Exponential Averaging

At first, we assumed that we made some error in the calculations, owing to the large discrepancy between the IE and C2 results, but after some further considerations, we convinced ourselves that the calculations are correct and the large discrepancy actually explains the problem. In fact, we had already given the explanation before but in another context.³⁰ Several other groups have also pointed out the poor convergence of exponential averaging.⁵⁴⁻⁵⁸

If ΔE_IE follows a Gaussian distribution, the last sum in eq 4 truncates at the second term, and the IE and C2 estimates should coincide. Owing to the central limit theorem, it is reasonable to assume that most ΔE_IE data should be approximately Gaussian (ref (22) shows distributions for 87 protein–ligand complexes, supporting this suggestion). Therefore, a powerful technique to judge the performance, convergence, accuracy, and precision of methods like IE is to assume that ΔE_IE follows a Gaussian distribution and perform numerical simulations with random Gaussian-distributed data.²⁹ With such a simulation, it is simple to show that as long as the standard deviation of the ΔE_IE energies, σ_IE, is small, ΔS_IE and ΔS_C2 indeed coincide.

Moreover, we can answer the question: how many independent ΔE_IE energies (N) are needed to obtain a reliable estimate of the entropy, which for a Gaussian distribution is the C2 estimate. We only need to specify what we mean by “reliable.” Following our previous study,³⁰ we define reliable as giving an entropy within 4 kJ/mol of the analytical results with 95% confidence (but other values could easily be used). 4 kJ/mol corresponds to a factor of 5 in the binding constant, which seems to be a proper limit for a reliable estimate.

The results of the simulations are shown in Figure 4. It can be seen that for σ_IE < 10 kJ/mol, the IE entropy converges smoothly and only a rather small number of snapshots is needed (e.g., N = 1100 for σ_IE = 9 kJ/mol). However, when σ_IE > 10 kJ/mol, the convergence rapidly deteriorates and N increases exponentially. In practical applications, the upper limit is around N = 10,000,000, at which point the size of the coordinate files becomes several TB, even after stripping of the water molecules. This limit is reached around σ_IE = 15 kJ/mol. If σ_IE is larger than that, it will be practically impossible to converge the IE entropy, and the IE estimates will gravely underestimate the true entropy and will increase as the sampling is increased, as observed in Figure 2 (for which σ_IE = 60–70 kJ/mol).

Number of snapshots (N) needed to converge the IE and C2 entropies within 4 kJ/mol of the analytical result with 95% confidence, assuming that the ΔE_IE energies follow a Gaussian distribution.

The reason for this is that the exponential average is extremely badly conditioned.³⁰ The exponential average depends critically on energies that give the largest value of the exponential in eq 3. If the standard deviation of the distribution, σ, is small, these values are rather likely and are therefore frequently found in a simulation. However, as σ increases, the most important values become extremely rare and may actually never be observed in a MD simulation of a normal length. Consequently, the exponential average will typically underestimate the true result (obtained by an infinite number of snapshots, or by assuming that the distribution is indeed Gaussian, in which case the exponential average should converge to the second-order cumulant result). This is illustrated in Figure 5.

Problem of converging the exponential average. Assuming that ΔE follows a Gaussian distribution, the exponential average can be rewritten as an integral over the product of two terms , the Gaussian distribution and the Boltzmann factor .³⁰ These two terms are shown in the figure for the example of σ_IE = 15 kJ/mol (note the logarithmic scale). In this case, the maximum for the product is attained at ΔE = 90 kJ/mol (blue vertical line). At this value, G = 4 × 10^–10 (but B = 5 × 10¹⁵), so around 10¹⁰ snapshots are needed before this value is observed. In fact, ΔG and therefore ΔS (which is ΔG minus the average of ΔE) are still not fully converged (ΔG in the figure; right axis; it should be read from the right to the left, i.e., showing ΔG when all values larger than ΔE are included), differing by 1.4 kJ/mol from the analytic result; convergence to within 0.1 kJ/mol is obtained at ΔE = −115 kJ/mol, when G = 5 × 10^–15 (violet vertical line). If all ΔE < 3σ_IE (= −45 kJ/mol) are ignored, ΔG and TΔS will be wrong by over 15 kJ/mol (orange vertical line) because the most important parts of P are excluded.

Inline graphic — Problem of converging the exponential average. Assuming that ΔE follows a Gaussian distribution, the exponential average can be rewritten as an integral over the product of two terms , the Gaussian distribution and the Boltzmann factor .³⁰ These two terms are shown in the figure for the example of σ_IE = 15 kJ/mol (note the logarithmic scale). In this case, the maximum for the product is attained at ΔE = 90 kJ/mol (blue vertical line). At this value, G = 4 × 10^–10 (but B = 5 × 10¹⁵), so around 10¹⁰ snapshots are needed before this value is observed. In fact, ΔG and therefore ΔS (which is ΔG minus the average of ΔE) are still not fully converged (ΔG in the figure; right axis; it should be read from the right to the left, i.e., showing ΔG when all values larger than ΔE are included), differing by 1.4 kJ/mol from the analytic result; convergence to within 0.1 kJ/mol is obtained at ΔE = −115 kJ/mol, when G = 5 × 10^–15 (violet vertical line). If all ΔE < 3σ_IE (= −45 kJ/mol) are ignored, ΔG and TΔS will be wrong by over 15 kJ/mol (orange vertical line) because the most important parts of P are excluded.

Fortunately, the C2 estimate has a better convergence. For example, at σ_IE = 15 kJ/mol, less than 1000 energies are needed. At σ_IE = 25 kJ/mol, still only 7200 snapshots are needed, but on the other hand, the estimated entropy starts to be very large, −TΔS_C2 = 125 kJ/mol, indicating that other approximations start to break down. The results in Table 1 (to be discussed below) show that NM entropies typically are 52–126 kJ/mol, and Duan et al. report NM entropies of 0–122 kJ/mol.²³

Table 1. Values of σ_IE, TΔS_IE, TΔS_IE, TΔS_IE, and IE from Our Previous MM/GBSA Investigations^{28,31,41,42,50,51} and in the Present Study^a.

protein	ligand	length	f	σ_IE	–TΔS_IE	–TΔS_C2	–TΔS_NM
lysozyme⁴²	Bz	20 × 0.2	5	6	6	7	52
		40 × 0.2		6	18	6	53
^b		10 × 10	0.01	6	15	6
^b		10 × 100	10	6	14	7
ferritin³¹	L1	40 × 1.2	10	9	15	18	67
	L2			11	25	26	73
	L3			10	18	20	70
	L4			10	19	20	68
	L5			12	21	28	63
	L6			10	16	20	60
	L7			10	14	19	59
	L8			10	13	19	56
	L9			13	19	34	82
^b	L1	10 × 7.5	0.01	13	23	33
^b		10 × 100	10	13	24	34
galectin-3²⁸^,^b	O	10 × 10	0.01	54	185	583
	M			46	199	428
	P			40	206	328
	O	10 × 100	10	70	162	980
	M			70	174	990
	P			60	208	714
galectin-3⁵¹	Lac	10 × 200	100	37	161	274	126
	L02			30	152	177	122
	Lac	40 × 0.2	5	45	190	398	99
	L02			49	185	476	100
fXa⁵⁰	CBB	40 × 0.2	5	66	154	880	114
	C9			61	159	737	116
	C39			40	119	321	116
	C47			66	167	872	118
	C49			69	208	949	119
	C50			71	254	1009	119
	C53			66	218	876	119
	C57			41	113	333	115
	C63			41	125	330	113
avidin⁴¹	Btn1	4 × 25 × 0.2	5	48	166	469	102
	Btn2	4 × 30 × 0.2		47	162	452	103
	Btn3	4 × 20 × 0.2		50	213	495	100
	Btn4	4 × 50 × 0.2		25	82	126	99
	Btn5	4 × 40 × 0.2		25	48	123	78
	Btn6	4 × 20 × 0.2		20	43	80	75
	Btn7	4 × 20 × 0.2		21	53	91	66
	Btn1	25 × 0.2		13	30	35	102
				13	38	36	101
				14	46	40	101
				13	42	32	102
	Btn2	30 × 0.2		14	40	41	103
				14	32	40	104
				15	40	44	102
				14	42	41	103
	Btn3	20 × 0.2		14	38	38	98
				14	38	40	100
				15	39	43	101
				15	56	48	101
	Btn4	50 × 0.2		13	29	33	100
				13	33	33	97
				13	38	36	99
				12	30	31	98
	Btn5	40 × 0.2		10	32	21	77
				13	33	33	78
				13	38	36	78
				12	30	31	79
	Btn6	20 × 0.2		10	18	19	74
				9	23	16	75
				10	18	18	75
				10	18	18	75
	Btn7	20 × 0.2		8	19	14	70
				8	15	14	67
				8	17	13	64
				9	22	16	68

Open in a new tab

The table shows the protein, the ligand (named after the original publications), the length of the simulation (number of independent simulations times the length of the production simulation in ns; for avidin, an initial “4×” signifies that the four binding sites in the homotetrameric protein are considered at the same time) and the sampling frequency (f in fs).

Present investigation.

It might seem strange to simulate the C2 entropy with a Gaussian distribution, for which the C2 estimate is exact. However, since the C2 entropy is estimated from the square of the standard deviation from the finite sample of ΔE_IE energies (eq 5), the random variation of the sampled values become quite large when σ_IE is large and therefore many snapshots are needed before the precision is good enough to reproduce the correct value of the C2 entropy within 4 kJ/mol with 95% confidence.

The conclusion from this exercise is that if σ_IE < 10 kJ/mol, both IE and C2 can be used to estimate the entropy, they should give identical results (within 4 kJ/mol) and ∼1000 snapshots can be used, as is typical for MM/GBSA. If 15 ≤ σ_IE < 25 kJ/mol, IE becomes impossible to converge; IE and C2 entropies will start to diverge and C2 is strongly to be preferred. For C2, it is still enough with a few thousands of snapshots. Finally, if σ_IE > 25, C2 can still be converged (up to σ_IE ≈ 150 kJ/mol), but the estimated entropy is most likely grossly overestimated.

Comparison with Previous Results

To evaluate whether the results of our Gaussian simulations are representative, we examine the results of some of our previous MM/GBSA studies. The data are presented in Table 1 and are typically based on N = 1600–5000 snapshots. It can be seen that the magnitude of σ_IE depends mainly on the protein. Proteins with a buried binding site give a low σ_IE, for example, lysozyme (6 kJ/mol), ferritin (9–13 kJ/mol), and avidin (8–15 kJ/mol), whereas proteins for which the ligand binds on the surface give larger σ_IE, for example, factor Xa (40–71 kJ/mol) and galectin-3 (30–70 kJ/mol). However, there are also clear trends among the ligands. For example, for avidin, small and neutral ligands, like Btn6 and Btn7, give lower values of σ_IE (8–10 k/mol) than the larger and negatively charged ligands Btn1–Btn3 (13–15 kJ/mol). Moreover, the results also depend on the details of the simulation. For example, if all four binding sites of the tetrameric avidin protein are considered at the same time, σ_IE becomes much larger than if the sites in each subunit are considered individually.

It is hard to decide the convergence of the calculated entropies without doing additional simulations. However, a first indication can be obtained by comparing the IE and C2 entropies. The difference between these two estimates is shown as a function of σ_IE in Figure 6. It can be seen that for σ_IE < 16 kJ/mol, the IE and C2 entropies agree reasonably (within 15 kJ/mol) and with no consistent trend (inset in Figure 6). However, for larger σ_IE values, the difference between the two methods increases (essentially linearly for σ_IE > 35 kJ/mol), and the IE entropy is always smaller than the C2 entropy. This confirms our results in the previous section that it is completely impossible to obtain any reliable estimate of the IE entropy if σ_IE > 15 kJ/mol and that the C2 entropies are more accurate (but too large in magnitude).

Difference between the C2 and IE entropies as a function of σ_IE for our previous MM/GBSA studies (raw data in Table 1). The inset shows the results with σ_IE < 16 kJ/mol.

Our previous MM/GBSA studies involved also estimation of the entropies with the NM method. There is a fair correlation between entropies calculated between the three methods: R = 0.62 and 0.72 between the NM and C2 or IE entropies, respectively. In fact, the correlation is very good (R = 0.86–0.89 for all three combinations) for lysozyme, ferritin, and avidin, and the NM entropies are always larger in magnitude by 35–72 kJ/mol (54 kJ/mol on average for both IE and C2). On the other hand, there is no correlation between NM and the other two methods (R < 0.12) for galectin-3 and factor Xa (but the IE and C2 entropies have a fair correlation of R = 0.68). In these cases, C2 entropies are the largest, IE entropies are intermediate, and NM entropies are the smallest (54 and 470 kJ/mol smaller than IE and C2 on average), but the variation is large.

Next, we consider the results in the study by Menzer et al., which implicitly reports σ_IE in their Figure S2.²² Among their 87 examined protein–ligand complexes, σ_IE varies between 6 and 42 kJ/mol (with an average of 17 kJ/mol). As in our study, the lowest values are found for lysozyme (6–8 kJ/mol). This shows that our estimates of σ_IE (6–71 kJ/mol) are comparable with σ_IE estimates in other studies and are not unusually large.

Likewise, we have examined the results in the original IE article.²³ They examined 15 protein–ligand complexes with two simulation protocols. They do not report σ_IE, but we can get an approximation of it from eq 5, but using S_IE instead of S_C2 (it will underestimate σ_IE when it is larger ∼15 kJ/mol). This gives values for σ_IE of 8–24 kJ/mol, 43% of which are larger than 15 kJ/mol. In the first simulation protocol, the protein structure was restrained by a force constant of 42 kJ/mol/mol/Å², and the simulation was run for 2 ns. In the second protocol, no restraints were used, and the simulation length was 6 ns. In both cases, ΔE_IE energies were sampled every 10 fs from the last 1 ns of the simulation (i.e., N = 100,000). The restrained simulations always gave a lower entropy by 1–11 kJ/mol, showing that the restraints are not innocent. On the other hand, it reduced the number of simulations with σ_IE > 15 kJ/mol from 67 to 20%. Minh and co-workers performed a systematic study of the effect of restraints on the simulations for 54 protein–ligand complexes for five proteins.⁵⁹ They showed that the calculated entropies vary by up to 140 kJ/mol when the restraint weight varied. The entropies are 26 and 38 kJ/mol on average for IE and C2, with a difference of up to 119 kJ/mol (the average and maximum values of σ_IE are 14 and 34 kJ/mol). Naturally, TΔS_IE is even larger for protein–protein interactions, for example, 171–287 kJ/mol, for 13 complexes studied by Zhang and co-workers.²⁴

Simulations of Lysozyme and Ferritin

To get some additional perspective of the performance of the IE and C2 methods, we run new simulations for two protein–ligand systems, benzene bound to the Leu99Ala T4 lysozyme mutant and phenol bound to a ferritin dimer. The systems were selected to illustrate cases where the two entropy methods are expected to work well (lysozyme with σ_IE = 6 kJ/mol) and where the performance could start to be problematic (ferritin with σ_IE ≈ 13 kJ/mol). In both cases, we run two sets of simulations (the same as for galectin-3), 10 × 10 ns simulations with a sampling frequency of 10 fs (N = 10,000,000) and 10 × 100 ns with a sampling frequency of 10 ps (N = 100,000). The pooled simulations were then divided into blocks of decreasing N, as for galectin-3.

The results for lysozyme are shown in Figure 7. For both sets of simulations, it can be seen that the C2 entropies are stable, with a variation of only 0.3–0.4 kJ/mol for the different batch sizes (reflecting a variation of the estimated σ_IE of 0.1–0.2 kJ/mol). On the other hand, the IE entropy increases by 5–6 kJ/mol as N is increased. Moreover, the estimated entropy is 0.7–0.9 kJ/mol lower when estimated from the simulations with the high sampling frequency (comparing results obtained with the same N). This probably reflects that the 10 fs sampling is too dense: the correlation of the ΔE_IE energies between two snapshots 10 fs apart is 0.74, whereas it is 0.04–0.06 between the snapshots 10 ps apart (0.31 for 0.1 ps and 0.07 for 1 ps). The difference goes down to 0.3 kJ/mol for a sampling frequency of 0.1 ps and to essentially zero for a sampling frequency of 1 ps. The estimated statistical efficiency is ∼3. In a previous study, we estimated the correlation time between different MM/GBSA energies to 1–10 ps.⁴¹ Sampling correlated energies will not affect the estimated entropies, only slow the convergence (in terms of N but not in terms of the total simulation time). For example, if we use each energy 10 times in the Gaussian simulation, 10 times more energies are needed to reach the same level of convergence for each value of σ_IE, both for IE and C2 entropies.

Entropies estimated by the IE and C2 methods for the binding of benzene to T4 lysozyme, estimated from 10 × 10 ns simulations with a sampling frequency of 10 fs (left) or 10 × 100 ns simulations with a sampling of 10 ps (right). In both cases, the simulations are then divided into batches of different sizes. Note the logarithmic scale on the x-axis.

It should be noted that this rather strong dependence of the IE entropy on N is unexpected. If we instead base the calculations on Gaussian distributed random energies with the same mean and σ_IE as the simulated data, the calculated IE entropies vary by only 0.1 and 0.6 kJ/mol for the short and long simulations, respectively (and the C2 entropies by less than 0.01 kJ/mol). Thus, the variation seems to come from the fact that the ΔE_IE energies do not exactly follow a Gaussian distribution. This is confirmed by the distributions, as shown in Figure 8. It would then be tempting to prefer the IE results, but the steady increase in the IE entropy with N, without any sign of convergence, makes it problematic to use in practice.

Distribution of the ΔE_IE energies for the binding of benzene to T4 lysozyme, estimated from 10 × 10 ns simulations with a sampling frequency of 10 fs (left) 10 × 100 ns simulations with a sampling of 10 ps (right). The green curve shows the ideal Gaussian distribution with the same average and standard deviation.

We have examined the movement of the benzene ligand in the MD simulations. Figure S6 shows that the ligand rmsd fluctuates between 0.3 and 4.2 Å, reflecting that the symmetric ligand rotates freely in the binding site (in two of the 100 ns simulations, the ligand rmsd occasionally stabilizes around 3 Å). However, it never leaves the binding site. Therefore, it seems meaningless to restrict the averaging to certain structures; excluding some initial parts of the simulations as equilibration also has a minimal effect on the calculated entropies (Figure S7). Extrapolating the IE entropy⁵⁴ according to

gives reasonably consistent results for c = 0.2–0.3, –TΔS = 15–16 kJ/mol (Table S2).

The corresponding results for ferritin are shown in Figure 9. It can be seen that the IE entropy shows the same increasing trend with N as for lysozyme, but the variation is larger in absolute terms, 11–13 kJ/mol. The increase is essentially linear on the logarithmic scale. This variation is slightly larger than expected from Gaussian distributed data, 9–10 kJ/mol.

Entropies estimated by the IE and C2 methods for the binding of phenol to ferritin, estimated from 10 × 10 ns simulations with a sampling frequency of 10 fs (left) and 10 × 100 ns simulations with a sampling of 10 ps (right). In both cases, the simulations are then divided into batches of different sizes. Note the logarithmic scale on the x-axis.

However, for ferritin, the C2 entropy also shows an increasing trend with N. In fact, the variation is larger than that for the IE entropy, 18–25 kJ/mol and with a more irregular trend. This reflects that σ_IE shows a similar increasing trend, 9–13 kJ/mol for the long simulation and 6–13 kJ/mol for the shorter simulation. This indicates that the data are distinctly non-Gaussian, as also seen in Figure 10 (with Gaussian data, there should be essentially no variation of the C2 entropy, <0.03 kJ/mol).

Distribution of the ΔE_IE energies for the binding of phenol to ferritin, estimated from 10 × 10 ns simulations with a sampling frequency of 10 fs (left) and 10 × 100 ns simulations with a sampling frequency of 10 ps (right). The green curve shows the ideal Gaussian distribution with the same average and standard deviation.

As for ferritin, the IE entropies estimated from the short simulations are lower than those from the long simulations, obtained from the same N (by 5–7 kJ/mol). Increasing the sampling frequency to 0.1 and 1 ps decreases the difference to 3 and 0.4 kJ/mol, respectively. This time, the statistical inefficiency is ∼40. Again, this indicates that a sampling frequency of 10 fs is too dense.

The rmsd of the ligand compared to the starting crystal structure is quite large and showing irregular fluctuations between 0.5 and 5 Å. Again, this reflects rotations of the small ligand in the binding site and no unbinding of the ligand. Such rotations are supported by the crystal structure, which shows two conformations of the ligand.³⁴ Again, this makes it meaningless to restrict the averaging to certain structures. Removing the initial part of the simulation has a somewhat larger effect on ferritin than that on lysozyme (Figure S9), up to 5 and 7 kJ/mol for IE and C2 entropies, respectively, but decreasing with N. Extrapolation of the entropies with eq 6 also works worse than that for lysozyme, giving 26–29 kJ/mol for IE and 39–43±4–7 kJ/mol for the C2 entropies, based on the 100 ns simulations and c = 0.2–0.3 (Table S3).

Conclusions

In this study, we have made a critical evaluation of the interaction-entropy method²³ and the related approach involving a cumulant expansion, truncated at the second order,²² as cheap estimates of the entropies for MM/GBSA calculations. By employing simulations with Gaussian-distributed random numbers, we illustrate the extremely poorly conditioning of the exponential average, which is involved in the IE method. If the standard deviation of the ΔE_IE energies, σ_IE, is larger than 15 kJ/mol, it becomes practically impossible to obtain converged results. Even worse, it is hard to recognize the problem because the extreme energies that determine the true value of the exponential average become very unlikely (cf. Figure 5). However, a good indication of the poor convergence (besides the large value of σ_IE) is the steadily increasing value of the estimated entropy as the number of energies included in the exponential average increases, as shown in Figure 2. Several previous studies have pointed out the poor convergence of exponential averages for free-energy estimates and suggested various methods to decide whether the estimate is accurate,^54−58,60 Theoretically, the C2 method shows much better convergence, up to σ_IE = 150 kJ/mol. However, for σ_IE > 25 kJ/mol, C2 entropies are too large to be realistic for a binding ligand.

Clearly, this is a practical problem because approximately half of the protein–ligand systems studied in the original IE and C2 publications^22,23 give σ_IE > 15 kJ/mol. Moreover, 13% of the systems studied by Minh and co-workers and two of our studied systems give σ_IE > 25 kJ/mol. This was also the case for 13 protein–protein interactions.²⁴ In that case, it was also observed that C2 entropies were much larger IE entropies. However, the authors suggested that “the Gaussian distribution obtained from relatively short MD runs...does not accurately represent the true energy distribution.” Therefore, they preferred the IE entropies, in contrast to our Gaussian simulations, which show that if IE and C2 entropies differ for large σ_IE, the C2 results should be trusted more than those from IE. On the other hand, for alanine scanning calculations, the effect of the entropy seems to be relatively small.²⁵⁻²⁷ In one study, it was shown that predictions by IE and C2 typically agreed with a mean absolute deviation of 1 kJ/mol and maximum deviations of 9 kJ/mol.²⁵

Zhang and co-workers suggested that the problem can be solved by using an extremely frequent sampling (every 10 fs). Our results indicate that this is too dense, giving strongly correlated energies. This does not affect the calculated entropies, but it is inefficient; for lysozyme and ferritin, the statistical inefficiency is 3–40, indicating that a sampling frequency of 0.1 ps would be more appropriate.

Moreover, both Zhang and Minh and co-workers suggested that the protein should be kept restrained during the simulations.^23,59 This reduces σ_IE by 1–16 kJ/mol (7 kJ/mol on average) but of course also reduces the calculated entropy, which is not necessarily innocent, especially as the reduction varies between different proteins (so that the relative entropy also changes). The effect of the restraints on the dynamics and the other energy terms in eq 1 are also significant.⁵⁹

Moreover, Zhang and co-workers suggested that a cutoff should be employed for the IE energies, so that those >3σ_IE are ignored in the average of eq 3.²⁴ From a statistical point of view, this is very questionable. As shown in Figure 5, when σ_IE is large, the correct ΔG or TΔS is completely dominated by the lowest values of ΔE_IE. If these are ignored or truncated, the calculated entropy will simply be incorrect (although the convergence to the wrong value will improve). Again, this can be illustrated by a Gaussian simulation, as shown in Figure 11. When σ_IE < 6 kJ/mol, ΔE_IE values outside 3σ_IE have little influence on the calculated entropy. However, for larger values, the truncated IE estimate rapidly diverges from the analytic results, giving a linear, rather than exponential increase with σ_IE. The deviation is 5, 17, 38, and 69 kJ/mol at σ_IE = 10, 15, 20, and 25 kJ/mol, respectively.

Effect of ignoring ΔE_IE values outside 3σ_IE when calculating TΔS_IE, according to Gaussian simulations. “True” is the analytical result, whereas “Truncated” is the results obtained from IE, using eq 3 and ignoring ΔE_IE values that deviate by more than 3σ_IE from the average. “Difference” is the difference between the two curves, that is, the error caused by the cutoff.

Unfortunately, our applications on lysozyme and ferritin show that the convergence is worse than expected from the Gaussian model. For lysozyme (with σ_IE = 6 kJ/mol), C2 gives converged energies that do not depend on the sampling frequency. However, the IE energies are larger and increase steadily with N, although the variation is rather small (up to 6 kJ/mol). For ferritin (with σ_IE = 13 kJ/mol), both IE and C2 give entropies that increase steadily with N and with variations of up to 13 and 25 kJ/mol for N between 100 and 10,000,000. This may be caused by the fact that the ΔE_IE energies do not follow a Gaussian distribution. However, it can also reflect that as the simulations are elongated, more and more conformational states become available (higher activation barriers can be passed), as has been discussed before.⁵¹

There is a clear relation between σ_IE and the properties of the protein–ligand complexes: σ_IE is lower in systems where the ligand binds in a buried binding site, than when it binds on the surface. This natural and intuitive: in a solvent-exposed binding site, the ligand most likely retains much of its flexibility and group-rotation degrees of freedom, whereas inside the protein, they may be strongly restricted. However, σ_IE also seems to increase for charged ligands, simply because the interaction energies increase in magnitude, which is less obvious.

In conclusion, this study gives a rather pessimistic view of the applicability of IE or C2 entropies for MM/GBSA, except for alanine screening. Our results show that σ_IE should always be reported when using these methods. Moreover, it is advisable to calculate both IE and C2 entropies for the available data and to study how they depend on N by block averaging (which also provide an estimate of the precision of the calculated entropies). In addition, a sampling frequency of 10 fs seems to be 3–40 times too dense. Clearly, the IE method should be avoided if σ_IE > 15 kJ/mol because it is impossible to converge the exponential average. Moreover, C2 seems to give unrealistically large entropies when σ_IE > 25 kJ/mol. However, in practice, even when σ_IE < 15 kJ/mol, both methods often seem to have problems to give converged results. Still, it seems that relative entropies between similar ligands binding to the same protein are more stable, as shown in Figure 3, although the results are far from quantitative. Thus, estimating entropies from MD simulations remains a challenging task.

Acknowledgments

This investigation has been supported by grants from the Swedish research council (project 2018-05003). The computations were performed on computer resources provided by the Swedish National Infrastructure for Computing (SNIC) at Lunarc at Lund University and HPC2N at Umeå University, partially funded by the Swedish Research Council (grant 2018-05973).

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.1c00374.

Absolute and relative entropies for the binding of three ligands to galectin-3 from 10 × 10 ns simulations with a sampling frequency of 10 fs; entropies estimated for the same system after 51 ns of equilibration were removed or after exclusion of snapshots with ligand RMSD > 2.2 Å, as well as ligand rmsd during the simulations and extrapolated values of the IE entropy; ligand rmsd, extrapolated IE entropies, as well as IE and C2 entropies estimated after 6 or 51 ns of equilibration were removed for the binding of benzene to lysozyme; ligand rmsd, extrapolated IE and C2 entropies, as well as IE and C2 entropies estimated after 4.25 or 51 ns of equilibration were removed for the binding of benzene to lysozyme. (PDF)

The authors declare no competing financial interest.

Supplementary Material

ct1c00374_si_001.pdf^{(840.2KB, pdf)}

References

Gohlke H.; Klebe G. Approaches to the Description and Prediction of the Binding Affinity of Small-Molecule Ligands to Macromolecular Receptors. Angew. Chem., Int. Ed. 2002, 41, 2644–2676. . [DOI] [PubMed] [Google Scholar]
Gilson M. K.; Zhou H.-X. Calculation of Protein-Ligand Binding Affinities. Annu. Rev. Biophys. Biomol. Struct. 2007, 36, 21–42. 10.1146/annurev.biophys.36.040306.132550. [DOI] [PubMed] [Google Scholar]
Wang X.; Song K.; Li L.; Chen L. Structure-Based Drug Design Strategies and Challenges. Curr. Top. Med. Chem. 2018, 18, 998–1006. 10.2174/1568026618666180813152921. [DOI] [PubMed] [Google Scholar]
Leach A. R.; Shoichet B. K.; Peishoff C. E. Prediction of Protein–Ligand Interactions. Docking and Scoring: Successes and Gaps. J. Med. Chem. 2006, 49, 5851–5855. 10.1021/jm060999m. [DOI] [PubMed] [Google Scholar]
Kontoyianni M.; Madhav P.; Suchanek E.; Seibel W. Theoretical and Practical Considerations in Virtual Screening: A Beaten Field?. Curr. Med. Chem. 2008, 15, 107–116. 10.2174/092986708783330566. [DOI] [PubMed] [Google Scholar]
Pinzi L.; Rastelli G. Molecular Docking: Shifting Paradigms in Drug Discovery. Int. J. Mol. Sci. 2019, 20, 433. 10.3390/ijms20184331. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hansen N.; Van Gunsteren W. F. Practical Aspects of Free-Energy Calculations: A Review. J. Chem. Theory Comput. 2014, 10, 2632–2647. 10.1021/ct500161f. [DOI] [PubMed] [Google Scholar]
Wang L.; Chambers J.; Abel R.. Protein-Ligand Binding Free Energy Calculations with FEP+. In Biomolecular Simulations: Methods and Protocols; Bonomi M., Camilloni C., Eds.; Springer New York: New York, NY, 2019; pp 201–232. [DOI] [PubMed] [Google Scholar]
Mey A. S. J. S.; Allen B. K.; Bruce Macdonald H. E.; Chodera J. D.; Hahn D. F.; Kuhn M.; Michel J.; Mobley D. L.; Naden L. N.; Prasad S.; et al. Best Practices for Alchemical Free Energy Calculations. Living J. Comput. Mol. Sci. 2020, 2, 18378. 10.33011/livecoms.2.1.18378. [DOI] [PMC free article] [PubMed] [Google Scholar]
Christ C. D.; Fox T. Accuracy Assessment and Automation of Free Energy Calculations for Drug Design. J. Chem. Inf. Model. 2014, 54, 108–120. 10.1021/ci4004199. [DOI] [PubMed] [Google Scholar]
Mikulskis P.; Genheden S.; Ryde U. A Large-Scale Test of Free-Energy Simulation Estimates of Protein-Ligand Binding Affinities. J. Chem. Inf. Model. 2014, 54, 2794–2806. 10.1021/ci5004027. [DOI] [PubMed] [Google Scholar]
Harder E.; Damm W.; Maple J.; Wu C.; Reboul M.; Xiang J. Y.; Wang L.; Lupyan D.; Dahlgren M. K.; Knight J. L.; et al. OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins. J. Chem. Theory Comput. 2016, 12, 281–296. 10.1021/acs.jctc.5b00864. [DOI] [PubMed] [Google Scholar]
Åqvist J.; Luzhkov V. B.; Brandsdal B. O. Ligand Binding Affinities from MD Simulations. Acc. Chem. Res. 2002, 35, 358–365. 10.1021/ar010014p. [DOI] [PubMed] [Google Scholar]
Michel J. Current and Emerging Opportunities for Molecular Simulations in Structure-Based Drug Design. Phys. Chem. Chem. Phys. 2014, 16, 4465–4477. 10.1039/c3cp54164a. [DOI] [PMC free article] [PubMed] [Google Scholar]
Salo-Ahen O. M. H.; Alanko I.; Bhadane R.; Bonvin A. M. J. J.; Honorato R. V.; Hossain S.; Juffer A. H.; Kabedev A.; Lahtela-Kakkonen M.; Larsen A. S.; et al. Molecular Dynamics Simulations in Drug Discovery and Pharmaceutical Development. Processes 2021, 9, 71. [Google Scholar]
Kollman P. A.; Massova I.; Reyes C.; Kuhn B.; Huo S.; Chong L.; Lee M.; Lee T.; Duan Y.; Wang W.; et al. Calculating Structures and Free Energies of Complex Molecules: Combining Molecular Mechanics and Continuum Models. Acc. Chem. Res. 2000, 33, 889–897. 10.1021/ar000033j. [DOI] [PubMed] [Google Scholar]
Genheden S.; Ryde U. The MM/PBSA and MM/GBSA Methods to Estimate Ligand-Binding Affinities. Expert Opin. Drug Discov. 2015, 10, 449–461. 10.1517/17460441.2015.1032936. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang E.; Sun H.; Wang J.; Wang Z.; Liu H.; Zhang J. Z. H.; Hou T. End-Point Binding Free Energy Calculation with MM/PBSA and MM/GBSA: Strategies and Applications in Drug Design. Chem. Rev. 2019, 119, 9478–9508. 10.1021/acs.chemrev.9b00055. [DOI] [PubMed] [Google Scholar]
Kongsted J.; Ryde U. An Improved Method to Predict the Entropy Term with the MM/PBSA Approach. J. Comput. Aided Mol. Des. 2008, 23, 63–71. 10.1007/s10822-008-9238-z. [DOI] [PubMed] [Google Scholar]
Genheden S.; Kuhn O.; Mikulskis P.; Hoffmann D.; Ryde U. The Normal-Mode Entropy in the MM/GBSA Method: Effect of System Truncation, Buffer Region, and Dielectric Constant. J. Chem. Inf. Model. 2012, 52, 2079–2088. 10.1021/ci3001919. [DOI] [PubMed] [Google Scholar]
Gohlke H.; Case D. A. Converging free energy estimates: MM-PB(GB)SA studies on the protein-protein complex Ras-Raf. J. Comput. Chem. 2004, 25, 238–250. 10.1002/jcc.10379. [DOI] [PubMed] [Google Scholar]
Menzer W. M.; Li C.; Sun W.; Xie B.; Minh D. D. L. Simple Entropy Terms for End-Point Binding Free Energy Calculations. J. Chem. Theory Comput. 2018, 14, 6035–6049. 10.1021/acs.jctc.8b00418. [DOI] [PMC free article] [PubMed] [Google Scholar]
Duan L.; Liu X.; Zhang J. Z. H. Interaction Entropy: A New Paradigm for Highly Efficient and Reliable Computation of Protein-Ligand Binding Free Energy. J. Am. Chem. Soc. 2016, 138, 5722–5728. 10.1021/jacs.6b02682. [DOI] [PubMed] [Google Scholar]
Sun Z.; Yan Y. N.; Yang M.; Zhang J. Z. H. Interaction Entropy for Protein-Protein Binding. J. Chem. Phys. 2017, 146, 124124. 10.1063/1.4978893. [DOI] [PubMed] [Google Scholar]
Yan Y.; Yang M.; Ji C. G.; Zhang J. Z. H. Interaction Entropy for Computational Alanine Scanning. J. Chem. Inf. Model. 2017, 57, 1112–1122. 10.1021/acs.jcim.6b00734. [DOI] [PubMed] [Google Scholar]
He L.; Bao J.; Yang Y.; Dong S.; Zhang L.; Qi Y.; Zhang J. Z. H. Study of SHMT2 Inhibitors and Their Binding Mechanism by Computational Alanine Scanning. J. Chem. Inf. Model. 2019, 59, 3871–3878. 10.1021/acs.jcim.9b00370. [DOI] [PubMed] [Google Scholar]
Liu X.; Peng L.; Zhou Y.; Zhang Y.; Zhang J. Z. H. Computational Alanine Scanning with Interaction Entropy for Protein-Ligand Binding Free Energies. J. Chem. Theory Comput. 2018, 14, 1772–1780. 10.1021/acs.jctc.7b01295. [DOI] [PubMed] [Google Scholar]
Wallerstein J.; Ekberg V.; Misini Ignjatović M.; Kumar R.; Caldararu O.; Peterson K.; Leffler H.; Logan D. T.; Nilsson U. J.; Ryde U.; et al. Entropy–Entropy Compensation Between the Conformational and Solvent Degrees of Freedom Fine-Tunes Affinity in Ligand Binding to Galectin-3C. JACS Au 2021, 1, 484–500. 10.1021/jacsau.0c00094. [DOI] [PMC free article] [PubMed] [Google Scholar]
Boresch S.; Woodcock H. L. Convergence of Single-Step Free Energy Perturbation. Mol. Phys. 2017, 115, 1200–1213. 10.1080/00268976.2016.1269960. [DOI] [Google Scholar]
Ryde U. How Many Conformations Need To Be Sampled To Obtain Converged QM/MM Energies? The Curse of Exponential Averaging. J. Chem. Theory Comput. 2017, 13, 5745–5752. 10.1021/acs.jctc.7b00826. [DOI] [PubMed] [Google Scholar]
Mikulskis P.; Genheden S.; Wichmann K.; Ryde U. A Semiempirical Approach to Ligand-Binding Affinities: Dependence on the Hamiltonian and Corrections. J. Comput. Chem. 2012, 33, 1179–1189. 10.1002/jcc.22949. [DOI] [PubMed] [Google Scholar]
Genheden S.; Kongsted J.; Söderhjelm P.; Ryde U. Nonpolar Solvation Free Energies of Protein–Ligand Complexes. J. Chem. Theory Comput. 2010, 6, 3558–3568. 10.1021/ct100272s. [DOI] [PubMed] [Google Scholar]
Vedula L. S.; Brannigan G.; Economou N. J.; Xi J.; Hall M. A.; Liu R.; Rossi M. J.; Dailey W. P.; Grasty K. C.; Klein M. L.; et al. A Unitary Anesthetic Binding Site at High Resolution. J. Biol. Chem. 2009, 284, 24176. 10.1074/jbc.m109.017814. [DOI] [PMC free article] [PubMed] [Google Scholar]
Morton A.; Matthews B. W. Specificity of Ligand Binding in a Buried Nonpolar Cavity of T4 Lysozyme: Linkage of Dynamics and Structural Plasticity. Biochemistry 1995, 34, 8576–8588. 10.1021/bi00027a007. [DOI] [PubMed] [Google Scholar]
Mikulskis P.; Cioloboc D.; Andrejić M.; Khare S.; Brorsson J.; Genheden S.; Mata R. A.; Söderhjelm P.; Ryde U. Free-Energy Perturbation and Quantum Mechanical Study of SAMPL4 Octa-Acid Host-Guest Binding Energies. J. Comput. Aided Mol. Des. 2014, 28, 375–400. 10.1007/s10822-014-9739-x. [DOI] [PubMed] [Google Scholar]
Case D. A.; Ben-Shalom I. Y.; Brozell S. R.; Cerutti D. S.; Cheatham T. E.; Cruzeiro V. W. D.; Darden T. A.; Duke R. E.; Ghoreishi D.; et al. Amber 18; University of California: San Francisco, 2018.
Maier J. A.; Martinez C.; Kasavajhala K.; Wickstrom L.; Hauser K. E.; Simmerling C. Ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from Ff99SB. J. Chem. Theory Comput. 2015, 11, 3696–3713. 10.1021/acs.jctc.5b00255. [DOI] [PMC free article] [PubMed] [Google Scholar]
Horn H. W.; Swope W. C.; Pitera J. W.; Madura J. D.; Dick T. J.; Hura G. L.; Head-Gordon T. Development of an Improved Four-Site Water Model for Biomolecular Simulations: TIP4P-Ew. J. Chem. Phys. 2004, 120, 9665–9678. 10.1063/1.1683075. [DOI] [PubMed] [Google Scholar]
Wang J.; Wolf R. M.; Caldwell J. W.; Kollman P. A.; Case D. A. Development and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157–1174. 10.1002/jcc.20035. [DOI] [PubMed] [Google Scholar]
Bayly C. I.; Cieplak P.; Cornell W.; Kollman P. A. A Well-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges: The RESP Model. J. Phys. Chem. 1993, 97, 10269–10280. 10.1021/j100142a004. [DOI] [Google Scholar]
Genheden S.; Ryde U. How to Obtain Statistically Converged MM/GBSA Results. J. Comput. Chem. 2010, 31, 837–846. 10.1002/jcc.21366. [DOI] [PubMed] [Google Scholar]
Genheden S.; Ryde U. A Comparison of Different Initialization Protocols to Obtain Statistically Independent Molecular Dynamics Simulations. J. Comput. Chem. 2011, 32, 187–195. 10.1002/jcc.21546. [DOI] [PubMed] [Google Scholar]
Ryckaert J.-P.; Ciccotti G.; Berendsen H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes. J. Comput. Phys. 1977, 23, 327–341. 10.1016/0021-9991(77)90098-5. [DOI] [Google Scholar]
Wu X.; Brooks B. R. Self-Guided Langevin Dynamics Simulation Method. Chem. Phys. Lett. 2003, 381, 512–518. 10.1016/j.cplett.2003.10.013. [DOI] [Google Scholar]
Berendsen H. J. C.; Postma J. P. M.; Van Gunsteren W. F.; DiNola A.; Haak J. R. Molecular Dynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81, 3684–3690. 10.1063/1.448118. [DOI] [Google Scholar]
Darden T.; York D.; Pedersen L. Particle mesh Ewald: An N·log(N) method for Ewald sums in large systems. J. Chem. Phys. 1993, 98, 10089. 10.1063/1.464397. [DOI] [Google Scholar]
Miller B. R.; McGee T. D.; Swails J. M.; Homeyer N.; Gohlke H.; Roitberg A. E. MMPBSA.py: An Efficient Program for End-State Free Energy Calculations. J. Chem. Theory Comput. 2012, 8, 3314–3321. 10.1021/ct300418h. [DOI] [PubMed] [Google Scholar]
Nguyen H.; Roe D. R.; Simmerling C. Improved Generalized Born Solvent Model Parameters for Protein Simulations. J. Chem. Theory Comput. 2013, 9, 2020–2034. 10.1021/ct3010485. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kuhn B.; Kollman P. A. Binding of a Diverse Set of Ligands to Avidin and Streptavidin: An Accurate Quantitative Prediction of Their Relative Affinities by a Combination of Molecular Mechanics and Continuum Solvent Models. J. Med. Chem. 2000, 43, 3786–3791. 10.1021/jm000241h. [DOI] [PubMed] [Google Scholar]
Genheden S.; Söderhjelm P.; Ryde U. Transferability of Conformational Dependent Charges from Protein Simulations. Int. J. Quantum Chem. 2012, 112, 1768–1785. 10.1002/qua.22967. [DOI] [Google Scholar]
Genheden S.; Ryde U. Will Molecular Dynamics Simulations of Proteins Ever Reach Equilibrium?. Phys. Chem. Chem. Phys. 2012, 14, 8662–8677. 10.1039/c2cp23961b. [DOI] [PubMed] [Google Scholar]
Box G. E. P.; Muller M. E. A Note on the Generation of Random Normal Deviates. Ann. Math. Stat. 1968, 29, 610–611. 10.1214/aoms/1177706645. [DOI] [Google Scholar]
Genheden S.; Akke M.; Ryde U. Conformational Entropies and Order Parameters: Convergence, Reproducibility, and Transferability. J. Chem. Theory Comput. 2014, 10, 432–438. 10.1021/ct400747s. [DOI] [PubMed] [Google Scholar]
Zuckerman D. M.; Woolf T. B. Overcoming Finite-Sampling Errors in Fast-Switching Free-Energy Estimates: Extrapolative Analysis of a Molecular System. Chem. Phys. Lett. 2002, 351, 445–453. 10.1016/s0009-2614(01)01397-5. [DOI] [Google Scholar]
Zuckerman D. M.; Woolf T. B. Theory of a Systematic Computational Error in Free Energy Differences. Phys. Rev. Lett. 2002, 89, 180602. 10.1103/physrevlett.89.180602. [DOI] [PubMed] [Google Scholar]
Lu N.; Kofke D. A. Accuracy of Free-Energy Perturbation Calculations in Molecular Simulation. I. Modeling. J. Chem. Phys. 2001, 114, 7303–7311. 10.1063/1.1359181. [DOI] [Google Scholar]
Gore J.; Ritort F.; Bustamante C. Bias and Error in Estimates of Equilibrium Free-Energy Differences from Nonequilibrium Measurements. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 12564–12569. 10.1073/pnas.1635159100. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wu D.; Kofke D. A. Model for Small-Sample Bias of Free-Energy Calculations Applied to Gaussian-Distributed Nonequilibrium Work Measurements. J. Chem. Phys. 2004, 121, 8742–8747. 10.1063/1.1806413. [DOI] [PubMed] [Google Scholar]
Menzer W. M.; Xie B.; Minh D. D. L. On Restraints in End-Point Protein-Ligand Binding Free Energy Calculations. J. Comput. Chem. 2020, 41, 573–586. 10.1002/jcc.26119. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wu D.; Kofke D. A. Phase-Space Overlap Measures. I. Fail-Safe Bias Detection in Free Energies Calculated by Molecular Simulation. J. Chem. Phys. 2005, 123, 054103. 10.1063/1.1992483. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ct1c00374_si_001.pdf^{(840.2KB, pdf)}

[ref1] Gohlke H.; Klebe G. Approaches to the Description and Prediction of the Binding Affinity of Small-Molecule Ligands to Macromolecular Receptors. Angew. Chem., Int. Ed. 2002, 41, 2644–2676. . [DOI] [PubMed] [Google Scholar]

[ref2] Gilson M. K.; Zhou H.-X. Calculation of Protein-Ligand Binding Affinities. Annu. Rev. Biophys. Biomol. Struct. 2007, 36, 21–42. 10.1146/annurev.biophys.36.040306.132550. [DOI] [PubMed] [Google Scholar]

[ref3] Wang X.; Song K.; Li L.; Chen L. Structure-Based Drug Design Strategies and Challenges. Curr. Top. Med. Chem. 2018, 18, 998–1006. 10.2174/1568026618666180813152921. [DOI] [PubMed] [Google Scholar]

[ref4] Leach A. R.; Shoichet B. K.; Peishoff C. E. Prediction of Protein–Ligand Interactions. Docking and Scoring: Successes and Gaps. J. Med. Chem. 2006, 49, 5851–5855. 10.1021/jm060999m. [DOI] [PubMed] [Google Scholar]

[ref5] Kontoyianni M.; Madhav P.; Suchanek E.; Seibel W. Theoretical and Practical Considerations in Virtual Screening: A Beaten Field?. Curr. Med. Chem. 2008, 15, 107–116. 10.2174/092986708783330566. [DOI] [PubMed] [Google Scholar]

[ref6] Pinzi L.; Rastelli G. Molecular Docking: Shifting Paradigms in Drug Discovery. Int. J. Mol. Sci. 2019, 20, 433. 10.3390/ijms20184331. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref7] Hansen N.; Van Gunsteren W. F. Practical Aspects of Free-Energy Calculations: A Review. J. Chem. Theory Comput. 2014, 10, 2632–2647. 10.1021/ct500161f. [DOI] [PubMed] [Google Scholar]

[ref8] Wang L.; Chambers J.; Abel R.. Protein-Ligand Binding Free Energy Calculations with FEP+. In Biomolecular Simulations: Methods and Protocols; Bonomi M., Camilloni C., Eds.; Springer New York: New York, NY, 2019; pp 201–232. [DOI] [PubMed] [Google Scholar]

[ref9] Mey A. S. J. S.; Allen B. K.; Bruce Macdonald H. E.; Chodera J. D.; Hahn D. F.; Kuhn M.; Michel J.; Mobley D. L.; Naden L. N.; Prasad S.; et al. Best Practices for Alchemical Free Energy Calculations. Living J. Comput. Mol. Sci. 2020, 2, 18378. 10.33011/livecoms.2.1.18378. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref10] Christ C. D.; Fox T. Accuracy Assessment and Automation of Free Energy Calculations for Drug Design. J. Chem. Inf. Model. 2014, 54, 108–120. 10.1021/ci4004199. [DOI] [PubMed] [Google Scholar]

[ref11] Mikulskis P.; Genheden S.; Ryde U. A Large-Scale Test of Free-Energy Simulation Estimates of Protein-Ligand Binding Affinities. J. Chem. Inf. Model. 2014, 54, 2794–2806. 10.1021/ci5004027. [DOI] [PubMed] [Google Scholar]

[ref12] Harder E.; Damm W.; Maple J.; Wu C.; Reboul M.; Xiang J. Y.; Wang L.; Lupyan D.; Dahlgren M. K.; Knight J. L.; et al. OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins. J. Chem. Theory Comput. 2016, 12, 281–296. 10.1021/acs.jctc.5b00864. [DOI] [PubMed] [Google Scholar]

[ref13] Åqvist J.; Luzhkov V. B.; Brandsdal B. O. Ligand Binding Affinities from MD Simulations. Acc. Chem. Res. 2002, 35, 358–365. 10.1021/ar010014p. [DOI] [PubMed] [Google Scholar]

[ref14] Michel J. Current and Emerging Opportunities for Molecular Simulations in Structure-Based Drug Design. Phys. Chem. Chem. Phys. 2014, 16, 4465–4477. 10.1039/c3cp54164a. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] Salo-Ahen O. M. H.; Alanko I.; Bhadane R.; Bonvin A. M. J. J.; Honorato R. V.; Hossain S.; Juffer A. H.; Kabedev A.; Lahtela-Kakkonen M.; Larsen A. S.; et al. Molecular Dynamics Simulations in Drug Discovery and Pharmaceutical Development. Processes 2021, 9, 71. [Google Scholar]

[ref16] Kollman P. A.; Massova I.; Reyes C.; Kuhn B.; Huo S.; Chong L.; Lee M.; Lee T.; Duan Y.; Wang W.; et al. Calculating Structures and Free Energies of Complex Molecules: Combining Molecular Mechanics and Continuum Models. Acc. Chem. Res. 2000, 33, 889–897. 10.1021/ar000033j. [DOI] [PubMed] [Google Scholar]

[ref17] Genheden S.; Ryde U. The MM/PBSA and MM/GBSA Methods to Estimate Ligand-Binding Affinities. Expert Opin. Drug Discov. 2015, 10, 449–461. 10.1517/17460441.2015.1032936. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref18] Wang E.; Sun H.; Wang J.; Wang Z.; Liu H.; Zhang J. Z. H.; Hou T. End-Point Binding Free Energy Calculation with MM/PBSA and MM/GBSA: Strategies and Applications in Drug Design. Chem. Rev. 2019, 119, 9478–9508. 10.1021/acs.chemrev.9b00055. [DOI] [PubMed] [Google Scholar]

[ref19] Kongsted J.; Ryde U. An Improved Method to Predict the Entropy Term with the MM/PBSA Approach. J. Comput. Aided Mol. Des. 2008, 23, 63–71. 10.1007/s10822-008-9238-z. [DOI] [PubMed] [Google Scholar]

[ref20] Genheden S.; Kuhn O.; Mikulskis P.; Hoffmann D.; Ryde U. The Normal-Mode Entropy in the MM/GBSA Method: Effect of System Truncation, Buffer Region, and Dielectric Constant. J. Chem. Inf. Model. 2012, 52, 2079–2088. 10.1021/ci3001919. [DOI] [PubMed] [Google Scholar]

[ref21] Gohlke H.; Case D. A. Converging free energy estimates: MM-PB(GB)SA studies on the protein-protein complex Ras-Raf. J. Comput. Chem. 2004, 25, 238–250. 10.1002/jcc.10379. [DOI] [PubMed] [Google Scholar]

[ref22] Menzer W. M.; Li C.; Sun W.; Xie B.; Minh D. D. L. Simple Entropy Terms for End-Point Binding Free Energy Calculations. J. Chem. Theory Comput. 2018, 14, 6035–6049. 10.1021/acs.jctc.8b00418. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref23] Duan L.; Liu X.; Zhang J. Z. H. Interaction Entropy: A New Paradigm for Highly Efficient and Reliable Computation of Protein-Ligand Binding Free Energy. J. Am. Chem. Soc. 2016, 138, 5722–5728. 10.1021/jacs.6b02682. [DOI] [PubMed] [Google Scholar]

[ref24] Sun Z.; Yan Y. N.; Yang M.; Zhang J. Z. H. Interaction Entropy for Protein-Protein Binding. J. Chem. Phys. 2017, 146, 124124. 10.1063/1.4978893. [DOI] [PubMed] [Google Scholar]

[ref25] Yan Y.; Yang M.; Ji C. G.; Zhang J. Z. H. Interaction Entropy for Computational Alanine Scanning. J. Chem. Inf. Model. 2017, 57, 1112–1122. 10.1021/acs.jcim.6b00734. [DOI] [PubMed] [Google Scholar]

[ref26] He L.; Bao J.; Yang Y.; Dong S.; Zhang L.; Qi Y.; Zhang J. Z. H. Study of SHMT2 Inhibitors and Their Binding Mechanism by Computational Alanine Scanning. J. Chem. Inf. Model. 2019, 59, 3871–3878. 10.1021/acs.jcim.9b00370. [DOI] [PubMed] [Google Scholar]

[ref27] Liu X.; Peng L.; Zhou Y.; Zhang Y.; Zhang J. Z. H. Computational Alanine Scanning with Interaction Entropy for Protein-Ligand Binding Free Energies. J. Chem. Theory Comput. 2018, 14, 1772–1780. 10.1021/acs.jctc.7b01295. [DOI] [PubMed] [Google Scholar]

[ref28] Wallerstein J.; Ekberg V.; Misini Ignjatović M.; Kumar R.; Caldararu O.; Peterson K.; Leffler H.; Logan D. T.; Nilsson U. J.; Ryde U.; et al. Entropy–Entropy Compensation Between the Conformational and Solvent Degrees of Freedom Fine-Tunes Affinity in Ligand Binding to Galectin-3C. JACS Au 2021, 1, 484–500. 10.1021/jacsau.0c00094. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref29] Boresch S.; Woodcock H. L. Convergence of Single-Step Free Energy Perturbation. Mol. Phys. 2017, 115, 1200–1213. 10.1080/00268976.2016.1269960. [DOI] [Google Scholar]

[ref30] Ryde U. How Many Conformations Need To Be Sampled To Obtain Converged QM/MM Energies? The Curse of Exponential Averaging. J. Chem. Theory Comput. 2017, 13, 5745–5752. 10.1021/acs.jctc.7b00826. [DOI] [PubMed] [Google Scholar]

[ref31] Mikulskis P.; Genheden S.; Wichmann K.; Ryde U. A Semiempirical Approach to Ligand-Binding Affinities: Dependence on the Hamiltonian and Corrections. J. Comput. Chem. 2012, 33, 1179–1189. 10.1002/jcc.22949. [DOI] [PubMed] [Google Scholar]

[ref32] Genheden S.; Kongsted J.; Söderhjelm P.; Ryde U. Nonpolar Solvation Free Energies of Protein–Ligand Complexes. J. Chem. Theory Comput. 2010, 6, 3558–3568. 10.1021/ct100272s. [DOI] [PubMed] [Google Scholar]

[ref33] Vedula L. S.; Brannigan G.; Economou N. J.; Xi J.; Hall M. A.; Liu R.; Rossi M. J.; Dailey W. P.; Grasty K. C.; Klein M. L.; et al. A Unitary Anesthetic Binding Site at High Resolution. J. Biol. Chem. 2009, 284, 24176. 10.1074/jbc.m109.017814. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref34] Morton A.; Matthews B. W. Specificity of Ligand Binding in a Buried Nonpolar Cavity of T4 Lysozyme: Linkage of Dynamics and Structural Plasticity. Biochemistry 1995, 34, 8576–8588. 10.1021/bi00027a007. [DOI] [PubMed] [Google Scholar]

[ref35] Mikulskis P.; Cioloboc D.; Andrejić M.; Khare S.; Brorsson J.; Genheden S.; Mata R. A.; Söderhjelm P.; Ryde U. Free-Energy Perturbation and Quantum Mechanical Study of SAMPL4 Octa-Acid Host-Guest Binding Energies. J. Comput. Aided Mol. Des. 2014, 28, 375–400. 10.1007/s10822-014-9739-x. [DOI] [PubMed] [Google Scholar]

[ref36] Case D. A.; Ben-Shalom I. Y.; Brozell S. R.; Cerutti D. S.; Cheatham T. E.; Cruzeiro V. W. D.; Darden T. A.; Duke R. E.; Ghoreishi D.; et al. Amber 18; University of California: San Francisco, 2018.

[ref37] Maier J. A.; Martinez C.; Kasavajhala K.; Wickstrom L.; Hauser K. E.; Simmerling C. Ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from Ff99SB. J. Chem. Theory Comput. 2015, 11, 3696–3713. 10.1021/acs.jctc.5b00255. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref38] Horn H. W.; Swope W. C.; Pitera J. W.; Madura J. D.; Dick T. J.; Hura G. L.; Head-Gordon T. Development of an Improved Four-Site Water Model for Biomolecular Simulations: TIP4P-Ew. J. Chem. Phys. 2004, 120, 9665–9678. 10.1063/1.1683075. [DOI] [PubMed] [Google Scholar]

[ref39] Wang J.; Wolf R. M.; Caldwell J. W.; Kollman P. A.; Case D. A. Development and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157–1174. 10.1002/jcc.20035. [DOI] [PubMed] [Google Scholar]

[ref40] Bayly C. I.; Cieplak P.; Cornell W.; Kollman P. A. A Well-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges: The RESP Model. J. Phys. Chem. 1993, 97, 10269–10280. 10.1021/j100142a004. [DOI] [Google Scholar]

[ref41] Genheden S.; Ryde U. How to Obtain Statistically Converged MM/GBSA Results. J. Comput. Chem. 2010, 31, 837–846. 10.1002/jcc.21366. [DOI] [PubMed] [Google Scholar]

[ref42] Genheden S.; Ryde U. A Comparison of Different Initialization Protocols to Obtain Statistically Independent Molecular Dynamics Simulations. J. Comput. Chem. 2011, 32, 187–195. 10.1002/jcc.21546. [DOI] [PubMed] [Google Scholar]

[ref43] Ryckaert J.-P.; Ciccotti G.; Berendsen H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes. J. Comput. Phys. 1977, 23, 327–341. 10.1016/0021-9991(77)90098-5. [DOI] [Google Scholar]

[ref44] Wu X.; Brooks B. R. Self-Guided Langevin Dynamics Simulation Method. Chem. Phys. Lett. 2003, 381, 512–518. 10.1016/j.cplett.2003.10.013. [DOI] [Google Scholar]

[ref45] Berendsen H. J. C.; Postma J. P. M.; Van Gunsteren W. F.; DiNola A.; Haak J. R. Molecular Dynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81, 3684–3690. 10.1063/1.448118. [DOI] [Google Scholar]

[ref46] Darden T.; York D.; Pedersen L. Particle mesh Ewald: An N·log(N) method for Ewald sums in large systems. J. Chem. Phys. 1993, 98, 10089. 10.1063/1.464397. [DOI] [Google Scholar]

[ref47] Miller B. R.; McGee T. D.; Swails J. M.; Homeyer N.; Gohlke H.; Roitberg A. E. MMPBSA.py: An Efficient Program for End-State Free Energy Calculations. J. Chem. Theory Comput. 2012, 8, 3314–3321. 10.1021/ct300418h. [DOI] [PubMed] [Google Scholar]

[ref48] Nguyen H.; Roe D. R.; Simmerling C. Improved Generalized Born Solvent Model Parameters for Protein Simulations. J. Chem. Theory Comput. 2013, 9, 2020–2034. 10.1021/ct3010485. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref49] Kuhn B.; Kollman P. A. Binding of a Diverse Set of Ligands to Avidin and Streptavidin: An Accurate Quantitative Prediction of Their Relative Affinities by a Combination of Molecular Mechanics and Continuum Solvent Models. J. Med. Chem. 2000, 43, 3786–3791. 10.1021/jm000241h. [DOI] [PubMed] [Google Scholar]

[ref50] Genheden S.; Söderhjelm P.; Ryde U. Transferability of Conformational Dependent Charges from Protein Simulations. Int. J. Quantum Chem. 2012, 112, 1768–1785. 10.1002/qua.22967. [DOI] [Google Scholar]

[ref51] Genheden S.; Ryde U. Will Molecular Dynamics Simulations of Proteins Ever Reach Equilibrium?. Phys. Chem. Chem. Phys. 2012, 14, 8662–8677. 10.1039/c2cp23961b. [DOI] [PubMed] [Google Scholar]

[ref52] Box G. E. P.; Muller M. E. A Note on the Generation of Random Normal Deviates. Ann. Math. Stat. 1968, 29, 610–611. 10.1214/aoms/1177706645. [DOI] [Google Scholar]

[ref53] Genheden S.; Akke M.; Ryde U. Conformational Entropies and Order Parameters: Convergence, Reproducibility, and Transferability. J. Chem. Theory Comput. 2014, 10, 432–438. 10.1021/ct400747s. [DOI] [PubMed] [Google Scholar]

[ref54] Zuckerman D. M.; Woolf T. B. Overcoming Finite-Sampling Errors in Fast-Switching Free-Energy Estimates: Extrapolative Analysis of a Molecular System. Chem. Phys. Lett. 2002, 351, 445–453. 10.1016/s0009-2614(01)01397-5. [DOI] [Google Scholar]

[ref55] Zuckerman D. M.; Woolf T. B. Theory of a Systematic Computational Error in Free Energy Differences. Phys. Rev. Lett. 2002, 89, 180602. 10.1103/physrevlett.89.180602. [DOI] [PubMed] [Google Scholar]

[ref56] Lu N.; Kofke D. A. Accuracy of Free-Energy Perturbation Calculations in Molecular Simulation. I. Modeling. J. Chem. Phys. 2001, 114, 7303–7311. 10.1063/1.1359181. [DOI] [Google Scholar]

[ref57] Gore J.; Ritort F.; Bustamante C. Bias and Error in Estimates of Equilibrium Free-Energy Differences from Nonequilibrium Measurements. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 12564–12569. 10.1073/pnas.1635159100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref58] Wu D.; Kofke D. A. Model for Small-Sample Bias of Free-Energy Calculations Applied to Gaussian-Distributed Nonequilibrium Work Measurements. J. Chem. Phys. 2004, 121, 8742–8747. 10.1063/1.1806413. [DOI] [PubMed] [Google Scholar]

[ref59] Menzer W. M.; Xie B.; Minh D. D. L. On Restraints in End-Point Protein-Ligand Binding Free Energy Calculations. J. Comput. Chem. 2020, 41, 573–586. 10.1002/jcc.26119. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref60] Wu D.; Kofke D. A. Phase-Space Overlap Measures. I. Fail-Safe Bias Detection in Free Energies Calculated by Molecular Simulation. J. Chem. Phys. 2005, 123, 054103. 10.1063/1.1992483. [DOI] [PubMed] [Google Scholar]

PERMALINK

On the Use of Interaction Entropy and Related Methods to Estimate Binding Entropies

Vilhelm Ekberg

Ulf Ryde

Abstract

Introduction