Quantifying Protein–Protein Interactions in Molecular Simulations

Alfredo Jost Lopez; Patrick K Quoika; Max Linke; Gerhard Hummer; Jürgen Köfinger

doi:10.1021/acs.jpcb.9b11802

. 2020 May 7;124(23):4673–4685. doi: 10.1021/acs.jpcb.9b11802

Quantifying Protein–Protein Interactions in Molecular Simulations

Alfredo Jost Lopez ^†, Patrick K Quoika ^†, Max Linke ^†, Gerhard Hummer ^†,^‡,^*, Jürgen Köfinger ^†,^*

PMCID: PMC7294537 PMID: 32379446

Abstract

Interactions among proteins, nucleic acids, and other macromolecules are essential for their biological functions and shape the physicochemcial properties of the crowded environments inside living cells. Binding interactions are commonly quantified by dissociation constants K_d, and both binding and nonbinding interactions are quantified by second osmotic virial coefficients B₂. As a measure of nonspecific binding and stickiness, B₂ is receiving renewed attention in the context of so-called liquid–liquid phase separation in protein and nucleic acid solutions. We show that K_d is fully determined by B₂ and the fraction of the dimer observed in molecular simulations of two proteins in a box. We derive two methods to calculate B₂. From molecular dynamics or Monte Carlo simulations using implicit solvents, we can determine B₂ from insertion and removal energies by applying Bennett’s acceptance ratio (BAR) method or the (binless) weighted histogram analysis method (WHAM). From simulations using implicit or explicit solvents, one can estimate B₂ from the probability that the two molecules are within a volume large enough to cover their range of interactions. We validate these methods for coarse-grained Monte Carlo simulations of three weakly binding proteins. Our estimates for K_d and B₂ allow us to separate out the contributions of nonbinding interactions to B₂. Comparison of calculated and measured values of K_d and B₂ can be used to (re-)parameterize and improve molecular force fields by calibrating specific affinities, overall stickiness, and nonbinding interactions. The accuracy and efficiency of K_d and B₂ calculations make them well suited for high-throughput studies of large interactomes.

1. Introduction

In biological cells, most protein, DNA, and RNA molecules have to bind to specific binding partners to perform their biological functions. These specific interactions compete with nonspecific interactions, and cells have evolved various mechanisms to minimize wasteful nonspecific binding.^1,2 However, nonspecific interactions shape the physicochemical properties of the crowded environments inside cells.³ The quantification of binding affinities and interaction strengths of biological macromolecules is thus crucial for the understanding and modeling of cellular processes. In the following, we focus on protein–protein interactions, but all of our results are generally applicable to other specific and nonspecific binding interactions.

Experimentally, protein interactions are quantified by the dissociation constants K_d and the second osmotic virial coefficient B_ij of protein species i and j. We follow the common convention and use B₂₂ for self-interactions and B₂₃ for cross-interactions. The dissociation constant K_d quantifies the amount of bound proteins and can be measured in isothermal titration calorimetry, surface plasmon resonance, or analytical ultracentrifugation experiments, for example.⁴ The interaction strength of pairs of proteins in binding and nonbinding configurations can be quantified by measuring the second osmotic virial coefficient B_ij, which relates the microscopic protein interactions to the macroscopic osmotic pressure.⁵⁻⁷ Moreover, the second osmotic virial coefficient is related to solubility and used as a predictor for protein crystallization conditions.^8,9 In experiments, B_ij is measured by sedimentation¹⁰⁻¹² and size-exclusion chromatography.⁸ Scattering experiments, such as static light scattering (SLS) and small-angle X-ray scattering (SAXS) experiments, can provide approximate estimates for B_ij.^13,14

K_d and B_ij are crucial quantities to relate molecular simulations of interacting proteins to the experiment. Such comparisons become increasingly important as molecular simulations of crowded cell-like environments have become computationally feasible, even in full atomic detail.^15,16 In simulations of strong binders, K_d is usually determined by calculating the binding free energy to specific binding interfaces.¹⁷ If binding interfaces are unknown, K_d values are often calculated from the ratio of bound and unbound populations,¹⁸ as recently applied to RNA–RNA binding.¹⁹ As we will discuss here, this approximation is accurate only for special cases. B_ij can be estimated by integration over the configuration space,²⁰⁻²² by Mayer sampling,^23,24 from molecular simulations using radial distribution functions or potentials of mean force,²⁵⁻²⁸ and by simply counting all configurations in which proteins do not interact.^29,30

Here, we show that K_d is fully determined by B_ij and the fraction p_b(V) of bound proteins estimated from molecular simulations of two proteins in a box with volume V, i.e.

Here N_A is Avogadro’s constant. In the derivation of this equation, we do not make any assumptions about the interaction strength or about the degrees of freedom of proteins or the solvent. Thus, it is generally applicable and valid not only for coarse-grained simulations using implicit solvents but also for fully atomistic molecular dynamics simulations using explicit solvents. We present two different routes to calculate B_ij and thus K_d.

For simulations using implicit solvents, we can apply protein insertion and removal moves to estimate the free energy that corresponds to the two-particle partition function determining B_ij. The insertion ensemble can be generated with any Monte Carlo or molecular dynamics code to sample from the canonical ensemble without modification. We estimate the partition function by combining the insertion and removal ensemble using either Bennett’s acceptance ratio (BAR) method³¹ or the binless weighted histogram analysis method (WHAM).³²⁻³⁴ In contrast to the Mayer sampling method,^23,24 which uses molecular Monte Carlo integration to calculate virial coefficients even of higher orders, here, we use exactly the same simulation system for the calculation of B_ij as we use to sample from the canonical ensemble.

For simulations using either implicit or explicit solvents, B_ij can be calculated accurately by estimating the probability that the two proteins are outside of their interaction range.²⁹ We present mathematically simple expressions for B_ij and K_d in terms of this probability, which provide insights into their physical interpretations complementary to more common formulations based on radial distribution functions or potentials of mean force.

We quantify the interactions of the two proteins when they are not bound using K_d and B_ij. Previously, theoretical models for excluded volumes have been used to extract nonbinding interactions from experimentally measured B_ij values.³⁵ Here, we use the fact that the contributions of bound configurations to B_ij are completely determined by K_d and show that the remaining contributions have a simple and clear interpretation. Moreover, we propose that these contributions of nonbinding interactions can be estimated in experiments.

The article is organized as follows. In Section 2, we derive expressions to calculate the dissociation constant and the second osmotic virial coefficient from simulations. We present the details of our methods in Section 3 and a validation of our methods and results for three weekly binding proteins using coarse-grained simulations in Section 4. We end with conclusions in Section 5.

2. Theory

For simulations of two proteins in a box, we show that the dissociation constant K_d is determined by the binding probability and the second osmotic virial coefficient B_ij of protein species i and j. The latter is determined by the two-particle partition function, which in general can be estimated from the fraction of states where proteins are outside of their interaction range²⁹ or, for implicit solvents, by performing a free energy calculation using insertion and removal moves.

2.1. Preliminaries

McMillan and Mayer⁵ have shown how we can apply results of statistical mechanics to describe osmotic properties of solutions. Integrating out solvent degrees of freedom, only solute degrees of freedom remain and solutes interact with each other via effective potentials. For such a system with m solute species, the virial equation of state^36,37 becomes the osmotic virial equation of state, i.e.

where Π is the osmotic pressure, V_m is the molar volume, R is the gas constant, T is the temperature, x_i is the mole fraction of species i, and B_ij is the osmotic second virial coefficient of proteins of species i and j.

We can express the second virial coefficients B_ij of an arbitrarily shaped particle of species i and an arbitrarily shaped particle of species j, via one- and two-particle configurational partition functions. To do so, we extend the derivation by McQuarrie to nonspherical particles⁷ and start from the grand canonical partition function

where V is the volume, N_i is the number of molecules of species i containing n_i atoms each, and z_i = exp(βμ_i) is the fugacity determined by the chemical potential μ_i of species i and the inverse temperature β = 1/(k_bT). k_b is Boltzmann’s constant. The osmotic pressure is a function of the fugacities and given by βΠV = ln Ξ(T,V, z₁,...,z_m).^38,39 Here, we write the canonical partition function Q(N₁,...,N_m) of m species of arbitrarily shaped particles as

where Inline graphic is the corresponding configurational partition function

where the potential energy U(X) depends on the set X of all |X| = ∏_iN_in_i atom positions. In eq 4, we introduced Inline graphic for the single-particle canonical partition function, e.g., . For spherically symmetric particles, and we recover McQuarrie’s expression⁷ for Q(N₁, ..., N_m). For rigid cylindrically symmetric and asymmetric particles, and , respectively. Note that in the following, we use “Z” instead of “ Inline graphic ” for these expressions for rigid molecules to distinguish them from the full configurational partition function of flexible molecules written as calligraphic “”. We obtain for the second osmotic virial coefficients

where we introduced Inline graphic for the two-particle partition function, e.g., for a pair of particles of species 1 and 2 or for a pair of particles of species 1.

2.2. Estimating the Dissociation Constant

We show how to obtain a box-size-independent estimate of the dissociation constant K_d from simulations of two proteins in a box. K_d is related to the Gibbs binding free energy ΔG via

where c₀ = 1M is the standard concentration and Δp is the pressure difference between bound and unbound states.⁴⁰ The last term is usually small and can be neglected.

For large enough box volumes V, one would be tempted to estimate the dissociation constant of two proteins A and B directly from the binding probability p_b(V). For a discussion of suitable definitions of bound states, see Section 2.7. Using the concentrations of free proteins [A] = [B] = (1 – p_b(V))/(N_AV) and the concentration of bound protein [AB] = p_b(V)/(N_AV), a first rough estimate of the dissociation constant is given by

For small box sizes typically used in simulations, this estimate suffers from finite-size effects. Accurate estimates using eq 8 would require unusually large boxes, as we show in Section 4, which makes sampling highly inefficient.

To overcome this finite-size effect, we effectively extend the box volume analytically and calculate K_d in the limit of infinite volume (Figure 1). We emphasize here that in the following derivation, we consider fully flexible proteins without any restrictions on their internal degrees of freedom. We remove the translational and rotational degrees of freedom of the protein of species i, which correspond to a factor Z_i(V) = 8π²V in the partition function for asymmetric proteins. That is, we fix the position and orientation of the protein of species i, which leaves the internal degrees of freedom due to the flexibility of the protein unchanged. The corresponding partition function of j in the presence of i, with i fixed in position and orientation but internally flexible, is given by

We extend this system with a fixed position and orientation of the flexible protein i by an additional volume ΔV accessible to the second protein. The contribution to the partition function of a protein of species j being in this additional volume ΔV is given by

where Z_j(ΔV) = 8π²ΔV gives the contribution due to the translational and rotational degrees of freedom of an asymmetric protein to the partition function. Inline graphic and are the partition functions of individual proteins i and j, whose positions and orientations are fixed in space. That is, and contain only contributions due to the respective internal degrees of freedom of free proteins and due to the degrees of freedom of solvent molecules in the vicinity of the proteins, which differ from the bulk due to the presence of the protein. For rigid protein models in implicit solvents, Inline graphic .

Calculating the dissociation constant K_d and the second osmotic virial coefficient B_ij from simulations of two proteins in a box of volume V. The red protein has a single specific wedge-shaped binding site for the triangular blue protein. The light-blue protein configurations illustrate different interaction modes of the two proteins considered in the derivation of eqs 1 and 28. To obtain a K_d estimate independent of box size, we analytically extend the two-particle partition function for the simulation box by the contributions of an extension volume ΔV (gray shaded area) and perform the limit ΔV → ∞. We calculate B_ij from the probability p_v(V) that the two proteins are within a subvolume v (green), which is at least large enough to cover all protein–protein interactions (yellow shaded area).

The probability p_b(V_ex) that the two proteins are bound in the extended volume V_ex = V + ΔV is now given by the ratio of the partition function Inline graphic of the bound proteins to the partition function of the extended system . With the position of protein i fixed, is independent of the size of the volume V and thus the same for the simulation box and for the extended system, i.e., . Consequently

To calculate a K_d value unaffected by the finite size of the simulation box, we now substitute eq 11 into eq 8. We then take the limit ΔV → ∞ and use that Z_j(V)/V = 8π² to obtain

We can rewrite this equation realizing that the partition function of all bound states of the system, where also protein i can move and rotate, is given by Inline graphic . Note that is proportional to V. Equation 12 becomes

Expressing Inline graphic by the second osmotic virial coefficient defined in eq 6

and inserting the resulting expression in eq 12, we obtain the relationship between K_d and B_ij given in eq 1

As a corollary, the volume dependence of the fraction of bound proteins

is parameterized by K_d and B_ij.

As we derive in the following, K_d and B_ij fulfill the approximate relation

This approximate relationship becomes an exact relationship if we define all interacting states as bound states^41,42 or for proteins that do not interact when they are not bound (see Section 2.4). We write Inline graphic as a sum of the partition functions for the bound and for the unbound states, i.e., , and insert this expression in eq 14. We then obtain

If unbound interactions are weak, then

such that

where we used eq 13. Rearranging this equation, we arrive at eq 16. We can now insert this expression into eq 15 and obtain

from which we can express K_d to obtain an approximate estimate for K_d, which we call K_d^′, i.e.

Here, we introduced the fraction of unbound protein configurations as p_u(V) = 1 – p_b(V). Note that eq 21 corresponds to eq 13 of de Jong et al.¹⁸ For the exact relationship between K_d and K_d^′, see Section 2.4.

2.3. Estimating the Second Osmotic Virial Coefficient

As we have shown above, we have to estimate B_ij to accurately estimate K_d. To do so, we apply the same concepts as we have used for the calculation of K_d. We first remove contributions to the partition function due to the translational and rotational freedom of the whole system by keeping the position and the orientation of the otherwise flexible protein i fixed (eq 9). Around this protein, we define a subvolume v < V, which has to be big enough such that it captures all protein–protein interactions (Figure 1). Outside this subvolume, protein–protein interactions can be neglected. That is, the flexible protein j moves freely when it is in volume δv = V – v.

The probability p_v(V) that protein j is in subvolume v is given by

where Inline graphic is given analogous to eq 9 and Z_j(δv) = 8π²δv for asymmetric proteins. We can express from eq 22 as

Usiing eq 9 for Inline graphic , it follows that

1 – p_v(V) is the probability that protein j is in volume δv. Consequently

Using that Z_j(v) and Z_j(δv) are proportional to their arguments with the same prefactor (see Section 2.1) and that δv = V – v, where V is the box volume, we obtain

Solving for p_v(V), we obtain

which describes the dependence of p_v(V) on the box volume V and the subvolume v.

We emphasize that eq 26 is generally valid for arbitrary binding partners, without making any assumptions about symmetry or the number of internal degrees of freedom of the binding partners or of the solvent. The only condition is that interactions between binding partners are negligible outside of the volume v. We can introduce correction terms based on an effective pairwise potential acting between the binding partners if this condition is not fulfilled (see Section 2.7).

To motivate the interpretation of eq 26, we rewrite it as

Note that the prefactor in eq 28 contains the box volume V, whereas the prefactor in eq 26 contains the subvolume v. The first term in the brackets, determining the two-particle partition function, is the ratio of the probability of finding one protein outside of the subvolume v for the ideal system, 1 – v/V, to the corresponding probability for the interacting proteins, 1 – p_v(V). This ratio, which is the inverse of the quantity f₂(V) of Ashton and Wilding,²⁹ is independent of the subvolume v, chosen to be just large enough to cover the interaction range. Consequently, the first term in the brackets in eq 28 can be written as 1/exp[−βF_o^(ex)(V)], where we introduced the excess free energy of finding the two proteins outside of their interaction range in the box of volume V as

We express K_d as a function of p_v(V) by inserting eq 28 into eq 1 and obtain

We next establish the commonly used relationship of B_ij to the partial radial distribution function g(r).⁷ The ratio of p_v(V)/(1 – p_v(V)) can be estimated from the probability density of center-of-mass distances p(r) of two proteins in a box, for instance, which is itself related to the radial distribution function g(r). To do so, we define a spherical volume v = 4πR³/3 and a spherical shell around this sphere with volume δv = 4π[(δR + R)³ – R³]/3. The ratio is then given by

We define a radial distribution function g(r) through

We can choose the proportionality constant such that g(r) = 1 for r > R, where p(r) ∝ r². Then, 4π ∫_R^R+δRg(r)r² dr = δv and we may write

Inserting this expression in eq 26 and using that ∫₀^R 4πr²dr = v, we obtain

By introducing an effective interaction potential βw(r) = −ln g(r), we can write eq 34 as it is commonly presented

Using eq 28 instead of eq 34 or 35, we can avoid the computation of distance distribution functions and potentials of mean force, respectively, and the subsequent integration. Importantly, we also do not have to estimate the plateau value of g(r), which in simulations is different from one and which depends on system size and the thermodynamic ensemble.^29,43 Although these differences might be viewed only as a minor simplification, eq 28 emphasizes that B_ij is independent of the detailed shapes of g(r) and w(r) and determined by the excess free energy F_o^(ex)(V) of finding the two proteins outside of their interaction range. Note that our results also apply to the infinite dilution limits of the Kirkwood–Buff integrals G_ij = 4π ∫_r=0[g(r) – 1]r² dr = 2B_ij.^13,44,45

2.4. Contribution of Nonbinding Interactions to B_ij

We can use K_d and B_ij to quantify the nonbinding interactions of two proteins. Let us first consider two nonbinding proteins for which Inline graphic (V) = 0. Consequently, eq 17 becomes

where we use the superscript “(u)” to indicate contributions of the unbound states. For binding proteins, B_ij^(u) is given by the difference between B_ij = B_ij + B_ij^(b) and the contributions to B_ij due to binding

i.e., we can quantify the nonbinding interactions for two binding proteins via

which becomes

For hard spheres, p_u(V) = 1 and p_v(V) = (v – v_exc)/(V – v_exc), where v_exc is the excluded volume, such that B_ij^(u) = v_exc/2. For attractive nonbinding interactions B_ij < v_exc/2, and for repulsive nonbinding interactions B_ij^(u) > v_exc/2. Note that for asymmetric proteins, v_exc corresponds to an excluded region in the configuration space, which, for instance, is spanned by Cartesian coordinates and Euler angles in the case of rigid proteins. Thus, in general, v_exc should be viewed as an effective volume corresponding to a thermodynamic free energy.

We now show that B_ij^(u) quantifies the difference between the approximate expression for K_d in eq 21 and the box-size-independent expression for K_d in eq 1. Inserting eq 11 into eq 21, we obtain

such that the relative difference is given by

Consequently, the approximate estimate K_d^′ deviates systematically from the true value K_d, with deviations proportional to B_ij, but converges to the true value with increasing box volume as 1/V.

2.5. Indistinguishable Binding Partners (Homodimers)

So far, we have assumed that the proteins are distinguishable, i.e., that they form heterodimers, but all expressions derived here are also valid for indistinguishable binding partners forming homodimers. To consider the case of two identical binding partners, we rewrite eq 13 as

where we introduced Inline graphic for the partition function of two free proteins, which is determined by the product of two single-protein partition functions. For indistinguishable binding partners forming homodimers, both and would have to be multiplied by a factor 1/2 to account for the indistinguishablity of the proteins. However, these factors then cancel in the ratio in eq 42.

2.6. K_d and B_ij from a Single Simulation

We can estimate K_d and B_ij from the fraction of bound protein p_b(V) and the probability p_v(V) of one protein being located in a subvolume v around the other. The latter determines B_ij according to eq 28, which we then insert into eq 1 to obtain the finite-size corrected estimate of K_d. We call this method the subvolume method. To calculate B_ij, we can also estimate the two-particle partition function Z_ij(V), now for simplicity but without loss of generality only considering rigid molecules, using free energy methods.⁴⁶ For implicit solvents, we can use insertion and removal moves of the proteins to efficiently estimate Z_ij(V), as explained in the following. We call this method the insertion/removal method.

2.6.1. Estimating Two-Particle Configurational Partition Functions for Implicit Solvents

A simulation of a pair of proteins in a box of volume V at reciprocal temperature β gives us immediately the particle-removal energy distribution as the normalized distribution of potential energies. We define x_i = (r_i,Ω_i), where r_i are the Cartesian coordinates of the geometric center of protein i and Ω_i are its Euler angles defining its orientation. We denote the configuration space as W = V × Ω to simplify the notation. The particle-removal energy distribution is then given by

where Z₂₃(β) = ∫_W² dx₂dx₃e^{–βU(x₂,x₃)} and δ[·] is Dirac’s delta function.

The particle-insertion energy distribution p_ins(E) is formally given by

where Z_i(β = 0) = ∫_W  dx_i. Sampling the particle-insertion energy distribution p_ins(E) for a given box size is straightforward. All one needs is a replica with reciprocal temperature β = 0 exactly. All moves will then be accepted, and the energies saved are those of random insertions. Alternatively, one could make trial moves of the two proteins with Monte Carlo move widths ±L/2, where L is the box length, and the orientation changes about random axes by ±π, and to write out the absolute trial (!) energies (not the energy differences or the accepted energies). With such a move protocol, it would not matter if one or both particles were moved and if moves are accepted or not. It also does not matter what the “acceptance rate” is (i.e., it can be zero!). What is important, though, is that the box volumes in insertion and removal runs are the same.

The normalized removal and insertion energy distributions are related to each other by

which follows from

The ratio of partition functions defines the free energy of going from a system of two noninteracting particles to a system in which they interact

Note that F = −F_o^(ex)(V) (see eq 29). An efficient way of determining this free energy is to use the Bennett acceptance ratio (BAR) estimator³¹

where E_i are the uncorrelated (by construction) insertion energies and E_i are the uncorrelated removal energies. However, it is clear that this is problematic in cases where the proteins are strongly bound (forming a dimer!) because then one would have very little information about higher energies.

This problem can be remedied using all of the data in a temperature replica exchange simulation. In effect, the high-temperature runs allow us to estimate an accurate density of states to a pretty high energy. The particle-insertion energies complement this density of states on the high-energy side. All of the runs at different temperatures can be combined with the list of insertion energies using binless WHAM. As a reference, we take the temperature of interest (β = β₁ without loss of generality). The bias energies at replicas with reciprocal temperature β_i are then ΔU = (β_i/β – 1)U. This formula works also for the insertion energies coming from a run with β_i = 0. The insertion energies can be thought of as coming from a run with the bias potential ΔU = −U, i.e., on potential zero. A binless-WHAM analysis using these bias energies as input will produce the required free energy F as the difference between the reference state and the insertion run.

2.7. Practical Considerations

In the derivation of K_d and B_ij, we have assumed that the volume is large enough such that interactions between the protein with a fixed position and orientation and the protein in the extended volume can be neglected. If this condition is not fulfilled, then we can correct for residual interaction energies using a simple distance-dependent interaction potential ϕ(r) in the calculation of Inline graphic , where denotes the Cartesian space defining ΔV. For example, at large distances, the interaction of charged proteins can be approximated by (screened) Coulomb interactions of the total charges located at the centers of charge. In such a case, we would include for the calculation of the fraction bound only configurations of the simulation where the two proteins are separated less than a cutoff distance, usually given by half the shortest box length. Such a system corresponds to a spherical volume with one protein at its center and the other one moving unrestrained. Doing so, we assume that the residual interaction modeled as a simple pair-potential has a negligible effect on the internal degrees of freedom and the degrees of freedom of the surrounding solvent, i.e., Inline graphic and are unchanged.

Suitable definitions of bound states will depend on the molecular model we use for simulations. In our simulations of rigid proteins in implicit solvents, we consider a state as bound if the interaction energy of the two proteins is smaller than −2k_bT. Additionally demanding that two proteins have to have a minimum C_α distance smaller than 0.8 nm to be counted as bound does not have a noticeable effect on the binding probability. For molecular dynamics simulations using explicit solvents, a combination of distance- and energy-based criteria and using transition-based-assignment of states⁴⁷ might be necessary to reliably distinguish bound states from spurious contacts.

In simulations of two proteins in a box, we can estimate p_v(V) using a distance-based criterion as has been introduced by Ashton and Wilding.²⁹ We define a distance between the two proteins, e.g., the center-of-mass distance r. We introduce a distance R such that interactions between proteins are negligible for distances r > R. For an ensemble of N structures, we count the number of structures N_v for which r ≤ R. In these structures, the center-of-mass of protein 2 lies within a spherical volume v = 4πR³/3 centered at the center-of-mass of protein 1. We then estimate p_v(V) = N_v/N.

For strong binders and in boxes of typical size, p_v(V) is close to one. For p_v(V) = 1, (1 – v/V)/(1 – p_v(V)) in eq 28 diverges. Consequently, p_v(V) has to be determined with sufficient numerical precision to obtain accurate estimates. For example, if we sample 10 000 configurations, then the numerical precision of p_v(V) is limited to 1/10 000. The precision can be increased by sampling more configurations or, in the case of replica simulation, by including additional replicas using WHAM when calculating p_v(V). For weak binders with K_d ≳ 100 μM, 10 000 configurations are sufficient to estimate K_d and B_ij even without applying WHAM.

3. Methods

We chose three weakly binding protein pairs with experimental K_d values covering 3 orders of magnitude from ∼μM to ∼mM. The lysozyme homodimer has an experimental K_d value of K_d ≈ 2710 ± 240 μM⁴⁸ (PDB 6LYZ⁴⁹), the ubiquitin/CUE dimer (PDB 1OTR⁵⁰) has a K_d ≈ 155 ± 9 μM,⁵⁰ and the dimer of the uracil-DNA glycosylase UDG and its uracil-DNA glycosylase inhibitor protein (Ugi) has a K_d ≈ 1. 3 ± 0.3 μM⁵¹ (PDB 1UUG⁵²).

To simulate these protein pairs, we use the amino-acid-level coarse-grained model developed by Kim and Hummer for weakly binding proteins⁵³ implemented in the Complexes++ software (https://www.github.com/bio-phys/complexespp). We treat all proteins as rigid bodies. In contrast to the original model, which is called the KH-model, we shift the original Miyazawa and Jernigan parameters^54,55 by e₀ = −1.875 k_bT, where T = 300 K, to account for the solvation energy and we scale the resulting parameters by λ = 0.1243 to balance them with the electrostatic interactions. In the original model, e₀ = −2.27 k_bT and λ = 0.159. The new values have been chosen to better reproduce the B₂₂ value of lysozyme and the K_d value of the ubiquitin/UIM1 complex. We chose residue charges of −1.0e for Asp and Glu, +1.0e for Arg and Lys, and +0.5e for His because its isoelectric point is at pH 7. e is the elementary charge. Consequently, the total charges of the proteins are +8.5e for lysozyme, +0.5e for ubiquitin, −4.5e for CUE, +7.5e for UDG, and −11.5e for Ugi. We set the dielectric constant to 80 and the Debye length to 1 nm, corresponding to the conditions in an aqueous solution of 100 mM NaCl.

To generate Boltzmann ensembles of configurations, which also provide the removal energy distributions defined in eq 43, we perform temperature replica exchange Monte Carlo (REMC) simulations using 24 replicas. Temperatures were equally spaced between 300 and 530 K. In a Monte Carlo sweep, each protein performs one trial move on average, which can be translation or rotation. Replica exchanges are attempted every 10 sweeps. For the rotation move, a rotation axis is randomly generated by drawing a point from a sphere. Then, we rotate around this axis by an angle, which we draw from a box distribution with a width given by twice the maximum angle. This maximum angle is set to 0.1 rad for the coolest replica and to 1.25 rad for the hottest replica and spaced equidistantly in between. Similarly, we set the maximum displacement for the translation move to 0.2 nm in the coolest replica and to 1.35 nm in the hottest replica, with equal spacing in between. In our simulations, we use a cutoff radius of 3 nm to truncate our interaction potentials.

To sample the insertion energy distribution defined in eq 44 in simulations, we switch off all interactions by setting all interaction parameters and residue charges to zero. We use a maximum displacement of half the box length and a maximum rotation angle of π. We accept and sample all configurations to generate the insertion ensembles, for which we then recalculate all energies for switched-on potentials.

To estimate the two-particle partition function, we combine results from REMC simulations (removal ensemble) and the energies calculated for the ensemble of noninteracting proteins (insertion ensemble) using binless WHAM.³²⁻³⁴ To avoid numerical problems, we clip interaction energies at 100 k_bT. We define two proteins as being bound if their total interaction energy is below −2k_bT.

For equilibration, we performed 10⁶ Monte Carlo sweeps in each replica. For production, we performed 10⁷ sweeps and we sampled every 100th sweep, yielding 10⁵ structures for each protein pair per replica. We also performed 10⁶ insertion moves for each pair, which by design creates uncorrelated configurations.

To study the box volume dependence of the fraction bound p_b(V), we calculated for the coolest replica p_b = N_{E≤−2k_bT}/N. N_{E≤−2k_bT} is the number of structures with energies E ≤ −2k_bT, and N = 10⁵ is the total number of structures. To study the box volume dependence of the subvolume probability p_v(V), we calculated for the coolest replica p_v = N_v/N, where N_v is the number of structures within the subvolume v. We defined this volume as a spherical volume with a radius given by the sum of (D_i + D_j)/2, where D_i and D_j are the largest diameters of proteins of species i and j, respectively, and our cutoff radius of 3 nm. The resulting radii are between ∼6.7 and ∼7.4 nm for the three proteins. For each protein pair, we performed simulations for 17 box sizes with volumes ranging from 3375 to 10⁶ nm³. We calculated the standard errors of the mean by block averaging.^56,57

We validate the insertion/removal method and the subvolume method for the smallest boxes used here with volume Inline graphic = 15³ nm³ = 3375 nm³. With uniform probability, we selected at random 10 000 of the N = 10⁵ samples and chose for each replica the configurations corresponding to the same 10 000 indices. We also drew 10 000 configurations of the 10⁶ configurations in the insertion ensemble with uniform probability. In the insertion/removal method, we then applied WHAM using these 250 000 configurations in total to calculate p_b( Inline graphic ) and , from which we then estimated K_d and B_ij. We repeated this procedure 1000 times and calculated the averages of K_d and B_ij and their covariance matrices. We confirmed visually that the distributions of the estimates of K_d and B_ij are distributed according to two-dimensional Gaussians with the estimated covariance matrices. We use the same protocol to obtain estimates and uncertainties from resampling for the subvolume method, in which we do not use the insertion ensemble.

4. Results

We calculated K_d and B_ij using the insertion/removal method and the subvolume method for three protein pairs, i.e., the lysozyme homodimer and the heterodimers ubiquitin/CUE and UDG/Ugi. As we will show, these estimates allow us to quantify the contributions due to binding and nonbinding interactions to B_ij.

In the insertion/removal method, we determine K_d and B_ij from replica exchange simulations at a box volume Inline graphic and from insertion ensembles. We first estimated p_b() and Z_ij() by combining the insertion ensemble and the replicas of our temperature REMC simulations using WHAM. We then evaluated eq 14 to obtain B_ij and used this value together with our estimate for p_b() in eq 1 to obtain K_d. By resampling, we estimated the covariance matrix.

In the subvolume method, we first estimated p_v( Inline graphic ) and p_b() from all replicas using WHAM. We used eq 26 to calculate B_ij from p_v() and used this estimate together with p_b() to estimate K_d using eq 1.

We find that the estimates for K_d and B_ij from the insertion/removal method and the subvolume method agree excellently with each other (Figure 2 and Table 1). Moreover, the estimates have similar uncertainties. K_d values and B_ij values calculated by resampling are correlated for both methods (Figure 2). A smaller value of B_ij, i.e., a more negative value, leads to a smaller value of K_d according to eq 1.

Comparison of the accuracy of the insertion/removal method (ins/rem, black, solid lines) and the subvolume method (subvol, red, dashed lines) to estimate K_d and B_ij for three different protein pairs (top to bottom). The most likely estimates are indicated by horizontal and vertical dashed lines. The contour lines indicate the limits of the 25, 50, 75, and 95% confidence regions. The insertion/removal method (eqs 14 and 1 and the two-particle partition function from WHAM (Section 2.6.1), black) and the subvolume method (eqs 26 and 1, red) agree excellently with each other, and they have similar uncertainties. For UDG/Ugi, contour lines collapse on to a single line due to the strong correlation between the estimates for K_d and B_ij.

Table 1. K_d, B_ij, and the Contributions of Binding Interactions, B_ij^(b), and Nonbinding Interactions, B_ij, to B_ij for Three Protein Complexes (PDB codes 6LYZ, 1OTR, 1UUG) for the Insertion/Removal Method (“ins/rem”) and the Subvolume Method (“subvol”)^a.

lysozyme	method	K_d [μM]	B₂₂ [nm³]	B₂₂^(b) [nm³]	B₂₂^(u) [nm³]
	ins/rem	5191 ± 63	–77 ± 4	–160 ± 2	83 ± 4
	subvol	5188 ± 68	–78 ± 4	–160 ± 2	82 ± 3

Ubi/CUE	method	K_d [μM]	B₂₃ [nm³]	B₂₃^(b) [nm³]	B₂₃^(u) [nm³]
	ins/rem	153 ± 1	–5444 ± 37	–5435 ± 37	–9 ± 3
	subvol	153 ± 1	–5455 ± 39	–5444 ± 38	–11 ± 3

UDG/Ugi	method	K_d [μM]	B₂₃ [nm³]	B₂₃^(b) [nm³]	B₂₃^(u)[nm³]
	ins/rem	0.25 ± 0.002	–3 332 000 ± 27 000	–3 332 000 ± 27 000	–94 ± 5
	subvol	0.25 ± 0.002	–3 308 000 ± 27 000	–3 308 000 ± 27 000	–81 ± 7

Open in a new tab

Errors are standard errors of the mean.

For additional validation, we use the results for K_d and B_ij obtained at the box volume Inline graphic to predict the box-size dependence of the fraction bound p_b(V) and the subvolume probability p_v(V). We use eq 15 and our estimates for K_d and B_ij obtained at a box volume to calculate p_b(V) (Figure 3). We use eq 27 and our estimates for B_ij obtained at a box volume to calculate p_v(V) (Figure 4). The resulting curves reproduce the box volume dependencies of p_b(V) and p_v(V) observed in the entire range of simulations, covering nearly 3 orders of magnitude in volume.

Box-size dependence of the binding probability p_b(V) is determined by B_ij and K_d via eq 15. We show simulation results (blue) for three protein pairs (top to bottom). Error bars indicate the blocked standard errors of the mean. The lines are predictions using eq 15 and estimates for K_d and B_ij obtained at a box volume = 3375 nm³ (magenta vertical line) using the insertion/removal method (black, solid lines) and the subvolume method (red, dashed lines).

Box-size dependence of the subvolume probability p_v(V) is determined by B_ij via eq 27. We show simulation results (blue) for three protein pairs (top to bottom). Error bars indicate the blocked standard errors of the mean. The lines are predictions using eq 27 and estimates of B_ij obtained at a box volume = 3375 nm³ (magenta vertical line) using the insertion/removal method (black, solid lines) and the subvolume method (red, dashed lines).

For strong binders, the fraction bound p_b(V) and the subvolume probability p_v(V) take on similar values (compare Figures 3 and 4). In these cases, p_v(V) is dominated by binding. For small boxes, p_b(V) is close to one and consequently so is p_v(V). For box sizes large enough such that p_b(V) is significantly below one, the contribution of the size of the subvolume v to p_v(V) is small. For UDG/Ugi, the strongest binding complex considered here, the fraction bound dominates p_v(V) such that the p_v(V) curve in Figure 4 looks nearly identical to the corresponding p_b(V) curve in Figure 3. However, the differences in these curves are significant as they are not only determined by the size of the subvolume v but also by the nonbinding interactions.

We can extract the contributions B_ij^(u), eq 38, of nonbinding interactions to B_ij. We can do so even in the case of strong binders for which the K_d value is close to B_ij = −1/(2N_AK_d) according to eq 16 (Figure 5, top). With the estimates provided by either the insertion/removal method or the subvolume method, we can resolve the small difference B_ij^(u) = B_ij – B_ij (Figure 5, center). Focusing on the results from the insertion/removal methods, we find that for lysozyme B_ij^(u) ≈ 83 ± 4 nm³ > 0. This value is close to what one would expect for hard spheres of equal volume, i.e., B_ij = v_exc/2 ≈ 70 nm³. For ubiquitin/CUE, the interactions are clearly attractive, but B_ij^(u) ≈ −9 ± 3 nm³ nearly vanishes. For UDG/Ugi, B_ij ≈ −94 ± 5 nm³ indicates attractive interactions (Figure 5 and Table 1).

Contributions of binding and nonbinding interactions to B_ij = B_ij^(b) + B_ij for three protein pairs. We show estimates from the insertion/removal method in color and estimates from the subvolume method using larger symbols in gray. B_ij of the strongest binders is dominated by contributions of binding B_ij^(b) = −1/(2N_AK_d) such that the ratio of |B_ij/*B_ij*| is close to one (top). In these cases, nonbinding contributions to B_ij are relatively small, i.e., |B_ij^(u)/B_ij| ≪ 1 (center).

Note that for Ubi/CUE and UDG/Ugi, the estimates for B₂₃^(u) = B₂₃ – B₂₃ are much smaller than the individual errors of B₂₃ and B₂₃^(b) (∼27 000 nm³ for UDG/Ugi and ∼40 nm³ for Ubi/CUE; Table 1). Naively, one would think that these large uncertainties preclude reliable estimates for the comparably small difference B₂₃ in such a situation. However, the estimates for B₂₃ and B₂₃^(b) from resampling are highly correlated because of the strong correlation of B₂₃ and K_d (Figure 2). That is, the individual errors of B₂₃ and B₂₃ do not determine the errors of their difference.

Next, we show that the naive estimate of K_d from concentrations using eq 8 actually suffers from a finite-size effect and that it converges to the estimates obtained with the insertion/removal and subvolume methods for large system sizes (Figure 6). For comparison only, we evaluate eq 8 for our predictions of p_b(V) obtained at a volume Inline graphic (Figure 3) and extrapolate the naive estimates for K_d until convergence is reached. For typical box sizes used in simulations, K_d is underestimated by about 10% for the lysozyme homodimer, the weakest binder considered here, and by 3 orders of magnitude for UDB/Ugi, the strongest binder considered here. To reach convergence when using eq 8, the box volumes have to be increased by a factor ∼100 for the weakest binder and by a factor ∼100 000 for the strongest binder compared to typical box sizes.

Finite-size correction gives box-size-independent dissociation constants K_d (eq 1, red symbols). The naive estimate of K_d (eq 8, blue symbols) suffers from finite-size effects and converges to the true value (gray horizontal line) for increasing box size. We illustrate this convergence by evaluating eq 8 for the predictions for p_b(V) from the insertion/removal method (black solid line) and the subvolume method (red dashed line). Approximately corrected estimates (eq 21, eq 13 of de Jong et al.,¹⁸ green symbols) suffer from finite-size effects and converge to the true value for increasing box volume. Error bars have been obtained by resampling.

Using eq 1, we obtain finite-size effect-free estimates for K_d at all box volumes (see Figure 6). In contrast, the estimates obtained using the approximate relation given by eq 21 (eq 13 of de Jong et al.¹⁸) show small but systematic deviations determined by B_ij^(u) (eq 41); (see Figure 6). These systematic deviations decrease with increasing box volume as 1/V (Figure 7). For the three dimers considered here, these differences are in the range of ±5% for the smallest box sizes used here.

Relative difference between the approximate estimates K_d^′ (eq 21, eq 13 of de Jong et al.¹⁸) and the box-size-independent estimates K_d (eq 1) for the dissociation constant as shown in Figure 6 (discs) as functions of the inverse box volume 1/V. This difference is proportional to 1/V and to the contribution of unbound states to the second osmotic virial coefficient, B_ij (eq 41, lines).

5. Conclusions

We have shown how to calculate the dissociation constant K_d of two proteins in a box from the fraction of protein dimers and the second osmotic virial coefficient B_ij. We derived and validated two methods to calculate B_ij: For implicit solvents, we can use standard Monte Carlo or molecular dynamics simulations of two proteins in a box and determine insertion and removal energy distribution functions. From the latter, we determine the two-particle partition function and thus B_ij using BAR/WHAM. For implicit and explicit solvents, we can calculate the probability that the two proteins are within a volume at least covering the interaction range of the two proteins.²⁹ Calculating B_ij from the radial distribution function or equally the potential of mean force via an integral is equivalent to this method. For the coarse-grained simulations performed here, both methods provide accurate results with comparable uncertainties.

The relationship between K_d and B_ij given by eq 1 is also well suited for the quantification of protein interactions in molecular dynamics simulations using explicit solvents. Fully atomistic simulations of concentrated protein solutions in explicit solvents have become computationally feasible on the microsecond scale.^15,16 These studies have been facilitated by recent improvements in molecular force fields, which correct, among other things, for an increased stickiness of protein surfaces.⁵⁸⁻⁶¹ These parameterization efforts can benefit from comparisons of K_d and B_ij to the experiment.

Fully atomistic simulations are within reach for the protein pairs considered here. The box volume Inline graphic used here corresponds to about 300 000 particles in fully atomistic simulations using explicit solvents. The binding and unbinding of weakly binding proteins like lysozyme can be simulated atomistically without bias.¹⁵ For more strongly binding proteins, enhanced sampling techniques have to be applied.⁶² Binding and unbinding events of proteins and other molecules can be simulated efficiently without bias also in molecular dynamics simulations using explicit solvents using the MARTINI model, for example.⁶³⁻⁶⁵

The sampling strategy used here for weak binders is different from the sampling strategy commonly used for strong binders. Strong binders usually have specific interfaces, and the dissociation constant is determined by the binding free energy to these specific interfaces. If these interfaces are known, then we only have to calculate the binding free energy for these specific binding poses dominating K_d.¹⁷ For weak binders, also nonspecific binding can contribute significantly to K_d and thus has to be sampled.

B_ij also plays an important role in understanding phase separation by which liquid droplets are formed within cells.⁶⁶ Specifically, the Flory–Huggins solution theory is used to model liquid–liquid phase separations.^67,68 In this framework, the Flory interaction parameter χ is determined by K_d and B_ij.⁶⁹B_ij also determines the “effective solvation volume” up to a proportionality constant, a quantity commonly used in polymer science.⁷⁰

The interactions of proteins in nonbinding configurations can be quantified by B_ij^(u), which is fully determined by K_d and B_ij and which is thus a well-defined thermodynamic quantity. These interactions shape the physicochemical properties of the crowded environments inside cells. For example, nonbinding interactions can lead to demixing and therefore to colocalization of binding partners. This colocalization effectively increases the binding probability.

In principle, the contributions B_ij^(u) of nonbinding interactions to B_ij can be determined experimentally. SAXS experiments provide information about B_ij in the forward scattering intensities as well as information about dimerization, and thus K_d, encoded in the radius of gyration. Varying protein concentrations in equilibrium sedimentation experiments can provide estimates for K_d and B_ij.¹⁰ The latter is used to correct for the nonideality of the protein solution. Equation 1 can be viewed as such a correction for nonideality. Especially for weak binders, we expect that K_d and B_ij can be estimated accurately enough such that the contributions B_ij of nonbinding conformations to B_ij can be determined. Similar to the calculations performed here, we expect that in sedimentation experiments, the uncertainties in the estimates for B_ij^(u) will be much smaller than the individual uncertainties in the estimates for K_d and B_ij.

Complexes++ simulation software and the binless-WHAM code can be downloaded free of charge at https://www.github.com/bio-phys/complexespp and at https://github.com/bio-phys/binless-wham, respectively.

Acknowledgments

We thank Drs. Mateusz Sikora, Jakob T. Bullerjahn, Roberto Covino, and Attila Szabo for insightful discussions. We acknowledge the financial support by the Max Planck Society.

Author Present Address

^§ Institute of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innrain 80-82, A-6020 Innsbruck, Austria.

The authors declare no competing financial interest.

References

Johnson M. E.; Hummer G. Nonspecific binding limits the number of proteins in a cell and shapes their interaction networks. Proc. Natl. Acad. Sci. U.S.A. 2011, 108, 603–608. 10.1073/pnas.1010954108. [DOI] [PMC free article] [PubMed] [Google Scholar]
Johnson M. E.; Hummer G. Evolutionary Pressure on the Topology of Protein Interface Interaction Networks. J. Phys. Chem. B 2013, 117, 13098–13106. 10.1021/jp402944e. [DOI] [PMC free article] [PubMed] [Google Scholar]
Qin S.; Zhou H.-X. Protein folding, binding, and droplet formation in cell-like conditions. Curr. Opin. Struct. Biol. 2017, 43, 28–37. 10.1016/j.sbi.2016.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kastritis P. L.; Bonvin A. M. J. J. On the binding affinity of macromolecular interactions: daring to ask why proteins interact. J. R. Soc., Interface 2013, 10, 20120835 10.1098/rsif.2012.0835. [DOI] [PMC free article] [PubMed] [Google Scholar]
McMillan W. G.; Mayer J. E. The Statistical Thermodynamics of Multicomponent Systems. J. Chem. Phys. 1945, 13, 276–305. 10.1063/1.1724036. [DOI] [Google Scholar]
Hill T. L. Theory of Protein Solutions. I. J. Chem. Phys. 1955, 23, 623–636. 10.1063/1.1742068. [DOI] [Google Scholar]
McQuarrie D. A.Statistical Mechanics; Harper & Row: New York, 1976. [Google Scholar]
Tessier P. M.; Vandrey S. D.; Berger B. W.; Pazhianur R.; Sandler S. I.; Lenhoff A. M. Self-interaction chromatography: a novel screening method for rational protein crystallization. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2002, 58, 1531–1535. 10.1107/S0907444902012775. [DOI] [PubMed] [Google Scholar]
George A.; Chiang Y.; Guo B.; Arabshahi A.; Cai Z.; Wilson W.. Macromolecular Crystallography Part A. Methods in Enzymology Academic Press: 1997; Vol. 276, pp 100–110. [DOI] [PubMed] [Google Scholar]
Harding S. E.; Rowe A. J. Insight into protein–protein interactions from analytical ultracentrifugation. Biochem. Soc. Trans. 2010, 38, 901–907. 10.1042/BST0380901. [DOI] [PubMed] [Google Scholar]
Deszczynski M.; Harding S. E.; Winzor D. J. Negative second virial coefficients as predictors of protein crystal growth: Evidence from sedimentation equilibrium studies that refutes the designation of those light scattering parameters as osmotic virial coefficients. Biophys. Chem. 2006, 120, 106–113. 10.1016/j.bpc.2005.10.003. [DOI] [PubMed] [Google Scholar]
Winzor D. J.; Deszczynski M.; Harding S. E.; Wills P. R. Nonequivalence of second virial coefficients from sedimentation equilibrium and static light scattering studies of protein solutions. Biophys. Chem. 2007, 128, 46–55. 10.1016/j.bpc.2007.03.001. [DOI] [PubMed] [Google Scholar]
Blanco M. A.; Sahin E.; Li Y.; Roberts C. J. Reexamining protein-protein and protein-solvent interactions from Kirkwood-Buff analysis of light scattering in multi-component solutions. J. Chem. Phys. 2011, 134, 225103 10.1063/1.3596726. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wills P. R.; Winzor D. J. Rigorous analysis of static light scattering measurements on buffered protein solutions. Biophys. Chem. 2017, 228, 108–113. 10.1016/j.bpc.2017.07.007. [DOI] [PubMed] [Google Scholar]
von Bülow S.; Siggel M.; Linke M.; Hummer G. Dynamic cluster formation determines viscosity and diffusion in dense protein solutions. Proc. Natl. Acad. Sci. U.S.A. 2019, 116, 9843–9852. 10.1073/pnas.1817564116. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nawrocki G.; Karaboga A.; Sugita Y.; Feig M. Effect of protein-protein interactions and solvent viscosity on the rotational diffusion of proteins in crowded environments. Phys. Chem. Chem. Phys. 2019, 21, 876–883. 10.1039/C8CP06142D. [DOI] [PMC free article] [PubMed] [Google Scholar]
Woo H.-J.; Roux B. Calculation of absolute protein–ligand binding free energy from computer simulations. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 6825–6830. 10.1073/pnas.0409005102. [DOI] [PMC free article] [PubMed] [Google Scholar]
De Jong D. H.; Schäfer L. V.; De Vries A. H.; Marrink S. J.; Berendsen H. J. C.; Grubmüller H. Determining equilibrium constants for dimerization reactions from molecular dynamics simulations. J. Comput. Chem. 2011, 32, 1919–1928. 10.1002/jcc.21776. [DOI] [PubMed] [Google Scholar]
Yesselman J. D.; Denny S. K.; Bisaria N.; Herschlag D.; Greenleaf W. J.; Das R. Sequence-dependent RNA helix conformational preferences predictably impact tertiary structure formation. Proc. Natl. Acad. Sci. U.S.A. 2019, 116, 16847–16855. 10.1073/pnas.1901530116. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zimm B. H. Application of the Methods of Molecular Distribution to Solutions of Large Molecules. J. Chem. Phys. 1946, 14, 164–179. 10.1063/1.1724116. [DOI] [Google Scholar]
Neal B.; Asthagiri D.; Lenhoff A. Molecular Origins of Osmotic Second Virial Coefficients of Proteins. Biophys. J. 1998, 75, 2469–2477. 10.1016/S0006-3495(98)77691-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kim B.; Song X. Calculations of the second virial coefficients of protein solutions with an extended fast multipole method. Phys. Rev. E 2011, 83, 011915 10.1103/PhysRevE.83.011915. [DOI] [PubMed] [Google Scholar]
Singh J. K.; Kofke D. A. Mayer Sampling: Calculation of Cluster Integrals using Free-Energy Perturbation Methods. Phys. Rev. Lett. 2004, 92, 220601 10.1103/PhysRevLett.92.220601. [DOI] [PubMed] [Google Scholar]
Benjamin K. M.; Singh J. K.; Schultz A. J.; Kofke D. A. Higher-Order Virial Coefficients of Water Models. J. Phys. Chem. B 2007, 111, 11463–11473. 10.1021/jp0710685. [DOI] [PubMed] [Google Scholar]
Grünberger A.; Lai P.-K.; Blanco M. A.; Roberts C. J. Coarse-Grained Modeling of Protein Second Osmotic Virial Coefficients: Sterics and Short-Ranged Attractions. J. Phys. Chem. B 2013, 117, 763–770. 10.1021/jp308234j. [DOI] [PubMed] [Google Scholar]
Qin S.; Zhou H.-X. Calculation of Second Virial Coefficients of Atomistic Proteins Using Fast Fourier Transform. J. Phys. Chem. B 2019, 123, 8203–8215. 10.1021/acs.jpcb.9b06808. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mereghetti P.; Gabdoulline R. R.; Wade R. C. Brownian Dynamics Simulation of Protein Solutions: Structural and Dynamical Properties. Biophys. J. 2010, 99, 3782–3791. 10.1016/j.bpj.2010.10.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mereghetti P.; Martinez M.; Wade R. C. Long range Debye-Hückel correction for computation of grid-based electrostatic forces between biomacromolecules. BMC Biophys. 2014, 7, 4 10.1186/2046-1682-7-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ashton D. J.; Wilding N. B. Three-body interactions in complex fluids: Virial coefficients from simulation finite-size effects. J. Chem. Phys. 2014, 140, 244118 10.1063/1.4883718. [DOI] [PubMed] [Google Scholar]
Ashton D. J.; Wilding N. B. Quantifying the effects of neglecting many-body interactions in coarse-grained models of complex fluids. Phys. Rev. E 2014, 89, 031301 10.1103/PhysRevE.89.031301. [DOI] [PubMed] [Google Scholar]
Bennett C. H. Efficient estimation of free energy differences from Monte Carlo data. J. Comput. Phys. 1976, 22, 245–268. 10.1016/0021-9991(76)90078-4. [DOI] [Google Scholar]
Souaille M.; Roux B. Extension to the Weighted Histogram Analysis Method Combining Umbrella Sampling With Free Energy Calculations. Comput. Phys. Commun. 2001, 135, 40–57. 10.1016/S0010-4655(00)00215-0. [DOI] [Google Scholar]
Shirts M. R.; Chodera J. D. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys. 2008, 129, 124105 10.1063/1.2978177. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rosta E.; Nowotny M.; Yang W.; Hummer G. Catalytic Mechanism of RNA Backbone Cleavage by Ribonuclease H from Quantum Mechanics/molecular Mechanics Simulations. J. Am. Chem. Soc. 2011, 133, 8934–8941. 10.1021/ja200173a. [DOI] [PMC free article] [PubMed] [Google Scholar]
Harding S. E.; Horton J. C.; Jones S.; Thornton J. M.; Winzor D. J. COVOL: An Interactive Program for Evaluating Second Virial Coefficients from the Triaxial Shape or Dimensions of Rigid Macromolecules. Biophys. J. 1999, 76, 2432–2438. 10.1016/S0006-3495(99)77398-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Onnes H. K.Through Measurement to Knowledge: The Selected Papers of Heike Kamerlingh Onnes 1853-1926; Gavroglu K.; Goudaroulis Y., Eds.; Springer Netherlands: Dordrecht, 1991; pp 146–163. [Google Scholar]
Kamerlingh Onnes H. In Expression of the Equation of State of Gases and Liquids by Means of Series. KNAW Proceedings, Amsterdam, 1902; pp 125–147.
Hill T. L. Theory of Solutions. II. Osmotic Pressure Virial Expansion and Light Scattering in Two Component Solutions. J. Chem. Phys. 1959, 30, 93–97. 10.1063/1.1729949. [DOI] [Google Scholar]
Widom B.; Underwood R. C. Second Osmotic Virial Coefficient from the Two-Component van der Waals Equation of State. J. Phys. Chem. B 2012, 116, 9492–9499. 10.1021/jp3051802. [DOI] [PubMed] [Google Scholar]
Gilson M.; Given J.; Bush B.; McCammon J. The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophys. J. 1997, 72, 1047–1069. 10.1016/S0006-3495(97)78756-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Woolley H. W. The Representation of Gas Properties in Terms of Molecular Clusters. J. Chem. Phys. 1953, 21, 236–241. 10.1063/1.1698866. [DOI] [Google Scholar]
Schröer W.; Weiss V. C. Molecular association in statistical thermodynamics. J. Mol. Liq. 2015, 205, 22–30. 10.1016/j.molliq.2014.08.013. [DOI] [Google Scholar]; Global Perspectives on the Structure and Dynamics of Liquids and Mixtures: Experiment and Simulation 9–13 September, 2013.
Lebowitz J. L.; Percus J. K.; Verlet L. Ensemble Dependence of Fluctuations with Application to Machine Computations. Phys. Rev. 1967, 153, 250–254. 10.1103/PhysRev.153.250. [DOI] [Google Scholar]
Kirkwood J. G.; Buff F. P. The Statistical Mechanical Theory of Solutions. I. J. Chem. Phys. 1951, 19, 774–777. 10.1063/1.1748352. [DOI] [Google Scholar]
Ben-Naim A.; Navarro A. M.; Leal J. M. A Kirkwood-Buff analysis of local properties of solutions. Phys. Chem. Chem. Phys. 2008, 10, 2451–2460. 10.1039/b716116f. [DOI] [PubMed] [Google Scholar]
Singh J. K.; Kofke D. A. Mayer Sampling: Calculation of Cluster Integrals using Free-Energy Perturbation Methods. Phys. Rev. Lett. 2004, 92, 220601 10.1103/PhysRevLett.92.220601. [DOI] [PubMed] [Google Scholar]
Buchete N.-V.; Hummer G. Coarse Master Equations for Peptide Folding Dynamics. J. Phys. Chem. B 2008, 112, 6057–6069. 10.1021/jp0761665. [DOI] [PubMed] [Google Scholar]
Sophianopoulos A. J. Association Sites of Lysozyme in Solution: I. THE ACTIVE SITE. J. Biol. Chem. 1969, 244, 3188–3193. [PubMed] [Google Scholar]
Diamond R. Real-space refinement of the structure of hen egg-white lysozyme. J. Mol. Biol. 1974, 82, 371–391. 10.1016/0022-2836(74)90598-1. [DOI] [PubMed] [Google Scholar]
Kang R. S.; Daniels C. M.; Francis S. A.; Shih S. C.; Salerno W. J.; Hicke L.; Radhakrishnan I. Solution Structure of a CUE-Ubiquitin Complex Reveals a Conserved Mode of Ubiquitin Binding. Cell 2003, 113, 621–630. 10.1016/S0092-8674(03)00362-3. [DOI] [PubMed] [Google Scholar]
Bennett S. E.; Schimerlik M. I.; Mosbaugh D. W. Kinetics of the uracil-DNA glycosylase/inhibitor protein association. Ung interaction with Ugi, nucleic acids, and uracil compounds. J. Biol. Chem. 1993, 268, 26879–26885. [PubMed] [Google Scholar]
Putnam C. D.; Shroyer M. J. N.; Lundquist A. J.; Mol C. D.; Arvai A. S.; Mosbaugh D. W.; Tainer J. A. Protein mimicry of DNA from crystal structures of the uracil-DNA glycosylase inhibitor protein and its complex with Escherichia coli uracil-DNA glycosylase. J. Mol. Biol. 1999, 287, 331–346. 10.1006/jmbi.1999.2605. [DOI] [PubMed] [Google Scholar]
Kim Y. C.; Hummer G. Coarse-grained Models for Simulations of Multiprotein Complexes: Application to Ubiquitin Binding. J. Mol. Biol. 2008, 375, 1416–1433. 10.1016/j.jmb.2007.11.063. [DOI] [PMC free article] [PubMed] [Google Scholar]
Miyazawa S.; Jernigan R. L. Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 1985, 18, 534–552. 10.1021/ma00145a039. [DOI] [Google Scholar]
Miyazawa S.; Jernigan R. L. Residue - Residue Potentials with a Favorable Contact Pair Term and an Unfavorable High Packing Density Term, for Simulation and Threading. J. Mol. Biol. 1996, 256, 623–644. 10.1006/jmbi.1996.0114. [DOI] [PubMed] [Google Scholar]
Efron B. Bootstrap Methods: Another Look at the Jackknife. Ann. Stat. 1979, 7, 1–26. 10.1214/aos/1176344552. [DOI] [Google Scholar]
Straatsma T.; Berendsen H.; Stam A. Estimation of statistical errors in molecular simulation calculations. Mol. Phys. 1986, 57, 89–95. 10.1080/00268978600100071. [DOI] [Google Scholar]
Best R. B.; Zheng W.; Mittal J. Balanced Protein-Water Interactions Improve Properties of Disordered Proteins and Non-Specific Protein Association. J. Chem. Theory Comput. 2014, 10, 5113–5124. 10.1021/ct500569b. [DOI] [PMC free article] [PubMed] [Google Scholar]
Piana S.; Donchev A. G.; Robustelli P.; Shaw D. E. Water Dispersion Interactions Strongly Influence Simulated Structural Properties of Disordered Protein States. J. Phys. Chem. B 2015, 119, 5113–5123. 10.1021/jp508971m. [DOI] [PubMed] [Google Scholar]
Robustelli P.; Piana S.; Shaw D. E. Developing a molecular dynamics force field for both folded and disordered protein states. Proc. Natl. Acad. Sci. U.S.A. 2018, 115, E4758–E4766. 10.1073/pnas.1800690115. [DOI] [PMC free article] [PubMed] [Google Scholar]
Piana S.; Robustelli P.; Tan D.; Chen S.; Shaw D. E. Development of a Force Field for the Simulation of Single-Chain Proteins and Protein-Protein Complexes. J. Chem. Theory Comput. 2020, 16, 2494–2507. 10.1021/acs.jctc.9b00251. [DOI] [PubMed] [Google Scholar]
Siebenmorgen T.; Engelhard M.; Zacharias M. Prediction of protein-protein complexes using replica exchange with repulsive scaling. J. Comput. Chem. 2020, 41, 1436–1447. 10.1002/jcc.26187. [DOI] [PubMed] [Google Scholar]
Marrink S. J.; Risselada H. J.; Yefimov S.; Tieleman D. P.; de Vries A. H. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. J. Phys. Chem. B 2007, 111, 7812–7824. 10.1021/jp071097f. [DOI] [PubMed] [Google Scholar]
Stark A. C.; Andrews C. T.; Elcock A. H. Toward Optimized Potential Functions for Protein-Protein Interactions in Aqueous Solutions: Osmotic Second Virial Coefficient Calculations Using the MARTINI Coarse-Grained Force Field. J. Chem. Theory Comput. 2013, 9, 4176–4185. 10.1021/ct400008p. [DOI] [PMC free article] [PubMed] [Google Scholar]
Schmalhorst P. S.; Deluweit F.; Scherrers R.; Heisenberg C.-P.; Sikora M. Overcoming the Limitations of the MARTINI Force Field in Simulations of Polysaccharides. J. Chem. Theory Comput. 2017, 13, 5039–5053. 10.1021/acs.jctc.7b00374. [DOI] [PubMed] [Google Scholar]
Brangwynne C.; Tompa P.; Pappu R. Polymer physics of intracellular phase transitions. Nat. Phys. 2015, 11, 899–904. 10.1038/nphys3532. [DOI] [Google Scholar]
Huggins M. L. Solutions of Long Chain Compounds. J. Chem. Phys. 1941, 9, 440. 10.1063/1.1750930. [DOI] [Google Scholar]
Flory P. J. Thermodynamics of High Polymer Solutions. J. Chem. Phys. 1941, 9, 660. 10.1063/1.1750971. [DOI] [Google Scholar]
Wei M.-T.; Elbaum-Garfinkle S.; Holehouse A. S.; Chen C. C.-H.; Feric M.; Arnold C. B.; Priestley R. D.; Pappu R. V.; Brangwynne C. P. Phase behaviour of disordered proteins underlying low density and high permeability of liquid organelles. Nat. Chem. 2017, 9, 1118–1125. 10.1038/nchem.2803. [DOI] [PMC free article] [PubMed] [Google Scholar]
Harmon T. S.; Holehouse A. S.; Rosen M. K.; Pappu R. V. Intrinsically disordered linkers determine the interplay between phase separation and gelation in multivalent proteins. eLife 2017, 6, e30294 10.7554/eLife.30294. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref1] Johnson M. E.; Hummer G. Nonspecific binding limits the number of proteins in a cell and shapes their interaction networks. Proc. Natl. Acad. Sci. U.S.A. 2011, 108, 603–608. 10.1073/pnas.1010954108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref2] Johnson M. E.; Hummer G. Evolutionary Pressure on the Topology of Protein Interface Interaction Networks. J. Phys. Chem. B 2013, 117, 13098–13106. 10.1021/jp402944e. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref3] Qin S.; Zhou H.-X. Protein folding, binding, and droplet formation in cell-like conditions. Curr. Opin. Struct. Biol. 2017, 43, 28–37. 10.1016/j.sbi.2016.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref4] Kastritis P. L.; Bonvin A. M. J. J. On the binding affinity of macromolecular interactions: daring to ask why proteins interact. J. R. Soc., Interface 2013, 10, 20120835 10.1098/rsif.2012.0835. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref5] McMillan W. G.; Mayer J. E. The Statistical Thermodynamics of Multicomponent Systems. J. Chem. Phys. 1945, 13, 276–305. 10.1063/1.1724036. [DOI] [Google Scholar]

[ref6] Hill T. L. Theory of Protein Solutions. I. J. Chem. Phys. 1955, 23, 623–636. 10.1063/1.1742068. [DOI] [Google Scholar]

[ref7] McQuarrie D. A.Statistical Mechanics; Harper & Row: New York, 1976. [Google Scholar]

[ref8] Tessier P. M.; Vandrey S. D.; Berger B. W.; Pazhianur R.; Sandler S. I.; Lenhoff A. M. Self-interaction chromatography: a novel screening method for rational protein crystallization. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2002, 58, 1531–1535. 10.1107/S0907444902012775. [DOI] [PubMed] [Google Scholar]

[ref9] George A.; Chiang Y.; Guo B.; Arabshahi A.; Cai Z.; Wilson W.. Macromolecular Crystallography Part A. Methods in Enzymology Academic Press: 1997; Vol. 276, pp 100–110. [DOI] [PubMed] [Google Scholar]

[ref10] Harding S. E.; Rowe A. J. Insight into protein–protein interactions from analytical ultracentrifugation. Biochem. Soc. Trans. 2010, 38, 901–907. 10.1042/BST0380901. [DOI] [PubMed] [Google Scholar]

[ref11] Deszczynski M.; Harding S. E.; Winzor D. J. Negative second virial coefficients as predictors of protein crystal growth: Evidence from sedimentation equilibrium studies that refutes the designation of those light scattering parameters as osmotic virial coefficients. Biophys. Chem. 2006, 120, 106–113. 10.1016/j.bpc.2005.10.003. [DOI] [PubMed] [Google Scholar]

[ref12] Winzor D. J.; Deszczynski M.; Harding S. E.; Wills P. R. Nonequivalence of second virial coefficients from sedimentation equilibrium and static light scattering studies of protein solutions. Biophys. Chem. 2007, 128, 46–55. 10.1016/j.bpc.2007.03.001. [DOI] [PubMed] [Google Scholar]

[ref13] Blanco M. A.; Sahin E.; Li Y.; Roberts C. J. Reexamining protein-protein and protein-solvent interactions from Kirkwood-Buff analysis of light scattering in multi-component solutions. J. Chem. Phys. 2011, 134, 225103 10.1063/1.3596726. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref14] Wills P. R.; Winzor D. J. Rigorous analysis of static light scattering measurements on buffered protein solutions. Biophys. Chem. 2017, 228, 108–113. 10.1016/j.bpc.2017.07.007. [DOI] [PubMed] [Google Scholar]

[ref15] von Bülow S.; Siggel M.; Linke M.; Hummer G. Dynamic cluster formation determines viscosity and diffusion in dense protein solutions. Proc. Natl. Acad. Sci. U.S.A. 2019, 116, 9843–9852. 10.1073/pnas.1817564116. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref16] Nawrocki G.; Karaboga A.; Sugita Y.; Feig M. Effect of protein-protein interactions and solvent viscosity on the rotational diffusion of proteins in crowded environments. Phys. Chem. Chem. Phys. 2019, 21, 876–883. 10.1039/C8CP06142D. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref17] Woo H.-J.; Roux B. Calculation of absolute protein–ligand binding free energy from computer simulations. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 6825–6830. 10.1073/pnas.0409005102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref18] De Jong D. H.; Schäfer L. V.; De Vries A. H.; Marrink S. J.; Berendsen H. J. C.; Grubmüller H. Determining equilibrium constants for dimerization reactions from molecular dynamics simulations. J. Comput. Chem. 2011, 32, 1919–1928. 10.1002/jcc.21776. [DOI] [PubMed] [Google Scholar]

[ref19] Yesselman J. D.; Denny S. K.; Bisaria N.; Herschlag D.; Greenleaf W. J.; Das R. Sequence-dependent RNA helix conformational preferences predictably impact tertiary structure formation. Proc. Natl. Acad. Sci. U.S.A. 2019, 116, 16847–16855. 10.1073/pnas.1901530116. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref20] Zimm B. H. Application of the Methods of Molecular Distribution to Solutions of Large Molecules. J. Chem. Phys. 1946, 14, 164–179. 10.1063/1.1724116. [DOI] [Google Scholar]

[ref21] Neal B.; Asthagiri D.; Lenhoff A. Molecular Origins of Osmotic Second Virial Coefficients of Proteins. Biophys. J. 1998, 75, 2469–2477. 10.1016/S0006-3495(98)77691-X. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref22] Kim B.; Song X. Calculations of the second virial coefficients of protein solutions with an extended fast multipole method. Phys. Rev. E 2011, 83, 011915 10.1103/PhysRevE.83.011915. [DOI] [PubMed] [Google Scholar]

[ref23] Singh J. K.; Kofke D. A. Mayer Sampling: Calculation of Cluster Integrals using Free-Energy Perturbation Methods. Phys. Rev. Lett. 2004, 92, 220601 10.1103/PhysRevLett.92.220601. [DOI] [PubMed] [Google Scholar]

[ref24] Benjamin K. M.; Singh J. K.; Schultz A. J.; Kofke D. A. Higher-Order Virial Coefficients of Water Models. J. Phys. Chem. B 2007, 111, 11463–11473. 10.1021/jp0710685. [DOI] [PubMed] [Google Scholar]

[ref25] Grünberger A.; Lai P.-K.; Blanco M. A.; Roberts C. J. Coarse-Grained Modeling of Protein Second Osmotic Virial Coefficients: Sterics and Short-Ranged Attractions. J. Phys. Chem. B 2013, 117, 763–770. 10.1021/jp308234j. [DOI] [PubMed] [Google Scholar]

[ref26] Qin S.; Zhou H.-X. Calculation of Second Virial Coefficients of Atomistic Proteins Using Fast Fourier Transform. J. Phys. Chem. B 2019, 123, 8203–8215. 10.1021/acs.jpcb.9b06808. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref27] Mereghetti P.; Gabdoulline R. R.; Wade R. C. Brownian Dynamics Simulation of Protein Solutions: Structural and Dynamical Properties. Biophys. J. 2010, 99, 3782–3791. 10.1016/j.bpj.2010.10.035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref28] Mereghetti P.; Martinez M.; Wade R. C. Long range Debye-Hückel correction for computation of grid-based electrostatic forces between biomacromolecules. BMC Biophys. 2014, 7, 4 10.1186/2046-1682-7-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref29] Ashton D. J.; Wilding N. B. Three-body interactions in complex fluids: Virial coefficients from simulation finite-size effects. J. Chem. Phys. 2014, 140, 244118 10.1063/1.4883718. [DOI] [PubMed] [Google Scholar]

[ref30] Ashton D. J.; Wilding N. B. Quantifying the effects of neglecting many-body interactions in coarse-grained models of complex fluids. Phys. Rev. E 2014, 89, 031301 10.1103/PhysRevE.89.031301. [DOI] [PubMed] [Google Scholar]

[ref31] Bennett C. H. Efficient estimation of free energy differences from Monte Carlo data. J. Comput. Phys. 1976, 22, 245–268. 10.1016/0021-9991(76)90078-4. [DOI] [Google Scholar]

[ref32] Souaille M.; Roux B. Extension to the Weighted Histogram Analysis Method Combining Umbrella Sampling With Free Energy Calculations. Comput. Phys. Commun. 2001, 135, 40–57. 10.1016/S0010-4655(00)00215-0. [DOI] [Google Scholar]

[ref33] Shirts M. R.; Chodera J. D. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys. 2008, 129, 124105 10.1063/1.2978177. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref34] Rosta E.; Nowotny M.; Yang W.; Hummer G. Catalytic Mechanism of RNA Backbone Cleavage by Ribonuclease H from Quantum Mechanics/molecular Mechanics Simulations. J. Am. Chem. Soc. 2011, 133, 8934–8941. 10.1021/ja200173a. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref35] Harding S. E.; Horton J. C.; Jones S.; Thornton J. M.; Winzor D. J. COVOL: An Interactive Program for Evaluating Second Virial Coefficients from the Triaxial Shape or Dimensions of Rigid Macromolecules. Biophys. J. 1999, 76, 2432–2438. 10.1016/S0006-3495(99)77398-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref36] Onnes H. K.Through Measurement to Knowledge: The Selected Papers of Heike Kamerlingh Onnes 1853-1926; Gavroglu K.; Goudaroulis Y., Eds.; Springer Netherlands: Dordrecht, 1991; pp 146–163. [Google Scholar]

[ref37] Kamerlingh Onnes H. In Expression of the Equation of State of Gases and Liquids by Means of Series. KNAW Proceedings, Amsterdam, 1902; pp 125–147.

[ref38] Hill T. L. Theory of Solutions. II. Osmotic Pressure Virial Expansion and Light Scattering in Two Component Solutions. J. Chem. Phys. 1959, 30, 93–97. 10.1063/1.1729949. [DOI] [Google Scholar]

[ref39] Widom B.; Underwood R. C. Second Osmotic Virial Coefficient from the Two-Component van der Waals Equation of State. J. Phys. Chem. B 2012, 116, 9492–9499. 10.1021/jp3051802. [DOI] [PubMed] [Google Scholar]

[ref40] Gilson M.; Given J.; Bush B.; McCammon J. The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophys. J. 1997, 72, 1047–1069. 10.1016/S0006-3495(97)78756-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref41] Woolley H. W. The Representation of Gas Properties in Terms of Molecular Clusters. J. Chem. Phys. 1953, 21, 236–241. 10.1063/1.1698866. [DOI] [Google Scholar]

[ref42] Schröer W.; Weiss V. C. Molecular association in statistical thermodynamics. J. Mol. Liq. 2015, 205, 22–30. 10.1016/j.molliq.2014.08.013. [DOI] [Google Scholar]; Global Perspectives on the Structure and Dynamics of Liquids and Mixtures: Experiment and Simulation 9–13 September, 2013.

[ref43] Lebowitz J. L.; Percus J. K.; Verlet L. Ensemble Dependence of Fluctuations with Application to Machine Computations. Phys. Rev. 1967, 153, 250–254. 10.1103/PhysRev.153.250. [DOI] [Google Scholar]

[ref44] Kirkwood J. G.; Buff F. P. The Statistical Mechanical Theory of Solutions. I. J. Chem. Phys. 1951, 19, 774–777. 10.1063/1.1748352. [DOI] [Google Scholar]

[ref45] Ben-Naim A.; Navarro A. M.; Leal J. M. A Kirkwood-Buff analysis of local properties of solutions. Phys. Chem. Chem. Phys. 2008, 10, 2451–2460. 10.1039/b716116f. [DOI] [PubMed] [Google Scholar]

[ref46] Singh J. K.; Kofke D. A. Mayer Sampling: Calculation of Cluster Integrals using Free-Energy Perturbation Methods. Phys. Rev. Lett. 2004, 92, 220601 10.1103/PhysRevLett.92.220601. [DOI] [PubMed] [Google Scholar]

[ref47] Buchete N.-V.; Hummer G. Coarse Master Equations for Peptide Folding Dynamics. J. Phys. Chem. B 2008, 112, 6057–6069. 10.1021/jp0761665. [DOI] [PubMed] [Google Scholar]

[ref48] Sophianopoulos A. J. Association Sites of Lysozyme in Solution: I. THE ACTIVE SITE. J. Biol. Chem. 1969, 244, 3188–3193. [PubMed] [Google Scholar]

[ref49] Diamond R. Real-space refinement of the structure of hen egg-white lysozyme. J. Mol. Biol. 1974, 82, 371–391. 10.1016/0022-2836(74)90598-1. [DOI] [PubMed] [Google Scholar]

[ref50] Kang R. S.; Daniels C. M.; Francis S. A.; Shih S. C.; Salerno W. J.; Hicke L.; Radhakrishnan I. Solution Structure of a CUE-Ubiquitin Complex Reveals a Conserved Mode of Ubiquitin Binding. Cell 2003, 113, 621–630. 10.1016/S0092-8674(03)00362-3. [DOI] [PubMed] [Google Scholar]

[ref51] Bennett S. E.; Schimerlik M. I.; Mosbaugh D. W. Kinetics of the uracil-DNA glycosylase/inhibitor protein association. Ung interaction with Ugi, nucleic acids, and uracil compounds. J. Biol. Chem. 1993, 268, 26879–26885. [PubMed] [Google Scholar]

[ref52] Putnam C. D.; Shroyer M. J. N.; Lundquist A. J.; Mol C. D.; Arvai A. S.; Mosbaugh D. W.; Tainer J. A. Protein mimicry of DNA from crystal structures of the uracil-DNA glycosylase inhibitor protein and its complex with Escherichia coli uracil-DNA glycosylase. J. Mol. Biol. 1999, 287, 331–346. 10.1006/jmbi.1999.2605. [DOI] [PubMed] [Google Scholar]

[ref53] Kim Y. C.; Hummer G. Coarse-grained Models for Simulations of Multiprotein Complexes: Application to Ubiquitin Binding. J. Mol. Biol. 2008, 375, 1416–1433. 10.1016/j.jmb.2007.11.063. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref54] Miyazawa S.; Jernigan R. L. Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 1985, 18, 534–552. 10.1021/ma00145a039. [DOI] [Google Scholar]

[ref55] Miyazawa S.; Jernigan R. L. Residue - Residue Potentials with a Favorable Contact Pair Term and an Unfavorable High Packing Density Term, for Simulation and Threading. J. Mol. Biol. 1996, 256, 623–644. 10.1006/jmbi.1996.0114. [DOI] [PubMed] [Google Scholar]

[ref56] Efron B. Bootstrap Methods: Another Look at the Jackknife. Ann. Stat. 1979, 7, 1–26. 10.1214/aos/1176344552. [DOI] [Google Scholar]

[ref57] Straatsma T.; Berendsen H.; Stam A. Estimation of statistical errors in molecular simulation calculations. Mol. Phys. 1986, 57, 89–95. 10.1080/00268978600100071. [DOI] [Google Scholar]

[ref58] Best R. B.; Zheng W.; Mittal J. Balanced Protein-Water Interactions Improve Properties of Disordered Proteins and Non-Specific Protein Association. J. Chem. Theory Comput. 2014, 10, 5113–5124. 10.1021/ct500569b. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref59] Piana S.; Donchev A. G.; Robustelli P.; Shaw D. E. Water Dispersion Interactions Strongly Influence Simulated Structural Properties of Disordered Protein States. J. Phys. Chem. B 2015, 119, 5113–5123. 10.1021/jp508971m. [DOI] [PubMed] [Google Scholar]

[ref60] Robustelli P.; Piana S.; Shaw D. E. Developing a molecular dynamics force field for both folded and disordered protein states. Proc. Natl. Acad. Sci. U.S.A. 2018, 115, E4758–E4766. 10.1073/pnas.1800690115. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref61] Piana S.; Robustelli P.; Tan D.; Chen S.; Shaw D. E. Development of a Force Field for the Simulation of Single-Chain Proteins and Protein-Protein Complexes. J. Chem. Theory Comput. 2020, 16, 2494–2507. 10.1021/acs.jctc.9b00251. [DOI] [PubMed] [Google Scholar]

[ref62] Siebenmorgen T.; Engelhard M.; Zacharias M. Prediction of protein-protein complexes using replica exchange with repulsive scaling. J. Comput. Chem. 2020, 41, 1436–1447. 10.1002/jcc.26187. [DOI] [PubMed] [Google Scholar]

[ref63] Marrink S. J.; Risselada H. J.; Yefimov S.; Tieleman D. P.; de Vries A. H. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. J. Phys. Chem. B 2007, 111, 7812–7824. 10.1021/jp071097f. [DOI] [PubMed] [Google Scholar]

[ref64] Stark A. C.; Andrews C. T.; Elcock A. H. Toward Optimized Potential Functions for Protein-Protein Interactions in Aqueous Solutions: Osmotic Second Virial Coefficient Calculations Using the MARTINI Coarse-Grained Force Field. J. Chem. Theory Comput. 2013, 9, 4176–4185. 10.1021/ct400008p. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref65] Schmalhorst P. S.; Deluweit F.; Scherrers R.; Heisenberg C.-P.; Sikora M. Overcoming the Limitations of the MARTINI Force Field in Simulations of Polysaccharides. J. Chem. Theory Comput. 2017, 13, 5039–5053. 10.1021/acs.jctc.7b00374. [DOI] [PubMed] [Google Scholar]

[ref66] Brangwynne C.; Tompa P.; Pappu R. Polymer physics of intracellular phase transitions. Nat. Phys. 2015, 11, 899–904. 10.1038/nphys3532. [DOI] [Google Scholar]

[ref67] Huggins M. L. Solutions of Long Chain Compounds. J. Chem. Phys. 1941, 9, 440. 10.1063/1.1750930. [DOI] [Google Scholar]

[ref68] Flory P. J. Thermodynamics of High Polymer Solutions. J. Chem. Phys. 1941, 9, 660. 10.1063/1.1750971. [DOI] [Google Scholar]

[ref69] Wei M.-T.; Elbaum-Garfinkle S.; Holehouse A. S.; Chen C. C.-H.; Feric M.; Arnold C. B.; Priestley R. D.; Pappu R. V.; Brangwynne C. P. Phase behaviour of disordered proteins underlying low density and high permeability of liquid organelles. Nat. Chem. 2017, 9, 1118–1125. 10.1038/nchem.2803. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref70] Harmon T. S.; Holehouse A. S.; Rosen M. K.; Pappu R. V. Intrinsically disordered linkers determine the interplay between phase separation and gelation in multivalent proteins. eLife 2017, 6, e30294 10.7554/eLife.30294. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Quantifying Protein–Protein Interactions in Molecular Simulations

Alfredo Jost Lopez

Patrick K Quoika

Max Linke

Gerhard Hummer

Jürgen Köfinger

Abstract

1. Introduction

2. Theory

2.1. Preliminaries

2.2. Estimating the Dissociation Constant

Figure 1.

2.3. Estimating the Second Osmotic Virial Coefficient

2.4. Contribution of Nonbinding Interactions to B_ij

2.5. Indistinguishable Binding Partners (Homodimers)

2.6. K_d and B_ij from a Single Simulation

2.6.1. Estimating Two-Particle Configurational Partition Functions for Implicit Solvents

2.7. Practical Considerations

3. Methods

4. Results

Figure 2.

Table 1. K_d, B_ij, and the Contributions of Binding Interactions, B_ij^(b), and Nonbinding Interactions, B_ij, to B_ij for Three Protein Complexes (PDB codes 6LYZ, 1OTR, 1UUG) for the Insertion/Removal Method (“ins/rem”) and the Subvolume Method (“subvol”)^a.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

5. Conclusions

Acknowledgments

Author Present Address

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Quantifying Protein–Protein Interactions in Molecular Simulations

Alfredo Jost Lopez

Patrick K Quoika

Max Linke

Gerhard Hummer

Jürgen Köfinger

Abstract

1. Introduction

2. Theory

2.1. Preliminaries

2.2. Estimating the Dissociation Constant

Figure 1.

2.3. Estimating the Second Osmotic Virial Coefficient

2.4. Contribution of Nonbinding Interactions to Bij

2.5. Indistinguishable Binding Partners (Homodimers)

2.6. Kd and Bij from a Single Simulation

2.6.1. Estimating Two-Particle Configurational Partition Functions for Implicit Solvents

2.7. Practical Considerations

3. Methods

4. Results

Figure 2.

Table 1. Kd, Bij, and the Contributions of Binding Interactions, Bij(b), and Nonbinding Interactions, Bij, to Bij for Three Protein Complexes (PDB codes 6LYZ, 1OTR, 1UUG) for the Insertion/Removal Method (“ins/rem”) and the Subvolume Method (“subvol”)a.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

5. Conclusions

Acknowledgments

Author Present Address

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

2.4. Contribution of Nonbinding Interactions to B_ij

2.6. K_d and B_ij from a Single Simulation

Table 1. K_d, B_ij, and the Contributions of Binding Interactions, B_ij^(b), and Nonbinding Interactions, B_ij, to B_ij for Three Protein Complexes (PDB codes 6LYZ, 1OTR, 1UUG) for the Insertion/Removal Method (“ins/rem”) and the Subvolume Method (“subvol”)^a.