Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Oct 3.
Published in final edited form as: J Phys Chem B. 2019 Sep 19;123(39):8203–8215. doi: 10.1021/acs.jpcb.9b06808

Calculation of Second Virial Coefficients of Atomistic Proteins Using Fast Fourier Fransform

Sanbo Qin †,, Huan-Xiang Zhou †,*
PMCID: PMC7032052  NIHMSID: NIHMS1558946  PMID: 31490691

Abstract

The second virial coefficient, B2, measures a protein solution’s deviation from ideal behavior. It is widely used to predict or explain solubility, crystallization condition, aggregation propensity, and critical temperature for liquid-liquid phase separation. B2 is determined by the interaction energy between two protein molecules, specifically, by the integration of the Mayer f-function in the relative configurational space (translation and rotation) of the two molecules. Simple theoretical models, such as one attributed to Derjaguin, Landau, Verwey, and Overbeek (DLVO), can fit the dependence of B2 on salt concentrations. However, model parameters derived often are physically unrealistic and hardly transferable from protein to protein. Previous B2 calculations incorporating atomistic details were done with limited sampling in the configurational space, due to enormous computational cost. Our FMAP method, based on fast Fourier transform, can considerably accelerate such calculations, and here we adapt it to calculate B2 values for proteins represented at the atomic level in implicit solvent. After tuning of a single parameter in the energy function, FMAPB2 predicts well the B2 values for lysozyme and other proteins over wide ranges of solvent conditions (salt concentration, pH, and temperature). The method is available as a web server at http://pipe.rcc.fsu.edu/fmapb2.

Graphical Abstract

graphic file with name nihms-1558946-f0001.jpg

INTRODUCTION

The second virial coefficient, B2, measures a protein solution’s deviation from ideal behavior. It is widely used to predict or explain various thermodynamic properties, including protein solubility,1 crystallization condition,2 aggregation propensity,3 and critical temperature for liquid-liquid phase separation.4 While its value is obtained through bulk experiments, microscopically, according to statistical thermodynamics,5 B2 is determined by the interaction energy between two protein molecules. Over the years, considerable efforts have been expended to obtain B2 values experimentally and by theoretical modeling or computation, yet significant barriers remain on both fronts. Conceptually, there is much similarity between B2 as a measure for weak, nonspecific binding and binding constant as the measure for strong, specific binding. Given the many nonspecific interactions that proteins inescapably encounter in cellular environments, there is renewed interest in better characterizing nonspecific interactions.6 This study aimed to develop a computational method for B2 that achieves both accuracy in terms of comparison against cross-validated experimental data and physical insight at the atomic level.

A variety of experimental techniques have been developed to obtain B2 values, each with its own limitations. The most often used technique nowadays is static light scattering (SLS), where one plots the reciprocal of the Raleigh scattering intensity at 90° angle against protein concentration, with the slope yielding B2. This technique consumes a great amount of time and protein samples, and is prone to artifacts including protein aggregation and other impurities. These limitations motivated continued improvements, including flow SLS, which measures the scattering intensity and protein concentration simultaneously in flow-mode using size exclusion chromatography,7 and concentration gradient SLS,8 which achieves similar goals using a programmable dual-syringe infusion pump for dispensing solutions with varying concentrations. Another technique is small angle X-ray or neutron scattering (SAXS/SANS).910 The scattering intensity can be decomposed to obtain the structure factor, which contains information about the weak interactions between solute molecules, over the entire range of the magnitude, q, of the scattering vector. This technique requires specialized resources and again is time-consuming. In addition, extracting B2 from SAXS/SANS data is an elaborate process that can generate considerable uncertainty, usually involving extrapolation to q = 0 of the structure factors at several protein concentrations and a subsequent linear fit of the concentration dependence. Aside from the inherent limitations of the experimental techniques, an even more serious problem is that the same technique when used by different researchers and the same researchers using different techniques can produce large discrepancies in B2 values.

Theoretically, B2 can be calculated by integrating the Mayer f-function in the 6-dimensional relative configurational space (translation and rotation) of the two molecules. When the interaction energy is modeled as spherically symmetric, dependence on 5 of the 6 degrees of freedom disappears and B2 becomes a 1-dimensional integral, over the interprotein distance. Such spherical models, in particular one attributed to Derjaguin, Landau, Verwey, and Overbeek (DLVO),1112 have long been used to help interpret experimental data, e.g., for fitting salt dependence of B2. However, the parameters derived from such fits often lack physical significance13 and are hardly transferable from one solvent condition to another or from one protein to another.

Various energy models with higher levels of realism have been introduced for B2 calculations, necessitating sampling in the 6-dimensional space.1423 The representation of the protein ranged from sphere plus embedded charges16 to coarse-grained17, 2023 to all atom.1415, 1819 While most used implicit solvent, one used explicit solvent22 and another used explicit ions;17 one other study considered limited protein flexibility.23 B2 was calculated in one of two ways. In the first, the configurational space was sampled to implement the integration of the Mayer f-function.1416, 2021 The number of configurations ranged from as few as six20 to thousands14 to 106 to 109,1516, 21 generated by uniform15 or Monte Carlo14, 16, 21 sampling. In the second way, a dilute protein solution was simulated by Brownian dynamics,1819 molecular dynamics,22 or Monte Carlo17, 23 to obtain the radial distribution function, which after subtracting by 1 equals the Mayer f-function. Several findings from these studies are worth noting. The pioneering study of Neal et al.,14 though with too few configurations to have convergent B2, generated insightful observations, such as the outsized contributions of a small number of highly favorable configurations. Elcock and McCammon15 commented that the integration of the Mayer f-function “might be dramatically accelerated using Fast Fourier Transform (FFT) methods.” McGuffee and Elcock18 presented success in modeling the effects of single mutations on B2. Stark et al.22 ran molecular dynamics simulations in the popular Martini coarse-grained force field24 to calculate B2, and found that 70% weakening of the strength of van der Waals interactions between proteins was necessary to reach agreement with experiment. The more realistic and/or more rigorous of these studies are so computationally demanding as to prohibit their applications on a large scale.

Here we present a method that breaks this last barrier, based on using FFT. In previous studies, we have already introduced a method called FMAP, or FFT-based modeling of atomistic protein-protein interactions, and demonstrated its superb speed in calculating the interaction energy of a probe molecule when placed at uniformly distributed positions inside a box of target molecules, enabling fast determination of the chemical potential.2528 Here we adapt FMAP to calculate B2 for proteins represented at the atomic level in implicit solvent, and test FMAPB2 on five proteins for which extensive experimental data are available: lysozyme,1, 7, 910, 23, 2930 chymotrypsinogen A,7, 9, 31 bovine pancreas trypsin inhibitor (BPTI),32 γD crystallin;33 and bovine serum albumin (BSA).34 After limited tuning of a single parameter in the interaction energy function, the calculated B2 values, each involving more than 1011 configurations, agree well with experimental data over wide ranges of solvent conditions (salt concentration, pH, and temperature). We make FMAPB2 available as a web server at http://pipe.rcc.fsu.edu/fmapb2.

THEORETICAL BACKGROUND

Virial Coefficients.

The equation of state of any gas approaches the ideal gas law

βP=ρ [1]

at very low number densities (i.e., ρ → 0). Here P denotes pressure, and β = 1/(kBT) with kB and T denoting the Boltzmann constant and absolute temperature, respectively. A similar relation is valid for the osmotic pressure, Π, of a dilute solution,

βΠ=ρ [2]

where ρ now represents the number density of the solute. Indeed, according to the McMillan-Mayer solution theory,35 a solution is exactly like a gas, if the intermolecular interaction energy is replaced by a potential of mean force (PMF) for a pair of solute molecules, with the solvent degrees of freedom averaged out. This correspondence between gas and solution is the basis for implicit modeling of the solvent. Below we describe how deviations from ideal behaviors are related to intermolecular interactions. The presentation is in the language for gases, but it is also applicable to solutions according to the just noted correspondence.

At higher densities, the pressure can be expanded in a series,5

βP=ρ+n2Bnρn [3]

where B2, B3, … are known as the second, third, etc. virial coefficients. For a monatomic gas, B2 is determined by

B2=12dR[eβU(R)1] [4]

where R is the relative position vector between two gas molecules, and U(R) is the intermolecular interaction energy. Higher order virial coefficients are given by interaction energies among three, four, etc. gas molecules. The integrand, eβU(R) − 1, in eq [4] is known as a Mayer f-function. If U(R) is centrosymmetric, then the integration over the direction of R, as specified by polar angle θ and azimuthal angle ϕ, can be trivially done,

B2=12dRdθdϕR2sinθ[eβU(R)1]=12dR4πR2[eβU(R)1] [5]

Even when U(R) is not centrosymmetric, we can express B2 in the form of eq [5], by first integrating over θ and ϕ,

eβW(R)14πdθdϕsinθeβU(R)<eβU(R)>θ,ϕ [6]

leading to

B2=12dR4πR2[eβW(R)1] [7]

Eq [6] serves to define a PMF W(R) along the distance R between two gas molecules.

For polyatomic gas molecules that are rigid, the interaction energy depends on both the relative position and the relative orientation. We denote the latter by Ω, which can be specified by 3 rotation angles such as the Euler angles. Then the second virial coefficient is given by

B2=1218π2dRdΩ[eβU(R,Ω)1] [8]

where an additional prefactor has been added to ensure proper reduction to eq [4] when the dependence of U(R, Ω) on Ω disappears. It is assumed that any Jacobian associated with the integration over Ω is absorbed into dΩ and that ∫ dΩ = 8π2. We again can cast eq [8] into the form of eq [7], by integrating over all the degrees of freedom except R. The PMF W(R) is now given by

eβW(R)14π18π2dθdϕdΩsinθeβU(R,Ω)<eβU(R,Ω)>θ,ϕ,Ω [9]

When the polyatomic molecules are flexible, i.e., they possess internal degrees of freedom, collectively denoted by X, we can further expand the definition of the PMF to

eβW(R)14π18π21VXdθdϕdΩdXsinθeβU(R,Ω,X)<eβU(R,Ω,X)>θ,ϕ,Ω,X [10]

where VX = ∫ dX.

As can be easily seen from eq [8], repulsive interactions [i.e., U(R, Ω) > 0] make positive contributions to B2, whereas attractive interactions [i.e., U(R, Ω) < 0] make negative contributions.

Simple Models of Intermolecular Interaction Energy.

Proteins have long been modeled as spherical. Often the protein spheres are assumed to be impenetrable, meaning that the interaction energy is

Wst(R)= if R<d [11]

where the subscript “st” indicates steric repulsion and d is the diameter of the protein spheres. If there is no other interactions, then according to eq [5] the second virial coefficient for proteins with only steric repulsion is

B20=120ddR4πR212Vco [12]

Eq [12] defines the co-volume Vco, which is the volume inaccessible to the center of a second protein molecule by the presence of a first protein molecule, and is 8 times the volume of a single protein sphere.

The DLVO theory1112 includes additional contributions to W(R) from van der Waals attraction and electrostatic interactions, respectively given by

WvdW(R)=AH12(d2R2+d2R2d2+2lnR2d2R2) [13]

and

Welec (R)=332(Q1+κd/2)2eκ(Rd)εR [14]

In the above equations, AH is the Hamaker constant representing the strength of van der Waals attraction, κ = 2.24(βI/ε)1/2 is the inverse of the Debye screening length, ε is the dielectric constant of water, Q is the net charge of the protein, and I is the ionic strength of the solution. Throughout this paper, lengths will be in units of Å, charges in units of electronic charge, ionic strengths in units of M, and energies in units of kcal/mol. The three terms, given by eqs [11], [13], and [14] are assumed to be additive, leading to

W(R)=Wst(R)+WvdW(R)+Welec(R) [15]

The steric and electrostatic terms make B2 positive whereas the van der Waals term make B2 negative. Note that eq [13] becomes singular at R = d. The usual fix is to increase the range of steric repulsion from d to d + δ, while maintaining the value of d in eqs [13] and [14].

Atomistic Model of Protein-Protein Interaction Energy.

As in our previous work,2526 we treat protein molecules as rigid and the solvent implicitly. Importantly, we represent the protein molecules at the atomic level, and calculate the interaction energy U(R, Ω) by accumulating contributions from individual pairs of atoms between two protein molecules, each depending on the interatomic distance rij. Again, there are three types of energy terms:

U(R,Ω)=Ust(R,Ω)+Una(R,Ω)+Uelec(R,Ω) [16]

The steric term is

Ust(R,Ω)= if any rij<(σii+σjj)/2 [17]

where σii/2 denotes the hard-core radius of atom i. The nonpolar attraction term, including van der Waals and hydrophobic contributions, has the form of a Lennard-Jones potential,

Una(R,Ω)=vsij4ϵij[(σijrij)12(σijrij)6] [18]

where ϵij denotes the magnitude of the attraction of the ij atom pair, σij is the distance at which the nonpolar interaction energy is zero, and vs is a scaling constant, which is the only parameter that we tune here (see Results and Discussion for details). The electrostatic term has the form of a Debye-Hückel potential,

Uelec (R,Ω)=332ijqiqjεrijeκrij [19]

where qi is the partial charge on atom i.

To calculate the second virial coefficient, one has to average the Mayer f-function of over R and Ω. If the evaluation of U(R, Ω) for a pair of atomistic proteins at a single relative configuration is already expensive, the averaging in the 6-dimensional configurational space (see eq [8]) can be computational prohibitive. Below we explain how FMAP2526 makes this calculation fast.

Basic Idea Behind FMAP.

The speedup by FMAP arises from expressing each unique term of the interaction energy as a correlation function in R space. Let us fix the first protein molecule at the origin, and position the second molecule at R. For now we consider a single relative orientation between the molecules and drop the explicit reference to Ω for notational simplicity. To illustrate how we can turn an interaction term into a correlation function, we note that the electrostatic interaction energy can be expressed as the integral of the product between the electrostatic potential of molecule 1 and the charge density of molecule 2:

Uelec (R)=dsψ1(s)ρ2(s|R) [20]

where

ψ1(s)=iqiε|sri|eκ|sri| [21]
ρ2(s|R)=jqjδ(srj|) [22]

with δ(s) denoting a delta function. We can recognize the integral in eq [20] as a correlation function in R space, once we allow the position R of molecule 2 to move and realize that the charge density ρ2(s|R) only depends on the position vector sR relative to molecule 2.

The same reasoning applies to exp[−βUst(R)] and to each of the two terms of Un−a(R) in eq [18].2526 In the latter case, we must use the geometric mean, αij = (αiiαjj)1/2, as the combination rule for both ϵij and σij. The fact that (σii + σjj)/2 ≥ (σiiσjj)1/2 ensures that, for any relative configuration between two protein molecules that is free of interatomic steric clashes, the nonpolar contribution from any atom pair cannot be positive.

The calculation of Uelec(R) by FFT involves the following steps. (1) Introduce a cubic box, centered on either protein molecule; discretize the box into a grid, and evaluate the values of the electrostatic potential ψ1(s) and the charge density ρ2(s|R) on all the grid points. (2) Perform FFT on both of these quantities, and multiply their Fourier transforms. (3) An inverse FFT of the product yields Uelec on all the grid points at once. By a similar procedure, Ust and the two terms of Un−a can be obtained, yielding U(R, Ω) over the entire R space and allowing for the averaging over R for a given Ω. Lastly the averaging over Ω is done by repeating the above process at many choices of Ω.

COMPUTATIONAL DETAILS

Preparation of Protein Structures and Charges.

The second virial coefficients of five proteins at different solvent conditions were calculated (Fig. 1). Atomic coordinates of the proteins were obtained from crystal structures in the Protein Data Bank (PDB), including entries 1BPI for BPTI, 1AKI for hen egg white lysozyme, 1HK0 for human γD crystallin, 2CGA for bovine chymotrypsinogen A, and 3V03 for BSA. To investigate possible effects of protein structures on B2 values, we also did calculations for lysozyme using six other entries: 1LKS, 1V7T, 2D4I, 2D4J, 2FBB, and 2Z18.

Figure 1.

Figure 1.

Structures of the five proteins studied. Molecular weight and net charge at the indicated pH are shown.

The PDB files were converted to PQR files, which contain coordinates of atoms along with their charges and radii, by the PDB2PQR program (version 2.1.1).36 We chose charges in the PARSE set,37 and, by default, used pKas predicted by PROPKA (version 3.0)38 to set discrete protonation states (either protonated or unprotonated) of ionizable side chains and main-chain termini. For lysozyme, we instead used experimental pKas determined at 25 °C and 0.1 M ion strength.39 The net charge of lysozyme as a function of pH, determined based on discrete protonation states, compares favorably with that based on fractional protonation according to the Henderson–Hasselbalch equation (Fig. S1).

FMAPB2 Implementation.

FMAP was originally developed to calculate free energies for transferring proteins from dilute solutions to crowded solutions (containing concentrated macromolecules).2526 Subsequently it was adapted to calculate chemical potentials of proteins for determining liquid-liquid phase equilibria.27 Here we tailor the method once again to calculate second virial coefficients. The following aspects are adjusted:

  1. Number of protein molecules in the solution box. Previously FMAP dealt with the interaction of a protein molecule with a cubic box of other protein molecules representing the solution. Here in FMAPB2, the interest is the interaction between a probe protein molecule and a partner protein molecule. The solution box therefore contains only the latter single protein molecule. This difference is trivial, and the only impact is a slight reduction in computational cost when evaluating the electrostatic potential ψ1(s) (see eq [21]) and its counterparts for the other terms of U(R, Ω) at all the grid points before FFT.

  2. Interatomic cutoff distance and size of cubic box. Previously in calculating chemical potentials by FMAP we chose an interatomic cutoff distance of 12 Å when evaluating ψ1(s) (and counterparts) at the grid points. For B2, which is an integral over the interprotein distance (see eq [7]), we increased the cutoff distance to 36 Å (to be denoted by rcut) to minimize the effect of the cutoff on W(R) and hence on B2. Also, the use of FFT transforms the solution box into an infinite periodic system, which is actually appropriate when the purpose is to calculate chemical potentials for macroscopic (hence effectively infinite) solutions. However, B2 involves the interaction between only two molecules, and therefore we must ensure that the probe molecule interacts with a single partner molecule in the solution box; i.e., all periodic images must be beyond the interatomic cutoff distance. To ensure satisfaction of the latter condition, we set the side length of the cubic box at 2(dmax + σmax + rcut), where dmax/2 is the largest distance of any atom from the center of the protein and σmax/2 is the largest atomic radius. The side lengths for the five proteins were 163.2, 186.0, 195.6, 190.8, and 274.8 Å.

  3. Number of rotations of the probe molecule. The average over the orientation of the probe molecule needs to be done by repeating FMAP calculations with the probe molecule in different orientations. Previously in calculating chemical potentials, a given probe molecule was placed in different regions of the solution box; the need for orientational averaging was modest since different regions of the solution box effectively sense different relative orientations of the same probe molecule. So at most a few hundred orientations of the probe molecule, randomly generated, were used.28 Here the solution box contains a single partner molecule; in each FMAPB2 calculation it senses only one relative orientation of the probe molecule, and so the need for orientational average is greater. To meet this need, we uniformly sample Ω by “successive orthogonal images” (rotations.mitchell-lab.org),40 which are rotation matrices separated by a fixed angle α. The number of rotations required to cover the entire Ω space is 8π4/α5 (with α in radians). For example, this number is 4392 for α = π/12 (or 15°). To assess the number of rotations necessary to provide sufficient coverage of the Ω space as to produce a converged B2 value, we compare B2 values calculated using increasingly larger subsets of randomly selected rotations from the 4392. As illustrated in Fig. S2 for lysozyme at various ionic strengths, even with one quarter of the 4392 rotations, B2 values are within 99% of those calculated using all the 4392 rotations. Errors, estimated using a Python code (https://github.com/manoharan-lab/flyvbjerg-std-err/) that implements Flyvbjerg and Petersen’s block decorrelation technique,41 progressively decreases with increasing number of rotations, down to less than 2% of B2 values at 4392 rotations. All these findings indicate that 4392 rotations are more than sufficient for orientational averaging and we used these many rotations for all subsequent B2 calculations.

For determining B2 by integrating the Mayer f-function of interaction energy U(R, Ω) (eq [8]), we sum over all the grid points in the cubic box and over all the probe molecule orientations. We also obtain the PMF W(R), by collecting Boltzmann factors of U(R, Ω), from all the probe molecule orientations, at the grid points within 0.6-Å bins in R (eq [9]), for R up to half of the box side length. Other details, including the choice of the grid spacing (0.6 Å) for discretizing the cubic box and the attendant modifications of atomic radii and charges to cancel any numerical errors from space discretization, can be found in the original presentation of FMAP.26

The expressions for B2 in the preceding section have units of volume. Below, we use volume, Å3 to be precise, as the unit only for reporting the steric component B20 In all other cases, we use 10−4 mol · ml · g−2 as the unit, for easy comparison with experimental measurements of B2. The conversion factor from the former to the latter unit is 10−26NA/M2, where NA is Avogadro’s number and M is protein molecular weight in kDa. In some literature a new symbol, A2, is used after the unit conversion, but here we stick with B2.

RESULTS AND DISCUSSION

Below we present FMAPB2 results for five proteins: BPTI, lysozyme, γD crystallin, chymotrypsinogen A, and BSA (Fig. 1). These proteins span small, medium, and large sizes. Experimental B2 data are available for them over wide ranges of solvent conditions (salt concentration, pH, and temperature), with lysozyme richest in such data.1, 7, 910, 23, 2930 Our calculations, using an atomistic interaction energy function, on these wide spans of proteins and solvent conditions allow for unprecedent in-depth probe into the physical determinants of the second virial coefficient. The experimental data also afforded an opportunity to parameterize the interaction energy function.

Parameterization of the Interaction Energy Function.

The atomistic interaction energy U(R, Ω) is given by eqs [16]]–[19]. For the steric and nonpolar attraction terms, we used parameters for the van der Waals term in Autodock4.42 These were modified from the Amber force field,43 with a reduced set of parameters, based on element types instead of bonding types (e.g., the same σii and ϵii parameters for carbon regardless of its bonding types). In addition, we applied a scaling constant, vs, to convert an energy term (eq [18]) originally meant for interatomic van der Waals interactions in gas phase into one modeling nonpolar attraction in solution. Similar scaling was done in previous work.14, 42

We tuned vs for FMAPB2. In short, the steric, nonpolar (with vs set to 1), and electrostatic terms of U(R, Ω) were calculated via FFT as described in the preceding section. The nonpolar term was then multiplied by various scaling values before adding to the other two terms, and the Mayer f-function of the resulting U(R, Ω) was finally summed over the grid points and the probe molecule rotations (eq [8]) to yield B2 values. Including the latter vs-tuning step incurred minimal computational cost. Such results for lysozyme at 25 °C, pH 4.5, and ionic strengths from 0.05 to 0.9 M, along with the experimental data,1, 7, 910, 29 are shown in Fig. S3. With vs = 0.16, excellent agreement with experiment is achieved. Note that a decrease in vs from 0.16 leads to a gradual increase in B2 whereas a change in vs in the opposite direction leads to a rapid decrease in B2, and the effects on B2 are amplified at higher ionic strengths. For BSA, best agreement with experimental data is obtained at vs = 0.12; the results calculated with vs = 0.16 are much too negative. Inspections of these and corresponding results for the other three proteins suggested that the optimal vs depends on protein size. We found that the following dependence on protein molecular weight (M, in kD)

vs=0.18(0.27M+80)M+80 [23]

captures well the optimal vs for the five proteins.

We can rationalize the decrease in vs with increasing protein size as follows. The nonpolar attraction is empirically known to be largely determined by protein surface area, as opposed to protein volume. Specifically, the hydrophobic contribution to the interaction energy between two protein molecules is often modeled as proportional to the buried surface area when the molecules come into contact.44 However, the nonpolar component of B2, calculated with the electrostatic term in our energy function turned off, scales with a power of protein volume (as represented by B20) greater than unity (Fig. S4, blue line), and therefore tends to grow too rapidly with protein size. Eq [23], by tempering vs for larger proteins, rectifies this bias. The resulting nonpolar component now scales approximately with the two thirds power of protein volume, i.e., linearly with protein surface area (Fig. S4, orange line).

We now turn to the electrostatic term. For the dielectric constant, we used that of water without further modification. The dielectric constant of water depends on temperature, and the dependence can be approximated by the following empirical relation45

ε=87.740.40008t+9.398×104t21.41×106t3 [24]

where t is temperature in °C. In previous FMAP calculations2528 we used Amber partial charges, but because Amber did not have neutral main-chain termini (i.e., deprotonated N-terminus and protonated C-terminus) in its library, here we opted for PARSE charges37 to enable FMAPB2 to work for arbitrary pH values. In the pH range where the main-chain termini of a protein are charged, the calculated B2 values according to PARSE and Amber charges are very similar (Fig. S5A, for lysozyme at pH 4.5 and 25 °C).

Steric Component of B2.

When the net effect of Un−a and Uelec is absent (e.g., due to the exact cancelation of the two terms or by artificially turning them off), we are left with B20, the steric component for B2. This situation is effectively reached in the high-temperature limit. In FMAPB2, B20 is calculated by counting the grid points at which the two protein molecules have steric clash. Expectedly, B20 increases with increasing molecular weight (Fig. 2, x axis). The value of B20 is half the covolume of the pair of protein molecules (eq [12]). The corresponding diameters of the equivalent spheres are 29.3, 36.8, 41.9, 45.5, and 69.4 Å, respectively, for the five proteins studied here.

Figure 2.

Figure 2.

The steric component B20 is accurately predicted by the covolume from GFMT.

We have developed a generalized fundamental measure theory (GFMT),46 which calculates the volume (Vp), surface area (Sp), and linear size (Lp) of an atomistic protein when probed by a spherical particle, and uses these geometric properties to predicts the steric component of the excess chemical potential of the protein inside a concentrated solution of the spherical particle. This type of theory47 also predicts the covolume between the protein and the spherical probe as

Vco=2(Vp+LpSp) [25]

Vco predicted by eq [25] depends on the probe radius (rp), and has a minimum for each protein. The minima come about because Vp and Sp change in opposite directions as rp increases, and occur at rp values that drift upward, from 3 Å for BPTI to 7 Å for BSA, as the protein size increases. The GFMT minimal Vco yields B20 that is in excellent agreement with the value calculated by FMAPB2 (Fig. 2). This quantitative account of B20 by GFMT Vco underscores the nonspherical shape of the proteins.

High Anisotropy of U(R, Ω).

In contrast to spherical models like DLVO, the interaction energy U(R, Ω) calculated by FMAPB2 for our atomistic model is highly anisotropic (Fig. 3). Figure 3A displays 300 poses (gray spheres) with the lowest interaction energies for a lysozyme pair. These poses fall into at least five clusters (with representatives shown as large gray spheres). The energy map on a slice through space (Fig. 3A inset) shows a highly localized minimum (indicated by a green circle, corresponding to one of the clusters) next to the region of steric clash (gray area). Notably, most of the crystal contacts in four different space groups (colored spheres) are close to the lowest-energy poses and fall into the clusters of the poses, as observed in previous studies.14, 18 To present a global view of the highly anisotropic interaction energy, we collected 15524 lowest-energy poses from the 4392 rotations of the probe molecule and calculated an energy map over different directions of the position vector R. The poses were divided into 30 × 30 bins in longitude and latitude, and in each bin the energies of the poses were Boltzmann-averaged (effectively yielding a PMF in ϕ and θ). The results are displayed as a Hammer projection in Fig. 3B, with the afore-mentioned five clusters showing as low-energy regions. The structures of the five representative poses show that they differ not only in the direction by which the probe molecule comes into contact with the partner molecule but also in the orientation of probe molecule.

Figure 3.

Figure 3.

Anisotropy of the interaction energy U(R, Ω), calculated for lysozyme at pH 4.5, I = 0.2 M, and 25 °C. (A). 300 lowest-energy poses, collected from the 4392 rotations of the probe molecules and shown as small gray spheres at the centers of the probe molecule, around a target molecule (cartoon representation). Five larger grays sphere show cluster representatives. Color spheres show crystal contacts in four space groups: red, P43212 ; green, , P212121; yellow, P1; and magenta, P6122. Inset: energy map on an xy plane, calculated for a single rotation of the probe molecule. The spectrum of energies from negative to zero is spanned by colors from dark red to yellow to white. (B) Energy map on a Hammer projection. Locations of cluster representatives are indicated numbers denoting their rankings in a collection of 15524 lowest-energy poses. Structures of the representative poses are shown with the target molecule as surface and probe molecule as trace.

Salt Dependence.

Adding salts is a simple way to perturb B2, by screening electrostatic interactions between two protein molecules. At pH values away from the isoelectric point, the net charge on a protein can be high. The resulting electrostatic repulsion is significant at low salt but become muted at high salt. This qualitative behavior is well illustrated by the extensive data1, 7, 910, 29 for the salt dependence of lysozyme at pH 4.5 (Fig. 4A, symbols), showing a substantial decrease in B2 with increasing ionic strength, from 3.92 × 10−4 mol · ml · g−2 at I = 0.078 M to approximately −5.0 × 10−4 mol · ml · g−2 at I = 0.85 M.

Figure 4.

Figure 4.

Salt dependence of lysozyme B2 at pH 4.5 and 25 °C. (A) Comparison of experimental and FMAPB2 results. Here as well in subsequent figures, a horizontal dashed line indicates B20. (B) PMF over R. Results are shown for the total PMF and the nonpolar and electrostatic PMFS at I = 0.1 and 0. 5 M.

As already noted, B2 values calculated for lysozyme at pH 4.5 are in excellent agreement with the experiment data over the entire ionic strength range (Fig. 4A, solid curve). At this pH, lysozyme has a net charge of +10e (Fig. 1). At I = 0.05 M, the electrostatic repulsion has a dominant role, and B2 reaches (4.868 ± 0.004) × 10−4 mol · ml · g−2, which is nearly 60% larger than the steric component, (3.076 ± 0.002) × 10−4 mol · ml · g−2. At approximately I = 0.08 M, the electrostatic repulsion has weakened to such an extent that its effect cancels that of the nonpolar attraction, making B2 equal to B20. At higher ionic strengths, the electrostatic repulsion is further weakened and the nonpolar repulsion becomes dominant, leading to a very negative B2, of (−4.07 ± 0.02) × 10−4 mol · ml · g−2 at I = 0.9 M.

To further dissect the electrostatic and nonpolar contributions of lysozyme, we compare the PMF over R with those calculated with the electrostatic and nonpolar terms in the energy function turned off one at a time (Fig. 4B). For convenience, we refer to the latter as the nonpolar and electrostatic PMFs. The steric core extends to approximately R = 25 Å. Starting at 28.5 Å and all the way to 50 Å, there is significant nonpolar attraction, as indicated by negative values of the nonpolar PMF (blue curve on either the left or right half). The fact that the nonpolar PMF comes into play at a much shorter interprotein distance of 28.5 Å than the 36.8-Å diameter of the equivalent sphere based on B20 reflects the nonspherical shape of lysozyme. The nonpolar attraction is countered by electrostatic repulsion. At I = 0.1 M, the latter (red curve in the left half) is strong enough such that, when both of these interaction terms are present, their effects in the R range of 30 to 50 Å nearly cancel, with |W| ≤ 0.5kBT (black curve in the left half). However, note that the contributions of the nonpolar attraction and electrostatic repulsion actually are not additive. In particular, at R = 33 Å, W has a minimum value of −0.3kBT, even though the nonpolar and electrostatic PMFs are 2.5kBT and −1.4kBT, respectively, and would thus predict W = 1.1kBT according to additivity. Such nonadditivity has been commented previously,28 and is due to correlation between nonpolar and electrostatic interactions (among different configurations at a fixed R in the present case). Note also that both the nonpolar and electrostatic PMFs almost die out at R = 60 Å, where many interprotein atom pairs are still within the cutoff distance of 36 Å, indicating that this choice of cutoff distance is appropriate. At I = 0.5 M, heightened screening lowers the electrostatic PMF, by amounts ranging from 1.3kBT to 0.1kBT over the R range of 30 to 50 Å (red solid curve compared to red dashed curve in the right half). Although that still leaves the electrostatic PMF significant (e.g., with value 3.9kBT at R = 30 Å), the total PMF is now very close to the nonpolar PMF (black and blue curves in the right half), once again exposing nonadditivity.

We also studied the salt dependences of B2 for the other four proteins. As shown in Fig. 5, our calculated B2 values show reasonable agreement with the experimental data overall.7, 9, 3234 BPTI at pH 4.9 (net charge +6e), γD crystallin at pH 5.5 (net charge +5e), and chymotrypsinogen A at pH 3 (+18e) all show the expected decrease in B2 with increasing ionic strength. FMAPB2 underestimated this dependence for BPTI, with a noticeable underestimation of B2 at low ionic strength. Among these three proteins, γD crystallin has the weakest salt dependence, which can be mainly attributed to its smallest net charge (and the larger size relative to BPTI). BSA shows relatively weak salt dependence over the entire pH range from 3.5 to 8. At the ends of this pH range, BSA carries significant net charges (+95e and −19e, respectively), and B2 exhibits the familiar decrease with increasing ionic strength. However, near the isoelectric point (pI = 5.25), the salt dependence reverses the trend. Here the electrostatic repulsion due to the net charge disappears, and attraction between electric dipoles comes into play.9, 31 As salt screens this electrostatic attraction, B2 increases with increasing ionic strength. This “reverse” salt dependence cannot be predicted by a spherical model like DLVO, although FMAPB2 also overestimated the magnitude of this reversal. Note that the minimum B2 of BSA occurs at pH 4.7, not precisely at the pI, further emphasizing the importance of detailed charge distributions in protein-protein electrostatic interactions.

Figure 5.

Figure 5.

Salt dependences of B2 for four proteins. (A) BPTI at pH 4.9 and 20 °C. (B) γD crystallin at pH 5.5 and 27 °C. (C) Chymotrypsinogen A at pH 3 and 25 °C. (D) BSA at 25 °C.

Taking advantage of the many crystal structures of lysozyme in the PDB, we tested how sensitive the calculated B2 is to variation in protein structure. In Fig. S5B, we compare the B2 values calculated on PDB entry 1AKI with those calculated on six other PDB entries, at pH 4.5 and 25 °C and in the ionic strength range of 0.05 to 0.9 M. The spread in B2 among the seven structures increases modestly with increasing ionic strength, suggesting a higher sensitivity of nonpolar attraction than electrostatic repulsion to structural variation. Still, even at I = 0.9 M, the standard deviation of B2 among the seven structures is only 7.7% of the mean value. This deviation is much less than discrepancies in B2 values obtained using different techniques (e.g., SLS vs. SANS) or by different researchers. We thus conclude that crystal structures of globular proteins can be used for B2 calculations. Note that, for proteins with flexible regions, we can account for the flexibility by representing the flexible regions by an ensemble of structures and taking an ensemble average of calculated B2 values.

The success of FMAPB2 noted above also highlights the limitation of simple theoretical models such as DLVO. In Fig. S6A, we compare the DLVO predictions for lysozyme B2 at pH 4.5 and 25 °C over a range of ionic strength. Although the chosen parameters (R = 36 Å; δ = 1.8 Å; AH = 5.01 kcal/mol; and Q = +10e)48 allow for good matching with experimental data at high ionic strength (I > 0.2 M), DLVO significantly overestimates B2 at low ionic strength. The latter result arises from modeling the atomic charges as a uniform distribution of the net charge on the protein surface, leading to significant overestimation of electrostatic repulsion.30 The latter is illustrated by a comparison between our electrostatic PMF and the DLVO counterpart (Fig. S6B, solid and dashed red curves). Our nonpolar PMF becomes discernible at R = 28.5 Å and peaks at 32.1 Å, but the DLVO counterpart has a much narrower range (starting and peaking at R = 37.8 Å, close to the diameter of the equivalent sphere based on our B20) and, to compensate, a much higher peak (solid and dashed blue curves). A final difference is that, whereas the van der Waals and electrostatic terms of the PMF over R are additive in DLVO (eq [15]), their counterparts in FMAPB2 are not.

pH Dependence.

The salt effects at different pH values on the B2 of BSA (Fig. 5D) have already revealed that pH, by modifying the charges on a protein, can significantly perturb B2. In Fig. 6 we display the pH dependences of B2 values calculated by FMAPB2 for lysozyme and chymotrypsinogen A. Overall the FMAPB2 results again show reasonable agreement with experimental data,7, 9, 23, 3031 although for lysozyme the agreement is perhaps somewhat worse than that shown in Fig. 4A for salt dependence at pH 4.5. Our agreement with the data of Velev et al.9 displayed in Fig. 6 is better than that by Elcock and McCammon,15 which we attribute in part to the extensive sampling afforded by FMAPB2. Here again different experimental techniques and different researchers reported widely varying data for nominally the same solvent conditions. In particular, for lysozyme at I = 0.1 M, using SLS, Velev et al.9 found a sharp decline in B2 around pH 7, but Liu et al.’s data30 at pH 7 and Prytkova’s data23 at pH 6.9 suggest a less sharp decline (Fig. 6A). Our results at I = 0.1 M fall in the middle of these three sets of experimental data. In explaining their discrepancy from experimental data on pH dependence at I = 0.1 M, Elcock and McCammon15 suggested change in protonation states upon protein weak binding as a possible reason.

Figure 6.

Figure 6.

pH dependences of B2 for two proteins. (A) Lysozyme at 25 °C. (B) Chymotrypsinogen A at 25 °C.

For both proteins, the calculated B2 decreases with increasing pH, due to deceasing net charge and hence reduced electrostatic repulsion. The pH dependences become weaker at increasing ionic strength, due to salt screening of electrostatic repulsion. For chymotrypsinogen A, as pH approaches the pI (10.15), we again see the start of a reversal of salt dependence observed above on BSA, i.e., B2 near pH 7 becomes higher at higher ionic strength (Fig. 6B), in agreement with the experimental data.9, 31

Temperature Dependence.

The temperature dependence of B2 was measured by Bonneté et al.10 for lysozyme at pH 4.5. This effect is reproduced quite well by FMAPB2 calculations (Fig. 7). To understand how temperature affects B2, we first note that it is an integral of the Mayer f-function, and the latter contains kBT as part of the Boltzmann factor of the interaction energy. For the moment, let us assume that the interaction energy itself does not dependent on temperature. At infinite temperature, the Boltzmann factor goes to 1 and the f-function goes to 0, except when a steric term is present (see eq [8]). Therefore B2 approaches the steric component B20 at high temperature. In contrast, at low temperatures, small regions in the configurational space where the interaction energy is especially negative (see Fig. 3), corresponding to a large positive f-function, make disproportionate contributions to B2, pushing it toward negative values (reminiscent of observations by Neal et al.14). Note that the decrease in B2 with decreasing temperature becomes steeper at higher ionic strength, because then the interaction energy and hence B2 at low temperatures are more negative.

Figure 7.

Figure 7.

Temperature dependence of lysozyme B2 at pH 4.5. The intersection of the horizontal solid line and a B2 curve predicts the critical temperature for liquid-liquid phase separation (vertical arrow). The numbers in the legend indicate ionic strengths (in M).

The temperature dependence of our B2 can be elaborated from a different angle. The foregoing reasoning also leads to the conclusion that our PMF over R is temperature-dependent, i.e., more negative at low temperatures than at high temperatures (see Fig. S7). This is in contrast to the PMFs of spherical models like DLVO, which typically are assumed to be (essentially) temperature-independent. Although even a temperature-independent W(R) yields a modest temperature dependence for B2, our temperature-dependent W(R) amplifies the temperature dependence of B2. While the temperature dependence of our B2 is in line with experimental data, that predicted by DLVO prediction is too weak.

The weak temperature dependence of B2 predicted by another spherical model, with a square-well interaction energy, was noted by Platten et al.49 To compensate, they made the well depth temperature-dependent (stronger attraction at low temperatures). This modification in turn was borrowed from Lomakin et al.,50 who introduced the temperature-dependent well depth in order to broaden the binodal of liquid-liquid phase separation, which otherwise is too narrow compared to experimental observations. Our atomistic interaction energy function produces a temperature-dependent PMF over R, which enables the correct prediction of the slope of the increase in B2 with respect to T (and possibly the broadness of binodals).

Now the electrostatic term in our interaction energy function has direct temperature dependence, through T and the temperature-dependent ε (eq [24]) that enter κ (inverse of screening length) and through ε that scales the magnitude of the electrostatic term (eq [19]). However, the temperature dependence of the electrostatic term contributes little to the temperature dependence of our B2 shown in Fig. 7, because it is largely determined by negative values of the interaction energy, which occur in regions where the nonpolar attraction is strong while the electrostatic interactions are relatively weak.

There is one outlier to an otherwise excellent agreement between the FMAPB2 results and experimental data in Fig. 7: at 10 °C and I = 0.52 M, FMAPB2 predicts B2 = (−5.05 ± 0.03) × 10−4 mol · ml · g−2, but the experimental value is much more negative, at −7.06 × 10−4 mol · ml · g−4.10 The consistent performance of FMAPB2 and instances of inconsistencies among experimental measurements embolden us to suggest that perhaps this last experimental value is an underestimate. As support, the critical temperature (Tc) for lysozyme liquid-liquid phase separation was determined under similar solvent conditions to be approximately 9 °C,51 close to the 10 °C temperature of interest here. It has been proposed that B2(Tc)6.6B20/4.4, 52 The latter relation predicts a B2 of −5.1 × 10−4 mol · ml · g−2 at 9 °C. This estimate agrees much better with the FMAPB2 result than with the one reported by Bonneté et al. The foregoing comparison suggests that, by calculating the temperature dependence of B2 and using the B2(Tc)6.6B20/4 relation, FMAPB2 can predict Tc (intersection of blue curve with solid horizontal line, Fig. 7).

FMAPB2 Web Server.

To make FMAPB2 widely accessible, we have implemented it as a web server, available at http://pipe.rcc.fsu.edu/fmapb2. The server outputs B2 and its steric component B20 (Fig. 8A). In addition, their convergence at increasing numbers of rotations used for averaging over Ω (Fig. 8B), the PMF over R, and the diameter of the equivalent sphere based on B20 are reported (Fig. 8C).

Figure 8.

Figure 8.

Output of the FMAPB2 web server, for lysozyme at pH 4.5. (A) Summary of predicted results. (B) Convergence of B2 at increasing number of probe molecule rotations. (C) PMF over R. A vertical dashed line indicates the diameter of the equivalent sphere based on B20.

The input to the server is a structure file in PQR format, which contains not only atomic coordinates but also atomic charges and radii. The atomic radii are reassigned according to the van der Waals parameters in AutoDock4.42 The user also enters the ionic strength and temperature in the B2 calculation. For convenience a portal is also provided for the user to start from a PDB file, which can be directly uploaded or, by entering a PDB entry name, retrieved from the Protein Data Bank (http://www.rcsb.org/pdb/). In this case the user also enters the desired pH value. The server then assigns atomic charges and radii using the PROPKA38 and PDB2PQR36 programs.

The B2 calculation is offloaded to compute nodes maintained by the Research Computing Center at Florida State University. On a single node with 16 Intel Xeon E5–2650 cores at 2.6 GHZ, the calculation takes approximately 4 hours for lysozyme and 17 hours for BSA, involving 3103 and 4583 grid points, respectively, for each of the 4392 rotations, or a total of 1.31 × 1011 and 4.22 × 1011 configurations uniformly distributed in the 6-dimensional space.

CONCLUSION AND OUTLOOK

We have developed the FMAPB2 method for calculating the second virial coefficients of protein solutions, with the interprotein interaction energy modeled at the all-atom level in implicit solvent. The method has been tested on lysozyme and other proteins ranging in molecular weight from 6.5 to 65 kD, over wide ranges of solvent conditions (salt concentration, pH, and temperature). Our calculations show that the value of B2 is determined by the steric covolume and by the balance between nonpolar attraction and electrostatic interactions; this balance can be shifted by salt, pH, and temperature. However, the effects of nonpolar attraction and electrostatic interactions are not simply additive. FMAPB2 will hopefully replace DLVO for future interpretation of experimental data on B2 and join experimental techniques for cross validation, given that the latter can produce large discrepancies. It can be accessed at http://pipe.rcc.fsu.edu/fmapb2 by simply uploading a PDB file or entering a PDB entry name.

Further developments can be pursued in many directions. For salt effects, we have only accounted for screening of electrostatic interactions. The B2 data that we gathered from the literature all used NaCl as the salt, for which electrostatic screening is the dominant effect, up to approximately 0.5 M concentration. Salts can have two other effects.5354 First, at high concentrations, they can change the surface tension of water and thereby modulate the strength of hydrophobic interactions. Kosmotropic ions such as SO42− strengthen hydrophobic interactions whereas chaotropic ions such as NO3 and SCN weaken them, lowering and raising, respectively, the value of B2. An overestimation of the lysozyme B2 by FMAPB2 at ionic strength close to 1 M (Fig. 4A) can be attributed to this effect, given that Cl is a mild kosmotrope. Second, starting at low concentrations, chaotropic anions can bind to the surface of a cationic protein like lysozyme, thereby neutralizing the protein and reducing electrostatic repulsion. This effect would lower B2. Indeed, B2 values obtained using NaNO3 and NaSCN are lower than those using NaCl.10 We can model these two types of salt effects into our interaction energy function. Similarly, cosolvents such as alcohols can change B2;30, 32, 55 their effects can also be modeled into our energy function.

The treatment of atomistic details in FMAPB2 not only has yielded physical insight into B2 but also allows us to begin to investigate how amino-acid sequence variations7 or even single mutations18, 29 affect B2. For such investigations to be meaningful, further refinement of our interaction energy function may be needed and the corresponding experimental data may also need to be cross-validated. Another straightforward application of FMAPB2 is to the second virial cross coefficient, which depends on the interaction between two different proteins and arises in a virial expansion of the osmotic pressure in a binary solution.5658 The only modification needed for FMAPB2 is to use two different proteins, one for the probe molecule and one for the target molecule; the computational cost remains the same. Also worth exploring is whether FMAPB2 can be used for calculating the binding constants for protein-protein specific binding (some effort on protein-ligand binding constant has already been made59), although it can be anticipated that the development of an implicit-solvent interaction energy function appropriate for specific binding and the neglect of induced fit in our approach can pose challenges.

In addition to B2 values, FMAPB2 can be used to calculate other experimental observables. In particular, the Boltzmann factor, exp[−βW(R)], of the PMF over R is the radial distribution function in the dilute limit, which can be used to calculate the structure factor and hence the scattering intensity over the entire q range, enabling direct comparison with raw SAXS/SANS data. For this reason, our web server returns the PMF over R (Fig. 8C). The difficulties in precise measurements of B2 prompted Platten et al.49 to propose an indirect method for obtaining B2, through determining the binodal of liquid-liquid phase separation. The idea behind is the postulate that all molecular systems with only short-range attraction can be mapped to the square-well model,60 for which one knows the relation between the binodal and B2. We hope that FMAPB2, with further improvements, will become reliable as to alleviate the need for going to such great lengths to obtain B2. Moreover, FMAPB2 may even be extended to predict the binodal. We have already illustrated that, based on the relation B2(Tc)6.6B20/4, FMAPB2 can be used to determine the critical temperature for liquid-liquid phase separation (Fig. 7). The entire binodal can be determined from B2 and a few higher order virial coefficients, based on a truncated virial expansion of the excess chemical potential2728, 50

βμexβμex0+l=24ll1(BlBl0)ρl1 [26]

where the superscript “0” denotes the steric component. B3 and B4 are given by “cluster integrals” in the relative configurational space (translation and rotation) of three and four protein molecules; the integrands are the products of pairwise Mayer f-functions. Following the use of Fourier transforms in B3 and B4 calculations for spherical models of molecular fluids,6163 it is feasible to extend FMAPB2 to these calculations for atomistic proteins in implicit solvent. One can then use eq [26] to predict not just the binodal but many other thermodynamic properties of a protein solution.

Supplementary Material

Supporting Information

ACKNOWLEDGMENT

This work was supported by National Institutes of Health Grant GM118091.

ABBREVIATIONS

BPTI

bovine pancreas trypsin inhibitor

BSA

bovine serum albumin

DLVO

Derjaguin, Landau, Verwey, and Overbeek

FFT

fast Fourier transform

FMAP

FFT-based modeling of atomistic protein-protein interactions

GFMT

generalized fundamental measure theory

PDB

Protein Data Bank

PMF

potential of mean force

STL

static light scattering

SANS

small angle neutron scattering

SAXS

small angle X-ray scattering

SLS

static light scattering

Footnotes

Supporting Information

The following Supporting Information is available free of charge: seven additional figures (Figures S1 to S7).

The authors declare no competing financial interest.

REFERENCES

  • 1.Rosenbaum DF; Zukoski CF, Protein interactions and crystallization. J Cryst Growth 1996, 169, 752–758. [Google Scholar]
  • 2.George A; Wilson WW, Predicting protein crystallization from a dilute solution property. Acta Crystallogr D Biol Crystallogr 1994, 50, 361–365. [DOI] [PubMed] [Google Scholar]
  • 3.Quigley A; Williams DR, The second virial coefficient as a predictor of protein aggregation propensity: a self-interaction chromatography study. Eur J Pharm Biopharm 2015, 96, 282–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Vliegenthart GA; Lekkerkerker HNW, Predicting the gas-liquid critical point from the second virial coefficient. J Chem Phys 2000, 112, 5364–5369. [Google Scholar]
  • 5.Hill TL, An Introduction to Statistical Thermodynamics. Dover Publications, Inc.: New York, 1986. [Google Scholar]
  • 6.Qin S; Zhou HX, Protein folding, binding, and droplet formation in cell-like conditions. Curr Opin Struct Biol 2017, 43, 28–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bajaj H; Sharma VK; Kalonia DS, Determination of second virial coefficient of proteins using a dual-detector cell for simultaneous measurement of scattered light intensity and concentration in SEC-HPLC. Biophys J 2004, 87, 4048–4055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Attri AK; Minton AP, New methods for measuring macromolecular interactions in solution via static light scattering: basic methodology and application to nonassociating and self-associating proteins. Anal Biochem 2005, 337, 103–110. [DOI] [PubMed] [Google Scholar]
  • 9.Velev OD; Kaler EW; Lenhoff AM, Protein interactions in solution characterized by light and neutron scattering: comparison of lysozyme and chymotrypsinogen. Biophys J 1998, 75, 2682–2697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Bonneté F; Finet S; Tardieu A, Second virial coefficient: variations with lysozyme crystallization conditions. J Cryst Growth 1999, 196, 403–414. [Google Scholar]
  • 11.Derjaguin B; Landau LD, Theory of the stability of strongly charged lyophobic sols and of the adhesion of strongly charged particles in solutions of electrolytes. Acta Physicochim URSS 1941, 14, 633–662. [Google Scholar]
  • 12.Verwey EJW; Overbeek JTG, Theory of the Stability of Lyophobic Colloids. Elsevier Publishing Company, Inc.: New York, 1948. [Google Scholar]
  • 13.Neal BL; Asthagiri D; Velev OD; Lenhoff AM; Kaler EW, Why is the osmotic second virial coefficient related to protein crystallization? J Cryst Growth 1999, 196, 377–387. [Google Scholar]
  • 14.Neal BL; Asthagiri D; Lenhoff AM, Molecular origins of osmotic second virial coefficients of proteins. Biophys J 1998, 75, 2469–2477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Elcock AH; McCammon JA, Calculation of weak protein-protein interactions: the pH dependence of the second virial coefficient. Biophys J 2001, 80, 613–625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Carlsson F; Malmsten M; Linse P, Monte Carlo simulations of lysozyme self-association in aqueous solution. J Phys Chem B 2001, 105, 12189–12195. [Google Scholar]
  • 17.Lund M; Jonsson B, A mesoscopic model for protein-protein interactions in solution. Biophys J 2003, 85, 2940–2947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.McGuffee SR; Elcock AH, Atomically detailed simulations of concentrated protein solutions: the effects of salt, pH, point mutations, and protein concentration in simulations of 1000-molecule systems. J Am Chem Soc 2006, 128, 12098–12110. [DOI] [PubMed] [Google Scholar]
  • 19.Mereghetti P; Gabdoulline RR; Wade RC, Brownian dynamics simulation of protein solutions: structural and dynamical properties. Biophys J 2010, 99, 3782–3791. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kim B; Song X, Calculations of the second virial coefficients of protein solutions with an extended fast multipole method. Phys Rev E 2011, 83, 011915. [DOI] [PubMed] [Google Scholar]
  • 21.Grünberger A; Lai P-K; Blanco MA; Roberts CJ, Coarse-grained modeling of protein second osmotic virial coefficients: sterics and short-ranged attractions. J Phys Chem B 2013, 117, 763–770. [DOI] [PubMed] [Google Scholar]
  • 22.Stark AC; Andrews CT; Elcock AH, Toward optimized potential functions for protein-protein interactions in aqueous solutions: osmotic second virial coefficient calculations using the MARTINI coarse-grained force field. J Chem Theory Comput 2013, 9, 4176–4185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Prytkova V; Heyden M; Khago D; Freites JA; Butts CT; Martin RW; Tobias DJ, Multi-conformation Monte Carlo: a method for introducing flexibility in efficient simulations of many-protein systems. J Phys Chem B 2016, 120, 8115–8126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Marrink SJ; Risselada HJ; Yefimov S; Tieleman DP; de Vries AH, The MARTINI force field: coarse grained model for biomolecular simulations. J Phys Chem B 2007, 111, 7812–7824. [DOI] [PubMed] [Google Scholar]
  • 25.Qin S; Zhou HX, FFT-based method for modeling protein folding and binding under crowding: benchmarking on ellipsoidal and all-atom crowders. J Chem Theory Comput 2013, 9, 4633–4643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Qin S; Zhou HX, Further development of the FFT-based method for atomistic modeling of protein folding and binding under crowding: optimization of accuracy and speed. J Chem Theory Comput 2014, 10, 2824–2835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Qin SB; Zhou HX, Fast method for computing chemical potentials and liquid-liquid phase equilibria of macromolecular solutions. J Phys Chem B 2016, 120, 8164–8174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Nguemaha V; Qin S; Zhou HX, Transfer free energies of test proteins Into crowded protein solutions have simple dependence on crowder concentration. Front Mol Biosci 2019, 6, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Curtis RA; Steinbrecher C; Heinemann M; Blanch HW; Prausnitz JM, Hydrophobic forces between protein molecules in aqueous solutions of concentrated electrolyte. Biophys Chem 2002, 98, 249–265. [DOI] [PubMed] [Google Scholar]
  • 30.Liu W; Bratko D; Prausnitz JM; Blanch HW, Effect of alcohols on aqueous lysozyme-lysozyme interactions from static light-scattering measurements. Biophys Chem 2004, 107, 289–298. [DOI] [PubMed] [Google Scholar]
  • 31.Woldeyes MA; Calero-Rubio C; Furst EM; Roberts CJ, Predicting protein interactions of concentrated globular protein solutions using colloidal models. J Phys Chem B 2017, 121, 4756–4767. [DOI] [PubMed] [Google Scholar]
  • 32.Farnum M; Zukoski C, Effect of glycerol on the interactions and solubility of bovine pancreatic trypsin inhibitor. Biophys J 1999, 76, 2716–2726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Blanco MA; Sahin E; Robinson AS; Roberts CJ, Coarse-grained model for colloidal protein interactions, B(22), and protein cluster formation. J Phys Chem B 2013, 117, 16013–16028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Ma Y; Acosta DM; Whitney JR; Podgornik R; Steinmetz NF; French RH; Parsegian VA, Determination of the second virial coefficient of bovine serum albumin under varying pH and ionic strength by composition-gradient multi-angle static light scattering. J Biol Phys 2015, 41, 85–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.McMillan WG Jr.; Mayer JE, The statistical thermodynamics of multicomponent systems. J Chem Phys 1945, 13, 276–305. [Google Scholar]
  • 36.Dolinsky TJ; Czodrowski P; Li H; Nielsen JE; Jensen JH; Klebe G; Baker NA, PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res 2007, 35, W522–W525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Sitkoff D; Sharp KA; Honig B, Accurate calculation of hydration free energies using macroscopic solvent models. J Phys Chem B 1994, 98, 1978–1988. [Google Scholar]
  • 38.Bas DC; Rogers DM; Jensen JH, Very fast prediction and rationalization of pKa values for protein-ligand complexes. Proteins 2008, 73, 765–783. [DOI] [PubMed] [Google Scholar]
  • 39.Kuramitsu S; Hamaguchi K, Analysis of the acid-base titration curve of hen lysozyme. J Biochem 1980, 87, 1215–1219. [PubMed] [Google Scholar]
  • 40.Mitchell JC, Sampling rotation groups by successive orthogonal images. SIAM J Sci Comput 2008, 30, 525–547. [Google Scholar]
  • 41.Flyvbjerg H; Petersen HG, Error estimates on averages of correlated data. J. Chem. Phys 1989, 91, 461–466. [Google Scholar]
  • 42.Huey R; Morris GM; Olson AJ; Goodsell DS, A semiempirical free energy force field with charge-based desolvation. J Comput Chem 2007, 28, 1145–1152. [DOI] [PubMed] [Google Scholar]
  • 43.Cornell WD; Cieplak P; Bayly CI; Gould IR; Merz KM; Ferguson DM; Spellmeyer DC; Fox T; Caldwell JW; Kollman PA, A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J Am Chem Soc 1995, 117, 5179–5197. [Google Scholar]
  • 44.Chandler D, Interfaces and the driving force of hydrophobic assembly. Nature 2005, 437, 640–647. [DOI] [PubMed] [Google Scholar]
  • 45.Malmberg CG; Maryott AA, Dielectric constant of water from 0 to 100 °C. J Res Nat Bur Stand 1956, 56, 1–8. [Google Scholar]
  • 46.Qin S; Zhou HX, Generalized fundamental measure theory for atomistic modeling of macromolecular crowding. Phys Rev E 2010, 81, 031919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Oversteegen SM; Roth R, General methods for free-volume theory. J Chem Phys 2005, 122, 214502. [DOI] [PubMed] [Google Scholar]
  • 48.Pellicane G; Costa D; Caccamo C, Microscopic determination of the phase diagrams of lysozyme and γ-crystallin solutions. J Phys Chem B 2004, 108, 7538–7541. [Google Scholar]
  • 49.Platten F; Valadez-Perez NE; Castaneda-Priego R; Egelhaaf SU, Extended law of corresponding states for protein solutions. J Chem Phys 2015, 142, 174905. [DOI] [PubMed] [Google Scholar]
  • 50.Lomakin A; Asherie N; Benedek GB, Monte Carlo study of phase separation in aqueous protein solutions. J Chem Phys 1996, 104, 1646–1656. [Google Scholar]
  • 51.Muschol M; Rosenberger F, Liquid–liquid phase separation in supersaturated lysozyme solutions and associated precipitate formation/crystallization. J Chem Phys 1997, 107, 1953–1962. [Google Scholar]
  • 52.Zhou HX; Nguemaha V; Mazarakos K; Qin S, Why do disordered and structured proteins behave differently in phase separation? Trends Biochem Sci 2018, 43, 499–516. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Baldwin RL, How Hofmeister ion interactions affect protein stability. Biophys J 1996, 71, 2056–2063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Zhang Y; Cremer PS, The inverse and direct Hofmeister series for lysozyme. Proc Natl Acad Sci U S A 2009, 106, 15249–15253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Gogelein C; Wagner D; Cardinaux F; Nagele G; Egelhaaf SU, Effect of glycerol and dimethyl sulfoxide on the phase behavior of lysozyme: theory and experiments. J Chem Phys 2012, 136, 015102. [DOI] [PubMed] [Google Scholar]
  • 56.Tessier PM; Sandler SI; Lenhoff AM, Direct measurement of protein osmotic second virial cross coefficients by cross-interaction chromatography. Protein Sci 2004, 13, 1379–1390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Wu D; Minton AP, Quantitative characterization of nonspecific self- and hetero-interactions of proteins in nonideal solutions via static light scattering. J Phys Chem B 2015, 119, 1891–1898. [DOI] [PubMed] [Google Scholar]
  • 58.Quigley A; Williams DR, Similar interaction chromatography of proteins: a cross interaction chromatographic approach to estimate the osmotic second virial coefficient. J Chromatogr A 2016, 1459, 47–56. [DOI] [PubMed] [Google Scholar]
  • 59.Nguyen TH; Zhou HX; Minh DDL, Using the fast fourier transform in binding free energy calculations. J Comput Chem 2018, 39, 621–636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Noro MG; Frenkel D, Extended corresponding-states behavior for particles with variable range attractions. J Chem Phys 2000, 113, 2941–2944. [Google Scholar]
  • 61.Montroll EW; Mayer JE, Statistical mechanics of imperfect gases. J Chem Phys 1941, 9, 626–637. [Google Scholar]
  • 62.Dyer KM; Perkyns JS; Pettitt BM, A reexamination of virial coefficients of the Lennard-Jones fluid. Theor Chem Acc 2001, 105, 244–251. [Google Scholar]
  • 63.Shaul KRS; Schultz AJ; Perera A; Kofke DA, Integral-equation theories and Mayer-sampling Monte Carlo: a tandem approach for computing virial coefficients of simple fluids. Mol Phys 2011, 109, 2395–2406. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

RESOURCES