Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Feb 28.
Published in final edited form as: Phys Rev E. 2017 Sep 25;96(3-1):032415. doi: 10.1103/PhysRevE.96.032415

Model for screened, charge-regulated electrostatics of an eye lens protein: Bovine gammaB-crystallin

Christopher W Wahle 1, K Michael Martini 2,3, Dawn M Hollenbeck 2, Andreas Langner 4,*, David S Ross 1, John F Hamilton 1, George M Thurston 2,
PMCID: PMC5830141  NIHMSID: NIHMS931472  PMID: 29346981

Abstract

We model screened, site-specific charge regulation of the eye lens protein bovine gammaB-crystallin (γ B) and study the probability distributions of its proton occupancy patterns. Using a simplified dielectric model, we solve the linearized Poisson-Boltzmann equation to calculate a 54 × 54 work-of-charging matrix, each entry being the modeled voltage at a given titratable site, due to an elementary charge at another site. The matrix quantifies interactions within patches of sites, including γB charge pairs. We model intrinsic pK values that would occur hypothetically in the absence of other charges, with use of experimental data on the dependence of pK values on aqueous solution conditions, the dielectric model, and literature values. We use Monte Carlo simulations to calculate a model grand-canonical partition function that incorporates both the work-of-charging and the intrinsic pK values for isolated γB molecules and we calculate the probabilities of leading proton occupancy configurations, for 4 < pH < 8 and Debye screening lengths from 6 to 20 Å. We select the interior dielectric value to model γB titration data. At pH 7.1 and Debye length 6.0 Å, on a given γB molecule the predicted top occupancy pattern is present nearly 20% of the time, and 90% of the time one or another of the first 100 patterns will be present. Many of these occupancy patterns differ in net charge sign as well as in surface voltage profile. We illustrate how charge pattern probabilities deviate from the multinomial distribution that would result from use of effective pK values alone and estimate the extents to which γB charge pattern distributions broaden at lower pH and narrow as ionic strength is lowered. These results suggest that for accurate modeling of orientation-dependent γB-γB interactions, consideration of numerous pairs of proton occupancy patterns will be needed.

I. INTRODUCTION

In considering interactions between proteins in solution, one interesting feature is that many different protonation patterns of the titratable, possibly charged amino acid residues coexist in equilibrium [13]. Because its acidic and basic residues continually exchange protons with the surrounding solution, an individual protein molecule presents many different spatial patterns of positive and negative charges to its neighbors, and the corresponding voltage patterns around each molecule keep changing. Each possible pair of such charging patterns can in principle give rise to a distinct spatial and orientational dependence of the screened electrostatic interaction between two nearby protein molecules [46], and the basins of attraction and repulsive parts of the corresponding potential energy landscape may change in depth or height, angular and spatial extent, and number.

The probabilities of individual charging patterns on molecules that are close enough, approximately within one or two Debye electrostatic screening lengths, also change in response to the altered voltages at neighboring sites on the two surfaces [2,711]. Such proximity can already occur more than 20% of the time even at protein volume fractions near 1% [12] and is of critical importance at the high macromolecular volume fractions in living cells, which have been estimated to range from 0.07 to 0.40 [13].

Phase transitions are ubiquitous in the normal and pathological physiology of living cells and tissues [1419]. Many of these transitions involve multiple chemical equilibria, such as the protonation equilibria studied here. Such protonation and other ligand-binding features, in solutions of proteins and other macromolecules that undergo phase transitions, are analogous to the simultaneous multiple chemical equilibria and phase transitions occurring in micellar solutions, microemulsions, and other self-associating systems [2024]. How do the relevant chemical equilibria and kinetics affect phase transitions in macromolecular solutions?

For a protein with 20 residues that may change charge at a particular pH, by accepting or donating protons from its surroundings, there are 220 or about 106 such coexisting protonation patterns. Even if most of these patterns are highly unlikely, it still may be necessary to consider the interactions of many different pairs of charging patterns in order to build quantitative models of their consequences for protein-protein interactions. For example, within a given pair of γB molecules at pH 7.1, the present model predicts that on each molecule, one or more of the most frequent 100 charging patterns will be present about 90% of the time. Thus, in order to account for 80% of the possible pair interaction potentials between these molecules, in principle one then needs to consider the approximately 5050 distinct pairings that can occur between members of these top 100 charging patterns.

Therefore, one important element for understanding protein-protein interactions is to know how often each charge pattern occurs in the isolated molecules, which is the focus of the present work. While the probabilities of each of these charge patterns will change with increasing protein concentration, their probability distributions for the isolated protein molecules nevertheless form part of the groundwork for characterizing pairwise and higher-order orientation-dependent interactions between proteins.

In order to evaluate how often a given pattern occurs, it is important to account for the fact that substantial electrostatic coupling can occur between charge patterns on a single protein, as well as between neighboring proteins, in the phenomenon known as charge regulation. Due to these electrostatic couplings, the probability of a given pattern is not, in principle, given by the product of the probabilities for each titratable residue to be occupied with a proton. Indeed, on a lattice such couplings can give rise to a charge-patterning phase transition [25].

That is, knowledge of the individual pH values at which each titratable residue is occupied with a proton half the time on average, called the pK1/2 values, in combination with the Henderson-Hasselbalch dependence [26] of occupancy on pH − pK1/2, is in principle not sufficient to evaluate the pattern probabilities. A better description is that effective pK values for given groups change in response to neighboring charges [2730]. However, because neighboring titratable site occupancies can be substantially altered [3134] from the Henderson-Hasselbalch form, a more comprehensive description can be given by a grand-canonical distribution model or equivalent consideration [2,7,10,11,25,29,32,33,35,36] that incorporates screened electrostatic couplings, as we pursue here for γB-crystallin (Protein Data Bank ID 1AMM).

In the present model we calculate a work-of-charging matrix that models screened electrostatic links between titratable sites on the protein, which are assumed to be fixed in position relative to the protein. In so doing it is important to recognize that other factors can contribute that we do not incorporate, including changes in conformation that are important in allosteric effects and in calculations that use more microscopic representations of dielectric properties [31,3742], hydration and the hydrophobic effect [43], hydrogen-bonding [44,45], static dipole potentials [4547], and ion binding [48], each of which can also be expected to produce changes in local charge patterns. We note that γ B-crystallin is believed to have a fairly robust internal structure; for example, circular dichroism measurements [49] showed no significant spectroscopic changes between −20 °C and 60 °C, though this does not rule out the possible role of conformational flexibility in affecting the present model. In the larger context of protein-protein interactions, we note that the work-of-charging matrix also involves sites on neighboring proteins and itself depends on the relative positions and orientations of the protein neighbors [10].

We focus the present model on studying the probability distributions of the protonation patterns of an eye lens protein, bovine γB-crystallin (γB). In aqueous solution, the eye lens γ-crystallins show liquid-liquid phase separation with an upper consolute temperature [12,5053], a phenomenon that can compromise transparency of the eye lens and has been linked to cataract disease [54]. The human counterpart of bovine γ B-crystallin, human γ D-crystallin (HGD), exhibits many single amino acid mutations that lead to congenital cataracts. The effects of a number of these mutations on the phase diagrams of HGD or HGD–α-crystallin mixtures are consistent with cataractogenesis [5560]. These findings motivate the present work, as an aid to building models of how particular amino acid changes affect protein interactions, the resulting phase diagram, and ultimately lens transparency and the cataract.

We build our model for the probability distributions using the notation of a previous paper [10], though it is important to note that study of protein charge regulation has a very long history [1,2,7,27,29,32,33,35,36,6163] (for a recent review see Ref. [3]) and there are other equivalent sets of notation. Briefly, we use a linearized Poisson-Boltzmann equation to compute a work-of-charging matrix. This matrix enables modeling of the work required to assemble a given pattern of charges on the protein. The linearized Poisson-Boltzmann equation is a useful starting point for this purpose because its linearity allows for the use of superposition in considering the effects of many charges and the work of charging is a symmetric quadratic form in the vectors of site charges [10]. We model the pK values of the titratable residues, with a simplified consideration of their dielectric environments. The combined work-of-charging matrix and pK values enter into a grand-canonical distribution that models the relative probabilities of occupancy patterns. We fix an assumed constant interior dielectric value through comparison with existing experimental charge vs pH data for γ B. We then use Monte Carlo simulations and direct calculations to study the resulting probability distributions of protonation patterns. We note that while numerical Poisson-Boltzmann solvers are available that provide calculations of the screened electrostatic environment around proteins and other biological macromolecules (see, e.g., Refs. [6467]) and corresponding acid-base titration characteristics [68], we developed a program with a view towards flexibility in analyzing model systems [10], including the protonation pattern probability distributions studied here, and for ongoing work on protein-protein interactions.

Figure 1 sets the stage for this work, by depicting the screened electrostatic potential that corresponds to the top proton occupancy pattern near neutral pK, modeled to occur about 17% of the time. Interestingly, in view of the attractive interactions that lead to liquid-liquid phase separation of this protein [50], the contours of zero voltage extend fairly far from the protein, in comparison with the Debye length, here 6 Å. For a 1:1 electrolyte in water at 298 K, a Debye length of 6 Å corresponds to an ionic strength of 257 mM, close to that at which the γB phase diagram has been studied [12,50,52,6972]. One might expect that negative and positive patches on neighboring molecules, separated by one or more Debye lengths, are quite capable of creating attractions by facing one another at relatively specific orientations.

FIG. 1.

FIG. 1

(a) Screened potential contours produced by the charges of γ B-crystallin, for the most common protonation pattern occurring at pH = 7.1 and Debye length 6.0 Å, corresponding to an ionic strength of 257 mM for a 1:1 electrolyte in water at 298 K: +kBT/e V (blue with horizontal curves), 0 V (light), and −kBT/e V (red with vertical curves). Black spheres, gray octahedra, and white spheres show positive, neutral, and negative sites, respectively. Curves on the 0 V contour are spaced by two Debye lengths from the center of the protein. The netted surface is the low-dielectric boundary and the plain (light blue) surface just outside it is the electrolyte boundary (see Fig. 2). (b) To aid in visualization, auxiliary spheres of radius 18.5 Å were placed over top and bottom parts of the molecule, and the potential and charges are shown in blue (+) (dark gray and nearly black, respectively) or red (−) (potential light and charges dark gray). Neutral charges are lightest in (b). (c) A simultaneous view of voltages around the entire protein surface can be given with the use of two Lambert azimuthal equal-area projections, one for each of the top and bottom spheres; projected locations of amino-acid residues of possibly charged sites are indicated. The grayscale description of (c) is like that in (b). The dashed rectangle in (c) shows the portion that is visible in (b). The top and bottom perimeter circles in (c) are both images of the crease between the auxiliary spheres in (b). Darker curves in (b) and (c) show +kBT/e and −kBT/e V contours.

To help study the resulting voltage variation, Fig. 1(b) shows the sign of the potential on auxiliary spheres that were placed over the top and bottom portions of the molecule for this purpose, about a one-half Debye length from the surface of the protein, with projected positions of possibly charged residues also indicated. Figure 1(c) shows conjoined Lambert azimuthal equal-area projections of the top and bottom auxiliary spheres, which provide a single view of the voltages around the entire protein surface.

Thus, the electrostatic interactions between γB molecules can contribute to the short-range, orientation-dependent interactions long known to be important for understanding the broad widths of γ-crystallin liquid-liquid coexistence curves and the position of the crystal solubility boundary, or liquidus [52]. However, because the voltage patterns depend on which γB residues are protonated, different protonation patterns may significantly affect the relative orientations that lead to attraction and repulsion, much like the problems that can occur in attempting to fit jigsaw pieces together. Our purpose here is to build groundwork for studying the distribution of orientation-dependent interactions that result from probable protonation patterns.

The paper is organized as follows. We briefly recap the relevant theory as it is presented in [10,25]. We then describe the construction of our simplified dielectric model for γ B, and model pK values that would occur in the hypothetical absence of electrostatic interactions between sites, termed the intrinsic, or pKint, values, as defined by Tanford and Kirkwood [2]. The pKint values are functions of the geometry of the interior dielectric environment, because of its effect on the energy stored in the electrostatic field. We then calculate the work-of-charging matrix as a function of the electrostatic screening length in the solvent and the internal dielectric environment, as input to the model grand-canonical partition function. The resulting function gives predictions for the charge vs pH, or titration, curve of the protein, which we compare with existing titration [73] and isoelectric point [74,75] data. Because the predictions depend on the assumed internal dielectric coefficient, we make use of the data for tuning this coefficient. We then quantify and study the resulting probability distributions of protonation patterns. Although very few protonation patterns occur compared to the possible ones, we find that their probability distributions are nevertheless broad. We study the extent to which charge pattern probabilities deviate from the multinomial distribution that would result from use of variously defined effective pK values, called pKeff,α* below, and study how the distributions broaden at lower pH and narrow at lower ionic strength. It turned out, somewhat to our surprise, that seemingly subtle changes in the work-of-charging matrix, for example, ignoring entries smaller than 0.2kBT, can still produce changes in the modeled rank order of protonation patterns and we analyze why this is so. We briefly discuss possible implications for protein interactions and refinements before concluding.

II. MODEL

A. Screened electrostatic model

As in previous work [10,25], we model the response of the electrostatic potential ϕ(r) to a specified distribution of fixed charge per unit volume ρ(r) through use of the linearized Poisson-Boltzmann equation [76], written here for a medium with spatially varying relative dielectric coefficient εr(r) and Debye screening parameter κ(r):

[ε0εr(r)ϕ(r)]κ2(r)ϕ(r)ρ(r). (1)

In Eq. (1), ε0 is the vacuum permittivity, εr(r) is the local, relative static dielectric coefficient, ϕ(r) is the local electrostatic potential, ρ(r) is the local free charge per unit volume, and κ is related to the standard Debye screening length in water 1 by κ=κ˜/εw, where εw is the static dielectric coefficient of liquid water. More sophisticated models of electrolyte solutions are needed in order to accurately model ionic solutions that are not dilute or contain divalent ions and explicit solvent [7680], to incorporate important physical effects such as finite ion size and ion-specific interactions, including ion absorption [8183], to include dipolar and polarizability-related interactions [84,85], and to take account of nonlinear dielectric response [86,87]. In the case of ion absorption, for example, one could construct expanded grand-canonical distribution models that would incorporate equilibria with ions other than protons or with polar ligands (see, e.g., [31,88]). Also, the application of Eq. (1) to molecular length scales, on which the protein and the solvent are heterogeneous, involves inherent problems that call for the use of more microscopic, quantum-mechanical approaches, as have been studied for many years (see, e.g., [37,40,8993] and references therein). Nevertheless, at low ionic strengths and surface charge densities [94], Eq. (1) is a useful starting point for investigating patterned, charge regulation-mediated electrostatic interactions, because its linearity allows the use of superposition in considering the effects of many charges. Also, the work of charging a given configuration of titratable sites may be expressed as a symmetric quadratic form in the vectors of site charges [10].

B. Grand canonical partiton function

In the present model [10], the grand-canonical partition function Q can be written formally as a sum over the occupancy patterns, indexed by α, of protons on the protein

Q=αeΔGα/kBT=αζkαe(Δμ0Oα)/kBTeWel,α/kBT (2)

in which ΔGα is the free energy of formation of pattern α, ζ = 10−pH, kα is the total number of protons bound to the protein in configuration α, and Δμ0=(Δμ10,Δμ20,,ΔμN0) is a vector of standard chemical potential differences for the occupancy of each site. Each Δμi0 is related to the corresponding intrinsic pKint,i value of a titratable site by

exp(Δμi0kBT)=10pKint,i. (3)

By the pKint,i value we mean the value of the pK that site i would have hypothetically in the absence of electrostatic interactions with charges on other sites and in the absence of the electrolytes in the solvent, as in Ref. [2]. That is, it is not the pK1/2 that would be measured as the value of the pH at which that amino acid residue is, on average, 50% occupied, for example, with use of appropriate nuclear magnetic resonance (NMR) experiments. Instead, pK1/2 values emerge as a consequence of models of the present type [2,10,29,3234]. In Sec. II D below we describe the model we used for estimating the pKint,i values.

The vector Oα in Eq. (2) is the occupancy pattern in configuration α, for example, {1,0,0,1,1,0,0, }. The quantity Wel,α in Eq. (2) denotes the work of charging contribution to the free energy when the protein assumes occupancy pattern α. The Wel,α is a quadratic form constructed from the work-of-charging matrix W, which in this formulation is dimensionless. Each entry Wij in W is the screened electrostatic potential produced at site i by a unit charge at site j, multiplied by the electronic charge e, and divided by kBT. That Wij = Wji can be shown with use of Eq. (1) [10]. In this notation, Wel,α is given by

Wel,αkBT=12(q1,q2,,qn)αW(q1,q2,,qn)α=12(Qb+Oα)W(Qb+Oα). (4)

Here the vector (q1,q2, , qn)α denotes the actual signed charge numbers on the protein for a specific pattern α and Qb denotes the vector of signed, bare charge numbers of the titratable groups, for example,−1 or 0. The bare charge numbers are 0 for arginine, histidine, and lysine residues, as well as the terminal amino group, and −1 for aspartate, glutamate, cysteine, and the terminal carboxylate. The probability of occupancy pattern α, Pα(x), is given by

Pα=eΔGα/kBTQ. (5)

We note that the grand-canonical partition function Q is also called the binding polynomial [31], because it can be written as a polynomial in powers of the proton activity ζ, as in Eq. (2).

C. Interior dielectric model and salt exclusion zone

We now describe our model for the quantities εr(r) and κ2(r) that appear in Eq. (1). We use a simplified model in which εr(r) is assumed to be a scalar that takes a low and constant value inside the protein and a high value outside. After constructing the grand-canonical distribution, we adjusted the interior dielectric coefficient so as to best match the available experimental protein net charge vs pH titration data [73,75], as described below in Sec. II F. The value that gave the best match to these data was εr,in = 3.0. Outside the protein, we take a value experimentally determined for water at 25 °C, εr,out = 78.5 [95]. We modeled the boundary of the low dielectric to be a surface that is 1.4 Å outside the Protein Data Bank (PDB) coordinates of the appropriate atoms (PDB entry 1AMM, from Ref. [96]) and a salt-exclusion zone to extend 3.3 Å beyond this boundary, in approximate accord with hydrated radii of monovalent ions in aqueous solvent [97]. The resulting surfaces are illustrated in Fig. 2.

FIG. 2.

FIG. 2

Illustration of the dielectric and salt-exclusion zone model for bovine γ B-crystallin, based on PDB entry 1AMM [96], rotated and translated to the coordinate system used for numerical solution of Eq. (1). The dark gray netted surface is the boundary of the low-dielectric region and the light plain surface is the boundary of the salt-exclusion zone, described in the text. Larger black and white spheres and gray octahedra show titratable sites that are positive, negative, and neutral, respectively, for the most probable configuration at pH 7.0, modeled to occur about 20% of the time [see Fig. 13(a)]. Smaller dark gray spheres are locations of nonhydrogen atoms in the 1AMM structure and smaller lighter spheres are positions of heteroatoms.

D. Model for pKint values

In view of our primary present purpose of studying the nature of the probability distributions of the protonation patterns on γB-crystallin, we adopted a simple classical approach to modeling the pKint values. We start from tabulated pK values in water for relevant charged groups [98100] and then calculate the change in the integral that, in a linear dielectric, gives the free energy stored in the electrostatic field per unit volume [101,102], (1/2)D(r) · E(r). The integral is taken over the volume outside spheres of radii r0 surrounding the group in question, when water is replaced by a heterogeneous dielectric environment like that near the surface of the protein. This approach omits a number of factors that also affect pK values, many of which call for molecular mechanics and/or quantum mechanical treatment [37,39,93,103107]. These include hydrogen bonding [44,45,108], bound ions, and nonlinear dielectric effects [86,87] that require a different integration of E(r) δD(r) than that which yields (1/2)D(r) · E(r) [86,101,102].·Hydrogen bonds can stabilize charged carboxylates [44], among other effects, and the presence of metal or other ions, often bound between titratable residues [109], would call for a grand-canonical formulation that involves more exchangeable components [88]. These phenomena are not modeled here. In addition, there are problems involved in characterizing dielectric response at molecular length scales [37,40,8993], as mentioned above. A related factor not modeled here is the local electrostatic potential from strong static dipoles such as backbone and side-chain amide groups [4547,110]. There are also solvent effects that can be studied with liquid-state theory approaches [111113]. However, the present approach is useful as a first approximation; its value may be illustrated, for example, by its remarkable ability to help understand the dependence of salt solubilities on a solvent’s static dielectric coefficient [97]. The resulting modeled contribution to the change in pK, ΔpK, can be written as

ΔpKln(10)=±12kBTr>r0D(r)E(r)dVq28πε0εwr0kBT
(if in uniform solvent)=±q28πε0r0kBT(1ε01εw). (6)

In Eq. (6), the + sign is appropriate for groups that become charged when they are not occupied by a proton, that is, for glutamate, aspartate, cysteine, and the terminal carboxylate, while the − sign is appropriate for groups that become charged when they are occupied by a proton, that is, for lysine, arginine, histidine, and the terminal amino group. To evaluate the needed integral in Eq. (6), as described in detail in the Appendix, we took advantage of the fact that a rough approximation to the shape of our more complicated dielectric model can be constructed by conjoining two spheres, each of radius 15.5 Å. Then we used Kirkwood’s analytical solution for the potential due to a charge placed in a low-dielectric sphere, near its surface [61]. We placed the charges a depth of 1.4 Å inside this sphere. We used Gauss’s theorem to convert the volume integral of (1/2)D · E in Eq. (6) to a surface integral over the small sphere of radius r0; symmetry then permits the needed integral to be converted to a one-dimensional integral, also given in the Appendix, that we evaluated numerically.

To estimate appropriate values of an effective r0 for use in Eq. (6), we used a facility within the quantum-chemistry package GAUSSIAN09 that provides for estimating recommended radii for self-consistent reaction field calculations [114]. For each titratable side-chain group and for the terminal amino and carboxyl groups, we constructed the test molecules listed in Table I, which included the side-chain titratable group in its charged form, and calculated the r0 values in Table I from repeated runs, for which the test ions were surrounded by a medium having the static dielectric coefficient of water. We first used Hartree-Fock calculations with the 6-31G(d,p) basis set to optimize the test molecules in the presence of implicit solvent. Vibration frequency analyses were performed on the optimized structures to determine whether they represented true minima. No structure exhibited imaginary frequencies. We did not perform a conformational analysis of the test molecules or optimize them in their protein environment, for simplicity and consistent with the fact that the present model does not incorporate cross-talk between conformational changes and charge regulation, as noted in the Introduction. We performed r0 calculations at least 10 times for each of the test molecules, which yielded standard deviations (Table I) that ranged from 0.1 to 0.2 Å. For modeling the solvent, we used the GAUSSIAN09 program’s default implementation of the integral equation formulation of a polarizable continuum model. The needed pKH2O values were estimated with use of the tables given in the work of Dawson et al. [98], Ellenbogen [99], and Serjeant and Dempsey [100]. Because this work is spurred by our interest in building models for γB-γB interactions in the range 4 < pH < 8, we did not include the titration of tyrosine side chains, which typically occurs in the range 10 < pH < 10.3 [110].

TABLE I.

Estimated pKint values used (see the text). Corresponding model pKeff,α values for individual residues are given in [122] for α* equal to the top pattern at pH 7.1.

Residue Abbreviation r0±s.d.a
(Å)
pKH2O
ΔpK pKint=3 No.
arginine Arg(R) 3.68 ± 0.2 12.48 −1.18 11.30 ± 0.2 20
aspartate Asp(D) 3.31 ± 0.15   3.86 +1.55   5.41 ∓ 0.2 13
cysteine Cys(C) 3.35 ± 0.2 10.50b +1.50 12.00 ∓ 0.2 3
glutamate Glu(E) 3.31 ± 0.15   4.25 +1.55   5.80 ∓ 0.2 9
N-glycine Gly(G) 2.65 ± 0.2   7.60 −2.72   4.88 ± 0.5 1
histidine His(H) 3.65 ± 0.2   6.00 −1.20   4.80 ± 0.2c 5
histidine H14      7.05d 1
histidine H53      6.04d 1
histidine H84      6.91d 1
histidine H117      6.37d 1
histidine H122      6.32d 1
lysine Lys(K) 2.65 ± 0.2 10.70 −2.72   7.98 ± 0.5 2
C-tyrosine Tyr(Y) 3.31 ± 0.15   3.40 +1.55   4.95 ∓ 0.2 1
a

Ions used in GAUSSIAN09 calculations with H2O solvent were methylguanidinium(Arg); acetate[Asp, Glu, Y174(carboxyl)]; ammonium[Lys, G1(amino)]; mean of imidazolium, 4-methyl imidazolium(His); and methanethiolate(Cys).

b

See the text.

c

Value not used; see the text.

d

From PROPKA 3.1 [115118]. Note that these values are not intended as pKint values; see Sec. II F for discussion.

The resulting pKint values and uncertainties are listed in Table I. The values calculated for the charged sites just inside the low-dielectric sphere are designated as pKint,ε =3. Here we are anticipating the fact that, as explained below, the grand-canonical distribution model was used to predict titration curves as functions of ε, which were then compared with experiment to settle on an assumed, continuum model internal static dielectric coefficient value ε = 3.

However, when carrying out this process, we found that there was a discrepancy between the modeled titration curve and the data, displayed in Fig. 4(a) below, which suggested that our modeled values for the histidine pK were lower than would be compatible with the titration curve [73] and the measured isoelectric point of bovine γB-crystallin, pH = 7.8 [74,75]. Therefore, as input to the grand-canonical simulations we instead tried using the PROPKA (version 3.1) web server estimates [115118] to replace the initially estimated histidine pKint values for γB, again with use of the PDB entry 1AMM, while leaving all the other pKint D · E integral method described above. The resulting PROPKA histidine pK estimates are listed in Table I. The comparison of the modeled titration curve with the data, for various assumptions about the inner dielectric coefficient and the histidine pK values, is described and shown in Sec. II F below, in connection with Fig. 4. As the authors emphasize [115118], PROPKA uses a phenomenological approach to achieve speed and scope in estimating pK values for a large variety of proteins. It incorporates factors that the present approach does not, including hydrogen bonding and varying degrees of penetration of residues into the interior. For histidines PROPKA starts from a higher model pK of 6.5 than the 6.0 initially used here and it assigns smaller pK reductions to H14, H84, H117, and H122 than the Table I ΔpK of −1.20, due to their varying degrees of penetration. PROPKA also predicts that hydrogen bonds raise the pK values of H14, H53, and H84, by from 0.6 to almost 0.9 pK units.

FIG. 4.

FIG. 4

Selection of interior dielectric coefficient and pK values through comparison of modeled titration curves with experiment. The experimental data [73] are shown by the labeled curve. The work-of-charging matrix W was calculated as described in the text, for different choices of εin. (a) Calculated titration curves when all pKint,εin values were calculated according to the D(r) · E(r) integral method described in Sec. II D, for the same εin values used for solution of Eq. (1) to yield the matrix W. (b) Calculated titration curves when all but the histidine pKint,εin values were estimated with the integral method, as functions of εin, while PROPKA 3.1 values were used for histidine (Table I) pKint values (see the text). In (b) the red (bold) εin = 3 curve is that of the model adopted for further study of the probability distributions.

The Table I cysteine pKH2O, 10.5, chosen to be near the pK values for methanethiol (10.33) and ethanethiol (10.5 and 10.61) [100], leads to pKint values of 12, well above the 9–9.5 typical of protein cysteines [110]; indeed, many proteins exhibit cysteine pK1/2 values much less than 9, due primarily to hydrogen bonding [119] and somewhat to nearby amide dipole potentials [47,119]. We did not alter the present approach given our focus on 4 < pH < 8, although better modeling of cysteine pK values is of interest, given the importance of cysteine oxidation for gamma crystallins and other lens proteins [5456,120,121]. In particular, the present model does not attempt to model hydrogen bonding of C18 with both C78 and S20, predicted by PROPKA to lower the C18 pK to 6.88. We included only C15, C18, and C22 in our model, which appear less buried than C32, C41, C78, and C109. However, due to the high model cysteine pKint, C15, C18, and C22 were charged only rarely in simulations, at higher pH values, as tabulated in the Supplemental Material [122].

We will find it instructive to study the ability of “effective” pK values, pKeff,α, defined below, to model the probability distributions of the protonation patterns. These effective pK values are closely related to those used in Refs. [2729], among others; briefly, the difference is that here we study the pKeff,α with respect to particular choices α* of on-or-off charge patterns Qb+Oα, as contrasted with patterns of average residue charge values at a given pH, Qb+Oα in the present notation, as analyzed, for example, in Ref. [29].

In the present notation, the pKeff,α are expressed as follows [27]. For any particular protonation pattern α on the protein, the numerator of Eq. (5) can be written

QPαQα=10(pKpH)OαeqαWqα/2,qα=Qb+Oα, (7)

in which Qb is the vector of bare charges of the titratable residues; pK, pH, and qα also denote vectors; and the symmetric work-of-charging W is defined above Eq. (4). The idea is now to use a chosen configuration Oα, which could be, for example, the most probable configuration at a certain pH, as a reference configuration; the algebraic development given here also applies for the reference configuration choice ⟨Oα⟩, as used in Refs. [2729]. The probabilities of other configurations can now be expressed in terms of how much their occupancy vectors differ from that of the reference configuration. For configuration α, the occupancy vector is Oα=Oα+(OαOα)=Oα+δOα, where δOα=OαOα. Letting qα=Qb+Oα and qα=qα+δOα one finds

Qα=10(pKpH)(Oα+δOα)e(qα+δOα)W(qα+δOα)/2=10(pKpH)OαeqαWqα/210(pKpH)δOα×e(2δOαWqα+δOαWδOα)/2.

The first two multiplicative factors in the above expression are common to all Qα. In the expression for the probabilities Pα, these factors cancel with the same common factors in the denominator, Q. Therefore, we have Pα=Qα(α)/Qα, in which we define Qα(α) and Qα via

Qα10(pKpH)OαeqαWqα/2Qα(α),
Q10(pKpH)OαeqαWqα/2Qα.

With these definitions,

Qα(α)=10(pKpH)δOαe(2δOαWqα+δOαWδOα)/2=10(pKpH)δOαe(Wqα)δOαeδOαWδOα/2=10(pKpH)δOα10(Wqα)δOα/ln10eδOαWδOα/2=10[(pKWqα/ln10)pH]δOαeδOαWδOα/2. (8)

With δqα=qαqα=δOα, Eq. (8) can be written as

Qα(α)=10[(pKWqα/ln10)pH]δOαeδqαWδqα/2. (9)

From a comparison of Eqs. (7) and (9), it is natural to define a vector pKeff,α of pKeff,α values by

pKeff,α=pKWqα/ln10, (10)

so that

Qα(α)=10(pKeff,αpH)δqαeδqαWδqα/2, (11)
Pα=Qα(α)QαwithQα=αQα(α). (12)

Equations (10)–(12) correspond to the combination of Eqs. (2)–(4) of Ref. [27], as expressed there in terms of a particular constellation of charges (here symbolized by the vector qα). As indicated above, except by its use of a particular on-off pattern α*, Eq. (10) is also related to Eqs. (1a), (1b), and (16) in Ref. [29], where the average pH-dependent approach introduced in Ref. [27] is expressed and its mean-field-approximation nature is elucidated. In this connection we note that the use of pKeff,α below, to study its capability to approximate probability distributions of protonation patterns, has a different focus than the study of the reduced-site approximation also introduced in Ref. [29]; that approximation becomes better as the criteria to regard less-labile sites as fixed become progressively more strict. Reference [29] demonstrates that the reduced-site approximation is more effective than the mean-field approach for representing the average occupancy states of particular sites, while typically more efficient computationally than using the exact expressions.

Equation (10) expresses the fact that for configurations that are similar to the chosen configuration α, the effective pK values are typically biased by charges qα on neighboring sites. These charges, in turn, produce voltages that bias the occupancy of a given site. Thus, at a given pH, one expects that the site occupancies can be fairly well described by pKeff,α values for a well-chosen α, say, the most common configuration. The extent to which this is not the case is clearly afunction ofthequantities eδqαWδqα/2, according to Eqs. (11) and (12). We will find below that a given set of pKeff,α values accurately represent only part of the probability distribution of the protonation patterns at a given pH, precisely because of this latter factor.

E. Calculation of the potential and work-of-charging matrix

The numerical methods we used to calculate the potential are those described previously [10]. We used grid sizes from 0.3 to 0.6 Å, the domain was 100 × 100 × 120 Å3, and the protein was placed at its center. We used the Neumann boundary condition that the normal component of the field is zero there. While we expect that a more accurate boundary condition would be that a linear combination of the normal field component and the potential would be zero, for the Debye lengths investigated here, the zero-field condition suffices. To calculate the ith row of the work-of-charging matrix W, a charge is placed at site i; the potential at site j then gives the entry wij. Each such work-of-charging matrix was symmetric, providing an important check on the calculation.

We note that there are also self-energies associated with the interaction of each charge with its counterion cloud. In principle, this factor also changes the effective pK of a site, above and beyond the fact that the site is near a dielectric boundary. We calculated the magnitudes of these effects from our numerical solutions of Eq. (1), by evaluating the potential at a given charged site produced by the nearby net charge within its surrounding, screening ionic atmosphere. The magnitudes we calculated for this effect were very uniform and would produce changes in the given pK on the order of only ±0.1 pK units, which we regard as insignificant compared with the uncertainties in the modeled pKint values themselves. Accordingly, we simply set the diagonal entries of the work-of-charging matrices to 0 for further calculations.

Figure 3 illustrates a work-of-charging matrix calculated in this fashion. To find a permutation of the residue order that would yield the approximately block-diagonal forms shown in Fig. 3, we used simulated annealing, with an objective function that was linearly proportional to the distance of (the symmetric) work-of-charging entries from either the diagonal or the upper right or lower left corners. On repeated runs, this yielded a robust grouping of sites. While we grouped the sites in this fashion in order to identify patches of residues predicted by the model to be more highly correlated, we left all entries intact for computing the partition function. That is, this grouping does not represent a block-diagonal approximation method, an avenue that has been pursued by a number of investigators (see Ref. [123] and references therein).

FIG. 3.

FIG. 3

(a) Approximate block-diagonal form of the dimensionless work-of-charging matrix W for interior dielectric coefficient 3 and Debye length 6.0 Å. The Wij magnitude categories are as follows: white < 0.05 ≤ dark purple quarter circles < 0.1 ≤ purple half circles < 0.2 ≤ blue 3/4 circles < 0.4 ≤ green triangles < 0.8 ≤ yellow squares < 1.6 ≤ orange pentagons < 3.2 ≤ red circles. The matrix includes all 54 titratable residues used in the present model, which as noted in the text omits the tyrosine residues and four of the cysteine residues; the entire protein contains 174 residues. Designations for the 54 residues considered alternate between left and right (and top and bottom) margins. (b) Cylinders with radii proportional to Wij mapped onto the PDB 1AMM structure of γ B-crystallin. The Wij magnitude categories are as follows: 0.4 ≤ green cylinders with three gaps < 0.8 ≤ yellow cylinders with two gaps < 1.6 ≤ orange cylinders with one gap < 3.2 ≤ red cylinders. (c) Lambert projections with potential and charges indicated as in Fig. 1(c). Groups of titratable sites participating in approximate blocks of W are circled in black and numbered in (c) and indicated by black squares in (a). Group numbering corresponds to the order of sites in W in Fig. 3(a), from top to bottom, and group numbers are those to which Figs. 7 and 8 refer. In (b) and (c) the protonation configuration is that modeled to be the most common one at pH 7.1.

Figure 3(a) displays an approximate block-diagonal form of the work-of-charging matrix W, for the adopted inner dielectric value εin = 3.0. The symbol code for Wij magnitude categories is given in the caption, ranging from white for entries less than 0.05kBT/e to red circles for entries greater than or equal to 3.2kBT/e. The prominent entries adjacent to the main diagonal show a high degree of charge pairing, long noted to occur for γ-crystallins [124]. The 0.05 lower cutoff is close to the value below which we observed very little change in the order of probabilities of the protonation patterns, if smaller entries were ignored [see Fig. 10(a)]. Residue identities are indicated on the borders of Fig. 3(a). A perspective view of the work-of-charging matrix of Fig. 3(a) is given in Fig. 2 of the Supplemental Material [122]. Figure 3(b) displays the work-of-charging entries in the form of line segments that link the titratable groups on the protein, using the same symbol code as in Fig. 3(a). Figure 3(c) shows labeled sets of titratable sites, circled in black, that participate in approximate blocks of W, with use of the same projection as in Fig. 1(c). The corresponding blocks are outlined by the thick black squares in Fig. 3(a). In addition, prominent charge pairs are circled in purple (lighter) in Fig. 3(c). Tables of the work-of-charging matrices we calculated for Debye lengths 6, 12, and 20 Å are given in Figs. 6–11 of the Supplemental Material [122].

FIG. 10.

FIG. 10

(a) Dependence on dimensionless work-of-charging cutoff level of the six most common proton configurations at pH 7.1 and Debye length 6.0 Å. At each cutoff level on the horizontal axis, all of the entries in W below the given level were set to zero and the probabilities of the configurations were recalculated using Eq. (5). At this pH, the order of the configurations is stable up to a cutoff level of 0.07, which is shown by the vertical dashed line. (b) Cumulative distribution function of the work-of-charging matrix entries (see the text). The left vertical line in (b) corresponds to the cutoff level of 0.07 indicated in (a). The right vertical line is at 1. (c) Changes in pKeff,α values of the indicated histidine residues, whose protonation switches produce the top-ranked configurations shown in (a). Above a cutoff of 0.07, the H84 pKeff,α value changes are the primary reasons for the configuration probability changes in (a) (see the text).

F. Calculation of the grand-canonical distribution function and the protonation pattern probabilities

We performed Metropolis Monte Carlo simulations that included all 54 sites of the present model to determine the grand-canonical partition function (GCPF) and the associated statistics of the distributions of protons on the protein. Protonation pattern statistics were studied using Monte Carlo runs of 108 iterations. We determined GCPF vs pH in 0.1 pH increments by finding top protonation configuration probabilities in 106 iteration runs and using Eq. (5) with that configuration’s ΔGα. While the results given here were calculated from the simulations, it is convenient to note that in the Monte Carlo simulations, many of the residues, primarily the arginines with the highest pKeff,α values, never changed their occupation states at some of the pH values in the range of primary interest here, 4–8, or did so very few times, even in 108 iterations. A table of the number of times each residue switched protonation state, as a function of pH, and a table that includes individual pKeff,α values appear as Figs. 4 and 5, respectively, in the Supplemental Material [122]. Therefore, to speed calculations, it can be convenient to omit such residues from calculation of the partition function, as was done in the reduced sites approximation of Ref. [29], and with fewer titratable sites, about 25 or 30, the model partition function Q can be evaluated exactly. By either method, once Q is known, the probability of protonation pattern α is then given by Eq. (5). Likewise, the average number of protons ⟨n⟩ on a protein can be found from

n=ζQQζ=lnQlnζ, (13)

in which ζ = 10pH.

FIG. 5.

FIG. 5

Protonation patterns that have opposite net protein charge readily occur at pH 7.1. In both panels, log10 P is plotted vertically for the most prominent pH 7.1 configurations that together account for over 97% of the configuration probability. (a) The net protein charge of each configuration is plotted horizontally. Line segments join configurations that can be transformed into one another with a single-residue protonation switch. (b) (i) The horizontal coordinate of the yellow-striped–clear boundary is the sum of the probabilities of that configuration and more common ones, that is, their cumulative probability. (ii) The horizontal coordinate of the blue–yellow-striped boundary is the square of the same cumulative probability. The blue–yellow-striped boundary estimates the fraction of pairs of neighboring proteins, both molecules of which have one of the configurations down to a given log10 P level; this estimate neglects biasing of pattern probabilities due to protein proximity.

Titration curves calculated in this manner are shown in Fig. 4. The experimental data [73] are shown by the black curve in each panel. These data were obtained with use of an aqueous 100 mM potassium chloride solvent, corresponding to a Debye length of 9.6Å, the value we therefore used in the 54 solutions of Eq. (1) for each choice of εin, to generate the matrices W needed for the comparisons shown in Fig. 4. Figure 4(a) shows the calculated titration curves when the pKint,εin values were calculated according to the D·E integral method described above. Note that in this case both the work-of-charging matrix W resulting from application of Eq. (1) and the pKint,εin values resulting from the first two lines of Eq. (6) are functions of the interior dielectric coefficient value εin and were calculated as input to the GCPF simulations for the values εin = 2, 4, 8, and 12. Therefore, the calculated pKint,εin values relevant to the curves in Fig. 4(a) are not those listed in Table I for εin = 3. In Fig. 4(a) each test model titration curve predicts a lower isoelectric point (pI) than that observed experimentally for γ B-crystallin, pI = 7.8 for the native protein [74,75], though in Ref. [74] a minor component was also observed at a lower pI of 7.3, a component that was sensitive to the presence of reducing agents [74]. Also, note that bovine γ B-crystallin was termed γ-II at the time of publication of Refs. [74,75].

In addition to the purpose of studying the probability distributions of the protonation patterns, we have a goal of modeling small-angle neutron scattering data from γ B-crystallin solutions in the pH range between 4.5 and 7.1, and as a preliminary step want to create a charge-regulation model in a pH range that spans these values and reproduces the observed isoelectric point. Therefore, as described above, we used the PROPKA estimates for the needed histidine pKint values, while continuing to use the D · E integral procedure for the other residues. Figure 4(b) shows the resulting model titration curves. Although there is clearly a range of εin values that could be used and there is room for improvement, the highlighted red curve, with W and nonhistidine pK values generated using εin = 3.0, provides a relatively good match to the experimental titration curve in the range 4 < pH < 8 of particular interest and we took it to be sufficient for studying the general nature of the protonation pattern probability distributions. We did so despite the fact that the PROPKA estimates include a model of charge-charge interactions [117] and therefore are not intended to be intrinsic pKint values as they are used here.

We anticipate that as NMR assignment and titration data become available for γ B-crystallin, it will become possible to test and refine the present model in much more detail. Accordingly, we postponed detailed study of using different histidine pKH2O values, which are expected to depend on their tautomeric states [125], in addition to the possible hydrogen bonding, dipolar potential, and other effects mentioned above.

The value εin = 3.0 of the model we use here is compatible with calculations of continuum-model static dielectric coefficients of 2–4 for interior regions of many proteins and with measurements of dry protein powders [90,126129]. The quoted range is approximate, depending on the protein and the method of calculation, and represents an ongoing area of investigation, as noted above [92]. Recent analyses of NMR chemical shifts within proteins, in particular their dependence on modeled local electric fields, found that values of εin near 3 gave the best matches to data [130,131].

III. PROBABILITY DISTRIBUTIONS OF PROTONATION PATTERNS

A. Features of the distributions at constant pH

In this section we address the following questions. How broad are the distributions at a given pH? How different are these distributions from the multinomial distribution that would occur if the off-diagonal work-of-charging entries were all zero? What simple approximations provide good quantitative agreement with the exact model probabilities? How different are the patterns of surface voltage that correspond to probable protonation patterns? In order to study these questions, in Figs. 5, 7, and 8 we plot the base-10 logarithm of the modeled probability of each protonation pattern vertically vs its net charge. In each figure, the line segments join two configurations that differ by a single switch in proton occupancy. We call such configurations adjacent. In addition to giving a visual picture of the protonation pattern probabilities and the possible single-step transitions between them, these and related diagrams can help to study how pattern probabilities are distributed with respect to factors that can affect protein-protein interactions, here net charge.

FIG. 7.

FIG. 7

(a) Because H117 is only weakly linked to other residues [see Fig. 3(c), group 11], when H117 switches charge (red short-dashed lines), the eight-vertex polygon representing the possible switches of H122 (purple solid lines), H84 (blue dash-dotted lines), and H14 (green long-dashed lines) undergoes translation with very little distortion (see the text). (b) Because E120 and H122 interact strongly, when E120 changes from charge −1 to 0 the H122 switching segments markedly change slope, distorting the same polygon. (c) Further analysis of the changes in (b), by comparing the full model probabilities with those of a Henderson-Hasselbalch approach. Agreement would correspond to all points being on the solid diagonal line. (c) Illustration that the E120 switch markedly alters some probabilities from Henderson-Hasselbalch values (see the text).

FIG. 8.

FIG. 8

(a) Most common 32-vertex polygon representing possible switches of all five histidine residues at pH 7.1; H53 switches (orange dotted lines) are shown as well as the H122 (purple solid lines), H84 (blue dash-dotted lines), and H14 (green long-dashed lines) depicted in Fig. 7. (b) The 32-vertex polygon of histidine switches that occurs under the condition that E7, a neighbor of H14, has gained a proton to become neutral. The positively charged state of H14, which had been stabilized by a neighboring negative charge, is now less probable than its neutral state and the green long-dashed segments have negative slopes, while the others retain their slopes. Note the change in the vertical scale. (c) Comparison of probabilities from the full model and a Henderson-Hasselbalch approach for the topmost 32-vertex polygon in (a) (black squares), using α* = 1, and the choices α* = 1 (blue closed circles), α* = 4 (red open circles), and α* = 48 (purple open squares) for the polygon in (b). The diagonal solid line is that of agreement between the two methods.

Figure 5 shows that γ B protonation patterns that have both positive and negative net protein charge readily occur at pH 7.1. Figure 5(a) shows the most prominent pH 7.1 configurations that together account for 97% of the configuration probability.

In Fig. 5(b) the cumulative probability down to a given level is plotted horizontally as the curve on the far right, the boundary of the striped yellow region. This curve, taken together with Fig. 5(a), shows that each of the topmost 70% of the configurations has a non-negative net charge. However, below that level quite a few pH 7.1 configurations have net negative charge. Because oppositely charged proteins are more likely to exhibit attractive interactions, Fig. 5 gives rise to the interesting possibility that at high concentrations, where proteins have many near neighbors, the probability distributions of net charge may even become bimodal. In this work we do not analyze the biasing of the distributions because of protein proximity.

The square of the cumulative probability is filled in blue (dark) in Fig. 5(b). It provides an estimate of the fraction of pairs of neighboring proteins, both molecules of which have a configuration with a probability above a given level. This estimate again neglects biasing of probabilities due to protein proximity. The blue-yellow boundary suggests that close pairs of proteins, both of which have net non-negative charge, will account for only the top half of neighboring protein pairs. Also, to account for about 80% of the configuration pair types, configurations that range down to those that occur only one one-thousandth of the time must be included. Thus the blue-yellow boundary gives a rough guide to how many configurations to include in a model of electrostatic interactions for this protein.

Figure 6 compares the voltage patterns around the 12 most probable proton configurations at pH = 7.1 and Debye length 6 Å. Residues that have gained or lost protons, with respect to the next more common configuration, are shown by blue (darker) and red (lighter) arrows, respectively. At this pH, histidine protonation switches are modeled to account for the first 20 patterns. It is very interesting that as a consequence of the majority of these switches, the connectivity of the positive [blue (darker)] and negative [red (lighter)] potential regions on the projection spheres also changes, much like straits and isthmuses in continental drift. Thus one might expect that in the presence of neighboring proteins that also have charged patches, the ease of reorientation of each protein could depend on voltage channels that open and close, as each of their protonation configurations changes. The similarity of many of the voltage patterns that result from different protonation patterns, illustrated in Fig. 6, suggests that larger classes of such pairs may be sufficient for creating accurate models of the relevant pair potentials. Thus a very interesting question is how best to construct a good coarse-grained level of detail in the protonation pattern distributions in order to model protein interactions accurately. In the present work we do not focus on the protein interaction consequences of the patterns shown in Fig. 6.

FIG. 6.

FIG. 6

Lambert projections, with potentials and charges indicated as in Fig. 1(c), for the 12 most probable configurations at pH = 7.1 and Debye length 6 Å, in order of probability (see Table II). Residues that have gained or lost a proton, with respect to the more common configuration that is adjacent in order, are shown by blue (darker) and red (lighter) arrows, respectively. For many protonation switches, positive [blue (darker)] and negative [red (lighter)] voltage regions change connectivity.

We now study the origins of the switching pattern shown in Fig. 5(a) in more detail, in a residue-by-residue manner. Each line segment in Fig. 5(a) can be identified with the particular residue that gained or lost a proton. The probabilities of protonation patterns reflect both the affinity of each residue for protons and the correlations between sites that are strongly affected by their mutual electrostatic interaction.

The quantitative consequences are illustrated in Fig. 7. If two residues are uncorrelated, as are H122 and H84, the change in the pattern probability when one of them switches protonation state will not depend on the state of the other. Because their occupation probabilities are essentially independent, when H122 changes its charge, the logarithm of the pattern probability will change by a given amount that does not depend on whether H84 is protonated. Thus, the slope of the line segment that links H122-adjacent patterns (purple solid line) will not depend on the state of H84 and vice versa. In contrast, if two sites are strongly correlated, their protonation probabilities are no longer independent and the corresponding slopes that link adjacent configurations will depend on the protonation of the second residue.

Consider Figs. 7(a) and 7(b). In Fig. 7(a), because residue H117 is uncorrelated with residues H122, H84, and H14, protonating H117 simply translates (red short-dashed lines) the line segments for switches of the three other residues. In Fig. 7(b), because E120 is in residue group 12, as is H122 [see the lower right corner of Fig. 3(c)], H122-adjacent pattern probabilities change in different ways that depend on the state of E120.

If the work-of-charging matrix were diagonal, the distribution of protonation patterns would be multinomial and a translation-without-distortion property would hold exactly for all line segments in a diagram such as the ones in Figs. 5, 7, and below in Fig. 8. Thus the deviations from congruence of residue-switch polygons in the coordinates (net protein charge log10 P) display the degree to which parts of the protonation pattern distribution differ from multinomial. We note that in the present work the choice of net charge on the horizontal axis underlies this polygon translation property, simply because the net charge is here assumed to be solely due to protonation switches. Clearly, if ion absorption played a significant role or if other coordinates were used in place of or in addition to net charge, such as the percentage of the surface that has a positive voltage, a more complex picture would result.

Figure 7(c) focuses on the configurations in the lower, eight-vertex polygon in Fig. 7(b). It illustrates the deviation of the probabilities calculated from the full model from those of the Henderson-Hasselbalch approximation, in a log-log plot. The blue closed circles compare these two probabilities using the topmost configuration as the reference (α* = 1) for calculating pKeff,α values in the Henderson-Hasselbalch approximation. In this case the Henderson-Hasselbalch probability of configuration 72, which is found by a direct single-proton switch from configuration 1, agrees with the probability predicted by the full model, together with the configurations in its attached, translated blue-green polygon, namely, 117, 154, and 242 [see Fig. 7(b)], while the other four configurations (27, 38, 47, and 78) have probabilities that are 10 times those predicted by the Henderson-Hasselbalch model. In contrast, if α* = 2, the open red circles show that the two methods of estimating probability agree for configurations 27, a direct switch from 2, together with 38, 47, and 78, while the other four no longer agree. Because E120 is in the same group as H122, there is no reference state for which all of the pattern probabilities can be computed with use of the Henderson-Hasselbalch approach. Indeed, the factors eδqαWδqα/2 in Eqs. (10)–(12), in which the vectors δqα depend on both α and α*, together with the existence of nonzero off-diagonal elements of W, imply that, in general, some full GCPF pattern probabilities will differ from Henderson-Hasselbalch ones, regardless of the choice of α*. Some individual residue protonation probabilities must then also differ from Henderson-Hasselbalch values. This can occur whether or not the residue is charged as it is in α*; this can be shown by expressing individual residue protonation probabilities as sums of the Pα of Eq. (12) over the appropriate patterns α, the key point being that each summand can carry a different factor of eδqαWδqα/2.

Figure 8 is a larger-scope version of the translating polygons picture. In the present γB-crystallin model, at pH 7.1 the switching of protonation states of the five histidines accounts for a large fraction of the topmost protonation configurations of the entire protein. Thus for this pH it is interesting to construct sets of 32-vertex polygons, in which the vertices represent all of the 25 histidine protonation patterns that occur for a given configuration of all the other residues. Figure 8(a) shows the topmost such polygon.

The entire probability distribution of protonation patterns can be represented as the family of all such 32-vertex polygons; each possible pattern belongs to just one such polygon. Figure 8(b) illustrates the distortion of the topmost polygon that results when residue E7, strongly coupled to H14, switches protonation. It is now harder for H14 to become protonated, which is reflected in the fact that the slopes of the green long-dashed segments that represent the H14 protonation switches become smaller; in this case they go from positive to negative. As in Fig. 7, Fig. 8(c) shows that when E7 switches, the Henderson-Hasselbalch approach does not work well for the resulting polygon, even though it does work well for the topmost polygon. The choice α* = 48, suggested by the fact that it is the topmost configuration in Fig. 8(b), produces a linear arrangement that is parallel to but displaced from the line of agreement.

Figure 9 compares full model configuration probabilities with those calculated using pKeff,α values, at pH 4.5 and 7.1. Figures 9(a), 9(b), 9(d), and 9(e) show that a large number of the configurations have quite different probabilities from those calculated using pKeff,α values alone, due to linkage between groups of titratable residues. Because the protein becomes more highly charged as pH is decreased (see Fig. 4), it is natural to expect that the broader distribution of probabilities relative to the Henderson-Hasselbalch approximation may be associated with this increased net charge. Also, there might be more charged residues at the lower pH, which might bias the probabilities from Henderson-Hasselbalch values.

FIG. 9.

FIG. 9

(a) Percentage deviation of the Henderson-Hasselbalch probabilities of the top-ranked 200 configurations from those of the full model at pH 7.1. Panels (a)–(c) all use α* = 1 for pH 7.1 to determine pKeff,α values for use in calculating Henderson-Hasselbalch probabilities. (b) Histogram of the same deviations, for the top-ranked 1000 configurations at pH 7.1. (c) A log-log comparison of the top-ranked 1000 configuration probabilities from the full model with the Henderson-Hasselbalch probabilities. Note the clustering along diagonal lines. (d) Similar percentage deviations of the top-ranked 200 configurations at pH 4.5. Panels (d)–(f) all use α* = 1 for pH 4.5 to determine pKeff,α values for use in calculating Henderson-Hasselbalch probabilities. (e) Histogram of the deviations at pH 4.5. (f) A log-log comparison of the top-ranked 1000 configuration probabilities; note the changed scales.

However, the Henderson-Hasselbalch probabilities in Fig. 9 already incorporate the influence of the most common charge patterns because they use the top configuration α* appropriate for each pH to construct the needed pKeff,α values, via Eq. (10). Thus, the existence of residues that are charged differently at the two pH values is not, by itself, sufficient to account for the broader distribution in Fig. 9(d), as compared with that in Fig. 9(b).

Also, for the very top configurations at each pH, fewer residues, 43, are modeled as charged at pH 4.5 (28 positive, 15 negative, 11 neutral, net charge +13) than at pH 7.1, where 48 are charged (25 positive, 23 negative, 6 neutral, net charge +2). Thus, positively and negatively charged residues, taken together, make for a larger total number of charges at pH 7.1, despite the fact that the net charge is lower at pH 7.1. This situation is physically reasonable because it corresponds mainly to the fact that at pH 4.5, eight glutamate and aspartate residues that carried negative charges at pH 7.1 are neutral, which can readily occur because the pH is much closer to their pKeff,α values; H53, H117, G1, and Y174 also change charge. Thus the fact that there are fewer titratable groups that carry charge (positive or negative) at pH 4.5 than there are at pH 7.1 depends on the set of pKeff,α values [see Fig. 15(a) herein and Fig. 5 in the Supplemental Material [122]), combined with the bare charge numbers. This is not directly connected with the fact that the deviation from the Henderson-Hasselbalch distribution is greater at pH 4.5.

FIG. 15.

FIG. 15

(a) The pK1/2 values from simulations of the full model with interactions, on the horizontal axis, are close to the pKeff,α values. Red dots result from taking α* to be the most prominent protonation configuration within 6.6 < pH < 7.3; pK1/2 values farther from 6.6 < pK1/2 < 7.3 differ from those pKeff,α, as expected. Blue open circles result from taking α* to be the most prominent configuration within 4.4 < pH < 4.6; again pK1/2 values farther from that of α* differ more from those of pKeff,α. (b) Except for H53 and H122 below their respective pK1/2, histidine titration curves from the full model (solid line) also agree well with Henderson-Hasselbalch curves (dashed line) as parametrized by the pKeff,α values of the α* used in (a).

Rather, Eqs. (11) and (12) indicate that the broader width of the distribution of protonation pattern probabilities must arise from the switches δqα of charge patterns from that of α* that contribute significantly to the factors eδqαWδqα/2. More specifically, the broader width of the protonation pattern probabilities relative to the Henderson-Hasselbalch approximation at pH 4.5, as compared with that at pH 7.1, is due to Glu and/or Asp residue pairs in the same work-of-charging group [see Fig. 3(c)]. Frequent charge switches of these residues at pH 4.5 produce the most probable protonation patterns, while at the same time their work-of-charging linkages bias pattern probabilities away from Henderson-Hasselbalch ones. At pH 7.1, histidine residue switches produce the most probable patterns, but because each histidine is in a different work-of-charging group, pattern probabilities more closely track the Henderson-Hasselbalch approximation. Figures 9(c) and 9(f) show log-log plots similar to those in Figs. 7(c) and 8(c) for the top-ranked 1000 configurations. At pH 7.1, the deviations cluster along lines parallel to the diagonal line of agreement. At pH 4.5 this clustering feature is less clear; the scale was expanded to make it apparent.

The polygons linking protonation patterns shown in Figs. 7 and 8 suggest that the use of effective pK values holds both value and danger. If the effective pK values were to be considered as fixed, they would not account for the lack of independence of the protonation pattern probabilities that is represented graphically by the distortion of the polygons shown. Nevertheless, as suggested by Figs. 9(c) and 9(f), one might accurately model the probability distributions with use of judicious choices of a changing set of base configurations α* for calculating effective pK values according to Eq. (10).

How large do off-diagonal parts of the work-of-charging matrix need to be before they significantly affect the probability distribution of protonation configurations, at a given pH? Figure 10 examines the sensitivity of the probabilities of the topmost few configurations to the omission of elements of the work-of-charging matrix that are smaller than chosen cutoff levels. Figure 10(a) shows that at pH 7.1, the order of the top-ranked six configurations is stable up to a cutoff level of only 0.07, a level that is shown by the vertical dashed line. Such a dimensionless work-of-charging level corresponds to an electrostatic potential φ, produced at one member of a pair of titratable groups by the other, charged member, of 0.07kBT/e, or approximately 2 mV. This rather small value to which the ranking of protonation patterns is sensitive occurs for two principal reasons, which are illustrated in Figs. 10(b) and 10(c), respectively.

First, while many of the entries in W are quite small, there are many such entries. To quantify this, Fig. 10(b) shows the cumulative distribution function of the work-of-charging matrix entries. While for 0.07 < Wij < 1, individual titratable site pairs have relatively little effect on one another, a large number of such pairs occurs; 1133 entries are less than 0.07, 262 entries are between 0.07 and 1, and 36 entries are more than 1.

Second, and more specifically, the cutoff values depend on how close the pH is to one or more pKeff,α values. At the pH illustrated, the titration of histidine residues is modeled to account for the relative prominence of the top-ranked configurations, as discussed above and shown by the polygons in Figs. 7 and 8. Further, the agreement between the Henderson-Hasselbalch probabilities and those of the full model for the topmost 32-vertex polygon, shown in Fig. 8(c), suggests that the changes shown in Fig. 10(a) should correspond to changing pKeff,α values.

This is borne out by Fig. 10(c), which shows how the pKeff,α values of the five histidines change as the work-of-charging cutoff value is increased from 0.01 to 10, all at a pH of 7. The 0.07 level is again shown by the vertical dashed line. At cutoffs lower than 0.07, the pKeff,α values show small fluctuations much like those of a random walk, a feature that corresponds to the large number of small work-of-charging entries below this level, shown in Fig. 10(b). The ranking of configuration probabilities [shown in Fig. 10(a)] consequently remains stable until the net result of these fluctuations overcomes the difference between two neighboring pKeff,α values. This occurs just beyond the 0.07 cutoff level, when the H84 pKeff,α crosses below that of H122. As a result, the H84 pKeff,α is now closer to the ambient pH 7.1 and its deprotonation would now be modeled as more probable than that of H122. In terms of the configuration probability polygon in Fig. 7(a), the H84 segments will now be less positively sloped than those of H122. Such a change corresponds precisely to the fact that the configurations initially ranked 2 and 3 switch their order in Fig. 10(a) just above cutoff level 0.07. It is also consistent with the fact that configurations 5 and 6 also switch their rankings at a very similar cutoff level. Further comparison shows that the prominent migration of the H84 pKeff,α value with increasing cutoff level, shown in Fig. 10(c), is largely responsible for the further configuration ranking changes shown in Fig. 10(a). Finally, Fig. 10(c) shows that at higher cutoff levels, many additional switches occur, until the cutoff level is so high that it is larger than any off-diagonal values.

In summary of the implications of Fig. 10, off-diagonal elements of the work-of-charging matrix that are quite small in the dimensionless units eϕ/kBT can nevertheless change the ranking of protonation configuration probabilities. Also, as larger and larger off-diagonal elements are set to zero, a random-walk-like migration of pKeff,α values provides an approximate accounting for the ranking changes of the top configurations, whose probabilities are well represented by the pKeff,α values at this pH.

B. The pH dependence of protonation pattern distributions

The modeled probability distributions of protonation configurations show a marked dependence on pH, which we now study. By way of introduction, Fig. 11 shows how the screened potential contours change with pH for the most common protonation patterns, those occurring at pH = 7.1 [Fig. 11(a), as in Fig. 1(a)], pH = 6.5 [Fig. 11(b)], pH = 5.0 [Fig. 11(c)], and pH = 4.5 [Fig. 11(d)]. The contour values displayed are for +kBT/e V (blue with horizontal curves), 0 V [gray with curves as in Fig. 1(a)], and −kBT/e V (red with vertical curves). In each case the Debye length is 6 Å. Prior experimental results, to be analyzed and reported with the help of the model being developed here, led to the choice of pH values for Fig. 11. Specifically, at pH 7.1, 6.5, and 5.5, at a Debye length of 6 Å, we observe reversible liquid-liquid phase separation in concentrated γB-crystallin solutions, strongly suggesting attractive net protein-protein interactions. However, we see no phase separation at pH 4.5, and at this pH small-angle neutron scattering indicates repulsive interactions.

FIG. 11.

FIG. 11

Screened potential contours of γ B-crystallin, for the most common protonation patterns at (a) pH = 7.1 (as in Fig. 1), (b) pH = 6.5, (c) pH = 5.0, and (d) pH = 4.5. The Debye length is 6.0 Å; contour values are +kBT/e V (blue with horizontal curves), 0 V [gray with curves as in Fig. 1(a)], and −kBT/e V (red with vertical curves); dielectric and electrolyte boundaries are designated as in Fig. 1(a). Note the changing balance between positive and negative voltage regions with pH and an accompanying shrinkage of zero-voltage contours, most of which, at pH 4.5, extend to less than a Debye length from the protein.

It is interesting that in this context the balance between the positive and negative voltage regions is fairly even at pH 7.1 and pH 6.5, while in contrast the positive regions progressively dominate at pH 5.0 and pH 4.5. Further, the zero potential contours extend far from the protein at the upper three pH values shown, but collapse to inside or near the protein at pH 4.5. In combination with the findings mentioned above, Fig. 11 suggests that with the more even balance of positive and negative surface regions modeled at the higher pH values, neighboring proteins may readily bias their orientations so that oppositely charged surface patches can face one another and interact so as to produce net attractive forces. However, if the balance between positive and negative surface regions becomes skewed beyond that corresponding to Fig. 11(c), net repulsive forces can result. Figure 11(d) illustrates that at pH 4.5 the majority of the protein surface is positive. At this pH, the angular-averaged interprotein interactions may be relatively insensitive to changes in the particular configuration of protons. It is important to note that a quantitative model will also need to include dispersion forces and hard-core interactions, at least.

Figure 12 shows log10 P vs net protein charge for configurations in the modeled distributions at pH 5.5 and pH 4.5, together with their single-protonation switch line segments, accompanied by the pH 7.1 distribution shown in Fig. 5. As pH decreases within this range, there is a substantial spread of net charge and the topmost configuration becomes considerably reduced in probability, reaching below 1 part in a thousand at pH 4.5.

FIG. 12.

FIG. 12

Plot of log10 P vs net charge, with line segments indicating single-residue protonation switches, for protonation patterns that occur at pH 7.1 (black solid line, the same as in Fig. 5), pH 5.5 (purple dash-dotted line), and pH 4.5 (blue dotted line) (see the text).

Figure 13 illustrates summary statistics of the configuration probability distributions vs pH. Figure 13(a) shows that near neutral pH the distributions are relatively narrow for this protein; for example, one of the top 100 patterns is expected to occur about 90% of the time. Because 27 = 128, this corresponds to on the order of seven sites switching their protonation status. As discussed in connection with Fig. 5, even though the distribution is relatively narrow near pH 7.0, Fig. 13(a) implies that a large number of pairs of patterns may be needed in order to model electrostatically mediated interactions between the proteins. The needed number of pairs can be estimated from the figure. For example, assuming for the purpose of illustration that neighboring patterns do not bias each other’s probabilities, it would mean that considering (100 × 101)/2 distinct pairs of patterns would enable one to model a fraction 0.9 × 0.9 of the pairs that contribute to the effective interaction strength. The distributions are much broader at lower pH values; the pH 4.5 curve in Fig. 13(a) shows that at that pH, one of the first 1000 patterns will be present only 20% of the time. Figure 13(b) shows the contours of the cumulative probabilities of the top sets of patterns at each pH, in the [pH, log10(number of configurations)] plane.

FIG. 13.

FIG. 13

(a) Cumulative probabilities of the most probable protonation patterns at a Debye length of 6.0 Å. For example, the dots on the pH 7.0 curve show that under these conditions, the top γB occupancy pattern occurs nearly 20% of the time, one of the first 10 patterns will be present 60% of the time, and one of the top 100 patterns occurs 90% of the time. The pH 4.5 curve shows that one of the first 1000 patterns will be present 20% of the time. (b) Contours of the cumulative probabilities displayed in the [pH,log10(number of configurations)] plane. For example, the 0.99 contour indicates that at pH ≈ 6.8, 1000 configurations account for 99% of the probability.

Figure 14(a) shows the pH dependence of the probabilities of patterns that are each the top pattern within some interval of pH. To understand these probabilities more thoroughly, consider any pattern α that has a specified number kα = n of protons bound. Such a pattern has the probability

Pn=ζne(Δμ0·Oα)/kBTeWel,α/kBTQζnB(Oα)Q,log10Pn=log10B(Oα)npHlog10Q, (14)

in which ζ = 10pH and B(Oα) denotes a Boltzmann factor for occupancy vector Oα; B(Oα) includes the intrinsic pK values as well as the work-of-charging contribution. Note that all of the pH dependence in the last line of Eq. (14) occurs in the last two terms; the partition function Q in the final term depends on pH through ζ. Therefore, for a given value of n, all of the curves of log10 Pn vs pH are simply vertically displaced with respect to one another, because they differ only due to the quantities log10 B(Oα). This feature is illustrated in Fig. 3(a) in the Supplemental Material [122]. The nearly parabolic shapes in the coordinates (pH, log10 Pn) correspond to nearly Gaussian shapes when Pn is plotted vs pH, as shown in Fig. 14(b) for the top-ranked 12 configurations at pH 7.1; these are the configurations illustrated in Fig. 6. In Fig. 14(c) we show the partition function in the form log10Q in the range 4 < pH < 8. In this pH range, log10Q can be represented well by cubic or quartic polynomials, specifically log10Q=474.2786.407×pH+7.414×pH20.30492×pH3(1adjustedR2=6×106) and log10Q=545.64136.56×pH+20.392×pH21.7713×pH3+0.061101×pH4(1adjustedR2=5×107), respectively. Such fits can be convenient for estimating Q in Eq. (5) or (14) for protonation pattern probabilities. The fit residuals in Fig. 14(c) illustrate the degree of error to be expected in using such a fit for Q and show their polynomial appearance; such correlation of residuals can be detected using, for example, the Durbin-Watson statistic. Physically, such an appearance is to be expected given that Q is a polynomial of essentially higher order than 4, because more than four prominent overall protonation numbers occur in 4 < pH < 8 [see Eq. (2) and Fig. 14(a)]. A perspective view of the joint dependence of pattern probabilities on net charge and pH is given in Fig. 3(b) in the Supplemental Material [122].

FIG. 14.

FIG. 14

(a) The pH dependence of the probabilities of patterns that are each the top pattern in some interval of pH. (b) The Pα vs pH for the top 12 configurations at pH 7.1 have nearly Gaussian distributions with respect to pH. (c) The log10Q from the present model, determined using Monte Carlo simulations (10 × 106 samples per pH, in 0.1 pH steps). The inset shows the residuals to two of the fits, which for the quartic fit in this pH interval have a range of about 1 part in 4000 of log10Q.

The finding that the pKeff,α values are useful for predicting the ranking of configurations suggests that it is interesting to compare them with the pK1/2 values calculated using the full model. Figure 15 makes such a comparison, with use of pKeff,α values that take α* to be the top-ranked configuration for 6.6 < pH < 7.3 (red closed circles) and to be the top-ranked configuration for 4.4 < pH < 4.6 (blue open circles). Figure 15(a) shows that for the histidines that are modeled to titrate near neutral pH, the pKeff,α values are indeed almost exactly equal to the corresponding pK1/2 values. This is to be expected from the agreement shown by the black squares in Fig. 8(c). It is instructive to compare the order in which histidine residues first switch to the difference between pH 7.1 and their respective pKeff,α values. From Fig. 6 or from Table I, the order of switching is H122, H84, H14, H117, and H53, consistent with the corresponding |pKeff,αpH| values of 0.19, 0.29, 0.44, 0.57, and 0.80.

Figure 15(a) also shows that for residues whose pKeff,α values are further from the range for which the chosen α* configuration is appropriate, the pK1/2 and pKeff,α differ more strongly. Thus, when using pKeff,α as a tool for estimating experimental pK1/2 values, it is important to choose α* configurations that are prominent, and representative, in a pH range that ideally includes the pK1/2 in question. We note that because 1 < pH < 12 in the simulations used to create Fig. 15(a), residues with model pK1/2 values outside this range are not shown; in addition to 12 of the arginines and the three cysteines, these included D72 and D107.

Figure 15(b) shows that, except for H53 and H122 below their respective pK1/2 values, the histidine titration curves from the full model agree well with Henderson-Hasselbalch curves, as expected because they are in different work-of-charging groups [Fig. 3(c)]. Thus, although many protonation configuration probabilities are not well predicted by a Henderson-Hasselbalch approach, this may not show up prominently in the titration curves of selected residues.

C. Dependence of the distributions on ionic strength

The possible effects of ionic strength on solutions of γ B-crystallin and other proteins are very interesting, in that one expects that they will depend on both the balance and shapes of the negative and positive voltage surface regions. On one hand, if attractive interactions are in part created by protein orientations that put negative and positive surface regions of neighboring proteins face-to-face to some degree, lowering ionic strength would be expected to increase attractions, because then the negative and positive regions would affect one another over a larger range of protein separations. On the other hand, if the net electrostatic portion of the interaction is repulsive, lowering ionic strength would be expected to increase the repulsion. In this context it is interesting to see how large the effects of ionic strength are on the distribution of protonation patterns, which could also play some role in mediating such effects.

Figures 16(a) and 16(b) show that the off-diagonal work-of-charging entries, as expected, can increase substantially as ionic strength is lowered. Figure 16(c) shows the corresponding changes in the configuration distribution at pH 7.1. In these coordinates the changes appear modest for the case illustrated. However, the effect is nevertheless evident; it is to make the most prominent configurations slightly more probable, at the expense of some of the less probable configurations. Figure 16(d) shows this in summary fashion. Note the crossover between the changes shown by the top-ranked configurations, whose probabilities generally increase (blue curve above black curve), at the expense of lower-ranked configurations, whose probabilities generally decrease, though not without exception.

FIG. 16.

FIG. 16

Lowering ionic strength, corresponding to increasing the Debye length λD from (a) 6 Å to (b) 20 Å, increases off-diagonal work-of-charging entries and makes the top configurations more prominent while suppressing others, as shown in (c) and (d). In (a) and (b) Wij magnitude codes are as in Fig. 3(a). A 1:1 electrolyte in water at 298 K corresponds to ionic strengths of (a) 257 mM and (b) 23.1 mM. In going from (a) to (b), in all categories but the top (red circles), entries above 0.05kBT increase in number with increased λD. (c) Plot of log10 P vs net charge, as in Fig. 5. Black dash-dotted lines show λD = 6 Å and blue solid lines λD = 20 Å. (d) Plot of log10 P for the top 20 configurations; note the changed vertical scale. The black dashed line shows λD = 6 Å and the blue solid line λD = 20 Å. The contrast between changes in top- and lower-ranked probabilities is shown by the crossing of the dashed and solid curves.

D. Possible implications for protein-protein interactions

As discussed in the Introduction, our primary purpose here is to provide part of a basis for further investigation of the molecular properties that determine the magnitude of interactions between γ B- and related γ-crystallin and other eye lens crystallin proteins, investigation that we hope can eventually achieve sufficient detail to provide for quantitative, predictive modeling of the origin of the cataractogenic effects of single-residue mutations. Because many known cataractogenic mutations of γ-crystallins involve changes of residue charge, it is natural to study the protonation configuration probability distributions in detail. While many models of orientation-dependent protein-protein interactions have been developed at various levels of coarse graining [36,11,132134], some of which incorporate charge regulation, including models for lysozyme interactions [46,132,135,136] and for gamma crystallin interactions [52,88,137], achieving the degree of fine graining for the more predictive modeling needed in many contexts remains an outstanding challenge [19].

Although a quantitative investigation of the consequences of the present model for how site-specific chemical changes influence interactions is not the focus of the present work, we nevertheless comment here on three features that illustrate the scope of the problem. These include (i) the small fraction of the pairs of configurations accounted for by each choice of individual protonation configurations in neighboring proteins, even if those choices are each the top-ranked choice, (ii) the six-dimensional space of the relative positions of two neighboring proteins, and (iii) the biasing of protonation configuration probabilities because of protein proximity [10], a biasing that is itself a function in that same six-dimensional space. Of course, additional relative position and orientation dimensions are needed if the concentration is high enough so that clusters of more than two neighboring proteins are needed to represent the situation adequately. We now briefly discuss each of these features.

First, Fig. 17 shows the calculated voltage contours around two neighboring γB-crystallin molecules, at pH 7.1 and Debye length 6 Å. Each of these molecules has been given the most common protonation configuration, that illustrated in Fig. 1. Note that the zero voltage contours appear dramatically altered from those surrounding the isolated protein in Fig. 1(a). Yet the corresponding pair of proton occupancy patterns accounts for only about 0.165 × 0.165 = 0.0272 of the contributions to the protein-protein interactions. An illustration of interaction contributions by common pairs of patterns is given in Fig. 1 of the Supplemental Material [122].

FIG. 17.

FIG. 17

Screened voltage contours around neighboring γ B-crystallin molecules, with pH 7.1 and Debye length 6 Å. The voltage contour surface designations are as in Fig. 1(a), except that the curves on the 0 V contour surface are spaced by 12 Å from the calculation box center. Each molecule has the top-ranked protonation configuration.

Now, for each chosen pair of protonation configurations, the space of relative orientations of the two proteins has five dimensions, two for each protein to choose the surface points that are in closest proximity and one more for the relative twist about the line joining their centers. Radial separation gives a sixth dimension. Among the many choices of how to visualize the space of possible relative orientations, one shown in Fig. 18 is to make a projection so as to be able to plot the voltages around both protein surfaces above and below two planes and to represent the space of possible proximities by the collection of all line segments or arrows that join pairs of points, one from each plane. Twist can then be added as a position along each line segment, if desired. In Fig. 18,a few such line segments are drawn that indicate connections that could correspond to strong electrostatic attractions between neighboring γ -crystallins, for the most common pair of protonation configurations, shown at left. While the few connections shown in Fig. 18 simply join positive to negative peaks, nonpeak locations can also show electrostatic attractions, depending on the twist angle. With use of calculations that consider the possible relative orientations, one can find prominent basins of attraction and saddle points in the five- or six-dimensional space and illustrate these by points on the appropriate connection lines.

FIG. 18.

FIG. 18

Visualization of the sets of relative orientations that could give electrostatic attractions between neighboring γ-crystallins. Lambert projections, as in Fig. 3(c) and described there, of the electrostatic potential at about one-half Debye length from two copies of the most common pH 7.1 protonation configuration are shown on the left. Voltages on the same surfaces are plotted vertically on the right, above and below the Lambert projections, with positive up. While the arrows here simply join positive to negative peaks, nonpeak locations can also attract, depending on twist angle.

Returning now to Fig. 17, protein proximity dramatically alters the surrounding voltage zero contours, as mentioned above. As a consequence the protonation pattern probability distribution will now reflect between-protein, off-diagonal elements of an enlarged work-of-charging matrix. The existence of these elements means that the joint configuration probabilities can only be approximately represented by the products of probabilities of the individual configurations of hypothetical isolated proteins. As a consequence, the study of the probability distributions of protonation patterns becomes much more intricate for close protein neighbors. Figure 19 gives an example in which changing the relative orientations of two neighboring proteins alters the expanded, two-protein work-of-charging matrix. This example illustrates that the expanded matrices are now functions of the six-dimensional space of the relative positions of the two proteins. Figure 19 also shows that only a small portion of the protein-protein blocks in the expanded matrices differ substantially from zero, which suggests that a perturbation approach might accurately represent the resulting joint probability distributions. To create Fig. 19, we streamlined the needed calculation by using a simpler dielectric boundary than that used above, which consisted of two conjoined, interpenetrating low-dielectric spheres, and by omitting the three cysteine residues that are incorporated in the 54 titratable residues considered above.

FIG. 19.

FIG. 19

Changing the relative orientations of two neighboring proteins alters the expanded, two protein work-of-charging matrix. The voltage contour surface designations in (a) and (c) are the same as in Fig. 1(a), except that the curves on the 0 V contour surface are spaced by 12 Å from the calculation box center. The dielectric surfaces are outlined by white longitude and latitude curves. (b) and (d) Same color code as in Fig. 3. (b) and (d) Upper left and lower right squares show the within-protein recalculated work-of-charging matrix entries, while the off-diagonal squares show the between-protein entries. (a) and (c) Different sets of residues are adjacent to one another, as indicated, producing differences in the corresponding work-of-charging matrices shown in (b) and (d), respectively. The order of residues is the same for each protein in (b) and (d), but differs from that in Fig. 3.

Note that close protein proximity can be quite common even at rather low concentrations compared with those that occur in the living eye lens, which can range into the hundreds of milligrams per milliliter. For example, in a square-well model of the phase behavior of γB-crystallin [12], Monte Carlo simulations using parameters that gave the closest fit to the observed critical temperature and concentration (a square-well width over diameter of 0.25 and square-well depth of 1.267kBT) indicate that even at a concentration of 0.5 mM protein, the mean-field estimate of the average number of contacts per particle [Eq. (30) in [12]] is 0.2; that is, a given protein will already have an essentially close neighbor about 20% of the time. This concentration, which for γB-crystallin is close to 10.5 mg/ml, corresponds to a volume fraction of 0.0074, a small fraction of its critical volume fraction of 0.18–0.20, and small compared with estimates of the macromolecular volume fraction in living cells [13], which range from 0.07 to 0.40. Thus one expects altered protonation probability distributions, due to molecular proximity, to contribute substantially to the thermodynamics of protein and other macromolecular solutions within living cells.

IV. CONCLUSION

We have used the linearized Debye-Hückel approximation to model the probability distributions of protonation patterns on bovine γB-crystallin as functions of pH. The breadth of the probability distributions indicates that a very large number of pairs of such patterns will be needed in order to account for how the distribution of protonation patterns affects γB- γB interactions. The key to such an analysis will be to understand not simply the distribution of protonation patterns of a single protein, but rather the probability distribution of protonation patterns, spatial variations of electrostatic potential, and consequent electrostatic interaction free energies present on pairs and larger tuples of neighboring proteins, as functions of their relative positions and orientations.

Accurate, angle-dependent potential of mean force models are needed to provide a sound molecular basis for understanding the statistical thermodynamics and the liquid structure of protein solutions [111113] and the corresponding dramatic effects of mutations, post-translational modifications, and solution environment on protein phase separation and aggregation in solution. The present model is a step towards building an accurate angle-dependent model of electrostatic contributions to the potential of mean force for γ -crystallin interactions. Clearly, such a model will also need to encompass other aspects of protein-protein interactions not considered here, including dispersion interactions, the hydrophobic effect, and hydration forces.

Supplementary Material

Supplemental Material

TABLE II.

Occupancies of the residues that switch protonation states in the top ten configurations at pH 7.1 (see also Fig. 6). Residues switch in the order H122, H84, H14, H117, and H53, consistent with increasing |pKeff,α7.1| (see the text).

Configuration rank
Residue 1 2 3 4 5 6 7 8 9 10
H14 1 1 1 0 1 1 0 1 0 1
H53 0 0 0 0 0 0 0 0 0 1
H84 1 1 0 1 0 1 1 1 0 1
H117 0 0 0 0 0 1 0 1 0 0
H122 1 0 1 1 0 1 0 0 1 1
probability 0.165 0.106 0.084 0.060 0.054 0.045 0.039 0.031 0.030 0.026

Acknowledgments

We thank Matthew Lynn for assistance with the use of GAUSSIAN09 and Hassler Thurston for advice about the simulated annealing used to permute the work-of-charging matrix. We thank anonymous referees for their careful readings of the paper, questions, and suggestions that led to clarification. Research reported in this publication was supported by the National Eye Institute of the National Institutes of Health (USA) under Award No. R15EY018249. The content is solely the responsibility of the authors and does not necessarily reflect the official views of the National Institutes of Health (USA).

APPENDIX: EVALUATION OF ELECTROSTATIC ENERGY INTEGRALS AND CORRESPONDING pK SHIFTS

Consider a single charge on the z axis located at (0,0,z0). We wish to compute 12D·EdV over the unbounded volume V exterior to a small sphere (neighborhood) of radius R, centered at the charge. To do so, we make use of the linearized Poisson-Boltzmann equation

[ε(x)u(x)]=εoutκ2(x)u(x)ρ(x).

Within V, ρ(x) = 0, and in the absence of ionic screening (κ = 0), the equation reduces to

[ε(x)u(x)]=0. (A1)

To compute the volume integral

12VDEdV=12Vε(x)u(x)u(x)dV,

we use the relation

[ε(x)u(x)u(x)]=εuu+u[εu].

The second term on the right-hand side of the relation is equal to zero by Eq. (A1), in which case we have

ε(x)u(x)u(x)=[ε(x)uu(x)].

Then the integral can be written as

12VDEdV=12V[ε(x)u(x)u(x)]dV.

The integral can be evaluated using Gauss’s divergence theorem by closing the volume with a second concentric sphere of radius R′ > R and letting R′ approach infinity. Let V′ be the volume bounded by the two spheres and let ∂R and ∂R′ denote the spherical boundary surfaces of radius R and R′, respectively. Then

12V[εuu]dV=12Rε(x)u(x)u(x)ndS+12Rε(x)u(x)u(x)ndS (A2)

where the unit vectors n and n′ are normal to the respective boundary surfaces. We choose the unit normal vectors to be directed outward relative to the spheres, rather than to the volume itself. The second integral on the right-hand side of Eq. (A2) approaches zero as R′ approaches infinity. In this limit, we are left with

12VDEdV=12Rε(x)u(x)u(x)ndS.

It is convenient to evaluate the integral in spherical coordinates with the origin translated to the charge location at (0,0,z0). Since the point charge is located on the z axis and ε is assumed to be symmetric about the z axis, the integrand is independent of the azimuthal angle θ. Therefore, the surface integral reduces to a single-variable integral given by

12VDEdV=πR20πε(ϕ)u(ϕ)ur(ϕ)sinϕdϕ.

Outside a sphere of radius r0, surrounding an isolated charge of magnitude q, placed in a dielectric having relative dielectric coefficient εr,

12r>r0DEdV=q28πε0εrr0. (A3)

Therefore, if we transfer a charged group surrounded by water, which has dielectric coefficient εw, into a medium with coefficient εr, the work required is

q28πε0εrr0q28πε0εwr0=q28πε0r0(1εr1εw). (A4)

The corresponding change in the pK of such a group, ΔpK, is therefore given by

ΔpKln(10)=±q28πε0r0kBT(1εr1εw)=±[12kBTr>r0DEdVq28πε0εwr0kBT], (A5)

which corresponds to Eq. (6) in the text. Whereas the last substitution may seem superfluous, in view of Eq. (A3), it is the last equality that enables calculation of how pK values can be expected to change in a nonuniform (scalar) dielectric environment, such as the one being used for the present model. This is how we have proceeded (except for the histidines) to model γB-crystallin’s pK values.

To understand which sign is to be used in Eq. (A5), it is valuable to recognize that for a lower dielectric than water, that is, εr < εw, the right-hand side of Eq. (A4) is positive, corresponding to the fact that one must do work to bury a charge, of either sign, in a low-dielectric environment. Consider first an acidic residue such as glutamic or aspartic acid. In that case, partially surrounding the charge site with a low-dielectric environment will favor the protonated state, which is uncharged, and therefore a lower concentration of protons (higher pH) will suffice for protonation. Thus, the pK will shift upward and the + sign should be used in Eq. (A5). The opposite is true for basic residues such as lysine, arginine, and histidine, for which a higher concentration of protons will be needed in order to protonate the site.

References

  • 1.Kirkwood JG, Shumaker JB. Forces between protein molecules in solution arising from fluctuations in proton charge and configuration. Proc Natl Acad Sci USA. 1952;38:863. doi: 10.1073/pnas.38.10.863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Tanford C, Kirkwood JG. Theory of protein titration curves. I. General equations for impenetrable spheres. J Am Chem Soc. 1957;79:5333. [Google Scholar]
  • 3.Lund M, Jönsson B. Charge regulation in biomolecular solution. Q Rev Biophys. 2013;46:265. doi: 10.1017/S003358351300005X. [DOI] [PubMed] [Google Scholar]
  • 4.Kim B, Song X. Calculations of the second virial coefficients of protein solutions with an extended fast multipole method. Phys Rev E. 2011;83:011915. doi: 10.1103/PhysRevE.83.011915. [DOI] [PubMed] [Google Scholar]
  • 5.Chan HY, Lankevich V, Vekilov PG, Lubchenko V. Anisotropy of the Coulomb interaction between folded proteins: Consequences for mesoscopic aggregation of lysozyme. Biophys J. 2012;102:1934. doi: 10.1016/j.bpj.2012.03.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Quang LJ, Sandler SI, Lenhoff AM. Anisotropic contributions to protein-protein interactions. J Chem Theory Comput. 2014;10:835. doi: 10.1021/ct4006695. [DOI] [PubMed] [Google Scholar]
  • 7.Lund M, Jönsson B. On the charge regulation of proteins. Biochemistry. 2005;44:5722. doi: 10.1021/bi047630o. [DOI] [PubMed] [Google Scholar]
  • 8.Mason AC, Jensen JH. Protein-protein binding is often associated with changes in protonation state. Proteins: Struct Funct Bioinf. 2008;71:81. doi: 10.1002/prot.21657. [DOI] [PubMed] [Google Scholar]
  • 9.Aguilar B, Anandakrishnan R, Ruscio JZ, Onufriev AV. Statistics and physical origins of pK and ionization state changes upon protein-ligand binding. Biophys J. 2010;98:872. doi: 10.1016/j.bpj.2009.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hollenbeck D, Martini KM, Langner A, Harkin A, Ross DS, Thurston GM. Model for evaluating patterned charge-regulation contributions to electrostatic interactions between low-dielectric spheres. Phys Rev E. 2010;82:031402. doi: 10.1103/PhysRevE.82.031402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lund M. Electrostatic chameleons in biological systems. J Am Chem Soc. 2010;132:17337. doi: 10.1021/ja106480a. [DOI] [PubMed] [Google Scholar]
  • 12.Lomakin A, Asherie N, Benedek GB. Monte Carlo study of phase separation in aqueous protein solutions. J Chem Phys. 1996;104:1646. [Google Scholar]
  • 13.Hall D, Minton AP. Macromolecular crowding: qualitative and semiquantitative successes, quantitative challenges. Biochim Biophys Acta. 2003;1649:127. doi: 10.1016/s1570-9639(03)00167-5. [DOI] [PubMed] [Google Scholar]
  • 14.Benedek GB. Cataract as a protein condensation disease: The Proctor Lecture, Invest. Ophthalmol. Visual Sci. 1997;38:1911. [PubMed] [Google Scholar]
  • 15.Clark JI, Clark JM. Lens cytoplasmic phase separation. Int Rev Cytol. 1999;192:171. doi: 10.1016/s0074-7696(08)60526-4. [DOI] [PubMed] [Google Scholar]
  • 16.Pollack GH. Cells, Gels and the Engines of Life. Ebner; Seattle: 2001. [Google Scholar]
  • 17.Gunton JD, Shiryayev A, Pagan DL. Protein Condensation: Kinetic Pathways to Crystallization and Disease. Cambridge University Press; Cambridge: 2007. [Google Scholar]
  • 18.Keating CD. Aqueous phase separation as a possible route to compartmentalization of biological molecules. Acc Chem Res. 2012;45:2114. doi: 10.1021/ar200294y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.McManus JJ, Charbonneau P, Zaccarelli E, Asherie N. The physics of protein self-assembly. Curr Opin Colloid Interface Sci. 2016;22:73. [Google Scholar]
  • 20.Herzfeld J, Briehl RW. Phase behavior of reversibly polymerizing systems with narrow length distributions. Macromolecules. 1981;14:397. [Google Scholar]
  • 21.Blankschtein D, Thurston GM, Benedek GB. Phenomenological theory of equilibrium thermodynamic properties and phase-separation of micellar solutions. J Chem Phys. 1986;85:7268. [Google Scholar]
  • 22.Gompper G, Schick M. Lattice model of microemulsions. Phys Rev. 1990;B 41:9148. doi: 10.1103/physrevb.41.9148. [DOI] [PubMed] [Google Scholar]
  • 23.Kahlweit M, Strey R, Busse G. Microemulsions: A qualitative thermodynamic approach. J Phys Chem. 1990;94:3881. [Google Scholar]
  • 24.van der Schoot P, Cates ME. Growth, static light scattering, and spontaneous ordering of rodlike micelles. Langmuir. 1994;10:670. [Google Scholar]
  • 25.Shore JD, Thurston GM. Charge-regulation phase transition on surface lattices of titratable sites adjacent to electrolyte solutions: An analog of the Ising antiferromagnet in a magnetic field. Phys Rev E. 2015;92:062123. doi: 10.1103/PhysRevE.92.062123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Van Holde KE, Johnson WC, Ho PS. Principles of Physical Biochemistry. Pearson/Prentice Hall; Upper Saddle River: 2005. [Google Scholar]
  • 27.Tanford C, Roxby R. Interpretation of protein titration curves. Application to lysozyme. Biochemistry. 1972;11:2192. doi: 10.1021/bi00761a029. [DOI] [PubMed] [Google Scholar]
  • 28.Shire SJ, Hanania GIH, Gurd FRN. Electrostatic effects in myoglobin. Hydrogen ion equilibria in sperm whale ferrimyoglobin. Biochemistry. 1974;13:2967. doi: 10.1021/bi00711a028. [DOI] [PubMed] [Google Scholar]
  • 29.Bashford D, Karplus M. Multiple-site titration curves of proteins: An analysis of exact and approximate methods for their calculation. J Phys Chem. 1991;95:9556. [Google Scholar]
  • 30.Feig M, Onufriev A, Lee MS, Im W, Case DA, Brooks CL. Performance comparison of generalized Born and Poisson methods in the calculation of electrostatic solvation energies for protein structures. J Comput Chem. 2004;25:265. doi: 10.1002/jcc.10378. [DOI] [PubMed] [Google Scholar]
  • 31.Wyman J, Gill SJ. Binding and Linkage: Functional Chemistry of Biological Macromolecules. University Science Books; Mill Valley: 1990. [Google Scholar]
  • 32.Onufriev A, Case DA, Ullmann GM. A novel view of pH titration in biomolecules. Biochemistry. 2001;40:3413. doi: 10.1021/bi002740q. [DOI] [PubMed] [Google Scholar]
  • 33.Lindman S, Linse S, Mulder FAA, Andre I. Electrostatic contributions to residue-specific protonation equilibria and proton binding capacitance for a small protein. Biochemistry. 2006;45:13993. doi: 10.1021/bi061555v. [DOI] [PubMed] [Google Scholar]
  • 34.Hass MAS, Mulder FAA. Contemporary NMR studies of protein electrostatics. Annu Rev Biophys. 2015;44:53. doi: 10.1146/annurev-biophys-083012-130351. [DOI] [PubMed] [Google Scholar]
  • 35.Sharma U, Negin RS, Carbeck JD. Effects of cooperativity in proton binding on the net charge of proteins in charge ladders. J Phys Chem B. 2003;107:4653. [Google Scholar]
  • 36.Biesheuvel PM, Lindhoud S, Cohen Stuart MA, de Vries R. Phase behavior of mixtures of oppositely charged protein nanoparticles at asymmetric charge ratios. Phys Rev E. 2006;73:041408. doi: 10.1103/PhysRevE.73.041408. [DOI] [PubMed] [Google Scholar]
  • 37.Warshel A, Russell ST, Churg AK. Macroscopic models for studies of electrostatic interactions in proteins: Limitations and applicability. Proc Natl Acad Sci USA. 1984;81:4785. doi: 10.1073/pnas.81.15.4785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.You TJ, Bashford D. Conformation and hydrogen ion titration of proteins: A continuum electrostatic model with conformational flexibility. Biophys J. 1995;69:1721. doi: 10.1016/S0006-3495(95)80042-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Archontis G, Simonson T. Proton binding to proteins: A free-energy component analysis using a dielectric continuum model. Biophys J. 2005;88:3888. doi: 10.1529/biophysj.104.055996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Warshel A, Sharma PK, Kato M, Parson WW. Modeling electrostatic effects in proteins. Biochim Biophys Acta. 2006;1764:1647. doi: 10.1016/j.bbapap.2006.08.007. [DOI] [PubMed] [Google Scholar]
  • 41.Gunner MR, Zhu X, Klein MC. MCCE analysis of the pKas of introduced buried acids and bases in staphylococcal nuclease. Proteins: Struct Funct Bioinf. 2011;79:3306. doi: 10.1002/prot.23124. [DOI] [PubMed] [Google Scholar]
  • 42.Polydorides S, Simonson T. Monte Carlo simulations of proteins at constant pH with generalized Born solvent flexible sidechains, and an effective dielectric boundary. J Comput Chem. 2013;34:2742. doi: 10.1002/jcc.23450. [DOI] [PubMed] [Google Scholar]
  • 43.Mehler EL, Fuxreiter M, Simon I, Garcia-Moreno E. The role of hydrophobic microenvironments in modulating pKa shifts in proteins. Proteins: Struct Funct Bioinf. 2002;48:283. doi: 10.1002/prot.10153. [DOI] [PubMed] [Google Scholar]
  • 44.Porter MA, Hall JR, Locke JC, Jensen JH, Molina PA. Hydrogen bonding is the prime determinant of carboxyl pKa values at the N-termini of α-helices. Proteins: Struct Funct Bioinf. 2006;63:621. doi: 10.1002/prot.20879. [DOI] [PubMed] [Google Scholar]
  • 45.Forsyth WR, Antosiewicz JM, Robertson AD. Empirical relationships between protein structure and carboxyl pKa values in proteins. Proteins: Struct Funct Bioinf. 2002;48:388. doi: 10.1002/prot.10174. [DOI] [PubMed] [Google Scholar]
  • 46.Gunner MR, Saleh MA, Cross E, ud-Doula A, Wise M. Backbone dipoles generate positive potentials in all proteins: origins and implications of the effect. Biophys J. 2000;78:1126. doi: 10.1016/S0006-3495(00)76671-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Miranda JJ. Position-dependent interactions between cysteine residues and the helix dipole. Protein Sci. 2003;12:73. doi: 10.1110/ps.0224203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Lund M, Vácha R, Jungwirth P. Specific ion binding to macromolecules: Effects of hydrophobicity and ion pairing. Langmuir. 2008;24:3387. doi: 10.1021/la7034104. [DOI] [PubMed] [Google Scholar]
  • 49.Horwitz J, Kabasawa I, Kinoshita JH. Conformation of gamma-crystallins of the calf lens: Effects of temperature and denaturing agents. Exp Eye Res. 1977;25:199. doi: 10.1016/0014-4835(77)90132-4. [DOI] [PubMed] [Google Scholar]
  • 50.Thomson JA, Schurtenberger P, Thurston GM, Benedek GB. Binary liquid phase separation and critical phenomena in a protein/water solution. Proc Natl Acad Sci USA. 1987;84:7079. doi: 10.1073/pnas.84.20.7079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Broide ML, Berland CR, Pande J, Ogun OO, Benedek GB. Binary-liquid phase separation of lens protein solutions. Proc Natl Acad Sci USA. 1991;88:5660. doi: 10.1073/pnas.88.13.5660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lomakin A, Asherie N, Benedek GB. Aeolotopic interactions of globular proteins. Proc Natl Acad Sci USA. 1999;96:9465. doi: 10.1073/pnas.96.17.9465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Asherie N. Protein crystallization and phase diagrams. Methods. 2004;34:266. doi: 10.1016/j.ymeth.2004.03.028. [DOI] [PubMed] [Google Scholar]
  • 54.Benedek GB, Pande J, Thurston GM, Clark JI. Theoretical and experimental basis for the inhibition of cataract. Prog Retinal Eye Res. 1999;18:391. doi: 10.1016/s1350-9462(98)00023-8. [DOI] [PubMed] [Google Scholar]
  • 55.Pande A, Pande J, Asherie N, Lomakin A, Ogun O, King JA, Lubsen NH, Walton D, Benedek GB. Molecular basis of a progressive juvenile-onset hereditary cataract. Proc Natl Acad Sci USA. 2000;97:1993. doi: 10.1073/pnas.040554397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Asherie N, Pande J, Pande A, Zarutskie JA, Lomakin J, Lomakin A, Ogun O, Stern LJ, King J, Benedek GB. Enhanced crystallization of the Cys18 to Ser mutant of bovine gammaB crystallin. J Mol Biol. 2001;314:663. doi: 10.1006/jmbi.2001.5155. [DOI] [PubMed] [Google Scholar]
  • 57.Pande A, Annunziata O, Asherie N, Ogun O, Benedek GB, Pande J. Decrease in protein solubility and cataract formation caused by the Pro23 to Thr mutation in human gammaD-crystallin. Biochemistry. 2005;44:2491. doi: 10.1021/bi0479611. [DOI] [PubMed] [Google Scholar]
  • 58.McManus JJ, Lomakin A, Ogun O, Pande A, Basan M, Pande J, Benedek GB. Altered phase diagram due to a single point mutation in human gammaD-crystallin. Proc Natl Acad Sci USA. 2007;104:16856. doi: 10.1073/pnas.0707412104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Pande A, Zhang J, Banerjee PR, Puttamadappa SS, Shekhtman A, Pande J. NMR study of the cataract-linked P23T mutant of human gammaD-crystallin shows minor changes in hydrophobic patches that reflect its retrograde solubility. Biochem Biophys Res Commun. 2009;382:196. doi: 10.1016/j.bbrc.2009.03.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Banerjee PR, Pande A, Patrosz J, Thurston GM, Pande J. Cataract-associated mutant E107A of human gammaD-crystallin shows increased attraction to alpha-crystallin and enhanced light scattering. Proc Natl Acad Sci USA. 2011;108:574. doi: 10.1073/pnas.1014653107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Kirkwood JG. Theory of solutions of molecules containing widely separated charges with special application to zwitterions. J Chem Phys. 1934;2:351. [Google Scholar]
  • 62.Shire SJ, Hanania GIH, Gurd FRN. Electrostatic effects in myoglobin. Application of the modified Tanford-Kirkwood theory to myoglobins from horse, California grey whale, harbor seal, and California sea lion. Biochemistry. 1975;14:1352. doi: 10.1021/bi00678a002. [DOI] [PubMed] [Google Scholar]
  • 63.Sundd M, Iverson N, Ibarra-Molero B, Sanchez-Ruiz JM, Robertson AD. Electrostatic interactions in ubiquitin: Stabilization of carboxylates by lysine amino groups. Biochemistry. 2002;41:7586. doi: 10.1021/bi025571d. [DOI] [PubMed] [Google Scholar]
  • 64.Bashford D, Gerwert K. Electrostatic calculations of the pKa values of ionizable groups in bacteriorhodopsin. J Mol Biol. 1992;224:473. doi: 10.1016/0022-2836(92)91009-e. [DOI] [PubMed] [Google Scholar]
  • 65.Bashford D. International Conference on Computing in Object-Oriented Parallel Environments. Vol. 1343. Springer; Berlin: 1997. An object-oriented programming suite for electrostatic effects in biological molecules: An experience report on the MEAD project; pp. 233–240. [Google Scholar]
  • 66.Li L, Li C, Sarkar S, Zhang J, Witham S, Zhang Z, Wang L, Smith N, Petukh M, Alexov E. DelPhi: A comprehensive suite for DelPhi software and associated resources. BMC Biophys. 2012;5:9. doi: 10.1186/2046-1682-5-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA. Electrostatics of nanosystems: Application to microtubules and the ribosome. Proc Natl Acad Sci USA. 2001;98:10037. doi: 10.1073/pnas.181342398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Gordon JC, Myers JB, Folta T, Shoja V, Heath LS, Onufriev A. H++: A server for estimating pKas and adding missing hydrogens to macromolecules. Nucleic Acids Res. 2005;33:W368. doi: 10.1093/nar/gki464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Schurtenberger P, Chamberlin RA, Thurston GM, Thomson JA, Benedek GB. Observation of Critical Phenomena in a Protein-Water Solution. Phys Rev Lett. 1989;63:2064. doi: 10.1103/PhysRevLett.63.2064. [DOI] [PubMed] [Google Scholar]
  • 70.Schurtenberger P, Chamberlin RA, Thurston GM, Thomson JA. Observation of Critical Phenomena in a Protein-Water Solution. Phys Rev Lett. 1993;71:3395. doi: 10.1103/PhysRevLett.63.2064. [DOI] [PubMed] [Google Scholar]
  • 71.Fine BM, Pande J, Lomakin A, Ogun OO, Benedek GB. Dynamic Critical Phenomena in Aqueous Protein Solutions. Phys Rev Lett. 1995;74:198. doi: 10.1103/PhysRevLett.74.198. [DOI] [PubMed] [Google Scholar]
  • 72.Fine BM, Lomakin A, Ogun OO, Benedek GB. Static structure factor and collective diffusion of globular proteins in concentrated aqueous solution. J Chem Phys. 1996;104:326. [Google Scholar]
  • 73.Shand-Kovach I. Electrostatic properties of phase-separating bovine lens proteins, Ph.D. thesis, MIT. 1992 [Google Scholar]
  • 74.Slingsby C, Miller L. The reaction of glutathione with the eye-lens protein γ -crystallin. Biochem J. 1985;230:143. doi: 10.1042/bj2300143. Note that bovine γ B-crystallin was termed γ II-crystallin at that time. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.McDermott MJ, Gawinowicz-Kolks MA, Chiesa R, Spector A. The disulfide content of calf γ-crystallin. Arch Biochem Biophys. 1988;262:609. doi: 10.1016/0003-9861(88)90413-4. Note that bovine γ B-crystallin was termed γ II-crystallin at that time. [DOI] [PubMed] [Google Scholar]
  • 76.McQuarrie DA. Statistical Mechanics (University Science Books, Mill Valley. 2000 [Google Scholar]
  • 77.Kjellander R, Akesson T, Jönsson B, Marčelja S. Double layer interactions in mono- and divalent electrolytes: A comparison of the anisotropic HNC theory and Monte Carlo simulations. J Chem Phys. 1992;97:1424. [Google Scholar]
  • 78.Kjellander R, Greberg H. Mechanisms behind concentration profiles illustrated by charge and concentration distributions around ions in double layers. J Electroanal Chem. 1998;450:233. [Google Scholar]
  • 79.Burak Y, Andelman D. Hydration interactions: Aqueous solvent effects in electric double layers. Phys Rev E. 2000;62:5296. doi: 10.1103/physreve.62.5296. [DOI] [PubMed] [Google Scholar]
  • 80.Swanson JMJ, Wagoner JA, Baker NA, McCammon JA. Optimizing the Poisson dielectric boundary with explicit solvent forces and energies: Lessons learned with atom-centered dielectric functions. J Chem Theory Comput. 2007;3:170. doi: 10.1021/ct600216k. [DOI] [PubMed] [Google Scholar]
  • 81.Borukhov I, Andelman D, Orland H. Steric Effects in Electrolytes: A Modified Poisson-Boltzmann Equation. Phys Rev Lett. 1997;79:435. [Google Scholar]
  • 82.Lue L, Zoeller N, Blankschtein D. Incorporation of non-electrostatic interactions in the Poisson-Boltzmann equation. Langmuir. 1999;15:3726. [Google Scholar]
  • 83.Ben-Yaakov D, Andelman D, Harries D, Podgornik R. Beyond standard Poisson-Boltzmann theory: Ion-specific interactions in aqueous solutions. J Phys: Condens Matter. 2009;21:424106. doi: 10.1088/0953-8984/21/42/424106. [DOI] [PubMed] [Google Scholar]
  • 84.Boström M, Williams DRM, Ninham BW. The influence of ionic dispersion potentials on counterion condensation on polyelectrolytes. J Phys Chem B. 2002;106:7908. [Google Scholar]
  • 85.Boström M, Tavares FW, Ninham BW, Prausnitz JM. Effect of salt identity on the phase diagram for a globular protein in aqueous electrolyte solution. J Phys Chem B. 2006;110:24757. doi: 10.1021/jp061191g. [DOI] [PubMed] [Google Scholar]
  • 86.Sandberg L, Edholm O. Nonlinear response effects in continuum models of the hydration of ions. J Chem Phys. 2002;116:2936. [Google Scholar]
  • 87.Gong H, Freed KF. Langevin-Debye Model for Nonlinear Electrostatic Screening of Solvated Ions. Phys Rev Lett. 2009;102:057603. doi: 10.1103/PhysRevLett.102.057603. [DOI] [PubMed] [Google Scholar]
  • 88.Kurut A, Lund M. Solution electrostatics beyond pH: A coarse grained approach to ion specific interactions between macromolecules. Faraday Discuss. 2013;160:271. doi: 10.1039/c2fd20073b. [DOI] [PubMed] [Google Scholar]
  • 89.Sham YY, Chu ZT, Warshel A. Consistent calculations of pKa’s of ionizable residues in proteins: Semi-microscopic and microscopic approaches. J Phys Chem B. 1997;101:4458. [Google Scholar]
  • 90.Simonson T. Dielectric relaxation in proteins: microscopic and macroscopic models. Int J Quantum Chem. 1999;73:45. [Google Scholar]
  • 91.Schutz CN, Warshel A. What are the dielectric constants of proteins and how to validate electrostatic models? Proteins: Struct Funct Genet. 2001;44:400. doi: 10.1002/prot.1106. [DOI] [PubMed] [Google Scholar]
  • 92.Simonson T. Dielectric relaxation in proteins: the computational perspective. Photosynth Res. 2008;97:21. doi: 10.1007/s11120-008-9293-2. [DOI] [PubMed] [Google Scholar]
  • 93.Kamerlin SCL, Haranczyk M, Warshel A. Progress in ab initio QM/MM free-energy simulations of electrostatic energies in proteins: Accelerated QM/MM studies of pKa redox reactions and solvation free energies. J Phys Chem B. 2008;113:1253. doi: 10.1021/jp8071712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Pericet-Camara R, Papastavrou G, Behrens SH, Borkovec M. Interaction between charged surfaces on the Poisson-Boltzmann level: The constant regulation approximation. J Phys Chem B. 2004;108:19467. [Google Scholar]
  • 95.Uematsu M, Frank EU. Static dielectric constant of water and steam. J Phys Chem Ref. 1980;Data 9:1291. [Google Scholar]
  • 96.Kumaraswamy VS, Lindley PF, Slingsby C, Glover ID. An eye lens protein-water structure: 1.2 Angstrom resolution structure of γ B-crystallin at 150 k. Acta Crystallogr. 1996;52:611. doi: 10.1107/S0907444995014302. [DOI] [PubMed] [Google Scholar]
  • 97.Israelachvili JM. Intermolecular and Surface Forces. 3rd. Academic; Waltham: 2011. [Google Scholar]
  • 98.Dawson RMC, Elliott DC, Elliott WH, Jones KM. Data for Biochemical Research. 3rd. Oxford University Press; Oxford: 1986. [Google Scholar]
  • 99.Ellenbogen E. Dissociation constants of peptides. I. A survey of the effect of optical configuration. J Am Chem Soc. 1952;74:5198. [Google Scholar]
  • 100.Serjeant EP, Dempsey B. Ionisation Constants of Organic Acids in Aqueous Solution. Pergamon; New York: 1979. (IUPAC Chemical Data Series No. 23). [Google Scholar]
  • 101.Panofsky WKH, Phillips M. Classical Electricity and Magnetism. 2nd. Dover; Mineola: 2005. Chap. 6. [Google Scholar]
  • 102.Jackson JD. Classical Electrodynamics. 3rd. Wiley; Danvers: 1998. [Google Scholar]
  • 103.Simonson T, Carlsson J, Case DA. Proton binding in proteins: pKa calculations with explicit and implicit solvent models. J Am Chem Soc. 2004;126:4167. doi: 10.1021/ja039788m. [DOI] [PubMed] [Google Scholar]
  • 104.Jensen JH, Li H, Robertson AD, Molina PA. Prediction and rationalization of protein pKa values using QM and QM/MM methods. J Phys Chem. 2005;A 109:6634. doi: 10.1021/jp051922x. [DOI] [PubMed] [Google Scholar]
  • 105.Alexov E, Mehler EL, Baker N, Baptista AM, Huang Y, Milletti F, Nielsen J Erik, Farrell D, Carstensen T, Olsson MHM, Shen JK, Warwicker J, Williams S, Word JM. Progress in the prediction of pKa values in proteins. Proteins: Struct Funct Bioinf. 2011;79:3260. doi: 10.1002/prot.23189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Matsui T, Baba T, Kamiya K, Shigeta Y. An accurate density functional theory based estimation of pKa values of polar residues combined with experimental data: From amino acids to minimal proteins. Phys Chem Chem Phys. 2012;14:4181. doi: 10.1039/c2cp23069k. [DOI] [PubMed] [Google Scholar]
  • 107.Burger SK, Schofield J, Ayers PW. Quantum mechanics/molecular mechanics restrained electrostatic potential fitting. J Phys Chem B. 2013;117:14960. doi: 10.1021/jp409568h. [DOI] [PubMed] [Google Scholar]
  • 108.Grimsley GR, Scholtz JM, Pace CN. A summary of the measured pK values of the ionizable groups in folded proteins. Protein Sci. 2009;18:247. doi: 10.1002/pro.19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Creighton TE. The Biophysical Chemistry of Nucleic Acids & Proteins. Helvetian; York: 2010. [Google Scholar]
  • 110.Creighton TE. Proteins: Structures and Molecular Properties. Macmillan; New York: 1993. [Google Scholar]
  • 111.Hansen JP, McDonald I. Theory of Simple Liquids. 3rd. Academic; New York: 2006. [Google Scholar]
  • 112.Gray CG, Gubbins KE. Theory of Molecular Fluids, Volume 1: Fundamentals. Oxford University Press; Oxford: 1984. (International Series of Monographs on Chemistry No. 9). [Google Scholar]
  • 113.Gray CG, Gubbins KE, Joslin CG. Theory of Molecular Fluids, Volume 2: Applications. Oxford University Press; Oxford: 2011. (International Series of Monographs on Chemistry No. 10). [Google Scholar]
  • 114.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Mennucci B, Petersson GA, Nakatsuji H, Caricato M, Li X, Hratchian HP, Izmaylov AF, Bloino J, Zheng G, Sonnenberg JL, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T, Montgomery JA, Jr, Peralta JE, Ogliaro F, Bearpark M, Heyd JJ, Brothers E, Kudin KN, Staroverov VN, Kobayashi R, Normand J, Raghavachari K, Rendell A, Burant JC, Iyengar SS, Tomasi J, Cossi M, Rega N, Millam JM, Klene M, Knox JE, Cross JB, Bakken V, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Martin RL, Morokuma K, Zakrzewski VG, Voth GA, Salvador P, Dannenberg JJ, Dapprich S, Daniels AD, Farkas Foresman JB, Ortiz JV, Cioslowski J, Fox DJ, et al. GAUSSIAN09, revision E.01. Gaussian Inc.; Wallingford, CT: 2009. [Google Scholar]
  • 115.Li H, Robertson AD, Jensen JH. Very fast empirical prediction and rationalization of protein pKa values. Proteins: Struct Funct Bioinf. 2005;61:704. doi: 10.1002/prot.20660. [DOI] [PubMed] [Google Scholar]
  • 116.Bas DC, Rogers DM, Jensen JH. Very fast prediction and rationalization of pKa values for protein-ligand complexes. Proteins: Struct Funct Bioinf. 2008;73:765. doi: 10.1002/prot.22102. [DOI] [PubMed] [Google Scholar]
  • 117.Olsson MHM, Søndergaard CR, Rostkowski M, Jensen JH. PROPKA3: Consistent treatment of internal and surface residues in empirical pKa predictions. J Chem Theory Comput. 2011;7:525. doi: 10.1021/ct100578z. [DOI] [PubMed] [Google Scholar]
  • 118.Søndergaard CR, Olsson MHM, Rostkowski M, Jensen JH. Improved treatment of ligands and coupling effects in empirical calculation and rationalization of pKa values. J Chem Theory Comput. 2011;7:2284. doi: 10.1021/ct200133y. [DOI] [PubMed] [Google Scholar]
  • 119.Roos G, Foloppe N, Messens J. Understanding the pKa of redox cysteines: The key role of hydrogen bonding. Antioxid Redox Signaling. 2013;18:94. doi: 10.1089/ars.2012.4521. [DOI] [PubMed] [Google Scholar]
  • 120.Bloemendal H, de Jong W, Jaenicke R, Lubsen NH, Slingsby C, Tardieu A. Ageing and vision: Structure, stability and function of lens crystallins. Prog Biophys Mol Biol. 2004;86:407. doi: 10.1016/j.pbiomolbio.2003.11.012. [DOI] [PubMed] [Google Scholar]
  • 121.Pande A, Gillot D, Pande J. The cataract-associated R14C mutant of human γ D-crystallin shows a variety of intermolecular disulfide cross-links: A Raman spectroscopic study. Biochemistry. 2009;48:4937. doi: 10.1021/bi9004182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.See Supplemental Material at http://link.aps.org/supplemental/10.1103/PhysRevE.96.032415 for additional figures and tables.
  • 123.Myers J, Grothaus G, Narayanan S, Onufriev A. A simple clustering algorithm can be accurate enough for use in calculations of pKs in macromolecules. Proteins: Struct Funct Bioinf. 2006;63:928. doi: 10.1002/prot.20922. [DOI] [PubMed] [Google Scholar]
  • 124.Sergeev YV, Chirgadze YN, Mylvaganam SE, Driessen H, Slingsby C, Blundell TL. Surface interactions of γ-crystallins in the crystal medium in relation to their association in the eye lens. Proteins: Struct Funct Bioinf. 1988;4:137. doi: 10.1002/prot.340040207. [DOI] [PubMed] [Google Scholar]
  • 125.Tanokura M. 1 H-NMR study on the tautomerism of the imidazole ring of histidine residues: I. Microscopic pK values and molar ratios of tautomers in histidine-containing peptides. Biochim Biophys Acta. 1983;742:576. doi: 10.1016/0167-4838(83)90276-5. [DOI] [PubMed] [Google Scholar]
  • 126.Simonson T, Brooks CL. Charge screening and the dielectric constant of proteins: Insights from molecular dynamics. J Am Chem Soc. 1996;118:8452. [Google Scholar]
  • 127.Bashford D. Macroscopic electrostatic models for protonation states in proteins. Front Biosci. 2004;9:1082. doi: 10.2741/1187. [DOI] [PubMed] [Google Scholar]
  • 128.Leontyev IV, Stuchebrukhov AA. Dielectric relaxation of cytochrome c oxidase: Comparison of the microscopic and continuum models. J Chem Phys. 2009;130:085103. doi: 10.1063/1.3060196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Patargias GN, Harris SA, Harding JH. A demonstration of the inhomogeneity of the local dielectric response of proteins by molecular dynamics simulations. J Chem Phys. 2010;132:235103. doi: 10.1063/1.3430628. [DOI] [PubMed] [Google Scholar]
  • 130.Hass MAS, Jensen M Ringkjøbing, Led JJ. Probing electric fields in proteins in solution by NMR spectroscopy. Proteins: Struct Funct Bioinf. 2008;72:333. doi: 10.1002/prot.21929. [DOI] [PubMed] [Google Scholar]
  • 131.Kukic P, Farrell D, McIntosh LP, García-Moreno BE, Jensen KS, Toleikis Z, Teilum K, Nielsen JE. Protein dielectric constants determined from NMR chemical shift perturbations. J Am Chem Soc. 2013;135:16968. doi: 10.1021/ja406995j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132.Grant ML. Nonuniform charge effects in protein-protein interactions. J Phys Chem B. 2001;105:2858. [Google Scholar]
  • 133.Li W, Persson BA, Morin M, Behrens MA, Lund M, Zackrisson Oskolkova M. Charge-induced patchy attractions between proteins. J Phys Chem B. 2015;119:503. doi: 10.1021/jp512027j. [DOI] [PubMed] [Google Scholar]
  • 134.Adžić N, Podgornik R. Charge regulation in ionic solutions: Thermal fluctuations and Kirkwood-Schumaker interactions. Phys Rev E. 2015;91:022715. doi: 10.1103/PhysRevE.91.022715. [DOI] [PubMed] [Google Scholar]
  • 135.Gögelein C, Nägele G, Tuinier R, Gibaud T, Stradner A, Schurtenberger P. A simple patchy colloid model for the phase behavior of lysozyme dispersions. J Chem Phys. 2008;129:085102. doi: 10.1063/1.2951987. [DOI] [PubMed] [Google Scholar]
  • 136.Kurut A, Persson BA, Åkesson T, Forsman J, Lund M. Anisotropic interactions in protein mixtures: Self assembly and phase behavior in aqueous solution. J Phys Chem. 2012;Lett. 3:731. doi: 10.1021/jz201680m. [DOI] [PubMed] [Google Scholar]
  • 137.Quinn MK, Gnan N, James S, Ninarello A, Sciortino F, Zaccarelli E, McManus JJ. How fluorescent labeling alters the solution behavior of proteins. Phys Chem Chem Phys. 2015;17:31177. doi: 10.1039/c5cp04463d. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

RESOURCES