Abstract
Obtaining a quantitative description of the membrane proteins stability is crucial for understanding many biological processes. However the advance in this direction has remained a major challenge for both experimental studies and molecular modeling. One of the possible directions is the use of coarse-grained models but such models must be carefully calibrated and validated. Here we use a recent progress in benchmark studies on the energetics of amino acid residue and peptide membrane insertion and membrane protein stability in refining our previously developed coarse-grained model (Vicatos et al Proteins 2014; 82: 1168). Our refined model parameters were fitted and/or tested to reproduce water/membrane partitioning energetics of amino acid side chains and a couple of model peptides. This new model provides a reasonable agreement with experiment for absolute folding free energies of several β-barrel membrane proteins as well as effects of point mutations on a relative stability for one of those proteins, OmpLA. The consideration and ranking of different rotameric states for a mutated residue was found to be essential to achieve satisfactory agreement with the reference data.
Keywords: molecular modeling, folding energy, lipid membrane, membrane electrostatics, ion-induced defect, arginine, rotamer, mutation, partitioning free energy, OmpLA
I. Introduction
Membrane proteins play a major role in life processes1,2 and thus there is a great interest in realistic modeling of the functions of such systems. In some cases the functional properties can be simulated without taking the membrane into account, but in other cases it is crucial to consider the effect of the membrane in assessing the energetics of the action of membrane proteins (e.g. activation of ion channels 3-9, the action of the translocon10-13, F0 ATPase 14,15). Modeling the effect of the membrane can range from including a grid of polarizable dipoles 16,17, to explicit all atom simulations.18-21 However, moving to explicit all atom membrane model is not always the most effective approach, in particular, when one is interested in landscape of processes that occur in long time scales and in large complex systems.
One of the most promising options is to use multiscale modeling and, in particular, coarse grained (CG) models. Of course, CG models for proteins have been introduced long ago 22 and refined significantly in recent years.12,13,23-30 Furthermore, CG models for membranes have been introduced and used extensively (e.g. MARTINI 31-34) and there has been also some progress in CG models for protein insertion to membranes.35-38 However, the focus on CG for the energetics of membrane proteins is more recent (e.g. 6,7,9,11-13,15,39,40). The move in the direction of obtaining realistic CG models for membrane proteins has been slow in part because of the difficulties of obtaining unique points of calibration. That is, while in the case of CG for the energetics of proteins we have a relatively extensive benchmark of folding experiments 25,41 the situation with regards to the membrane part is more complex.42 Here there are very few results that can be used with certainly in calibrating the energetics of peptide insertion (20-mer poly-Ala helix,43,44 12-mer poly-Leu helix,45 and the 23-amino acid M2δ segment of the nicotinic acetylcholine receptor 35,46; see Table S1 from ref. 13 for more details), and even the energy of charge and peptide insertion has been controversial.3,10,12,13,47-55 Of course, one can benefit from the advances in microscopic simulations of penetration of charges as well as polar and nonpolar groups to membranes (e.g. 48,56-61) but experimental verification is also essential.
In order to address the above issues we chose to take an incremental approach of a gradual refinement of our CG model waiting for progress in benchmark studies on the energetics of amino acid residue and peptide membrane insertion and membrane protein stability. In doing so we notice the recent gradual progress in the experimental determination of membrane protein folding thermodynamics 62 including a systematic investigation of amino acid substitution effects, which allowed to establish a new hydrophobicity scale.63 In particular, Fleming and co-workers studied a reversible folding of β-barrel integral membrane proteins such as outer membrane phospholipase A1 (OmpLA) 63, outer membrane protein W (OmpW) and phospholipid:lipid A palmitoyltransferase PagP 64 into dilauroyl-phosphatidylcholine (DLPC) lipid membranes. Those measurements were performed on both wild-type proteins and for a number of OmpLA mutants, e.g., A210X where mutations of Ala-210 located near the membrane center to another amino acid residue X were performed.63 Several other positions across the membrane were probed for Arg and Leu mutations.63 The results of these studies were used as one of the primary sources for the validation of our CG model refinements.
We also noted other experimental studies on membrane protein folding thermodynamics including ones on bacteriorhodopsin (bR),65-67 diacylglycerol kinase (DGK),68 outer membrane protein A (OmpA)69-71 as well as some other systems (reviewed in 62). However, their applicability to our CG model validation is limited by ambiguities regarding the nature of the unfolded protein state, availability of a high-resolution structure for the whole protein or lack of folding reversibility.62
II. The CG model and the current modifications
II.1 The total CG free energy and its components
Our CG model has been described in great details elsewhere (e.g. 25), and here we will only outline the main terms and then focus on the new modifications.
Briefly, the total CG free energy is given by
| (1) |
where the total CG folding free energy is taken relative to the free energy of the unfolded (uf) system in water at zero applied potential. The first two terms represent the main chain and side chain contributions, while the third term takes into account total protein and side chain flexibility in estimating the overall conformational entropy.
The main chain energy is given by the contributions of the backbone solvation and the hydrogen bonds
| (2) |
where c2 and c3 are scaling coefficients (0.25 and 0.15, respectively), while the side chain contribution is decomposed into four terms:
| (3) |
where are the electrostatic, polar and hydrophobic components, respectively.
is the van der Waals (vdW) component for side chain interactions, and c1 is a scaling coefficient (0.10 in our current implementation).
Finally, in the case of the presence of electrodes and electrolytes, Eq. 1 is expressed as,
| (4) |
where the is the CG representation of the effect of the external potential.6
The overall model has been calibrated on the absolute folding energy of water soluble proteins, and in this work where we try to improve the treatment of membrane proteins we must make sure not to destroy the agreement for soluble proteins. Thus we will refine the terms that involve the interaction with the membrane.
II.2 Side chain electrostatic contribution
Our focus is placed on the electrostatic term , which is computed as a sum of change in free energy associated with charge-charge interactions between ionizable side chains, , and change in solvation free energy of those residues in their specific environment (self-energy), , inside protein and in water. That is, we write:
| (5) |
where the first term reflects the change in charge – charge interactions, the second term is the change in self-energy of the ionizable groups.
The change in charge–charge interaction free energy is computed as:
| (6) |
and are the charge-charge interaction free energies in a folded and unfolded protein, respectively. The terms in Eq. 6 are given (in kcal/mol) by:
| (7) |
| (8) |
where the distances, rij, and charges, Qi, are expressed in Å and electronic charge units, respectively. rij is the distance between the indicated ionizable side chains in a folded protein, whereas is the corresponding distance between those residues in an unfolded protein, assuming a linear protein chain with a distance of 6 Å between neighboring residues.6 is the charge of the ith residue in the given ionization state obtained using Metropolis Monte Carlo (MC) approach as described below, whereas is the charge of the ith residue for the given pH in an unfolded protein. In the expressions above it is assumed that the protein charges are fully solvated by water (ε≈80) in the unfolded state.72 is an empirical scaling factor and has been taken to be 0.2.
εeff is the effective dielectric constant for charge-charge interaction, which reflects the idea established in many of our earlier works (e.g. 72,73) that the optimal value is large even in protein interiors (namely εeff>20). This type of dielectric constant has been found to provide very powerful insight in recent studies of protein stability (see 41,73).
In this study, following our previous work 25 we used a distance dependent dielectric constant 72:
| (9) |
where fε is dielectric factor, which modifies dielectric response for some electrostatic interactions. fε = 0.9 for electrostatic interactions between Lys and Asp/Glu side chains, fε = 0.6 for electrostatic interactions between Lys side chains located within 9 Å from each other. For all other cases fε = 1.0.
The change in the solvation free energies (self-energies) of the ionizable protein residues between the bulk water and the folded protein, , is given by 72,73:
| (10) |
where i runs over the protein's ionizable residues, and are the intrinsic pKa of the ith ionizable residue in the protein and in water, respectively, when all other residues are neutral.73 Here it is assumed that intrinsic pKa in the unfolded protein is approximately equal to .73 R is the gas constant and T is the absolute temperature. is the charge of the ith residue in the given ionization state, obtained using Metropolis Monte Carlo (MC) approach as described below.
If we assume that the same protein groups are ionized in the unfolded and folded protein states we can write for the ionizable groups:72
| (11) |
However, this is generally not the case and we need to account for a possible change of a side chain ionization state upon unfolding. This is done by adding the correction term, , which reflects the scaled down effect of the change in an ionizable residue protonation state upon unfolding (i.e. penalty for finding that residue at a given pH in a charge state other than its native state in water determined by its )
| (12) |
where μ (0.2 in the present implementation as was determined in our previous study25 to provide the best quantitative agreement for absolute folding free energies of water-soluble proteins) is an empirically determined exponential factor and the other terms are as described above. The last term in Eq.12 is a new term that represents the previously neglected solvation free energy change of the uncharged form of an ionizable residue i, ΔΔGisolv,ucg, and currently is only used for the His residue since its neutral form is more stable at pH=7 . Similar terms can be used for other ionizable residues. Moreover, an exponential term in Eq. 12 introduced in our previous work to provide accurate folding energies of water-soluble proteins25 can be also augmented by a membrane dependent contribution to achieve a better agreement with microscopic simulations and other reference data on the energetics of both charged and neutral forms of ionizable residues.
By combining Eqs. 11 and 12 we can write:
| (13) |
Inserting the expression for into Eq. 5 gives
| (14) |
In our CG model implementation the term, given by Eq. 8, is scaled down by an empirically determined factor of 0.2 similarly to using factor μ for a correction term to provide a good agreement between experimental and calculated protein folding free energies.25 Moreover, using a scaled down effect of the change in the protonation state of an ionizable residue controlled by a value of μ we are able to provide different self-energies for its charged and neutral states in protein and/or membrane environment in agreement with microscopic calculation results as discussed below. If value of μ→∞ then Eqs. 10 and 13 provide the same value of and maximum contribution from a change in the protonation states between water and protein and/or membrane interior, whereas if μ=0 then and is determined by Eq. 11 without taking into account the changes in the protonation state (not considering a possible non-zero value of ΔΔGisolv,ucg in Eq. 12 above in both cases, which is necessary to account for a penalty to move an uncharged form of an ionizable residue into a membrane interior when its and are both equal to 0 such as for His at pH=7 since its as was discussed above).
The ionization states of the protein residues are determined by the Metropolis Monte Carlo (MC) approach of ref 24 for the given pH and temperature T, where a proton transfer (PT) between a randomly chosen pair of ionizable residues or an ionizable residue and the solvent is attempted at each MC step.74 This procedure in each MC move evaluates the electrostatic free energy of the folded protein, ΔGelec, for the mth charge configuration of the ionizable protein residues75
| (15) |
The charge configuration is accepted if the electrostatic free energy achieves a lower value or satisfies the Metropolis criteria. The resulting charge configuration for a minimized ΔGelec is used for subsequent calculation of electrostatic contribution to folding free energy using Eq. 14. Alternatively, MC averaged charges, 〈Qi〉, can be used for calculation of folding free energy as was done throughout this work (i.e. .
An important component of our model is the calculation of the intrinsic pKa values of the ionizable residues, which are computed using 75,76:
| (16) |
Here is a sign function of the charge of the ith residue in its ionized form (which is +1 for Arg, Lys, His and −1 for Asp and Glu). ΔGself,i is the change in self-energy of an ionizable residue upon moving it from water to the protein, this term includes implicitly the solvation of the unionzed form.75 The sum of the ΔGself,i for all the ionizable residues in the protein gives us the total ΔGself, which along with the correction term gives us the discussed above. Thus a key element of our approach is the treatment of the self-energy, ΔGself, associated with charging each ionizable group (residues Asp, Glu, Lys, Arg and His) in its specific environment. This term is given by:
| (17) |
where U designates effective potential, i runs over all ionizable residues, , and are the contributions to the self-energy from non-polar (np) residues, polar (p) residues and membrane (mem) atoms (more precisely, membrane grid points as clarified below), respectively. Here , and are, respectively, the number of non-polar residues, polar residues and membrane atoms in the neighborhood of the ith residue. Note that the non-polar contribution for the membrane is taken into account separately in the hydrophobic term (described below).
The empirical functions and are given by:
| (18) |
and
| (19) |
The number of non-polar residues neighboring the ith ionizable residue is determined by the analytical function:
| (20) |
where rij is the distance between the simplified side chains of ionizable residue (i) and non-polar residue (j), rnp and αnp are the parameter radius and factor, respectively, that determine the effect of the non-polar residues. Similar equations were used for the number of polar residues neighboring the ith ionizable residue, , with parameters rp and αp, and for number of membrane grid points neighboring the ith ionized residue, , with parameters rmem and αmem. The relevant parameters are given in Table S1.
The values of and have been estimated, by observing the values of neighbors in a set of diverse proteins.25,41 For specific values of rp and rnp given in Table S1 and used extensively in our previous work,6,7,12,23-25 we have observed that less than 5% of ionizable residues have more than . The same feature occurs for the non-polar neighbors: Less than 5% of the ionizable residues have more than , and those which do, are deeply buried inside the interior part of the contained protein.25 Parameters , , and used in our model for all ionizable residues are provided in Table S2, whereas residue-dependent parameters and are provided in Table S3.
In case of membrane proteins we represent the membrane by a grid of unified atoms, as we have done in our previous studies (e.g. see refs. 6-9,12,13,25). The membrane grid has a regular spacing between the membrane particles, Dspacing. Moreover, the width of such a CG membrane grid, Wmem, is equivalent to a hydrophobic thickness of a lipid bilayer or the membrane protein under investigation as described in Section III below. Membrane grid particles near the protein atoms and inside the protein internal cavities (e.g. central cavities in ion channels) are not being built. The CG membrane grid points are not modified during CG energy calculations (although the points that are too close to the protein gradually “disappear” through Eq. 28 below). Thus the membrane grid is primarily used to modulate membrane protein energetics rather than modelling membrane thickness fluctuations and/or phase behavior as done in other CG models. In our model a CG membrane grid is used to calculate in a similar way to that used in Equation 20. The resulting self-energy term, which also reflects the boundaries between the protein and the membrane, is given by:
| (21) |
where the term is given by:
| (22) |
The parameter Rsolvent in Eq. 21 is the distance to the closest solvent molecule, which is determined by a water grid around the system, and using the distance to the closest water grid point.12 The grid (also used for Langevin dipole generation in our previous studies77) was built up to 40 Å from system center, with points within 10 Å from system surface atoms placed with spacing of 3 Å (inner grid) and those outside with spacing of 8 Å (outer grid). Grid points within 4.9 Å of system atoms were excluded. The parameters Lw and Ls determine the effect of the burial of residue (i), and their suggested values in our previously published studies (see refs. 6,7,25) were one half and one quarter of Wmem, respectively. For a membrane grid spacing Dspacing = 2Å and width Wmem = 36Å, the values of Lw and Ls were taken as 18 Å and 9 Å (see refs. 6,7,25 for more details). Ls value of 12 Å was also used for this membrane in our earlier study.12 However, we found that values of Lw and Ls in Table S4 for membranes with different thickness Wmem and 2 Å grid spacing provide better agreement with microscopic simulation results (see Section III.1 below). The Wmem values are multiple of 4 Å because of using 2 Å membrane grid resolution (Dspacing). So lipids having hydrophobic thickness closest to Wmem are indicated in the 1st column of Table S4 along with their tail length and unsaturation. Eq. 22 parameters and used in our model for all ionizable residues are provided in Table S2, whereas residue-specific parameters are provided in Table S3 for both previously published25 (model P14) and our refined model (model 0).
in Eq. 12 was computed using Eq. 22 with for His (in lieu of in this equation with other parameters being the same). This term was not used for other ionizable residues in our current model implementation.
The effect of zwitterionic membrane head groups is simulated by placing positive and negative charges on the outer and the subsequent layer of the membrane grid, respectively. The corresponding electrostatic interactions with the protein charges were treated with Eq. 7 and εeff = 20, which has been justified by earlier studies of the electric field from membrane head groups (see e.g ref. 78).
II.3 Side chain polar and hydrophobic contributions
The second term in Eq. 3, , is treated with equations identical to the ones used to calculate the self-energies of the ionizable residues and is given by:
| (23) |
where i runs over all polar residues (Ser, Thr, Tyr, Cys, Asn, Gln), , and are the number of non-polar residues, polar residues, and membrane atoms in the neighborhood of the ith residue. The N terms in Eq. 23 are calculated by using Eq. 20 with exactly the same parameters given in Table S1. The functions , and are given by the same expression as in equations 18-19 with Nmax and αU values provided in Table S2, and residue-dependent Bpolar values are given in Table S3.
The third term in Eq. 3, , is treated by adopting similar model used in the self-energy and polar free energy calculations, as follows.
| (24) |
where i runs over all non-polar residues (Ala, Leu, Ile, Val, Pro, Met, Phe, Trp), and are the number of polar residues and membrane atoms in the neighborhood of the ith non-polar (hydrophobic) residue. They are calculated by using Eq. 20, with exactly the same parameters given in Table S1. The functions and are given by the same expression as in equations 18 and 19 with Nmax and αU values provided in Table S2 and residue-dependent Bhyd values given in Table S3. Please note that value of in Table S2 is 14 whereas corresponding values for and are 28, the same as in our previous studies.25 The substantial reduction in maximum number of membrane particles around a non-polar residue was necessary to provide a good agreement with experimental relative folding free energies for hydrophobic mutations in a membrane protein as discussed in Section III.3.
The term , however, is being treated in a different way, compared to its counterparts. That is, is given by:
| (25) |
where is a constant, similar in nature with the constants described in equations 18, 19 and 22 and with values provided in Table S3. is the number of implicit water grid points within a certain radius from the side chain center. is the total number of implicit water grid points that this specific residue is surrounded with, when it is by itself in a water environment.
To calculate for each non-polar residue (i), we create an implicit water grid around that residue and eliminate the grid points, which collide with protein main chain atoms. Next we retain the grid points that are within the volume between the spheres of radii rhydro (i) and rhydro (i) + 4Ǻ from the center of the side chain atom of ith residue. The rest of the grid points are eliminated. The total number of these grid points is taken as the value of (see ref. 25 for more details). See Table S5 for residue-specific values of and rhydro (i) used in our current model for all hydrophobic residues.
To achieve better agreement with microscopic simulations for side chain translocation energetics across lipid membranes we also tried to augment by an additional term, , which describes residue binding to water/membrane interface. For a lipid membrane oriented along z axis it is given by an expression:
| (26) |
where determines depth of the interfacial binding minima (similar to , which determines the magnitude of the central barrier), αint determines width of the binding trough, zmem is a position of the membrane center, zi is a side chain position, Wmem is a membrane thickness. For all polar residues αint =0.04 was used whereas values varied as shown in Table S3 since they were optimized to reproduce interfacial binding minima from microscopic simulations.60,61
The same formalism was used to augment by an additional term, using the same αint and values from Table S3. If there is no interfacial binding for that residue. Similarly, can be augmented by an interfacial binding term. However, this was not done in the current implementation of our CG model for the reasons described in Section III.1.
In one of the variants of the refined CG model described in Section III below (called model 2) we were forced to eliminate and terms by assigning and to 0 for all amino acid residues. In another variant (model 1) in line with one of our previous studies13 we scaled down by a factor of 3.57 hydrophobic contributions and in Eq. 24 with term still present and did not consider polar contribution given by Eq. 23 (as well as interfacial term ). Such modifications were done to be able to reproduce experimental absolute folding free energies of membrane proteins as will be discussed in Section III.4 below.
II.4 Side chain van der Waals contribution
The last term of Eq. 3, , describes the effective van der Waals interactions between simplified side chains. It consists of two components: a) the interactions between the protein residue simplified side chains, and b) the interactions between side chains and membrane grid atoms, is described by an “8-6” potential of the form:
| (27) |
where and . The parameters and define, respectively, the well depth and equilibrium distance. These parameters were refined by minimizing the root-mean-square deviations between the calculated and observed values of both the atomic positions and the protein size (i.e., the radii of gyration) for a series of proteins.25 The corresponding refined parameters are given in Table S6.
The van der Waals interactions of membrane grid atoms are treated in a different way to allow for efficient modeling of the membrane effect. That is, the membrane grid is treated with continuous derivatives in order to reduce the need for generating a new grid when the protein is displaced or changes its structure. This was done by building a continuous membrane (instead of deleting membrane points that appears in direct contact with the protein). Accounting for the fact that the membrane grid should be deleted upon contact with the simplified side chain protein atoms, we replaced the standard van der Waals interaction between the protein and the membrane by
| (28) |
where Aij and Bij are parameters for interacting ith side chain and jth membrane grid atom, rij is the distance between the two atoms, and α is a vdW cutoff parameter.
| (29) |
where and are, respectively, the well depth and equilibrium distance for the pair of atoms i and j. Note the different way of calculating , compared to the one used for . Parameter α is equal to 7452.75 Å6.
II.5 Main chain contribution
For folding free energy calculations the main chain contribution is given by Eq. 2 where and are main chain solvation and hydrogen bonding contributions, respectively.
The main chain solvation term is the difference between main chain solvation free energies of the folded and unfolded protein and is given by
| (30) |
| (31) |
where Bsolv = −2 and i runs over all protein residues (Nres) in the sequence. Parameter αθ determines steepness of Uα,i function and is set equal to 10, whereas θmax determines cutoff for the fraction of polar residues and is set to be 0.8.
The function θ, which reflects the fraction of polar residues around the Cα atom of a given residue i, is given by
| (32) |
where is the maximum number of polar residues around a Cα atom (taken as 27 based on the total number of neighbors around a residue buried inside SecY translocon that was used as a test system25); is the maximum number for membrane atoms around a Cα atom (taken as 33 based on using Ala in a membrane with membrane spacing of 4 Å as described in ref. 25). Nnp,i and Nmem,i are the numbers of nonpolar and membrane residues around residue i, which are calculated by the same approach used in the self-energy calculations. The only difference is that we count the residues around the Cα and not the Cβ atom, as done for the calculation of the self-energy contributions. So if a residue is in bulk water and/or surrounded mostly by polar moieties, |θi| is close to 1, Uα,i = 1 and its contribution to term is around −2 kcal/mol, which is effectively cancelled by the unfolded protein contribution, , whereas for a residue in the middle of membrane and/or surrounded mostly by nonpolar residues |θi| ≈ 0, Uα,i is small and its contribution is negligible resulting in a positive main chain solvation term due to an unfolded protein contribution.
The hydrogen bond function is given by
| (33) |
where and are determined by Eq. 31, and we have
| (34) |
and where
| (35) |
where we use μHB = 22.2 Å−2 and rHB = 2.9Å. Awater = 0.044 and Amem = 0.22 (see explanation below). Here rij is distance between H and O atom forming N–H…O=C hydrogen bond between protein backbone atoms. Here we counted hydrogen bonds satisfying the following geometric criteria: r(H…O)≤3.5 Å and angle N–H…O > 150°.
is the regular HB function used in the standard MOLARIS force field.
| (36) |
while is given by:
| (37) |
with μ = 15 Å−2, r0 = 2 Å, and rij is the same as in Eq. 34.
The scaling factors Awater and Amem are evaluated by the function
| (38) |
where in water Uα is equal to 1 for all residues (see Eqs. 31-32 and discussion above), therefore from Eq. 38 we have
| (39) |
On the other hand, in membrane Uα is set to 0 for all residues (see Eqs. 31-32 and discussion above), and from Eq. 38 we have
| (40) |
Function for HB in water, , determined by Eq. 34 has a shallow minimum and a Gaussian centered at 2.9 Å (controlled by rHB). This reflects the fact that we have to spend some energy to break the HB, but once it is broken, the HB with water molecules can be formed. Function for HB in membrane, , determined by Eq. 35 has a deeper minimum, since it is more difficult to break the HB in a hydrophobic environment.
II.6 Scaled size contribution
Now we can turn to the so called scaled size term, ΔGsc.size of Eq. 1, which takes into account total protein size in terms of number of amino acid residues, Nres, as well as flexibility of amino acid side chains manifested in number of single bonds (excluding ones terminated by hydrogen), Nsbond.
| (41) |
Here sres and ssbond are empirical scaling factors and are set to be 0.042 and 0.044 in our current CG implementation to achieve good quantitative agreement with experimental folding free energies (Vicatos and Warshel, personal communication) for a number of proteins described in our previous study.25 Nsbond values for different residue types are provided in Table S7.
In one of the variants of our refined CG model discussed below (called model 2) in the presence of membrane, the scaled size term is multiplied for each residue contribution by (1+smem,resFmem,i) and for single bond contributions by (1+smem,sbondFmem,i) resulting in the following overall expression:
| (42) |
where empirical scaling coefficients smem,res = 11.18, smem,bond =13.91, and Fmem,i determines whether a residue is located in the membrane environment and is given by:
| (43) |
where empirical factors , and Nmem,i is the number of membrane grid particles around protein residue i, which is calculated as described for Eq. 32 above.
II.7 CG energy for protein folding and MD relaxation
Thus protein folding free energy can be presented as:
| (44) |
It should be noted that the CG treatment of folding energy uses the simplified main chain treatment of Eq. 2. On the other hand the expression used for ΔGmain in MD relaxation simulations is (see ref. 25):
| (45) |
where ΔGbond, ΔGangle, ΔGtor, and ΔGitor are bond, angle, torsion and improper torsion contributions from the regular ENZYMIX force field.77 is a torsional correction potential, which is used to modify the gas-phase potential ΔGtor, since the protein secondary structure strongly depends on main chain solvation (see ref. 25 for more details). Also, the last term in Eq. 45, , is the charge-charge interaction free energy between the main chain atoms, which is calculated by Eq. 7 with a dielectric constant εeff =10 and using partial atomic charges of protein main chain atoms q instead of QMC (see ref. 25 for more details). The fact that we use different treatments for the main chain energy in relaxation runs and in evaluating the CG energetics reflects the awareness that the microscopic main chain energy (typically defined by similar terms to those included in Eq. 45) fluctuates enormously in relaxation runs and obtaining the corresponding converging microscopic average is exactly what we try to avoid in the CG model. Obtaining the CG contribution of ΔGmain, which has well converged statistical weights of different main chain conformations encountered during protein dynamics, remains a challenging problem. In the present study we only looked at protein crystal structures and considered only side chain conformations for different mutants (as will be described in Section III.3) but did not study main chain dynamics explicitly. We are looking at several options for obtaining more consistent treatment including using virtual bonds between the Cα atoms22 and torsional normal mode analysis in that representation but at present we view the use of Eq. 2 as a powerful compromise as it is not extremely sensitive to the details of the main chain structure.
III. Results and Discussion
III.1 Inserting amino acid side chains into membranes
In our gradual approach of improving the CG model we started to explore the membrane-protein electrostatic term by refining the energetics of ion insertion into a membrane. This was done by considering the energetics of inserting an ionized Arg side chain trying to reproduce the corresponding microscopic results for membranes of different thicknesses (Ref 79). In doing so we consider first the results expected from moving to an ideal undeformed membrane and then the results from the actual water penetration and charged group relaxation as revealed by microscopic simulations.79 The corresponding refinement effort started by exploring the membrane electrostatic energy for an isolated Arg residue moving across an ideal lipid membrane represented by a hydrophobic slab with ε=2 of different thicknesses. First, we made theoretical estimates of such a barrier using our previously developed model based on image electrostatics (see Eq.25 from ref. 80). We used Born radius a=2.75 Å for Arg side chain analog, methyl guanidinium (MGuanH+), calculated based on its theoretical estimate of hydration free energy (-61 kcal/mol, ref. 81), and obtained a barrier height of ∼26 kcal/mol regardless of membrane thickness (dotted lines in Fig. 1A). A similar estimate has been provided by the PDLD/S model 77 (dashed lines in Fig. 1A). Those calculations were done for a Na+ ion and the corresponding free energy profiles were downscaled by ratio of MGuanH+ and Na+ hydration free energies (−61 and −105 kcal/mol,81,82 respectively). Our CG model can also provide similar Arg self-energies shown as solid lines in Fig. 1A if we increase value for Arg from 10 to 26 kcal/mol (see Table S3). These results are also similar to previous continuum membrane estimates for MGuanH+ across non-deformable lipid membranes of different thicknesses79 (cf. gray dotted and solid lines in Fig. S2B). There is an expected increase in the width of the barrier with the increase of membrane thickness but the barrier height remains nearly the same and is determined by the cost of the ion dehydration upon moving to an ideal undeformed membrane.
Figure 1.
The energetics of a charged Arg side chain translocation through a non-deformable (panel A) or a deformable (panel B) membrane of different thicknesses (Wmem). All free energy estimates are shown relative to Arg side chain in bulk water. CG results using our refined model (model 0) are shown as solid lines. Only electrostatic term is shown in panel A whereas total CG energy (which also includes a small vdW contribution) is shown in panel B for a proper comparison with reference values. In panel A scaled PDLD/S electrostatic estimates are shown as dashed lines whereas theoretical estimates from continuum electrostatics (ref. 80) are shown as dotted lines. In panel B Arg side chain translocation free energies from microscopic all-atom MD simulations (ref. 79) are shown as dotted lines.
The above self-energies were calculated using Eq. 22 without taking into account membrane deformability and water penetration inside membrane modeled using parameters of Eq. 21. However, it is well known from microscopic MD simulations and from logical considerations that lipid membranes deform substantially upon burial of a charged amino acid residue or other ionic species.48,56,60,83,84 Water molecules and/or lipid head groups move deep inside lipid membrane core to solvate such buried charged species, which allows for a substantial decrease in the energetic cost of ion translocation.48,81 Thus the free energy barrier for ion translocation by this molecular mechanism (called “ion induced defect” mechanism) no longer directly depends on the energetic cost of ion dehydration82 but substantially varies with the lipid membrane thickness, e.g. decreasing from 26 to 6 kcal/mol when the hydrophobic membrane thickness decreases from ∼30 Å for 18-carbon long lipid tails to a ∼15 Å for 10-carbon long tails (ref. 79, see dotted curves in Figs. 1B and S2A). The above considerations mean that the Arg CG self-energies calculated using Eq. 22 alone (solid curves in Fig. 1A or Fig. S1A) do not reproduce microscopic results as expected (cf. e.g. solid gray and dotted black lines in Fig. S2B), as they do not reflect the microscopic relaxation that follows the Arg burial. However, using Eq. 21 (that was introduced 12,25 to consider this effect) and varying the Ls and Lw parameters we can account for this effect and reproduce microscopic simulation results as shown in Figs. 1B and S2. Interestingly, the suggested Ls and Lw values from our previous study25, Lw=Wmem/2 and Ls=Wmem/4 do not provide good agreement with microscopic results predicting no changes (Fig. S1B) in the Arg translocation barrier as a function of membrane thickness. However, using Ls=7 Å with slight variation of Lw= 22 − 24 Å (see Table S4) we could reproduce a dramatic change in the Arg translocation free energy barriers consistent with microscopic results (Figs. 1B and S2). Especially good quantitative agreement has been achieved when we also increased by 2 Å (from 2.9 to 4.9 Å) the minimum distance from the protein or membrane atoms to the water grid points generated to calculate Rsolvent in Eq. 21. Such a change was also shown to be necessary to predict energetics of Arg in membrane proteins as will be discussed below.
Similarly, we increased for other ionizable side chains (Lys, His, Asp, Glu) from 10 to 26 as well (see Table S3). The reason for using the same for all ionizable residues is that the microscopic simulations and electrophysiology experiments predict similar barriers for different ionic species regardless of their aqueous hydration energetics or charge sign (+1 or -1) (see e.g. ref. 82). Thus the Arg and Lys side chains were also predicted to have similar barriers for membrane translocation based on microscopic simulations.59 Other microscopic simulations show some differences, e.g. ∼12 kcal/mol barrier for Lys+, ∼14 kcal/mol for Arg+, ∼19 kcal/mol for Asp− and ∼20 kcal/mol for Glu−.61 This could be, however, partially related to using full side chain analogs in that study (e.g. butylammonium for Lys+ and propylguanidinium for Arg+), which allows for some snorkeling towards the interface and lowering free energy barriers,48,58 whereas shorter analogs (methylammonium for Lys+ and MGuanH+ for Arg+) were used in ref. 59. The latter model is more compatible with ionizable side chain definition for our CG model, where an effective atom X representing an Arg side chain is placed into its charge center.25
Next we extended our tests to all amino acid residues. Our goal was to check performance of our CG model against experimental data if available or microscopic simulations for amino acid side chain translocation across a lipid membrane. As a reference we used the recently published microscopic simulations of Tieleman and co-workers,60,61 who studied the translocation of side chain analogs across a DOPC (1,2-dioleoyl-sn-phosphatidylcholine) membrane. These workers achieved a very close correspondence between free energies at the membrane center and experimental water/cyclohexane partitioning free energies.85
The model system used in attempting to reproduce the above results (and in the Arg CG tests discussed above) has been an isolated side chain moving across a 28 Å thick CG lipid membrane with a 2 Å resolution to mimic a DOPC membrane used in the microscopic simulations.60,61 The CG simulation system consisted of an isolated amino acid CG residue and a CG membrane oriented along z axis with horizontal dimensions 56×56 Å2, where the CG side chain effective atom X was moved along the z axis in 1 Å increment across the CG membrane, which was regenerated for each position. The main chain amino acid residue atoms were placed well outside the membrane with bonds between X and D (dummy atom placed at the position of a side chain Cβ atom and used for an atomistic side chain regeneration25) atoms with the main chain atoms removed. Results are presented in Figures S3-S4 and Table S8 for our previously published25 (model P14) as well as refined (model 0) CG models.
The Bmem values for polar and non-polar residues were initially set to be the same as Bnp 25, which provided sufficient accuracy for our previous CG studies of membrane protein systems (e.g. 12). However, the present refinement work, that included a comparison of the results of the CG model and microscopic simulations, found some discrepancies including 1) an underestimation of the free energy minima at the membrane center for several non-polar residues (Val, Leu, Ile), whereas the barriers at the membrane center were either non-existent or strongly underestimated for some polar residues (Thr, Ser, Gln, Asn). For ionizable residues (Arg, Lys, Glu, Asp) the central barriers were also underestimated, while for Trp our previous CG model predicted too deep minimum in the middle of the membrane inconsistent with microscopic simulations and experimental water/cyclohexane partitioning (Table S8). Apparently, modifying the Bmem values in the new model (Table S3) allowed us to overcome these discrepancies and provide a better agreement between CG and reference values for most amino acid side chains (see Table S8). The Bmem were modified so that the CG results matched the microscopic results 60,61 with some exceptions. Although there are no microscopic results for Pro, we followed the trends for other non-polar residues between the old and new set and doubled the value of Bmem. For Tyr there is a large uncertainty (of ±1 kcal/mol) in a microscopic free energy estimate, comparable to the magnitude of the barrier61 and thus a value closely matching experimental water/cyclohexane partitioning free energy85 (which is within the range of microscopic free energy uncertainty) was chosen as a target value.
Our previous CG model could not reproduce the existence of interfacial minima for polar and some non-polar residues (Tyr, Trp, Met, Cys, Thr, Ser, Asn, Gln). Such minima are crucial for the interfacial localization of some residues, e.g. Trp.86 Introduction of additional terms and (see Eq. 26 above), which can be controlled by a residue-dependent and parameters appeared to help in resolving this issue (see Table S8 and Fig. S3, panels B-D). A similar interfacial term was not used for self-energies since microscopic MD results show large variations in the interfacial binding of ionizable residues depending on a side chain analog and a membrane model used.56-59,61
Figs. S5 and 2 show the correlations between the CG free energies of the translocation of amino acid side chains through a 28 Å thick membrane and the corresponding microscopic counterparts from ref. 61 or the water-cyclohexane partitioning free energies from ref. 85. Our refined model (model 0) provides much better correlation with microscopic MD results compared to a previously published one (model P14, ref. 25) as expected since Bmem values for most non-polar and polar residues were adjusted to reproduce microscopic MD results (see Fig. S5). A linear regression coefficient for the new model (1.17) is > 1 since it overestimates microscopic free energy barriers for most ionizable residues as model 0 parameters were set to reproduce values for another set of microscopic calculations using shorter side chain analogs59 as was discussed above. More importantly, there is also a good correlation between model 0 membrane partitioning free energies and experimental water/cyclohexane partitioning free energies (see Fig. 2B) with a linear regression coefficient very close to 1 (0.96), whereas model P14 (with a linear regression coefficient 0.78) shows substantial underestimation of experimental partitioning free energies (Fig. 2A).
Figure 2.
The correlation between amino acid side chain experimental water/cyclohexane partitioning free energies, ΔGexp(wat→cHex),85 and their CG model relative free energies in the middle of a 28 Å thick membrane with 2 Å resolution, ΔGCG(wat→mem), using (A) previously published25 (model P14) and (B) our refined (model 0) CG model results. Linear regressions are shown as solid black lines with equations (y=ax+b) and Pearson's correlation coefficients (rxy) provided as well. Dotted black lines correspond to equation y=x. Standard one-letter amino acid names are shown. For ionizable residues (D, E, K, R and H) ΔGCG(wat→mem) for a neutral form with addition of an energetic cost of a residue discharging in bulk water was used to provide a direct comparison between CG and published experimental results.
III.2 Inserting amino acids on a long helix into membranes
The second system that was considered in our refinement study has involved an amino acid residue in the center of the long transmembrane α-helix sliding up and down across a lipid membrane. Such models were used in recent microscopic MD simulations to determine energetic cost of an ionizable residue translocation across lipid membranes (see e.g. refs. 48,58,59). In those studies an 81-residue poly-Leu helix with Arg or Lys residue in the middle, i.e. L40RL40 or L40KL40, was used. We used the same system: an 81-residue poly-Leu helix with Arg in the middle, i.e. L40RL40. The reason for using such a long helix in our current and previous studies has been to allow for sliding a central Arg residue across a membrane with both ends of the helix still in water, which allows to get Arg contribution without need to take into account partial desolvation/membrane embedding of Leu residues at the ends of the helix.
To account for a possible Arg side chain snorkeling to lower its self-energy in the middle of membrane, as was predicted by microscopic simulations,48 it is important to consider different Arg side chain rotamers. Thus we used a protein side chain rotamer library from ref. 87 implemented in MOLARIS although we explored other available choices, for instance, rotamer library in Pymol mutagenesis toolkit from ref. 88 with very similar results (data not shown). For the 5 torsional angles that govern the conformations of Arg (χ1 to χ5: χ1=N-Cα-Cβ-Cγ, χ2= Cα-Cβ-Cγ-Cδ, χ3= Cβ-Cγ-Cδ-Nε, χ4= Cγ-Cδ-Nε-Cz, χ5= Cδ-Nε-Cz-Nh1) we identified 34 rotamers from the analysis of the PDB structures,87 32 of which were stable for the L40RL40 system (they do not have substantial steric clashes with other peptide atoms) and will be considered further. We then used MOLARIS to generate those rotamers in the explicit all-atom representations of our helix; converted each configuration to a CG model; added a CG membrane grid with a thickness of 28 Å, representative of a DPPC (1,2-dipalmitoyl-sn-phosphatidylcholine) membrane used in microscopic MD simulations,48 and with a 2 Å spacing; and computed energies of those systems for different helix z positions represented by a position of a central Arg Cα, z(Arg Cα), across the membrane. All the generated rotamers for the Arg position in the middle of the membrane i.e. for z(Arg Cα)=0 are shown in Table S9. It was found that a variation of χ1 to χ4 values (χ5 controls the flipping of the guanidinium group around Nε-Cz axis and does not affect Arg side chain position and thus was not varied) results in different z positions of the CG effective Arg side chain atom X, ranging from -5.2 to 2.9 Å, whereas the Cα-X bond length varied from 5.1 to 7.0 Å, and the corresponding N-Cα-X angle has values ranging from ∼78° to ∼170° (see Table S9). This variation had a dramatic influence on the Arg self-energies using the refined CG model described in this work (model 0), ranging from 8.0 to 21.7 kcal/mol with much less variation (from 10.2 to 10.5 kcal/mol for stable rotamers) for a previously published one (model P14) for the reasons explained above. The rotamer with the lowest CG energy was chosen as a representative for each z; e.g. rotamer 11 for z(Arg Cα)=0 with the self-energy of 8.0 kcal/mol. For this rotamer z(Arg X)=−5.2 Å i.e. Arg side chain snorkels downwards resulting in a substantial decrease of Arg self-energy (see Fig. 3). Rotamers with side chains oriented nearly parallel to membrane plane (see Table S9 and Fig. 3 for rotamer 20) and those with an upward oriented Arg side chain (i.e. z(Arg X)> z(Arg Cα), see e.g. rotamer 25 on Fig. 3) were found to have higher self-energies than those with the downward orientation. The above finding is in agreement with previous microscopic studies on L40RL40 membrane translocation.48 For other z(Arg Cα) values different Arg rotamers result in lowest self-energy values (see Fig. 3 and Table S12). Rotamer 20 with mostly parallel side chain orientation (with respect to membrane plane) is the most stable when Arg is in bulk water; downward oriented rotamers 10 and 11 are the most stable when Arg is in the lower membrane core and near the membrane center; whereas upward oriented rotamers 25 and 24 provide lowest self-energies in the upper membrane core, respectively (Fig 3 and Table S12). One potential issue with choosing rotamers solely based on their CG energies is that we ignore possible steric clashes between an Arg side chain and the rest of the helix. For a L40RL40 system we found that rotamers 30 and 34 have substantial unfavorable Lennard-Jones (LJ) interactions on the order of several thousand kcal/mol based on atomistic calculations (see Table S9, those rotamers are highlighted in red). Therefore we did not consider those rotamers in choosing ones with lowest CG energies. The situation becomes even more complex for membrane proteins and we need to choose objective criteria taking into account atomistic energetics as will be discussed in the next section.
Figure 3.

A few representative Arg rotamers providing lowest L40RL40 CG energies using our refined CG model (model 0) for different z positions across a 28 Å thick membrane. Rotamer (rot.) numbers, corresponding to those in Table S9 are shown. A poly-Leu α-helix is shown as a green ribbon, a central Arg residue is shown in an all-atom stick representation (C is dark-gray, O – red, N – blue, H – white). Cα and CG X atoms of that residue are shown as pink and cyan balls, respectively. A CG membrane is shown as a grid composed of small gray dots. All molecular structures in this and other figures were drawn using VMD (Visual Molecular Dynamics) program.100
Our study found that the self-energy profile for L40RL40 across a 28 Å thick membrane is very sensitive to the model used and our choice of rotamers (see Fig. S7A). For example, using a previously published CG model25 (model P14) produced a flat barrier of the same magnitude (∼10 kcal/mol) but a different width near the membrane center for an isolated Arg side chain and L40RL40 with or without choosing lowest-energy rotamers. However, the picture is very different using the current CG model modification (model 0). That is, choosing the lowest self-energy rotamer for each z, as was described above, the central barrier is also about 10 kcal/mol (Fig. S7A) at the top a Λ-shaped profile characteristic for results of microscopic simulations.58 Interestingly, if we were to ignore the relative energies of Arg side chain rotameric states across the membrane and use a rotamer with a parallel side chain orientation (e.g. rotamer 20, see Fig. 3 and Table S9) throughout, the central barrier would be comparable to that for an isolated Arg side chain (∼21.5 kcal/mol, see Fig. S7A).
We should also consider total CG folding free energy as described by Eq. 44 above (see Fig. 4A). Using the lowest CG energy rotamer for each z, the central L40RL40 translocation barrier would be around 14.0 kcal/mol, using model 0 described in this work, and ∼12.6 kcal/mol for the previously published model P14 (cf. red and blue curves in Fig. 4A). For both models the total ΔG barrier is dominated by a self-energy term followed by a hydrophobic contribution due to a displacement of a Leu residue from membrane to water as Arg moves to the membrane core (see Fig. S8 and Tables S11 and S12). The barrier height and a shape of the profile are in a reasonable agreement with those from a microscopic MD simulation for the same system (black curve in Fig. 4A). Such an agreement is improved even further upon using a modification of model 0 with a scaled up size term from Eq. 42, which we call model 2 (pink curve in Fig. 4A) and which will be discussed in more detail below. In that model the scaled size term becomes as important as the hydrophobic contribution (see Fig. S8 and Table S14). On the contrary, model 1, model 0 modification with a reduced hydrophobic term (also to be discussed below), somewhat reduces quantitative agreement with microscopic MD for the barrier height (green curve in Fig. 4A) for obvious reasons (see Fig. S8 and Table S13).
Figure 4.

A comparison of relative CG free energies for a long hydrophobic transmembrane (TM) α-helix with charged Arg in the middle sliding across a membrane. (A) poly-Leu/DPPC and (B) poly-Ala/DLPC systems. Insets show CG simulation systems with Arg Cα at the membrane center (z=0). A TM helix is green, Arg Cα and CG X atoms are pink and cyan balls. A CG membrane grid is shown by gray dots. An atomistic ΔG profile for Arg/poly-Leu helix translocation across a DPPC membrane from ref. 48 is shown by a solid black curve in panel A. Experimental relative folding free energies for OmpLA A→R mutants from ref. 63 are shown by a dotted black curve in panel B. CG results using a previously published (model P14, ref. 25) and refined models (models 0, 1 and 2) are shown by blue, red, green and pink curves. For each z an Arg rotamer with the lowest CG energy was chosen.
Since the Arg residue may be deprotonated (and thus neutralized) in order to overcome the large self-energy for burying it in the middle of the membrane, we considered such possibility for the L40RL40 membrane translocation. The energetics of such deprotonation is governed by a term given by Eq. 12. So if the Arg residue is deprotonated and its QMC = 0, according to Eq. 13. Based on Eqs. 12 and 13 will be smaller for a neutralized Arg residue (Arg0) compared to a charged one (Arg+) in membrane relative to the same species in bulk water, which is confirmed by our L40RL40 calculations (compare dashed and solid lines in Fig. S7A). The difference between CG energies for charged and neutral Arg forms greatly depends on the rotamer due to an exponential term in (cf. last two columns in Table S9). For instance, such difference is only ∼2.8 kcal/mol for a downward oriented rotamer 11 but it increases to ∼14.3 kcal/mol for a rotamer 20 with a parallel side chain orientation. There is a reasonably good agreement between our CG results and the microscopic MD estimates of ref. 58 for membrane translocation energetics of L40RL40 peptide with a neutral Arg in the middle (cf. dashed black and colored curves in Fig. S6A). We should also note that the fact that is always smaller for a neutral Arg does not mean that it will be always deprotonated while inside the membrane. The prototonation states of ionizable residues are determined by minimizing an electrostatic free energy term given by Eq. 15. And in the absence of charge-charge interactions, as in the case of a L40RL40 system, it will be determined by Arg intrinsic pKa given by Eq. 16, which in turn is a function of its self-energy (ΔGself) and pKa in bulk water . It should be noted that an electrostatic free energy given by Eq. 15 is determined for protein residues in their specific environment i.e. protein and/or membrane, whereas the electrostatic component of folding free energy provided by Eq. 14 is calculated for those residues in such environment relative to their contributions in an unfolded protein in bulk water. It is possible to provide a direct and accurate correlation between those quantities for simple systems (e.g. by using modifications of a term in Section II.2 above) but it is beyond the scope of the present study.
In addition to the membrane translocation of L40RL40 system described above we performed similar calculations for a 79-residue poly-Ala helix with Arg in the middle, i.e. A39RA39, embedded in a CG membrane with 2 Å spacing and 20 Å thickness representing DLPC. The reason for using this model system was to provide some comparison with OmpLA experimental Ala→Arg mutant relative folding free energies,63 discussed in more detail in the next section. We considered the same set of Arg rotamers as for a L40RL40 system choosing for each z(Arg Cα) ones, which minimize the CG energies (also excluding rotamers 30 and 34 due to steric clashes with other peptide atoms). There is also a substantial dependence of the Arg self-energy on its rotameric state, ranging from 2.1 to 10.0 kcal/mol for model 0 at z(Arg Cα)=0, whereas for model P14 Arg rotamer self-energies at this position vary by less than 1 kcal/mol (from 10.2 to 10.9 kcal/mol, see Table S10). This determines the shape and height of a self-energy profile across the membrane, which has a plateau of ∼10 kcal/mol in the middle for model P14 (Fig. S7B), very similar in shape and magnitude to those for an isolated Arg side chain and a L40RL40 peptide. On the contrary, a Λ shaped self-energy profile for A39RA39 using model 0 reaches only ∼2 kcal/mol at the membrane center when using lowest-energy rotamers for each z, much smaller than 9.4 kcal/mol for an isolated Arg side chain across a membrane of the same thickness and up to ∼10.0 kcal/mol for a L40RL40 helix across a 28 Å thick membrane (see Fig. S7). Furthermore, using neutral Arg residue in an A39RA39 expectedly reduces self-energy in the middle of membrane (cf. solid and dashed curves in Fig. S7). This effect is substantial for model P14 (from 10.0 to 5.7 kcal/mol at |z|≤1 Å) but rather small using lowest-energy rotamers for model 0 (e.g. from 2.4 to 2.2 kcal/mol at z = 1 Å).
Considering the total ΔG for an A39RA39 peptide across a 20 Å thick CG membrane (dominated by self-energy, hydrophobic and/or scaled size terms, see Fig. S8 and Tables S15-S18) allows us to make a rough comparison with experimental Ala→Arg mutant relative folding free energies63 for OmpLA β-barrel membrane protein (cf. colored solid and black dotted curves in Fig. 4B). Model 0 (red curve in Fig. 4B) provides the best agreement with the experimental results, followed by model 1 with a downscaled hydrophobic term (green curve in Fig. 4B). A previously published model P14 (blue curve in Fig. 4B) substantially overestimates the experimental OmpLA relative mutational free energy, which can be expected due to high Arg self-energies in the middle of membrane nearly insensitive to its side chain orientation (see Table S10), and thus not reproducing the energy lowering due to the Arg side chain snorkeling. Interestingly though that model 2 with an increased size term (pink curve in Fig. 4B) also leads to a substantial overestimation of OmpLA relative mutant free energies despite accurately predicting microscopic MD results for a L40RL40 system as was described above. The overestimation seems to stem from a dominant scaled size contribution for an A39RA39 peptide energetics using model 2 (see Fig. S8 and Table S18). It is hard to say though, without additional reference data and/or calculations, whether it is an artifact of the model or a result of very different membrane energetics for an α-helical A39RA39 peptide and β-barrel membrane protein OmpLA. Therefore direct calculations on OmpLA system will be discussed in the next section.
III.3 The energetics of mutations in OmpLA
A main validation test for our refined CG model was the comparison of its results with the experimental data for the relative folding free energies, ΔΔGfold, for mutations of the OmpLA β-barrel protein. Experimental data from the study of Fleming and co-workers 63 for a reversible folding of this protein in DLPC membranes are available for the wild-type (WT) protein, various mutants A210X for residue 210 located near the membrane center, as well as Arg and Leu mutants at several other positions across the membrane (see Fig. 5). For our CG studies we used the available PDB crystal structure 1QD5 of the WT protein. The mutant structures were built by mutating a corresponding residue in the WT structure using MOLARIS. As in the case of Arg on the poly-Leu and poly-Ala helices, we had to consider different rotamers for the mutated residue side chains from the MOLARIS rotamer library87 and choose those with the lowest CG energy. However, as for those helices we need to be aware for possible steric clashes of the built mutant side chain. Here due to the complexity of the system it is not feasible to discard rotamers with high LJ energy values for every mutant individually. Therefore we came up with an objective energetic criterion, which was used for choosing the lowest-energy rotamer: (ΔGfold + 0.1ULJ), where ΔGfold is the CG folding free energy given by Eq. 44 and ULJ is the atomistic Lennard-Jones energy. Thus the lowest-CG energy rotamer is being favored unless there are substantial steric clashes. In the OmpLA studies we also used CG membranes with two thicknesses Wmem: 20 and 24 Å (see Fig. 5). The former corresponds roughly to the hydrophobic thickness of DLPC bilayer, which was used in experimental studies, whereas the latter is approximately equal to the hydrophobic thickness of OmpLA.63 It would seem more natural just to use a DLPC thickness of 20 Å for these calculations as it would correspond to an experimental setup. However, since a hydrophobic mismatch can result in the adjustment of the membrane thickness we considered a 24 Å thick membrane as well. In a recent computational study no substantial membrane thickness change around OmpLA was reported during a multi-ns microscopic simulation.89 However, the authors reported membrane (and OmpLA) hydrophobic thickness of ∼23 Å,89 whereas an experimental estimate for an unperturbed DLPC membrane is 20.9 Å,90 which justifies using both 20 and 24 Å thick CG membranes in our calculations.
Figure 5.

Showing OmpLA (cyan) in a (A) 20 Å thick (DLPC) or (B) 24 Å thick (corresponding to protein hydrophobic thickness) CG membrane with 2 Å resolution (small dark-gray spheres). Cα positions of mutated residues are shown by large colored spheres: A164 – red, L120 – orange, A210 – yellow, G212 – green, A223 – blue, Y214 – purple.
The relative CG folding free energies for the A210X mutants, ΔΔGfold(A210X) were calculated as differences between ΔGfold for the A210X lowest-energy rotamer and WT structure, using:
| (46) |
The corresponding results are summarized in Table S20 (with a more detailed information including lowest-energy rotamer geometries and relative CG energy components provided for different models in Tables S21-S28). First, we compared the experimental results to those for a previously published model (model P14). For most side chains ΔΔGfold are substantially underestimated in magnitude with an RMS error of ∼1.9 kcal/mol and the largest error of 4.6-4.8 kcal/mol for Lys mutant. The use of our refined model (model 0) leads to a substantially better agreement with the RMS error dropping to ∼1.1 kcal/mol and the largest error of ∼3.0 kcal/mol for Lys using a 20 Å thick membrane and ∼2.1 kcal/mol for Arg using a 24 Å thick membrane. Now the correlation between the experimental and CG ΔΔGfold values improves substantially. This is especially clear when performing a linear regression analysis as shown in Figs. 6 and S9. The linear correlation plots for model 0 compared to those for model P14 are much closer to a y=x line representing a perfect agreement between the experimental and CG ΔΔGfold (see Fig. S9). Furthermore, the difference between the experimental and CG ΔΔGfold decreases for most residues (see Table S20). The exceptions are Cys, Phe, Met and Tyr but the difference between two models is well within 1 kcal/mol.
Figure 6.
Linear correlation plots between experimental, , (ref. 63) and CG, , relative folding free energies of OmpLA A210X mutants using our refined CG model (model 0) and its variants with scaled down hydrophobic (model 1) or scaled up size term (model 2). Linear regressions are shown as solid black lines with equations (y=ax+b) and Pearson's correlation coefficients (rxy) provided as well. Dotted black lines correspond to equation y=x. An OmpLA crystal structure (PDB: 1QD5) was used to generate rotamers for different mutants. A rotamer minimizing an objective function combining CG energy with 0.1 of atomistic LJ energy was chosen for each mutant residue. No structural relaxation was performed. CG membranes of 20 Å (left column) and 24 Å (right column) thicknesses (Wmem) were used. See text for more details.
Our revised model provides a good agreement with the experimental ΔΔGfold for the OmpLA A210X mutants. However, the absolute folding free energies, ΔGfold, should be compared as well, as will be discussed in more detail in the next section. Unfortunately, our revised model (model 0) substantially overestimates the experimental ΔGfold for OmpLA (see e.g. Table S19). To overcome this deficiency we tested several options. In one of them, which we called model 1, we scaled down the hydrophobic term by a factor of ∼3.6 and did not consider the polar term, similarly to what was done in one of our previous studies.13 In this case we achieved a good agreement with the experimental ΔGfold within 2 kcal/mol for the WT for both membrane thicknesses (see Table S19). However, the agreement with the experimental ΔΔGfold for the different A210X mutants decreases substantially, almost to the level of our previously published model, P14 (see Table S20). The linear regression coefficients drop substantially as well (see Figure 6, panels C and D). Thus we explored a different option, which we called model 2, in which we had to remove interfacial contributions and (see Section II.3 and Eq. 26 above) and introduced membrane dependence of the size term given by Eq. 42 in Section II.6. The membrane associated size term parameters were adjusted to provide good agreement with reference data for OmpLA WT (for a 24 Å thick membrane) as well as poly-Ala and poly-Leu helices, as will be discussed in the next section. Thus experimental ΔGfold for OmpLA WT in a 24 Å thick membrane is perfectly reproduced by model 2 and it is overestimated (but not nearly as much as for model 0 or P14) in a 20 Å thick membrane (see Table S19). What is more important, is that model 2 provides the same level of agreement (and even somewhat better for a 24 Å thick membrane) as model 0 (see Table S20) for A210X mutant ΔΔGfold values. For model 2 ΔΔGfold values in a 24 Å thick membrane, RMS error is ∼1.0 kcal/mol and maximum error is ∼2.4 kcal/mol (for Tyr mutant). A linear regression a coefficient (in the y=ax+b equation) is 0.79 (see Fig. 6F), comparable to that for model 0 (0.78, Figure 6B). The agreement with experiment is worse using a 20 Å thick membrane with model 2 (see Fig. 6E and Table S20), but it is still substantially better than for model 1 (except for absolute folding free energies, Table S19). Interestingly, all the models used in our study tend to underestimate the experimental ΔΔGfold(A210X) since a linear regression coefficients a are less than 1, even for models 0 and 2. This occurs despite the fact that model 0 provides excellent agreement between CG water→membrane and experimental water→cyclohexane partitioning free energies for isolated amino acid side chains (see Fig. 2B and discussion in Section III.1 above). The largest (but still moderate, typically less than 2 kcal/mol) underestimation of experimental ΔΔGfold(A210X) seems to come from some ionizable residues such as Lys, His and Asp (as well as polar Asn residue) and could be related to an underestimation of the electrostatic penalty for their burial in the middle of membrane. On the other hand, ΔΔGfold(A210R) is moderately overestimated in a 24 Å thick membrane, although this could be related to a lack of a mutant side chain relaxation as described below. Our CG estimates for a ΔΔGfold(A210Y) i.e. Ala→Tyr mutation are also a 1-2 kcal/mol too unfavorable (see Table S20), which can be related to a missing favorable interaction of the mutant residue with other residue(s) nearby, water molecules or polar lipid moieties. A similar explanation was a proposed for a similar magnitude overestimation of a A210S mutant free energy in a recent microscopic MD study.91 A reason for experimental folding free energy underestimation for several charged and polar residues is harder to pinpoint and it can be related to more global protein structural relaxation and/or readjustment in the membrane environment (tilting, shifting etc.) not considered in this study. Even though it might be possible to add residue-specific correction terms for different residues to provide a perfect agreement between experimental and CG ΔΔGfold(A210X) values, this might lead to a loss of transferability of our model. It will perfectly predict OmpLA mutant relative folding free energies but will likely result in substantial errors for absolute folding free energies of OmpLA as well as other systems (discussed in more detail in the next section). Similarly, by adding the same correction term to account for a systematic underestimation of many CG ΔΔGfold(A210X) values we will increase errors in those values for other important residues where such underestimation is not an issue (e.g. Arg and Tyr as discussed above) and will likely worsen the agreement for absolute folding free energies as well.
In addition to the A210X mutants we also considered Arg and Leu mutants at other positions across the membrane (see Figure 5) as was done in an experimental study.63 In this case all the ΔΔGfold values were calculated with respect to Ala residues at those positions. Therefore if a residue other than Ala was at the mutation position in the WT protein, e.g. Gly at position 212, we considered the quantities:
When Ala was present at the point of the mutation in the WT protein, the difference in free energies with respect to WT was calculated. The results are shown in Tables S29 and S30 (with the lowest energy rotamer geometry and relative CG energy components for each mutant provided in Tables S31-S36) and Figures 7 and S10. Interestingly, the refined model (model 0) and its modifications (models 1 and 2) resulted in a substantial overestimation of ΔΔGfold(A120R) and a more moderate overestimation of ΔΔGfold(A210R) and ΔΔGfold(A214R) for most of them (see Table S29, part A and Figure S10, panel A). Such overestimation might be related to ignoring the low-CG energy rotamers due to substantial steric interactions (large ULJ). This is corroborated by large values of ΔGfold for rotamers with the lowest value of (ΔGfold + 0.1ULJ), especially for L120R mutant (3.9 kcal/mol for model 0 using a 24 Å thick membrane, with an overestimation of an experimental value by 4.2 kcal/mol, see Table S32). To resolve this issue we performed 10 ps of an atomistic MD simulation of mutant Arg side chains (with other protein atoms fixed) for all rotamers in the gas phase with Arg partial atomic charges corresponding to a neutral state. Using the same criterion to choose the lowest energy rotamer (ΔGfold + 0.1ULJ), see Fig. 8 for representative structures, we found that the Arg side chain minimization improves the agreement with the experimental results (see Table S29, part B and Figs. 7A and S10B). Even though ΔΔGfold(A120R) is still overestimated, the error drops, especially for model 0 (from ∼4.2 to ∼0.8 kcal/mol for a 24 Å thick membrane, Table S29). The worse performance for model 2 is not unexpected in this case, as a bigger Arg size term results in higher ΔΔGfold(A120R) values (e.g. an increase in a scaled size term by ∼1.2 kcal/mol and an electrostatic term by ∼1.0 kcal/mol results in a ΔΔGfold(A120R) error increase to ∼3.4 kcal/mol for model 2 using a 24 Å thick membrane, see L120R(MD) entries in Tables S32 and S36), which was also the case for A→R ΔG profile for an A39RA39 system using this model as was discussed in a previous section. Overall, model 0 provides the best agreement for A→R mutations with the RMS errors of 1.3 and 1.0 kcal/mol for 20 and 24 Å thick membranes. Models 1 and 2 show somewhat worse agreement with experiment. None of the models presented here can provide the same trend in ΔΔGfold(A→R) as that observed experimentally, but this is not unexpected since in the present study we ignore large-scale protein structural relaxation and/or titling. However, the semi-quantitative agreement between computed and experimental free energy differences is reasonable.
Figure 7.
Comparing experimental63 (black dotted curves) and CG relative folding free energies of OmpLA Ala→Arg (panel A) and Ala→Leu (panel B) mutants, ΔΔGfold(A→R) and ΔΔGfold(A→L), for different z positions (corresponding to a Cα atom of the mutated residue) across the membrane (membrane center is at z=0). An OmpLA crystal structure (PDB: 1QD5) was used to generate rotamers for different mutants. A rotamer minimizing an objective function combining CG energy with 0.1 of atomistic LJ energy was chosen for each mutant residue. 10 ps gas-phase mutant side chain atomistic MD was performed for rotamers to get CG data for Arg mutants in panel A whereas no structural relaxation was performed for rotamers to get CG data for Leu mutants in panel B. 20 Å (dashed lines) and 24 Å (solid lines) thick CG membranes were used. See text for more details. See Tables S29B and S30 for numerical data.
Figure 8.

Showing minimum CG energy (model 2) rotamers for Arg OmpLA mutants for different z positions across a 24 Å thick membrane (after 10 ps atomistic mutant Arg side chain gas-phase relaxation). Rotamer (rot.) numbers, corresponding to those in Table S36 are shown. OmpLA protein is shown as green ribbons. A mutated Arg residue is shown in an all-atom stick representation (C is dark-gray, O – red, N – blue, H – white). Cα and CG X atoms of that residue are shown as pink and cyan balls, respectively. A CG membrane is shown as a grid composed of small gray dots. See Table S29B and a solid pink curve in Fig. 7A for corresponding relative CG folding energies. See text for calculation details.
Our study also evaluated ΔΔGfold for the A→L mutations across the membrane (see Table S30 and Figure 7B). The best agreement with the experimental results was achieved for model 0 with a fairly similar performance for model 2 (Table S30), which was achieved due to a substantial increase in for Leu and halving of compared to previously published model P14 values. However, model 1 systematically underestimates all the ΔΔGfold(A→L) values, which can be expected due to a reduced hydrophobic term (e.g. the hydrophobic term for the A210L mutant in a 24 Å thick membrane drops from -1.8 kcal/mol for model 0 to -0.5 kcal/mol for model 1 resulting in a 1.3 kcal/mol underestimation of the experimental value for the latter, see Tables S32 and S34).
III.4 Absolute folding energies of membrane proteins
In addition to the folding free energy changes associated with mutation of particular membrane protein residues, the refinement of the CG model was also aimed to reproduce the absolute folding free energies of membrane proteins. In our previously published CG model 25 this was done successfully for a number of water-soluble proteins. We started by examining the water-membrane partitioning free energies of several small peptides, for which experimental or other computational free energy estimates are available and which were also tested in our previous works.13,25 In the next step we examined the absolute folding free energies of several β-barrel integral membrane proteins, for which experimental estimates are available.62
Before considering our results it is useful to clarify how the folding and water–membrane partitioning free energies of membrane associated peptides and proteins have been calculated. This is illustrated on Figure 9 for poly-Ala (A20) helix. When the protein is embedded in a CG membrane ΔGfold values computed using Eq. 44 correspond to the free energy difference between folded peptide (or protein) in membrane and unfolded peptide (or protein) in bulk water, i.e. ΔGfold(wat→mem) = G(f)(mem) – G(uf) (wat). If a peptide or protein is located in the bulk water then its CG folding free energy will be determined as ΔGfold(wat) = G(f)(wat) − G(uf)(wat). In both cases our reference state was taken as the unfolded peptide or protein in the bulk water. Thus in order to calculate the partitioning free energy of a folded peptide or a protein between water and membrane, ΔG(f)(wat→mem), we use the following thermodynamic cycle: ΔG(f)(wat→mem) = ΔGfold(wat→mem) – ΔGfold(wat). In other words, we calculated partitioning free energies as a difference of CG folding free energies in membrane and water. Thus when the experimental folding free energies for membrane proteins are given with respect to unfolded protein in the bulk water, we can directly compare them with our calculated ΔGfold(wat→mem) values and we only need one calculation – in the CG membrane environment. However, if the experimental reference state is an unfolded (or partially unfolded, as shown in Fig. 9) protein in a membrane as it is often the case 62 then the experimental and our computed ΔGfold values cannot be compared directly. Based on the thermodynamic cycle in Fig. 9, membrane protein folding free energy with a reference state of a partially unfolded protein in a membrane, ΔGfold(mem), can be computed as ΔGfold(mem) = ΔGfold(wat→mem) −ΔG(uf)(wat→mem). In other words, we will need to know water–membrane partitioning free energy of the unfolded protein, which cannot be directly computed by our CG model. In principle, we can estimate this quantity by assuming that there is no interactions between protein residues in the unfolded state and thus its water–membrane partitioning free energy will be a sum of corresponding partitioning free energies of all the protein residues (both side chain and backbone contributions). However, this might not be the case as experiments suggest that membrane proteins can be partially unfolded inside lipid membranes.62 Therefore in this work we only focused on membrane proteins, for which reversible folding was observed with unfolded reference states in bulk water.
Figure 9.

Folding (ΔGfold) and water – membrane (wat→mem) partitioning (ΔG(f) and ΔG(uf)) free energies for poly-Ala (A20) peptide. Folded (f) α-helical state in water and membrane as well as unfolded (uf) state in water and partially unfolded state in a membrane are shown in cartoon representation (green). A 28 Å thick CG membrane with a 2 Å separation is shown by small dark-gray balls.
Partitioning free energies ΔG(f)(wat→mem) of 3 model peptides, poly-alanine, poly-leucine and M2δ peptide (representing a crucial part of acetylcholine receptor transmembrane domain) are shown in Table 1. Both the previously published CG parameters (model P14) and our refined CG model (model 0) substantially overestimate water –membrane partitioning free energies for all 3 systems due to a very large uncompensated hydrophobic term (see Table S37). In one of our previous CG studies this issue was solved by downscaling hydrophobic energies by a factor of 3.57, ignoring the polar term and modifying main chain solvation and hydrogen bond contributions 13 compared to a standard P14 model. Here we used a similar approach but with a refined CG model (model 0) as a starting point. We also did not perform modification of a main chain solvation and hydrogen bonding term (ΔGmain in Table S37). The resultant model (model 1) provides substantially better agreement with reference values, especially for poly-Ala and M2δ peptide. For poly-leucine the ΔG(f)(wat→mem) value obtained with model 1 (−8.5 kcal/mol) is still overestimated by almost a factor of 3, but it should be noted that the reference value (−3.2 kcal/mol) represents a microscopic MD estimate obtained at a high T of 80°C,92 and thus it is not clear how accurately it represents this system's thermodynamics at physiological conditions. Yet, as described in a previous section, model 1 provided poor agreement with experiment for OmpLA ΔΔGfold(A210X) as well as position-dependent ΔΔGfold(Ala→Leu) relative folding free energies. Thus we developed another modification of model 0 named model 2, which allows to overcome those issues. For that model we removed all the interfacial terms ( and , see e.g. Eq. 26 in Section II.3 above) and scaled up the size term given by Eq. 42 in the presence of membrane. In particular, as described in Section II.6 above, we introduced additional scaling factors, smem,res and smem,sbond. The location of a particular residue in membrane environment is determined by Fmem,i, which in turn depends on empirical factors and (see Eq. 43 in Section II.6). The optimal values of smem,res, smem,sbond, and for model 2 were determined by a least-square fitting to get the best agreement with reference values for water-membrane partitioning free energies for poly-alanine and poly-leucine as well as an absolute folding free energy for wild-type OmpLA. Therefore model 2 provides an excellent agreement with reference values for those systems (see Tables 1 and S37 for the peptides and Tables 2 and S19 for OmpLA). Unlike model 1, this model provides a good agreement with experiment for OmpLA ΔΔGfold(A210X) as well as position-dependent ΔΔGfold(Ala→Leu) relative folding free energies (see Section III.3 above for more details). However, there are some shortcomings for this model as well. For instance, it predicts positive water-membrane partitioning free energy for M2δ peptide (see Table 1) indicating that this peptide is more stable in water. This is in disagreement with previous continuum membrane studies that provided theoretical estimates of -11.3 kcal/mol from ref. 46 and -6.1 kcal/mol from ref. 35 but to the best of our knowledge there are no experimental estimates that confirm those values. Nevertheless, the positive ΔG(f)(wat→mem) for this peptide obtained with model 2 is worrisome and most likely indicates overestimated size term, also suggested by A39RA39 peptide and some OmpLA calculations above. It is likely related to the dependence of the size term on a structure of a membrane associated peptide or protein as well as some other factors. This will need to be addressed in further refinements of our CG model provided an increase of available experimental data on peptide/protein membrane association and folding.
Table 1.
CG membrane partitioning free energies, ΔG(f)(wat→mem), for several model peptides in a 28 Å thick membrane relative to bulk water.(1)
| Peptide | Picture | Model | ΔG(f)(wat→mem) |
|---|---|---|---|
| Poly-alanine (A20) |
|
Reference43,44 | -4 |
| Model P14 | -18.3 | ||
| model 0 | -25.5 | ||
| model 1 | -2.8 | ||
| model 2 | -4.0 | ||
| Poly-leucine (L12) |
|
reference45,92 | -3.2 |
| model P14 | -23.6 | ||
| model 0 | -37.7 | ||
| model 1 | -8.5 | ||
| model 2 | -3.0 | ||
| M2δpeptide |
|
reference35,46 | -8.7(2) |
| model P14 | -17.7 | ||
| model 0 | -30.8 | ||
| model 1 | -8.1 | ||
| model 2 | 21.1 |
All free energies are in kcal/mol. ΔG(f)(wat→mem) were computed as differences in corresponding folding free energies for a peptide embedded in a 28 Å thick CG membrane and those for a peptide in bulk water: ΔG(f)(wat→mem)= ΔGfold(wat→mem) – ΔGfold(wat).ΔGfold were computed using Eq. 44.See Section III.4 and Fig. 9 for more details and Table S37 for ΔG(f)(wat→mem) components. Poly-alanine and poly-leucine helices were built in an ideal α-helical configuration using Molefacture plugin in VMD100 and were placed with the helical axis parallel to membrane normal. For M2δ peptide NMR structure 9 from a PDB entry 1A11 was used and its 2 N-terminal residues were removed to provide a direct comparison with a previous continuum membrane study.46
Table 2.
Absolute folding free energies, ΔGfold, for several β-barrel integral membrane proteins in CG membranes calculated using refined CG models 1 and 2 compared to experimental (exp.)values.1
| Protein(PDB code) | Picture | ΔGfold(exp.) | CG calculations | ||
|---|---|---|---|---|---|
| Wmem, Å | ΔGfold(CG) | ||||
| model 1 | model 2 | ||||
| OmpLA (1QD5) |
|
-32.45(ref. 63) | 20 | -34.94 | -52.53 |
| 24 | -31.41 | -33.01 | |||
| OmpW (2F1V) |
|
-18.60 (ref. 64) | 20 | -4.66 | -18.56 |
| 24 | -2.85 | -14.29 | |||
| PagP (1THQ) |
|
-24.40 (ref. 64) | 20 | -51.49 | -58.45 |
| 24 | -47.01 | -37.62 | |||
| OmpA(1QJP) |
|
-3.40 (ref. 71) | 20 | -26.17 | -7.72 |
| 24 | -24.21 | 3.87 | |||
| 28 | -19.56 | 17.19 | |||
All ΔGfold are in kcal/mol and were computed using Eq. 44.See Section III.4 for more details and Tables S38 and S39 for ΔGfold components. X-ray structures from PDB without any structural relaxation were used.
Tables 2 and S19 compare different estimates of ΔGfold(wat→mem) for the WT OmpLA. The main findings that were already mentioned in Section III.3 above, will be re-iterated briefly. The calculations were done for both 20 and 24 Å thick CG membrane, the former corresponding to DLPC (lipid used in an experimental setup), whereas the latter to the protein hydrophobic thickness. Both previously published (model P14) and the refined model 0 substantially overestimate ΔGfold(wat→mem) compared to experiment63 (see Table S19). Modifications of model 0, models 1 and 2, overcome this deficiency due to scaling down of hydrophobic and removal of polar term (model 1) or scaling up the size term in the presence of membrane near a protein residue and removal of the interfacial terms (model 2). Interestingly, model 1 provides good agreement with experiment (within ∼2.5 kcal/mol) for both 20 and 24 Å thick CG membranes, whereas model 2 does that only for a 24 Å thick membrane, i.e. a system, which was used in a membrane size correction fitting process (see above). However, for the 20 Å thick membrane ΔGfold(wat→mem) is substantially overestimated by model 2. This indicates a substantial stabilization of OmpLA in a 20 Å thick membrane, which is not corroborated by experimental values 63 and indicates a further need for a model refinement.
Table 2 (with CG energy components and results for several crystal structures provided in Tables S38 and S39) compares the experimental and the CG ΔGfold(wat→mem) values obtained with models 2 and 1 for several other β-barrel membrane proteins, OmpW, PagP and OmpA, for which experimental values are available. Tables 2 and S38 indicate that model 2 provides a good agreement with the corresponding experimental results (within a few kcal/mol) for the OmpW protein using 3 different crystal structures. Model 1 tends to underestimate ΔGfold(wat→mem) for this protein (see Table S39). For PagP the situation is different, where both models, 1 and 2, substantially overestimate ΔGfold(wat→mem) for this protein (see Table 2, S38 and S39). However, for OmpA (not to be confused with OmpLA discussed above), there is again substantial difference among models: model 1 overestimates ΔGfold(wat→mem) for all checked membrane thicknesses, whereas for model 2 a 20 Å thickness provides a reasonable albeit overestimated result, and for thicker CG membranes ΔGfold(wat→mem)>0. For OmpA such discrepancies are not unexpected though as a full protein including a cytoplasmic domain was used in folding experiments71 whereas only a TM domain from an available X-ray PDB structure (1QJP) was used in our CG calculations.
IV. Concluding remarks
The recent increase in interest in modeling membrane proteins (e.g. 9,13,15,18,20,93-96) has reflected the growing realization of the biological importance of such systems and the emergence of key structural information. Nevertheless, major difficulties remain considering the enormously complicated landscape of the protein-membrane system. Here the use of full microscopic calculations is not yet at one level that expects to give quantitative results, in particularly, when one considers the overall stability of membrane proteins 97 (see below).
One possible effective strategy is to use CG models but such models must be calibrated carefully and validated repeatedly with the emergence of new experimental information. Here we use a recent progress in benchmark studies on the energetics of amino acid residue and peptide membrane insertion and membrane protein stability in refining our previously developed coarse-grained model. This new model provides a reasonable agreement with experiment for absolute folding free energies of several β-barrel membrane proteins (OmpLA, PagP, OmpW) as well as effects of point mutations on a relative stability of OmpLA. This point is important considering our aim of providing a model that captures the absolute free energy of proteins in different environments.
At this point it may be useful to comment on the appealing microscopic option. Obviously one would like to move to more microscopic treatments and one may point out to recent microscopic MD studies like that of Ref. 91, which used free energy perturbation (FEP) approach to estimate energetics of A210R, A210L and A210S OmpLA mutations. This work reported a reasonable agreement with experiment for Arg and Leu mutations and ∼2-3 kcal/mol overestimation for Ser. Moreover there is a ∼2-3 kcal/mol range in the reported estimates due to an uncertainty with a reference state. The overall level of agreement with experiment is similar to one reported in our study (with many more mutational states explored in the latter). However, the main problem is that the microscopic FEP study (which is interesting and formally correct) is unlikely to obtain reasonably converged estimates of absolute membrane protein stabilities from those kind of simulations, which might be implied from a previous work by Ref. 98. Even an adequate sampling of membrane embedded Arg side chain rotational states and hydration is challenging and may not converge during the reported simulation length based on previous estimates for simple systems.48,58,99 However, the energetics of inserting a whole helix or larger fragments is simply unlikely to converge with current simulations, considering the need to obtain equilibration of water penetration as well as the ionization states of the protein residues. Therefore well calibrated CG calculations of absolute stability remain of substantial value.
It is useful to point out that the validation and calibration process should remain an ongoing effort and reflect the emergence of new information. Furthermore, microscopic studies will remain a major tool in the refinement of CG models, including the attempts to improve accuracy of such models and to understand the origin of different phenomenological terms (e.g. the scaling coefficients in Eq. 42).
Another direction that requires further studies is the use of rotamer search in the CG optimization. Here we should look at efficient search approaches that will allow us to explore activation barriers in addition to the current search around available low-energy structures.
Supplementary Material
Acknowledgments
This work was supported by NIH grants GM 24492 and GM 40283, MCB 0836400, and NCI (1U19CA105010). We gratefully acknowledge the University of Southern California's High Performance Computing and Communications.
References
- 1.Alberts B. Molecular biology of the cell. New York, NY: Garland Science, Taylor and Francis Group; 2015. p. 1464. [Google Scholar]
- 2.Gennis RB. Biomembranes: molecular structure and function. New York: Springer-Verlag; 1989. p. 533. [Google Scholar]
- 3.Jiang Y, Ruta V, Chen J, Lee A, MacKinnon R. The principle of gating charge movement in a voltage-dependent K+ channel. Nature. 2003;423:42–48. doi: 10.1038/nature01581. [DOI] [PubMed] [Google Scholar]
- 4.Schmidt D, Jiang QX, MacKinnon R. Phospholipids and the origin of cationic gating charges in voltage sensors. Nature. 2006;444:775–779. doi: 10.1038/nature05416. [DOI] [PubMed] [Google Scholar]
- 5.Swartz KJ. Sensing voltage across lipid membranes. Nature. 2008;456:891–897. doi: 10.1038/nature07620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Dryga A, Chakrabarty S, Vicatos S, Warshel A. Realistic simulation of the activation of voltage-gated ion channels. Proc Natl Acad Sci USA. 2012;109:3335–3340. doi: 10.1073/pnas.1121094109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Dryga A, Chakrabarty S, Vicatos S, Warshel A. Coarse grained model for exploring voltage dependent ion channels. Biochim Biophys Acta. 2012;1818:303–317. doi: 10.1016/j.bbamem.2011.07.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kim I, Chakrabarty S, Brzezinski P, Warshel A. Modeling gating charge and voltage changes in response to charge separation in membrane proteins. Proc Natl Acad Sci USA. 2014;111:11353–11358. doi: 10.1073/pnas.1411573111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kim I, Warshel A. Coarse-grained simulations of the gating current in the voltage-activated Kv1.2. channel Proc Natl Acad Sci USA. 2014;111:2128–2133. doi: 10.1073/pnas.1324014111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hessa T, Kim H, Bihlmaier K, Lundin C, Boekel J, Andersson H, Nilsson I, White SH, von Heijne G. Recognition of transmembrane helices by the endoplasmic reticulum translocon. Nature. 2005;433:377–381. doi: 10.1038/nature03216. [DOI] [PubMed] [Google Scholar]
- 11.Rychkova A, Mukherjee S, Bora RP, Warshel A. Simulating the pulling of stalled elongated peptide from the ribosome by the translocon. Proc Natl Acad Sci USA. 2013;110:10195–10200. doi: 10.1073/pnas.1307869110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rychkova A, Vicatos S, Warshel A. On the energetics of translocon-assisted insertion of charged transmembrane helices into membranes. Proc Natl Acad Sci USA. 2010;107:17598–17603. doi: 10.1073/pnas.1012207107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Rychkova A, Warshel A. Exploring the nature of the translocon-assisted protein insertion. Proc Natl Acad Sci USA. 2013;110:495–500. doi: 10.1073/pnas.1220361110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Junge W, Sielaff H, Engelbrecht S. Torque generation and elastic power transmission in the rotary F0F1-ATPase. Nature. 2009;459:364–370. doi: 10.1038/nature08145. [DOI] [PubMed] [Google Scholar]
- 15.Mukherjee S, Warshel A. Realistic simulations of the coupling between the protomotive force and the mechanical rotation of the F0-ATPase. Proc Natl Acad Sci USA. 2012;109:14876–14881. doi: 10.1073/pnas.1212841109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Alden RG, Parson WW, Chu ZT, Warshel A. Calculations of electrostatic energies in photosynthetic reaction centers. J Am Chem Soc. 1995;117:12284–12298. [Google Scholar]
- 17.Aqvist J, Warshel A. Energetics of ion permeation through membrane channels. Solvation of Na+ by gramicidin A. Biophys J. 1989;56:171. doi: 10.1016/S0006-3495(89)82662-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Jensen MØ, Jogini V, Borhani DW, Leffler AE, Dror RO, Shaw DE. Mechanism of voltage gating in potassium channels. Science. 2012;336:229–233. doi: 10.1126/science.1216533. [DOI] [PubMed] [Google Scholar]
- 19.Grossfield A, Feller SE, Pitman MC. A role for direct interactions in the modulation of rhodopsin by ω-3 polyunsaturated lipids. Proc Natl Acad Sci USA. 2006;103:4888–4893. doi: 10.1073/pnas.0508352103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Boiteux C, Vorobyov I, Allen TW. Ion conduction and conformational flexibility of a bacterial voltage-gated sodium channel. Proc Natl Acad Sci USA. 2014;111:3454–3459. doi: 10.1073/pnas.1320907111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Nury H, Poitevin F, Van Renterghem C, Changeux JP, Corringer PJ, Delarue M, Baaden M. One-microsecond molecular dynamics simulation of channel gating in a nicotinic receptor homologue. Proc Natl Acad Sci USA. 2010;107:6275–6280. doi: 10.1073/pnas.1001832107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Levitt M, Warshel A. Computer simulation of protein folding. Nature. 1975;253:694–698. doi: 10.1038/253694a0. [DOI] [PubMed] [Google Scholar]
- 23.Kamerlin SC, Vicatos S, Dryga A, Warshel A. Coarse-grained (multiscale) simulations in studies of biophysical and chemical systems. Annu Rev Phys Chem. 2011;62:41–64. doi: 10.1146/annurev-physchem-032210-103335. [DOI] [PubMed] [Google Scholar]
- 24.Messer BM, Roca M, Chu ZT, Vicatos S, Kilshtain AV, Warshel A. Multiscale simulations of protein landscapes: Using coarse-grained models as reference potentials to full explicit models. Proteins. 2010;78:1212–1227. doi: 10.1002/prot.22640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Vicatos S, Rychkova A, Mukherjee S, Warshel A. An effective Coarse-grained model for biological simulations: Recent refinements and validations. Proteins. 2014;82:1168–1185. doi: 10.1002/prot.24482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wu C, Shea JE. Coarse-grained models for protein aggregation. Curr Opin Struct Biol. 2011;21:209–220. doi: 10.1016/j.sbi.2011.02.002. [DOI] [PubMed] [Google Scholar]
- 27.Tozzini V. Coarse-grained models for proteins. Curr Opin Struct Biol. 2005;15:144–150. doi: 10.1016/j.sbi.2005.02.005. [DOI] [PubMed] [Google Scholar]
- 28.Guardiani C, Livi R. Coarse grained modeling and approaches to protein folding. Current bioinformatics. 2010;5:217–240. [Google Scholar]
- 29.Hills RD, Brooks CL. Insights from coarse-grained Gō models for protein folding and dynamics. Int J Mol Sci. 2009;10:889–905. doi: 10.3390/ijms10030889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Thorpe IF, Zhou J, Voth GA. Peptide folding using multiscale coarse-grained models. J Phys Chem B. 2008;112:13079–13090. doi: 10.1021/jp8015968. [DOI] [PubMed] [Google Scholar]
- 31.Marrink SJ, De Vries AH, Mark AE. Coarse grained model for semiquantitative lipid simulations. J Phys Chem B. 2004;108:750–760. [Google Scholar]
- 32.Marrink SJ, de Vries AH, Tieleman DP. Lipids on the move: simulations of membrane pores, domains, stalks and curves. Biochim Biophys Acta. 2009;1788:149–168. doi: 10.1016/j.bbamem.2008.10.006. [DOI] [PubMed] [Google Scholar]
- 33.Monticelli L, Kandasamy SK, Periole X, Larson RG, Tieleman DP, Marrink SJ. The MARTINI coarse-grained force field: extension to proteins. J Chem Theory Comput. 2008;4:819–834. doi: 10.1021/ct700324x. [DOI] [PubMed] [Google Scholar]
- 34.Marrink SJ, Tieleman DP. Perspective on the Martini model. Chem Soc Rev. 2013;42:6801–6822. doi: 10.1039/c3cs60093a. [DOI] [PubMed] [Google Scholar]
- 35.Kessel A, Shental-Bechor D, Haliloglu T, Ben-Tal N. Interactions of hydrophobic peptides with lipid bilayers: Monte Carlo simulations with M2δ. Biophys J. 2003;85:3431–3444. doi: 10.1016/S0006-3495(03)74765-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Maddox MW, Longo ML. A Monte Carlo study of peptide insertion into lipid bilayers: equilibrium conformations and insertion mechanisms. Biophys J. 2002;82:244–263. doi: 10.1016/S0006-3495(02)75391-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Milik M, Skolnick J. Insertion of peptide chains into lipid membranes: An off-lattice Monte Carlo dynamics model. Proteins. 1993;15:10–25. doi: 10.1002/prot.340150104. [DOI] [PubMed] [Google Scholar]
- 38.Baumgärtner A. Insertion and hairpin formation of membrane proteins: a Monte Carlo study. Biophys J. 1996;71:1248. doi: 10.1016/S0006-3495(96)79324-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Mukherjee S, Warshel A. Electrostatic origin of the mechanochemical rotary mechanism and the catalytic dwell of F1-ATPase. Proc Natl Acad Sci USA. 2011;108:20550–20555. doi: 10.1073/pnas.1117024108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Mukherjee S, Warshel A. Dissecting the role of the γ-subunit in the rotary–chemical coupling and torque generation of F1-ATPase. Proc Natl Acad Sci USA. 2015;112:2746–2751. doi: 10.1073/pnas.1500979112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Vicatos S, Roca M, Warshel A. Effective approach for calculations of absolute stability of proteins using focused dielectric constants. Proteins. 2009;77:670–684. doi: 10.1002/prot.22481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bowie JU. Solving the membrane protein folding problem. Nature. 2005;438:581–589. doi: 10.1038/nature04395. [DOI] [PubMed] [Google Scholar]
- 43.Ben-Tal N, Ben-Shaul A, Nicholls A, Honig B. Free-energy determinants of alpha-helix insertion into lipid bilayers. Biophys J. 1996;70:1803–1812. doi: 10.1016/S0006-3495(96)79744-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Moll TS, Thompson TE. Semisynthetic proteins: model systems for the study of the insertion of hydrophobic peptides into preformed lipid bilayers. Biochemistry. 1994;33:15469–15482. doi: 10.1021/bi00255a029. [DOI] [PubMed] [Google Scholar]
- 45.Ulmschneider JP, Andersson M, Ulmschneider MB. Determining peptide partitioning properties via computer simulation. J Membr Biol. 2011;239:15–26. doi: 10.1007/s00232-010-9324-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kessel A, Haliloglu T, Ben-Tal N. Interactions of the M2δ segment of the acetylcholine receptor with lipid bilayers: a continuum-solvent model study. Biophys J. 2003;85:3687–3695. doi: 10.1016/S0006-3495(03)74785-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hessa T, White SH, Von Heijne G. Membrane insertion of a potassium-channel voltage sensor. Science. 2005;307:1427–1427. doi: 10.1126/science.1109176. [DOI] [PubMed] [Google Scholar]
- 48.Dorairaj S, Allen TW. On the thermodynamic stability of a charged arginine side chain in a transmembrane helix. Proc Natl Acad Sci USA. 2007;104:4943–4948. doi: 10.1073/pnas.0610470104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Grabe M, Lecar H, Jan YN, Jan LY. A quantitative assessment of models for voltage-dependent gating of ion channels. Proc Natl Acad Sci USA. 2004;101:17640–17645. doi: 10.1073/pnas.0408116101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.White SH, von Heijne G. Transmembrane helices before, during, and after insertion. Curr Opin Struct Biol. 2005;15:378–386. doi: 10.1016/j.sbi.2005.07.004. [DOI] [PubMed] [Google Scholar]
- 51.Schow EV, Freites JA, Cheng P, Bernsel A, von Heijne G, White SH, Tobias DJ. Arginine in membranes: the connection between molecular dynamics simulations and translocon-mediated insertion experiments. J Membr Biol. 2011;239:35–48. doi: 10.1007/s00232-010-9330-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Choe S, Hecht KA, Grabe M. A continuum method for determining membrane protein insertion energies and the problem of charged residues. J Gen Physiol. 2008;131:563–573. doi: 10.1085/jgp.200809959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Hristova K, Wimley WC. A look at arginine in membranes. J Membr Biol. 2011;239:49–56. doi: 10.1007/s00232-010-9323-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Burykin A, Kato M, Warshel A. Exploring the origin of the ion selectivity of the KcsA potassium channel. Proteins. 2003;52:412–426. doi: 10.1002/prot.10455. [DOI] [PubMed] [Google Scholar]
- 55.Burykin A, Schutz C, Villa J, Warshel A. Simulations of ion current in realistic models of ion channels: the KcsA potassium channel. Proteins. 2002;47:265–280. doi: 10.1002/prot.10106. [DOI] [PubMed] [Google Scholar]
- 56.Johansson AC, Lindahl E. The role of lipid composition for insertion and stabilization of amino acids in membranes. J Chem Phys. 2009;130:185101. doi: 10.1063/1.3129863. [DOI] [PubMed] [Google Scholar]
- 57.Johansson AC, Lindahl E. Titratable amino acid solvation in lipid membranes as a function of protonation state. J Phys Chem B. 2008;113:245–253. doi: 10.1021/jp8048873. [DOI] [PubMed] [Google Scholar]
- 58.Li L, Vorobyov I, Allen TW. Potential of mean force and pKa profile calculation for a lipid membrane-exposed arginine side chain. J Phys Chem B. 2008;112:9574–9587. doi: 10.1021/jp7114912. [DOI] [PubMed] [Google Scholar]
- 59.Li L, Vorobyov I, Allen TW. The different interactions of lysine and arginine side chains with lipid membranes. J Phys Chem B. 2013;117:11906–11920. doi: 10.1021/jp405418y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.MacCallum JL, Bennett WD, Tieleman DP. Partitioning of amino acid side chains into lipid bilayers: results from computer simulations and comparison to experiment. J Gen Physiol. 2007;129:371–377. doi: 10.1085/jgp.200709745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.MacCallum JL, Bennett WD, Tieleman DP. Distribution of amino acids in a lipid bilayer from computer simulations. Biophys J. 2008;94:3393–3404. doi: 10.1529/biophysj.107.112805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Fleming KG. Energetics of Membrane Protein Folding. Annu Rev Biophys. 2014;43:233–255. doi: 10.1146/annurev-biophys-051013-022926. [DOI] [PubMed] [Google Scholar]
- 63.Moon CP, Fleming KG. Side-chain hydrophobicity scale derived from transmembrane protein folding into lipid bilayers. Proc Natl Acad Sci USA. 2011;108:10174–10177. doi: 10.1073/pnas.1103979108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Moon CP, Zaccai NR, Fleming PJ, Gessmann D, Fleming KG. Membrane protein thermodynamic stability may serve as the energy sink for sorting in the periplasm. Proc Natl Acad Sci USA. 2013;110:4285–4290. doi: 10.1073/pnas.1212527110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Curnow P, Booth PJ. Combined kinetic and thermodynamic analysis of α-helical membrane protein unfolding. Proc Natl Acad Sci USA. 2007;104:18970–18975. doi: 10.1073/pnas.0705067104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Chang YC, Bowie JU. Measuring membrane protein stability under native conditions. Proc Natl Acad Sci USA. 2014;111:219–224. doi: 10.1073/pnas.1318576111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Joh NH, Min A, Faham S, Whitelegge JP, Yang D, Woods VL, Bowie JU. Modest stabilization by most hydrogen-bonded side-chain interactions in membrane proteins. Nature. 2008;453:1266–1270. doi: 10.1038/nature06977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Lau FW, Bowie JU. A method for assessing the stability of a membrane protein. Biochemistry. 1997;36:5884–5892. doi: 10.1021/bi963095j. [DOI] [PubMed] [Google Scholar]
- 69.Hong H, Park S, Flores Jiménez RH, Rinehart D, Tamm LK. Role of aromatic side chains in the folding and thermodynamic stability of integral membrane proteins. J Am Chem Soc. 2007;129:8320–8327. doi: 10.1021/ja068849o. [DOI] [PubMed] [Google Scholar]
- 70.Hong H, Szabo G, Tamm LK. Electrostatic couplings in OmpA ion-channel gating suggest a mechanism for pore opening. Nat Chem Biol. 2006;2:627–635. doi: 10.1038/nchembio827. [DOI] [PubMed] [Google Scholar]
- 71.Hong H, Tamm LK. Elastic coupling of integral membrane protein stability to lipid bilayer forces. Proc Natl Acad Sci USA. 2004;101:4065–4070. doi: 10.1073/pnas.0400358101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Warshel A, Sharma PK, Kato M, Parson WW. Modeling electrostatic effects in proteins. Biochim Biophys Acta. 2006;1764:1647–1676. doi: 10.1016/j.bbapap.2006.08.007. [DOI] [PubMed] [Google Scholar]
- 73.Roca M, Messer B, Warshel A. Electrostatic contributions to protein stability and folding energy. FEBS Lett. 2007;581:2065–2071. doi: 10.1016/j.febslet.2007.04.025. [DOI] [PubMed] [Google Scholar]
- 74.Beroza P, Fredkin D, Okamura M, Feher G. Protonation of interacting residues in a protein by a Monte Carlo method: application to lysozyme and the photosynthetic reaction center of Rhodobacter sphaeroides. Proc Natl Acad Sci USA. 1991;88:5804–5808. doi: 10.1073/pnas.88.13.5804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Sham YY, Chu ZT, Warshel A. Consistent calculations of pKa's of ionizable residues in proteins: semi-microscopic and microscopic approaches. J Phys Chem B. 1997;101:4458–4472. [Google Scholar]
- 76.Warshel A, Russell S, Churg A. Macroscopic models for studies of electrostatic interactions in proteins: limitations and applicability. Proc Natl Acad Sci USA. 1984;81:4785–4789. doi: 10.1073/pnas.81.15.4785. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Lee FS, Chu ZT, Warshel A. Microscopic and semimicroscopic calculations of electrostatic energies in proteins by the POLARIS and ENZYMIX programs. J Comput Chem. 1993;14:161–185. [Google Scholar]
- 78.Warshel A, Russell ST. Calculations of electrostatic interactions in biological systems and in solutions. Q Rev Biophys. 1984;17:283–422. doi: 10.1017/s0033583500005333. [DOI] [PubMed] [Google Scholar]
- 79.Li LB, Vorobyov I, Allen TW. The role of membrane thickness in charged protein–lipid interactions. Biochim Biophys Acta. 2012;1818:135–145. doi: 10.1016/j.bbamem.2011.10.026. [DOI] [PubMed] [Google Scholar]
- 80.Warshel A. Energetics of light-induced charge separation across membranes. Isr J Chem. 1981;21:341–347. [Google Scholar]
- 81.Vorobyov I, Li L, Allen TW. Assessing atomistic and coarse-grained force fields for protein-lipid interactions: The formidable challenge of an ionizable side chain in a membrane. J Phys Chem B. 2008;112:9588–9602. doi: 10.1021/jp711492h. [DOI] [PubMed] [Google Scholar]
- 82.Vorobyov I, Olson TE, Kim JH, Koeppe RE, Andersen OS, Allen TW. Ion-induced defect permeation of lipid membranes. Biophys J. 2014;106:586–597. doi: 10.1016/j.bpj.2013.12.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Tepper HL, Voth GA. Mechanisms of passive ion permeation through lipid bilayers: insights from simulations. J Phys Chem B. 2006;110:21327–21337. doi: 10.1021/jp064192h. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Wilson MA, Pohorille A. Mechanism of unassisted ion transport across membrane bilayers. J Am Chem Soc. 1996;118:6580–6587. doi: 10.1021/ja9540381. [DOI] [PubMed] [Google Scholar]
- 85.Radzicka A, Wolfenden R. Comparing the polarities of the amino acids: side-chain distribution coefficients between the vapor phase, cyclohexane, 1-octanol, and neutral aqueous solution. Biochemistry. 1988;27:1664–1670. [Google Scholar]
- 86.de Jesus AJ, Allen TW. The role of tryptophan side chains in membrane protein anchoring and hydrophobic mismatch. Biochim Biophys Acta. 2013;1828:864–876. doi: 10.1016/j.bbamem.2012.09.009. [DOI] [PubMed] [Google Scholar]
- 87.Lovell SC, Word JM, Richardson JS, Richardson DC. The penultimate rotamer library. Proteins. 2000;40:389–408. [PubMed] [Google Scholar]
- 88.Dunbrack RL, Cohen FE. Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 1997;6:1661–1681. doi: 10.1002/pro.5560060807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Fleming PJ, Freites JA, Moon CP, Tobias DJ, Fleming KG. Outer membrane phospholipase A in phospholipid bilayers: A model system for concerted computational and experimental investigations of amino acid side chain partitioning into lipid bilayers. Biochim Biophys Acta. 2012;1818:126–134. doi: 10.1016/j.bbamem.2011.07.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Kučerka N, Liu Y, Chu N, Petrache HI, Tristram-Nagle S, Nagle JF. Structure of fully hydrated fluid phase DMPC and DLPC lipid bilayers using X-ray scattering from oriented multilamellar arrays and from unilamellar vesicles. Biophys J. 2005;88:2626–2637. doi: 10.1529/biophysj.104.056606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Gumbart J, Roux B. Determination of membrane-insertion free energies by molecular dynamics simulations. Biophys J. 2012;102:795–801. doi: 10.1016/j.bpj.2012.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Ulmschneider JP, Smith JC, White SH, Ulmschneider MB. In silico partitioning and transmembrane insertion of hydrophobic peptides under equilibrium conditions. J Am Chem Soc. 2011;133:15487–15495. doi: 10.1021/ja204042f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Fleishman SJ, Unger VM, Ben-Tal N. Transmembrane protein structures without X-rays. Trends Biochem Sci. 2006;31:106–113. doi: 10.1016/j.tibs.2005.12.005. [DOI] [PubMed] [Google Scholar]
- 94.Dror RO, Dirks RM, Grossman J, Xu H, Shaw DE. Biomolecular simulation: a computational microscope for molecular biology. Annu Rev Biophys. 2012;41:429–452. doi: 10.1146/annurev-biophys-042910-155245. [DOI] [PubMed] [Google Scholar]
- 95.Stansfeld PJ, Sansom MS. Molecular simulation approaches to membrane proteins. Structure. 2011;19:1562–1572. doi: 10.1016/j.str.2011.10.002. [DOI] [PubMed] [Google Scholar]
- 96.Arinaminpathy Y, Khurana E, Engelman DM, Gerstein MB. Computational analysis of membrane proteins: the largest class of drug targets. Drug Discovery Today. 2009;14:1130–1135. doi: 10.1016/j.drudis.2009.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Cymer F, von Heijne G, White SH. Mechanisms of integral membrane protein insertion and folding. J Mol Biol. 2015;427:999–1022. doi: 10.1016/j.jmb.2014.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Gumbart J, Chipot C, Schulten K. Free-energy cost for translocon-assisted insertion of membrane proteins. Proc Natl Acad Sci USA. 2011;108:3596–3601. doi: 10.1073/pnas.1012758108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Yoo J, Cui Q. Does arginine remain protonated in the lipid membrane? Insights from microscopic pKa calculations. Biophys J. 2008;94:L61–L63. doi: 10.1529/biophysj.107.122945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Humphrey W, Dalke A, Schulten K. VMD: visual molecular dynamics. J Mol Graphics. 1996;14:33–38. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




