Abstract
Using a new semi‐empirical method for calculating molecular polarizabilities and the Clausius−Mossotti relation, we calculated the static dielectric constants of dry proteins for all structures in the protein data bank (PDB). The mean dielectric constant of more than 150,000 proteins is with a standard deviation of 0.04, which agrees well with previous measurement for dry proteins. The small standard deviation results from the strong correlation between the molecular polarizability and the volume of the proteins. We note that non‐amino acid cofactors such as Chlorophyll may alter the dielectric environment significantly. Furthermore, our model shows anisotropies of the dielectric constant within the same molecule according to the constituents amino acids and cofactors. Finally, by changing the amino acid protonation states, we show that a change of pH does not have a significant effect on the dielectric constants of proteins.
Keywords: proteins, dielectric constants, semi-empirical methods, electrostatic interactions, molecular polarizabilities
The static dielectric constants of dry proteins for all structures in the protein data bank (PDB) have been calculated using a new semi‐empirical method for calculating molecular polarizabilities and the Clausius−Mossotti relation.
The intermolecular electrostatic interactions in proteins are scaled by their dielectric constants, which vary according to the size and composition of the proteins. The accurate determination of the dielectric constant is essential to understand a variety of biochemical interactions such as electron and proton transfer,1, 2 voltage gating,3, 4 ion channel selectivity,5 charge separation,6 and protein‐protein and protein‐ligand interactions.7 To a large extend, these interactions are governed by the electrostatic‐potential surfaces of proteins.
Direct measurements of dielectric constants ϵr of dry proteins span a range from 2.5 to 3.5. These values are determined by measuring the capacity of crystalline samples,8, 9 which agree with chemical shift perturbation measurements.10 However, in addition to amino acids, proteins in practice contain solvent molecules as well as organic and inorganic cofactors. These affect their dielectric constants and in most cases the effective dielectric constant is significantly different from the measured values for the dry proteins. The effective dielectric constants are usually determined indirectly using the Poisson‐Boltzmann equation to calculate the electrostatic interactions that reproduce measured pKa’s of some amino acids. These measurements include the effect of solvent molecules on the dielectric constant.10 The contribution of the solvent to the effective dielectric constant was studied theoretically based on Kirkwood‐Fröhlich dielectric theory.11
In addition, computational studies based on continuum electrostatics and molecular dynamic simulations showed that different structural motifs within the same protein may yield significantly different values of ϵr according to the polarity of their constituents molecules.12, 13, 14
The dielectric constant ϵr, the average polarizability α, and the volume V of a molecule are related by the Clausius−Mossotti relation:15
(1) |
However, calculations of the molecular polarizabilities of macromolecules are challenging and computationally demanding. Previously, we proposed a model16 for calculating the complete polarizability tensor of a protein through scaling of the tensor of a perfect conductor of the same shape based on a molecular basis set. The scaling factor was obtained from a regression model that correlated the polarizabilities of the molecule and a corresponding perfect‐conductor of the constituents molecules of the proteins, i.e. the amino.
Here, we propose a new method for the calculation of the average (scalar) polarizabilities of proteins based on their amino acid compositions, which utilizes the fact that objects with the same volume V and dielectric constant ϵr have the same average polarizabilities α independent of shape, see also (1). The static dielectric constants are then calculated using the Clausius−Mossotti relation. This method is computationally highly efficient and facilitated the calculations of the average polarizabilities and dielectric constants of all proteins in the protein data bank (PDB).17
The average polarizability of a molecule can be calculated from the sum over hybridization configurations of the atoms in the molecule,18
(2) |
with the number of electrons in the molecule N and the hybrid component τA of each atom A, obtained by approximating the zeroth order wavefunction by an antisymmetrized product of molecular orbitals and spin functions. Average polarizabilities predicted by this method showed a very good agreement with experimental polarizabilities for more than 400 relatively small molecules with only ∼2 % error.
Furthermore, since the atomic hybridizations of the atoms within the constituents amino acids do not change in proteins, (2) could be rearranged to obtain the average polarizability of a protein αp by summing over effective amino‐acid hybrid components:
(3) |
Here, Np is the number of electrons in the protein and τaa are the hybridization components of amino acid aa, which are obtained as
(4) |
with the number of electrons Naa in an amino acid aa and its average polarizability αaa. The latter could be obtained from quantum‐chemical calculations and, therefore, the values of τ not only include the summation of the atomic hybrid components within the amino acids, but also exchange correlation interactions at the level of quantum‐chemistry employed.
Furthermore, for (2) to be applicable for very polar compounds, τA has to be modified to include the effect of the atoms to which A is bonded. However, τaa already includes this effect since it reproduces the exact polarizabilities calculated from first principles.
The values of τ and αaa obtained with DFT are reported in Table 1 for the 6‐31G+(d,p) and 6‐311G++( ) basis set using B3LYP functional. The 6‐31G+(d,p) basis sets allow us to compare the predicted average polarizability against the calculated ones for the Trp‐cage mini protein, whereas the DFT calculations were not feasible for the larger basis sets. The average polarizability of the Trp‐cage protein calculated by DFT is 221 Å3; this calculation consumed more than 2000 CPU hours. The average polarizability of Trp cage calculated with our semi‐imperial approach is 215 Å3, with an error against DFT of 2.7 %; calculated in less than 200 μs. Thus, this approach allows the calculations of the average polarizabilities and hence the dielectric constants of all the structures stored in the PDB. However, for these calculations we will use the amino acids polarizabilities obtained with the larger basis sets 6‐311G++( ) to get more accurate predictions; for Trp cage this approach yields 234 Å3.
Table 1.
|
α′ (Å3) |
α (Å3) |
V (Å3) |
|
|
---|---|---|---|---|---|
G |
6 |
6 |
63 |
0.41 |
|
A |
7 |
8 |
81 |
0.42 |
|
S |
8 |
9 |
92 |
0.39 |
|
P |
10 |
11 |
109 |
0.41 |
|
V |
11 |
12 |
119 |
0.41 |
|
T |
10 |
11 |
109 |
0.41 |
|
C |
10 |
11 |
98 |
0.47 |
|
I |
13 |
14 |
141 |
0.41 |
|
L |
11 |
13 |
139 |
0.40 |
|
N |
10 |
11 |
112 |
0.41 |
|
D |
11 |
12 |
102 |
0.49 |
|
Q |
12 |
13 |
132 |
0.41 |
|
K |
13 |
14 |
158 |
0.36 |
|
E |
14 |
15 |
121 |
0.52 |
|
M |
14 |
15 |
139 |
0.46 |
|
H |
14 |
15 |
138 |
0.45 |
|
F |
17 |
18 |
160 |
0.48 |
|
R |
15 |
17 |
173 |
0.40 |
|
Y |
18 |
19 |
168 |
0.48 |
|
W |
22 |
23 |
193 |
0.50 |
To compare with our previous method, which allows the calculations of the full polarizability tensor, we calculated the polarizability tensor for perfect conductors of the same shape of the proteins by solving Laplace's equation with Dirichlet boundary conditions and using Monte Carlo path integral methods.19 Then, all tensors are diagonalized to transform the proteins to the polarizability frame and the average of the diagonal elements are scaled by 0.26, which was the slope of the best‐fit line that described the correlation between the amino acids and perfect conductors of their shapes.16 The obtained polarizabilites from the summation of the square of the atomic hybridization components highly correlate with those obtained by scaling the polarizabilites of perfect conductors with R 2=0.8 and a slope of 1.6, with the intercept set to zero. Thus, the later, method produced polarizabilities that are 60 % higher, which we ascribe to effects of the uneven concentration of the individual amino acids in each protein. Overall, the method presented here provides a computationally highly efficient method for the calculation of the scalar polarizabilities. If the tensorial properties of the polarizability are needed, the current method could be used to generate the scaling factor that is applied to the tensor elements obtained in our previous method.16
In order to solve the Clausius−Mossotti equation, the volumes of the proteins are calculated as the summation of the volume of the constituents amino acids. The volume of the 20 amino acids are calculated using the Volume Assessor software by rolling a virtual sphere with a probe radius of 1 pm on the surface of the amino acids.20 The calculated volumes are reported in Table 1.
The average static dielectric constant ϵr for more than 150,000 protein structures stored in the PDB database based on their amino acid decomposition is 3.23 with a standard deviation of 0.04, see Figure 1a. According to the Clausius−Mossotti relation, the ratio between the average polarizability and the volume, , is the factor that determines the value of ϵr. Thus, due to the strong correlation between the average polarizability and the molecular volume with R 2=1, Figure 1b, the standard deviation of ϵr is very small. According to the regression model shown in Figure 1b, the polarizability α of proteins could be calculated according to the straight line equation with negligible residuals. Both the volume and the average polarizabilities exhibit a skewed normal distribution, shown in Figure 1c, d.
The maximum dielectric constant of 3.7 is observed for N‐terminal human brand 3 peptide with PDB ID 2BTA,21 which has an average polarizability of 212.7 Å3 and a volume of 1879 Å3. The large polarizability of this peptide is attributed to the ASP and GLU amino acids, which represent 50 % of the constituent amino acids and have high ratios. The minimum ϵr of 2.8 is observed for peptide‐membrane PDB ID 6HNG,22 which is formed by only eight leucine and six lysine amino acids. The lysine amino acid generally has a small ratio, because it is positively charged, i.e., it has less electrons than neutral or negatively charged amino acids which are also stronger bound.
Within the same protein the value of ϵr may change according to the composition of the different parts. For example, in norrin, a Wnt signaling activator, PDB ID 5BPU,23 the chains A, B, D, E, and F have , while chains H and I have as they are only formed by GLU amino acids. Thus, ϵr distributions can be inhomogeneous within a protein, which agrees with previous studies based on MD simulations and continuum electrostatics simulations.12, 13, 14 Furthermore, proteins have a variety of cofactor such as chlorophyll, metal clusters, chloride ions, hems, quinones, …These molecules are very different than the amino acids and could have large impact on the dielectric environment of the proteins. For example, the calculated average polarizability of chlorophyll is 132.3 Å3, with a volume of 900 Å3, which results in , while for iron‐sulphur clusters of photosystem I in the oxdized state,24 and its amino acids ligands .
To study the effect of pH on the dielectric constant, we recalculated the distribution of ϵr for all proteins by replacing the average polarizabilities αaa of GLU−, ASP−, and HIS0 with the average polarzbilities of the protonated form GLU0, ASP0, and HIS+ to simulate low pH environment. The mean of the distribution reduced to 3.15 and the standard deviation is unchanged. Because the mean of the ϵr is changed only by 0.08, it is a reasonable assumption that proteins, which experience pH gradient across different structural motifs have the same dielectric constants.
In conclusions, we developed an empirical method for the calculation of the average polarizabilities of dry proteins based on their amino acids composition. The method is computationally highly efficient and allowed us to calculate the average polarizabilities and dielectric constants of all molecular structures in the PDB. The average dielectric constant for more than 150,000 proteins is , with a very small standard deviation of 0.04, due to the strong correlation between the average polarizability and the molecular volume.
However, organic and inorganic cofactors could alter the dielectric environment of the proteins significantly. Thus, in order to understand the chemical reactions in proteins, the correct dielectric environment should be implemented in the biochemical/biophysical calculations.
We point out that the current approach does not take into account the molecules shape, which is valid for the scalar average polarizability, see also (1). For the computation of tensorial properties advanced, more expensive methods have to be employed.16
Supporting Information
We provide a compressed text file in comma‐separated‐value format that contains the polarizabilities, the volumes, and the dielectric constants for all structures in PDB (as of 01. August 2019).
Conflict of interest
The authors declare no conflict of interest.
Supporting information
Acknowledgements
This work has been supported by the European Research Council under the European Union's Seventh Framework Programme (FP7/2007‐2013) through the Consolidator Grant COMOTION (614507) and by the Deutsche Forschungsgemeinschaft through the Cluster of Excellence “Advanced Imaging of Matter” (AIM, EXC 2056, ID 390715994).
M. Amin, J. Küpper, ChemistryOpen 2020, 9, 691.
Contributor Information
Muhamed Amin, Email: muhamed.amin@cfel.de, https://www.controlled‐molecule‐imaging.org.
Jochen Küpper, Email: jochen.kuepper@cfel.de.
References
- 1. Huynh M. H. V., Meyer T. J., Chem. Rev. 2007, 107, 5004–5064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Amin M., Vogt L., Szejgis W., Vassiliev S., Brudvig G. W., Bruce D., Gunner M., J. Phys. Chem. B 2013, 119, 7366–7377. [DOI] [PubMed] [Google Scholar]
- 3. Armstrong C. M., Hille B., Neuron 1998, 20, 371–380. [DOI] [PubMed] [Google Scholar]
- 4. Sigworth F. J., Quart. Rev. Biophys. 1994, 27, 1–40. [DOI] [PubMed] [Google Scholar]
- 5. Keramidasa A., Moorhousea A. J., Schofield P. R., Barry P. H., Prog. Biophys. Mol. Biol. 2004, 86, 161–204. [DOI] [PubMed] [Google Scholar]
- 6. Gray H. B., Winkler J. R., Annu. Rev. Biochem. 1996, 65, 537–561. [DOI] [PubMed] [Google Scholar]
- 7. Sheinerman F. B., Norel R., Honig B., Curr. Opin. Struct. Biol. 2000, 10, 153–159. [DOI] [PubMed] [Google Scholar]
- 8. Rosen D., Trans. Faraday Soc. 1963, 59, 2178–2191. [Google Scholar]
- 9. Takashima S., Schwan H. P., J. Phys. Chem. 1965, 69, 4176–4182. [DOI] [PubMed] [Google Scholar]
- 10. Kukic P., Farrell D., McIntosh L. P., García-Moreno B., Jensen E. K. S., Toleikis Z., Teilum K., Nielsen J. E., J. Am. Chem. Soc. 2013, 135, 16968–16976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Gilson M. K., Honig B. H., Biopolymers 1986, 25, 2097–2119. [DOI] [PubMed] [Google Scholar]
- 12. Li L., Li C., Zhang Z., Alexov E., J. Chem. Theory Comput. 2013, 9, 2126–2136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Hazra S. T., Siwen A. U., Emil W., Zhao A., J. Math. Chem. 2019, 2282–2294. [Google Scholar]
- 14. Simonson T., Perahia D., Pneumonol. Alergol. Pol. 1995, 92, 1082–1086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Jansen L., Phys. Rev. 1958, 112, 434–444. [Google Scholar]
- 16. Amin M., Samy H., Küpper J., J. Phys. Chem. Lett. 2019, 10, 2938–2943, arXiv:1904.02504 [physics]. [DOI] [PubMed] [Google Scholar]
- 17. Berman H. M., Westbrook J., Feng Z., Gilliland G., Bhat T. N., Weissig H., Shindyalov I. N., Bourne P. E., Nucleic Acids Res. 2000, 28, 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Miller K. J., Savchik J., J. Am. Chem. Soc. 1979, 112, 7206–7213. [Google Scholar]
- 19. Mansfield M. L., Douglas J. F., Garboczi E. J., Phys. Rev. A 2001, 64, 061401–061416. [DOI] [PubMed] [Google Scholar]
- 20. Voss N., Gerstein M., Steitz T., Moore P., J. Mol. Biol. 2006, 360, 893–906. [DOI] [PubMed] [Google Scholar]
- 21. Schneider M., Post C., Biochemistry 1995, 34, 16574–16584. [DOI] [PubMed] [Google Scholar]
- 22. Schneider G., Blatter M., Mueller A., Protein Data Bank website 2018. [Google Scholar]
- 23. Chang T., Hsieh F., Zebisch M., Harlos K., Elegheert J., Jones E., Elife 2015, 4, e06554. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Golbeck J. H., Annu. Rev. Plant Physiol. Plant Mol. Biol. 1992, 43, 293–324. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.