Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2022 Oct 26;91(3):395–399. doi: 10.1002/prot.26437

Interplay between hydrogen and chalcogen bonds in cysteine

Oliviero Carugo 1,2,
PMCID: PMC10092013  PMID: 36250971

Abstract

Protein structures are stabilized by several types of chemical interactions between amino acids, which can compete with each other. This is the case of chalcogen and hydrogen bonds formed by the thiol group of cysteine, which can form three hydrogen bonds with one hydrogen acceptor and two hydrogen donors and a chalcogen bond with a nucleophile along the extension of the C—S bond. A survey of the Protein Data Bank shows that hydrogen bonds are about 40–50 more common than chalcogen bonds, suggesting that they are stronger and, consequently, prevail, though not always. It is also observed that frequently a thiol group that forms a chalcogen bond is also involved, as a hydrogen donor, in a hydrogen bond.

Keywords: chalcogen bond, hydrogen bond, Protein Data Bank, protein structure

1. INTRODUCTION

It was realized, long ago, that a folded protein “consists of one polypeptide chain which continues without interruptions throughout the molecule (or, in certain cases, of two or more such chains)” and that “this chain is folded into a uniquely defined configuration, in which it is held by hydrogen bonds”. 1 Later on, a series of other non‐covalent interactions have been discovered to be responsible for protein folding, stability, plasticity, and function, like van der Waals, hydrophobic, and electrostatic interactions. 2 One of these non‐covalent interactions has received little attention insofar: it is the chalcogen bond.

It is an attractive interaction between chalcogen atoms (sulfur, selenium, or tellurium) and nucleophiles. In molecular moieties like R—X—R (X = S, Se, or Te), the nucleophile tends to occupy a position along the extension of one of the R—X covalent bonds (Figure 1). 3 Although recent publications reviewed extensively both theoretical and experimental studies of chalcogen bonds, 4 , 5 minor attention has been paid to this non‐covalent interaction in biological systems.

FIGURE 1.

FIGURE 1

Scheme of the chalcogen and hydrogen bonds that may involve the cysteine side‐chain. Nu indicates a nucleophile, AAA a hydrogen acceptor, and DH a hydrogen donor

Despite the pioneering work of Thronton, who reported the interaction between the sulfur atoms of cysteine and methionine with the aromatic rings of tryptophan, tyrosine, and phenylalanine, which are nucleophiles, 6 little was published in the early days of structural bioinformatics. No trace of chalcogen bonds involving methionine sulfur atoms emerged in a 1999 analysis of room‐temperature protein crystal structures. 7 In a subsequent study of a larger data set, some evidence of chalcogen bonds between the methionine sulfur atom and backbone or side‐chain carboxylate oxygen atoms was observed. 8 Further statistical analyses, coupled with molecular orbital ab initio calculations, confirmed that the sulfur atoms of cysteine and methionine can form chalcogen bonds with protein polar atoms. 9 , 10 More recently, a 2021 study showed numerous chalcogen bonds between the selenium atom of selenomethionine in low‐temperature protein crystal structures. 11

Ligand binding to proteins can be influenced by chalcogen bonds, too. 12 , 13 For example, the activity of ebselen, a glutathione peroxidase mimic, is enhanced by its ability of forming chalcogen bonds with selenium. 14 This non‐bonding interaction is important also in the mechanism of inhibition of maltase glucoamylase by salacinol and katalanol. 15

In this manuscript, the chalcogen bonds that involve cysteine side‐chains in proteins are identified and compared to the hydrogen bonds that involve the same side‐chains, which can behave either as hydrogen donors or as hydrogen acceptors (Figure 1). This is an attempt to determine the relative energies of the two types of interactions by comparison of their frequency of occurrence.

2. RESULTS AND DISCUSSION

All chalcogen and hydrogen bonds were identified according to the procedures described in Section 3. Interatomic contacts that might be considered, according to the stereochemical criteria used in this study, both hydrogen and chalcogen bonds were discarded (about 20% of the chalcogen bonds may be confused with hydrogen bonds and about 1% of the hydrogen bonds may be confused with chalcogen bonds).

Details about hydrogen and chalcogen bonds are given in Supplementary Material (Tables S1–S5) and are only briefly presented here.

As it can be expected, given the high frequency of occurrence—one per residue—the most common nucleophile that forms chalcogen bonds is the main‐chain oxygen atom. Surprisingly, sulfur atoms act rather frequently as nucleophiles, too, despite their low frequency of occurrence in proteins. 16 As it was already observed, cysteines tend to be more frequently hydrogen donors than hydrogen acceptors in hydrogen bonds, roughly with a ratio from 2:1 4 to 5:1. 17 When the cysteine thiol group acts as a hydrogen donor, the hydrogen acceptor is often a main‐chain oxygen atom and when it acts as a hydrogen acceptor, the hydrogen donor is frequently a main‐chain or water oxygen. Chalcogen bond average lengths (Table S5) are obviously larger when the nucleophile atom is sulfur than with oxygen; however, no statistically significant differences appear amongst different types of sulfur and oxygen atoms, likely because of the moderate accuracy of these macromolecular structures. 18

Concerning the focus of this study—the relative frequency of chalcogen and hydrogen bonds—it is interesting to observe that hydrogen bonds are about 40–50 times more common than chalcogen bonds (Table 1). This suggests that they are stronger. Not 40–50 times stronger, obviously. This indicates that in most cases, the hydrogen bond that a thiol group may form is more stable than alternative chalcogen bond: even a small difference would produce the prevalence of hydrogen bond, from a thermodynamic perspective. Precise estimations of the energy of these interactions are unfortunately impossible, based on statistical observations, since the probability density functions of the energies are unknown. However, it is clear that hydrogen bonds are stronger, on average.

TABLE 1.

Frequencies of chalcogen bonds (Cb) and hydrogen bonds (Hb) in the Single dataset and in the nine subsets of the Protein Data Bank assembled with the RaSPDB procedure (see Section 3 for details). The average values (estimated errors in parentheses) are computed on these nine subsets

Dataset Number Cb Number Hb % Cb % Hb
Single 833 31 972 2.54 97.46
raspb_1 321 16 848 1.87 98.13
raspb_2 347 17 141 1.98 98.02
raspb_3 359 17 026 2.06 97.94
raspb_4 363 17 255 2.06 97.94
raspb_5 346 16 878 2.01 97.99
raspb_6 340 17 288 1.93 98.07
raspb_7 359 16 381 2.14 97.86
raspb_8 348 17 171 1.99 98.01
raspb_9 347 16 936 2.01 97.99
Average‐1–9 2.01(0.03) 97.99(0.03)

Table 1 also shows that trends and tendencies evaluated by using a non‐redundant subset of the Protein Data Bank (Single dataset) or by following the RaSPDB method (raspdb_x datasets) are nearly equivalent. This reinforces the use of the RaSPDB method, which allows one to use a greater amount of information and to compute estimated errors.

The trends outlined above are independent of the secondary structure or the degree of solvent accessibility of the cysteines.

If chalcogen and hydrogen bonds would have the same strength, one would expect two‐to‐three hydrogen bonds per chalcogen bond, since the number of hydrogen bonds that a cysteine thiol group can form is likely to be higher than the number of chalcogen bonds that it can form (Figure 1). However, the observed difference in the number of bonds is much higher.

The interaction energy of both chalcogen and hydrogen bonds can be quite variable. Both chalcogen and hydrogen bonds have an electrostatic component, which strongly depend on the local environment—that is, on the local dielectric constant. Consequently, chalcogen bonds may be stronger than hydrogen bonds, in some cases.

It is interesting to observe that the presence of a hydrogen donor may hinder, because of steric reasons, the formation of a chalcogen bond along the extension of the C—S covalent bond. In other words, the hydrogen donor and the nucleophile roughly compete for the same position close to the sulfur atom (Figure 1).

Interestingly, the same thiol group of the cysteine side‐chain can be involved in both a chalcogen and a hydrogen bond (Table 2). About 65% of the chalcogen bonds are associated with hydrogen bonds and in about 80% of these hydrogen bonds, the cysteine is a hydrogen donor (Table 2). In this regard, it is interesting to note that it has long been observed that cysteine side chains are good hydrogen donors and poor hydrogen acceptors in hydrogen bonds 17 , 19 , 20 : this could be due, at least in part, to the formation of competing chalcogen bonds.

TABLE 2.

Fraction of chalcogen bonds associated with a hydrogen bond involving the SG cysteine atom and percentage of cases where the cysteine is a hydrogen donor in these hydrogen bonds, in the Single dataset and in the nine subsets of the Protein Data Bank assembled with the RaSPDB procedure (see Section 3 for details)

Dataset Fraction (%) H‐donor (%)
Single 66.3 81.7
raspdb_1 67.0 87.7
raspdb_2 64.6 82.1
raspdb_3 63.5 81.8
raspdb_4 66.7 76.4
raspdb_5 61.0 80.2
raspdb_6 65.0 82.2
raspdb_7 69.4 83.4
raspdb_8 69.3 80.2
raspdb_9 64.8 87.6
Average‐1–9 65.8(0.9) 82.4(1.2)

An example is shown in Figure 2: cysteine 31 (chain A) of human thymidylate kinase forms a chalcogen bond with the main‐chain oxygen atom of alanine 37 and forms a hydrogen bond with the main‐chain oxygen atom of valine 27 (PDB file 1e9a 21 ).

FIGURE 2.

FIGURE 2

Example of chalcogen and hydrogen bonds formed by the same cysteine thiol group (data from the file 1e9a of the Protein Data Bank)

It is possible to hypothesize a local structural rearrangement that transforms the chalcogen bond into a hydrogen bond and vice versa. In other words, an oscillation from chalcogen to hydrogen bond. This would optimize the bonding requirements and decrease the entropic cost of protein folding.

It is nevertheless important to observe that the statistical analyses presented in this manuscript cannot provide a quantitative estimation of the energy associated with chalcogen and hydrogen bonds. There are in fact several limitations. For example, although the position of the hydrogen atoms is of crucial importance for defining hydrogen bonds, it is usually unknown in protein crystal structures, especially for acidic and rotatable hydrogen atoms. Moreover, without the hydrogen position, one cannot identify chalcogen bonds that might be formed along the extension of the H—S covalent bond. Moreover, chalcogen and hydrogen bonds involving aromatic rings were not considered in the present study, for the sake of simplicity, though they might participate in protein structure stabilization, by forming both chalcogen and hydrogen bonds.

Further analyses, focused on X‐ray and neutron protein crystal structures at extremely high resolution might provide additional information as well as analyses of protein structures determined with alternative methods—for example, cryo‐electron crystallography and nuclear magnetic resonance in solution.

3. MATERIALS AND METHODS

3.1. Data selection

All data were extracted from the enormous amount of information available in the Protein Data Bank. 22 , 23 Only X‐ray crystal structures determined in the 80–120 K temperature range and refined at a resolution of at least 2.0 Å were retained. This resulted in about 66 500 entries of the Protein Data Bank.

Then two strategies were followed to extract non‐redundant sets of data.

On the one hand, the pairwise sequence redundancy was reduced with CD‐HIT—maximal percentage of sequence identity of 40% 24 —and the attention was limited to chains containing at least 50 amino acids. This resulted in the Single dataset containing about 14 000 protein chains.

On the other hand, the RaSPDB procedure was applied. 25 It consists in creating several subsets of the Protein Data Bank. Each subset must be large enough to be representative of the Protein Data Bank and small enough to avoid internal redundancy. Nine non‐overlapping subsets, each containing about 7000 protein chains made by more than 50 amino acids, were assembled, and all statistical analyses were performed on each of them and then averaged. This procedure allows one to use a much larger fraction of the Protein Data Bank and to estimate the standard errors of each estimate. This results in the nine subsets raspdb_X (X = 1–9).

3.2. Chalcogen bond detection

In previous studies of chalcogen bonds formed by selenomethionine, the position of the nucleophile relative to the selenium atom was described by means of spherical coordinates, 7 , 11 which require the atomic positions of the C—Se—C triatomic fragment of the selenomethionine side‐chain. An analogous approach is impossible here, where the attention is focused on the C—S—H triatomic fragment of cysteine, given that the coordinates of this hydrogen atom are usually unknown, since acidic and rotatable hydrogen atoms are often undetected, even at very high crystallographic resolution or in neutron diffraction studies.

In principle, it is possible to compute the position of these hydrogen atoms by optimizing their interactions with atoms close by. 26 This means by optimizing their hydrogen bonds. 27 Here, it is preferable to avoid the computation of the position of these hydrogen atoms, since this would inevitably bias the analysis of chalcogen bonds.

As a consequence, a S—Nu chalcogen bond was simply defined as a contact shorter than 3.4 Å (when Nu is an oxygen atom) or than 3.7 Å (when Nu is a sulfur atom) and colinear or nearly colinear with the C‐S bond, which means that the angle α = 180°‐(Cβ‐Sγ‐Nu) must be narrower than 25°—note that this threshold is larger than 20°, the value used in chemistry and material science, since it is necessary to consider the lower accuracy of macromolecular crystal structures.

Care was taken to remove from the chalcogen bonds' list the disulfide bonds and the short sulfur–sulfur contacts that may be observed for radiation‐damaged disulfide bonds. 28 , 29 Analogously, short sulfur–sulfur contacts resulting from the interactions of the sulfur atoms with the same heteroatom—typically a metal cation—were removed from the chalcogen bonds' list.

3.3. Hydrogen bond detection

Potential hydrogen bonds that involve cysteine were identified with HBPLUS 30 and filtered according to the following criteria 17 , 31 : S—A < 4.3 Å and S—A—AA > 90° when the cysteine is a hydrogen donor; and D—S < 4.1 Å when the cysteine is a hydrogen acceptor. Additional stereochemical criteria that can be used to identify hydrogen bonds and that require the knowledge of the position of the hydrogen atoms were disregarded, since the hydrogen atom position is generally unknown.

3.4. Miscellaneous

Solvent‐accessible surface areas were computed with NACCESS 32 and secondary structure assignments were performed with Stride. 33

AUTHOR CONTRIBUTIONS

Oliviero Carugo designed the procedures, executed all computations, and wrote the manuscript.

CONFLICTS OF INTEREST

There is no conflict of interests.

Supporting information

Table S1 Supporting information

ACKNOWLEDGMENTS

Kristina Djinović is gratefully acknowledged for her kind hospitality and Prof. A. Stradella for his constant support.

Carugo O. Interplay between hydrogen and chalcogen bonds in cysteine. Proteins. 2023;91(3):395‐399. doi: 10.1002/prot.26437

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available in the Protein Data Bank at https://www.rcsb.org. PDB files used in this study are listed in the Supplementary Material.

REFERENCES

  • 1. Mirsky AE, Pauling L. On the structure of native, denatured, and coagulated proteins. Proc Natl Acad Sci U S A. 1936;22:439‐447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Karshikoff A. Non‐Covalent Interactions in Proteins. Imperial College Press; 2006. [Google Scholar]
  • 3. Aekeroy C, Bryce BD, Desiraju LG, et al. Definition of the chalcogen bond (IUPAC recommendations 2019). Pure Appl Chem. 2019;91:1889‐1892. [Google Scholar]
  • 4. Scheiner S. Participation of S and Se in hydrogen and chalcogen bonds. CrstEngComm. 2021;23:6821‐6837. [Google Scholar]
  • 5. Scilabra P, Terraneo G, Resnati G. The chalcogen bond in crystalline solids: a world parallel to halogen bond. Acc Chem Res. 2019;52:1313‐1324. [DOI] [PubMed] [Google Scholar]
  • 6. Reid KSC, Lindley PF, Thornton JM. Sulphur‐aromatic interactions in proteins. FEBS Lett. 1985;190:209‐213. [Google Scholar]
  • 7. Carugo O. Stereochemistry of the interaction between methionine sulfur and the protein core. Biol Chem. 1999;380:495‐498. [DOI] [PubMed] [Google Scholar]
  • 8. Pal D, Chakrabarti P. Non‐hydrogen bond interactions involving the methionine sulfur atom. J Biomol Struct Dyn. 2001;1:115‐128. [DOI] [PubMed] [Google Scholar]
  • 9. Iwaoka M, Takeöoto S, Okada M, Tomoda S. Weak nonbonded S ≥ X (X = O, N, and S) interactions in proteins. Statistical and theoretical studies. Bull Chem Soc Jpn. 2002;75:1611‐1625. [Google Scholar]
  • 10. Junming L, Yunxiang L, Subin Y, Weiliang Z. Theoretical and crystallographic data investigationsof noncovalent S…O interactions. Struct Chem. 2011;22:757‐763. [Google Scholar]
  • 11. Carugo O, Resnati G, Metrangolo P. Chalcogen bonds involving selenium in protein structures. ACS Chem Biol. 2021;16:1622‐1627. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Kříž K, Franrlík J, LepšíK M. Chalcogen bonding in protein‐ligand complexes: PDB survey and quantim mechanical calculations. ChemPhysChem. 2018;19:2540‐2548. [DOI] [PubMed] [Google Scholar]
  • 13. Fick RJ, Kroner GM, Nepal B, et al. Oxygen chalcogen bonding mediates AdoMet recognition in the lysine methyltransferase SET7/9. ACS Chem Biol. 2016;11:748‐754. [DOI] [PubMed] [Google Scholar]
  • 14. Daolio A, Scilabra P, Di Pietro ME, Resnati C, Rissanen K, Resnati G. Binding motif of ebselen in solution: chalcogen and hydrogen bonds team up. New J Chem. 2020;44:20697‐20703. [Google Scholar]
  • 15. Galmés B, Juan‐Bals A, Frontera A, Resnati G. Charge‐assisted chalcogen bonds: csd and dft analyses and biological implication in glucosidase inhibitors. Chem A Eur J. 2022;26:4599‐4606. [DOI] [PubMed] [Google Scholar]
  • 16. Carugo O. Amino acid composition and protein dimension. Protein Sci. 2008;17:2187‐2191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Zhou P, Tian F, Lv F, Shang Z. Geometric characteristics of hydrogen bonds involving sulfur atoms in proteins. Proteins. 2009;76:151‐163. [DOI] [PubMed] [Google Scholar]
  • 18. Dinesh Kumar KS, Gurusaran M, Satheesh SN, et al. Online_DPI: a web server to calculate the diffraction precision index for a protein structure. J Appl Cryst. 2015;48:939‐942. [Google Scholar]
  • 19. Gregoret LM, Rader SD, Fletterick RJ, CF E. Hydrogen bonds involving sulfur atoms in proteins. Proteins. 1991;9:99‐107. [DOI] [PubMed] [Google Scholar]
  • 20. Pal D, Chakrabarti P. Different types of interaction of cysteine sulphidryl group in proteins. J Biomol Struct Dyn. 1998;15:1059‐1072. [DOI] [PubMed] [Google Scholar]
  • 21. Ostermann N, Lavie A, Padiyar S, et al. Potentiating Azt activation: structures of wildtype and mutant human thymidylate kinase suggest reasons for the mutants' improved kinetics with the HIV prodrug metabolite Aztmp. J Mol Biol. 2000;304:43‐53. [DOI] [PubMed] [Google Scholar]
  • 22. Bernstein FC, Koetzle TF, Williams GJB, et al. The Protein Data Bank: a computer‐based archival file for macromolecular structures. J Mol Biol. 1977;112:535‐542. [DOI] [PubMed] [Google Scholar]
  • 23. Berman HM, Westbrook J, Feng Z, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28:235‐242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Fu L, Niu B, Zhu Z, Wu S, Li W. CD‐HIT: accelerated for clustering the next generation sequencing data. Bioinformatics. 2012;28:3150‐3152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Carugo O. Random sampling of the Protein Data Bank: RaSPDB. Sci Rep. 2021;11:24178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Li Y, Roy A, Zhang Y. HAAD: a quick algorithm for accurate prediction of hydrogen atoms in protein structures. PLoS One. 2009;4:e6701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Lippert T, Rarey M. Fast automated placement of polar hydrogen atoms in protein‐ligand complexes. J Chem. 2009;1:13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Holton JM. A beginner's guide to radiation damage. J Synchrotron Radiat. 2009;16:133‐142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Carugo O, Djinovic‐Carugo K. When X‐rays modify the protein structure: radiation damage at work. Trends Biochem Sci. 2005;30:213‐219. [DOI] [PubMed] [Google Scholar]
  • 30. IK IKMD, Thornton JM. Satisfying hydrogen bonding potential in proteins. J Mol Biol. 1994;238:777‐793. [DOI] [PubMed] [Google Scholar]
  • 31. Mazmanian K, Karen S, Cédric G, Dudev T, Carmay L. Preferred hydrogen‐bonding partners of cysteine: implications for regulating Cys functions. J Phys Chem B. 2016;120:10288‐10296. [DOI] [PubMed] [Google Scholar]
  • 32. Hubbard SJ, Thornton JM. NACCESS. Department of Biochemistry and Molecular Biology, University College; 1993. [Google Scholar]
  • 33. Heinig M, Frishman D. STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Res. 2004;32:w500‐w502. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1 Supporting information

Data Availability Statement

The data that support the findings of this study are available in the Protein Data Bank at https://www.rcsb.org. PDB files used in this study are listed in the Supplementary Material.


Articles from Proteins are provided here courtesy of Wiley

RESOURCES