Abstract
The roles of carbohydrates in nature are many and varied. However, the lack of template encoding in glycoscience distances carbohydrate structure, and hence function, from gene sequence. This challenging situation is compounded by descriptors of carbohydrate structure and function that have tended to emphasise their complexity. Herein, we suggest that revising the language of glycoscience could make interdisciplinary discourse more accessible to all interested parties.
Keywords: carbohydrates, epigenetics, glycoscience vocabulary, glycosyl hydrolase, glycosylation, glycosyltransferase
The lack of template‐encoding in glycoscience disconnects the “glyco code” from direct gene sequence control. Thus carbohydrate biosynthesis and glycan function depend upon a series of protein–carbohydrate interaction events. Descriptions of these events have tended to emphasise their complexity: here we suggest that revising the language of glycoscience could make interdisciplinary discourse more accessible to all parties.

The perception that glycoscience—the chemistry and biology of carbohydrates—is both complex and ubiquitous in nature1, 2 has led to the notion that “carbohydrates in molecular biology are like dark matter in the universe… poorly studied yet crucial to a full understanding of how things actually work”.3 In contrast to nucleic acids and proteins (DNA codes for RNA codes for protein), the lack of template‐encoding disconnects the “glyco code”4 from direct gene sequence control. This results in carbohydrate biosynthesis and the biological function of glycans being dependent upon a series of protein–carbohydrate interaction events. Overall, the concerted actions of lectins, glycosyltransferases and/or glycoside hydrolases achieve the integrity of mature bioactive glycan structures. The intricacies of this landscape are made worse by the tendency of the glycoscience community to emphasise the complexities of the field, perhaps making it less accessible to the casual reader—the informed non‐expert—than it needs to be. The glycoscience community are not alone in this shortcoming, as highlighted by a Comment in Nature that suggests that “Antibiotic resistance has a language problem. A failure to use words clearly undermines the global response to antimicrobials′ waning usefulness”.[5] Technological6 and informatics7 advances in glycoscience, alongside combinations of the two,8 are providing new ways to cut through the complexity, whilst comprehensive books of glycobiology topics provide entries in to the field.9 The introduction of stylized symbol nomenclature for glycans (SNFG; Figure 1) also represents an important step towards simplifying communication within and between interested disciplines10 along with guidelines for experimental design and data curation,11 and a repository for glycan structures.12
Figure 1.

Representing glycan structures: simplification and standardization with stylized SNFG. Taken from ref. 10.
As discussed recently by Gabius,13 there are notable parallels between aspects of glycobiology and the epigenetic regulation of chromatin structure and function. The latter processes, which occur with exquisite precision, are typically referred to in stripped‐down terms as a series of read, write and erase events, making the field immediately accessible to outsiders. Indeed, this approach emulates computer programming's create, read, update and delete (CRUD)14—the four basic functions employed for persistent data storage.15 Herein, we consider the potential to recapitulate glycoscience language in the terms of epigenetic vocabulary.
In simple terms, epigenetics concerns small chemical changes (marks) in the chemical structure of chromatin—typically the histone proteins that organize and package DNA in chromosomes.16 Dynamic changes in these epigenetic protein marks impact on the physical accessibility of gene sequences for expression, rather than on the alteration of the genetic code per se. The profound biological consequences of these processes have attracted enormous attention over the past decade, given their central role in life and their disruption in disease.17 The molecular hallmarks of epigenetic regulation comprise a dynamic series of enzymatic modification steps that introduce or remove marks to the histone protein structure. Epigenetic writers, which introduce epigenetic marks on amino acid residues of the histones, include histone acetyltransferases (HATs, which N‐acetylate lysine), histone methyltransferases (HMTs, which N‐methylate lysine), protein arginine methyltransferases (PRMTs) and protein kinases (which O‐phosphorylate serine/threonine), amongst others. Epigenetic readers, which bind to epigenetic marks and amplify their impact on DNA packaging and hence gene accessibility for expression, include proteins containing bromodomains, chromodomains and Tudor domains. Epigenetic erasers, such as histone deacetylases (HDACs), lysine demethylases (KDMs) and phosphatases, catalyse the removal of epigenetic marks (Figure 2).
Figure 2.

Epigenetic writers, readers and erasers. DNA packaged around histones gives a condensed genomic information package (top) that can be selectively unwound by epigenetic modification (e.g., acetylation, methylation of phosphorylation) to expose genes for transcription (turn on). Abbreviations used are given in the text. Adapted from ref. 17a.
The impact of the lysine N‐acetylation epigenetic mark is perhaps simplest to appreciate. Writing this mark results in the loss of a positive charge on the lysine side chain of a histone, thus removing the potential for interaction with the negatively charged DNA backbone and causing loosening the DNA–histone complex. The resulting opening up of the chromosome structure enables the localized activation (turning on) of gene expression. In the opposite sense, erasing a lysine acetylation mark drives a tighter assembly of the histone–DNA complex and silencing (turning off) gene expression.
The general principle of readers, writers and erasers prompts consideration of potential parallels between epigenetics and the control of glycan biosynthesis, structure and function. That is, does the notion of lectin readers, glycosyltransferase writers and glycosyl hydrolase erasers ring true in glycobiology? A convenient segue from epigenetics into glycoscience is provided by the reversible O‐GlcNAc modification of Ser/Thr residues in proteins.18 This central metabolic “rheostat”19 comprises a nutrient status‐responsive, post‐translational modification that impacts on protein–protein and protein–nucleic acid interactions. In turn regulating of cellular events including transcription and signal transduction, with implications in diabetes, Alzheimer's disease and cancer.
So how does the O‐GlcNAc cycle work? O‐GlcNAc transferase (OGT) writes and O‐GlcNAcase erases, providing a simple and reversible modification cycle that is orthogonal to protein phosphorylation and which has far‐reaching physiological impact (Figure 3).20
Figure 3.

The O‐GlcNAc cycle and its impact on the modulation of cellular processes. Adapted from ref. 19.
In addition to glycosyltransferase writers and glycosyl hydrolase erasers, there are also potential readers in glycoscience—a function performed by lectins21 and the carbohydrate‐binding modules (CBMs)22 in multidomain CAZymes. The full read, write, erase combination in glycoscience is most easily exemplified by the proofreading and editing cycle associated with N‐linked glycoprotein biosynthesis. These processes are essential to ensuring the correct integrity and dynamics of cell‐surface glycoproteins, which contribute to the glycocalyx that dominates cell–cell interactions in the maintenance of healthy tissue and which underpin sperm–egg interactions during fertilisation, but which also serve as cellular receptors for a wide range of microbial pathogens.9
Asparagine‐linked protein N‐glycosylation starts in the endoplasmic reticulum, whereas the peptide chain is unfolded, and proceeds through protein folding to the Golgi apparatus, where the glycan components are processed to a mature state. This requires a highly organised distribution of processing machinery to achieve the fidelity and quality control needed to ensure biological function.23 Approximately 80 % of the proteins entering the secretory pathway are glycosylated in the ER and most of the proteins assembled in the ER feature N‐linked oligosaccharides. Most of the glycoproteins featuring mature N‐glycans are, as described by Aebi, “precisely heterogeneous” in their carbohydrate composition—a result of kinetically controlled processing.24 Nonetheless, to reach their final mature and bioactive form, in the early stage of biosynthesis all N‐linked glycoprotein are homogeneously glycosylated. This is a result of a precise lectin chaperone (reader) based proofreading mechanism in the ER, which discriminates between correctly folded and misfolded glycoproteins (Figure 4).25 Here the oligosaccharide plays a key role in presenting each glycoprotein for scrutiny by the sophisticated biological checkpoint process, which is referred to as glycoprotein quality control.26 This process ensures that only correctly folded glycoproteins are transported to the Golgi for further glycan processing in to mature glycoproteins. Unfolded and misfolded glycoproteins are retained in the ER for further folding attempts and are eventually degraded if the correctly folded status is not achieved.
Figure 4.

Carbohydrate writers, readers and erasers oversee the quality control of glycoprotein folding in the ER by modification of then‐linked glycan high mannose oligosaccharide core structure.
The glycoprotein quality control system presents clear parallels to the read, write and erase processes of epigenetic regulation. In the first step of glycosylation, Glc3Man9GlcNAc2 is transferred en bloc from an oligosaccharyl dolichol diphosphate to the nitrogen of an asparagine side chain in the nascent polypeptide chain by the writer oligosaccharyltransferase (OST).27 Immediately after the Glc3Man9GlcNAc2 is transferred, the eraser glucosidase‐I28 cleaves off the terminal glucose (Glc) residue, which is necessary to prevent the glycoprotein product rebinding to the OST. Subsequently, the eraser glucosidase‐II29 catalyses cleavage of a second glucose residue and the resulting monoglucosylated polypeptide is promptly sequestered by the calnexin (CNX)30 and calreticulin (CRT)31 lectin chaperone32 readers. These chaperones prevent aggregation of the unfolded glycopolypeptide chains, and assist in their correct folding by presentation to the oxidoreductase ERp57, which is responsible for effecting correct disulfide bond formation.33 Once the folded glycoprotein is released from the lectin chaperones readers, the eraser glucosidase‐II removes the final glucose residue and the glycoprotein undergoes inspection by the UDP‐glucose:glycoprotein glycosyltransferase (UGGT)29 (Figure 4).
If the correct glycoprotein folding is not accomplished, UGGT serves as a writer and re‐glucosylates the misfolded glycoprotein in preparation for recycling to the chaperones/ERp57 machinery34—the so‐called calnexin/calreticulin cycle (Figure 4).35 Following repeated failed folding attempts, the glycoprotein is degraded by the endoplasmic reticulum associated degradation system (ERAD).36 If correct folding is achieved, the glycoprotein is transported into the Golgi apparatus for further processing of the glycan to provide the mature glycoprotein.
Conclusion
It is widely recognised that carbohydrates play important roles in biological molecular recognition, and have a profound impact on human health and medicine. Nonetheless, there is merit in simplifying the language of glycoscience to make it more accessible to the uninitiated. In turn, this might facilitate a focus on the principles and implications of glycosylation in biology, rather than risking drowning in the detail of structural complexity. The notion of accessible vocabulary in glycoscience is not new: it was already evident in Hood, Huang and Dreyer's 197737 description of differentiation antigens as cell‐surface “area codes”; and the potential of cell‐surface carbohydrates, lectins, enzymes and carbohydrate‐binding antibodies in Feizi's 198138 “cellular addresses”, “postmen, policeman and traffic signs” “involved in the obedient interpretation of area codes”. Similar thoughts were explored in Brandley and Schnaar's 198639 “potential carbohydrate ”language“ involved in intercellular interactions”, while Hakomori's 200240 “glycosynapse”—microdomains of glycolipids—seeks to draw parallels to the “immune synapse” assembly that contributes to cell adhesion and signalling. As highlighted in the cross‐disciplinary article by Bertozzi and Kiessling in 2001,41 “chemical tools have proven indispensable for studies in glycobiology”. Perhaps it is time to revisit the terminology of glycoscience, to make interdisciplinary communication more straightforward and to support marketing and engagement beyond the immediate field. Reference to lectin readers, glycosyltransferase writers, and glycosyl hydrolase erasers could therefore be worth wider (re)consideration.
Conflict of interest
The authors declare no conflict of interest.
Supporting information
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re‐organized for online delivery, but are not copy‐edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
Supplementary
Acknowledgements
Given the illustrative and abbreviated nature of this article, which is intended to prompt discussion rather than to be comprehensive, we have had to be selective in the material cited. The informed reader can no doubt identify alternative or additional examples that could have been used. We thank the reviewers for their helpful suggestions. Studies at the John Innes Centre are supported by the UK BBSRC Institute Strategic Programme on Molecules from Nature (MfN)—Products and Pathways [BBS/E/J/000PR9790], the John Innes Foundation, and the ERA CAPS Design Starch program [BB/N010272/1]. Figures were created by using templates from the Library of Scientific & Medical illustration (licence CC BY‐NC‐SA 4.0) http://www.somersault1824.com/). The 3D model in the graphical abstract is based on PDB structure 1KX5 and was created by using UCSF Chimera, developed by the Resource for Biocomputing, Visualization and Informatics at the University of California, San Francisco, with support from NIH P41‐GM103311.
S. Dedola, M. D. Rugen, R. J. Young, R. A. Field, ChemBioChem 2020, 21, 423.
References
- 1. Rademacher T. W., Parekh R. B., Dwek R. A., Annu. Rev. Biochem. 1988, 57, 785–838; [DOI] [PubMed] [Google Scholar]; Dwek R. A., Chem. Rev. 1996, 96, 683–720. [DOI] [PubMed] [Google Scholar]
- 2. Varki A., Glycobiology 1993, 3, 97–130; [DOI] [PMC free article] [PubMed] [Google Scholar]; Varki A., Glycobiology 2017, 27, 3–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.A. Varki quoted in Borman S., Chem. Eng. News 2012, 90, 28–29; See also: National Research Council (US) Committee on Assessing the Importance and Impact of Glycomics and Glycosciences; Transforming Glycoscience: A Roadmap for the Future, National Academies Press, Washington (DC), 2012. [PubMed] [Google Scholar]
- 4. Gabius H.-J., Siebert H.-C., André S., Jiménez-Barbero J., Rüdiger H., ChemBioChem 2004, 5, 740–764; [DOI] [PubMed] [Google Scholar]; The Sugar Code: Fundamentals of Glycosciences (Ed.: H.-J. Gabius), Wiley-VCH, Weinheim, 2009. [Google Scholar]
- 5. Mendelson M., Balasegaram M., Jinks T., Pulcini C., Sharland M., Nature 2017, 545, 23–25. [DOI] [PubMed] [Google Scholar]
- 6. Turnbull J. E., Field R. A., Nat. Chem. Biol. 2007, 3, 74–77. [DOI] [PubMed] [Google Scholar]
- 7. Cantarel B. L., Coutinho P. M., Rancurel C., Bernard T., Lombard V., Henrissat B., Nucleic Acids Res. 2009, 37, D233–D238; [DOI] [PMC free article] [PubMed] [Google Scholar]; Lombard V., Ramulu H. G., Drula E., Coutinho P. M., Henrissat B., Nucleic Acids Res. 2014, 42, D490–D495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Bennun S. V., Baycin Hizal D., Heffner K., Can O., Zhang H., Betenbaugh M. J., J. Mol. Biol. 2016, 428, 3337–3352. [DOI] [PubMed] [Google Scholar]
- 9. Varki A., Cummings R. D., Esko J. D., Stanley P., Hart G. W., Aebi M., Darvill A. G., Kinoshita T., Packer N. H., Prestegard J. H., Schnaar R. L., Seeberger P. H., Essentials of Glycobiology , 3rd ed., Cold Spring Harbor Laboratory Press, New York, 2015. –2017. [PubMed] [Google Scholar]
- 10. Varki A., Cummings R. D., Aebi M., Packer N. H., Seeberger P. H., Esko J. D., Stanley P., Hart G., Darvill A., Kinoshita T., et al., Glycobiology, 2015, 25, 1323–1324. See also: [DOI] [PMC free article] [PubMed] [Google Scholar]; Neelamegham S., Aoki-Kinoshita K., Bolton E., Frank M., Lisacek F., Lütteke T., O'Boyle N., Packer N. H., Stanley P., Toukach P., Varki A., Woods R. J., The SNFG Discussion, Glycobiology, 2019, 29, 620–624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. York W. S., Agravat S., Aoki-Kinoshita K. F., McBride R., Campbell M. P., Costello C. E., Dell A., Feizi T., Haslam S. M., Karlsson N., et al., Glycobiology 2014, 24, 402–406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Tiemeyer M., Aoki K., Paulson J., Cummings R. D., York W. S., Karlsson N. G., Lisacek F., Packer N. H., Campbell M. P., Aoki N. P., Fujita A., Matsubara M., Shinmachi D., Tsuchiya S., Yamada I., Pierce M., Ranzinger R., Narimatsu H., Aoki-Kinoshita K. F., Glycobiology 2017, 27, 915–919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Gabius H.-J., BioSystems 2018, 164, 102–111. [DOI] [PubMed] [Google Scholar]
- 14. Martin J., Managing the Database Environment, Prentice Hall, New Jersey, 1983, p. 381. [Google Scholar]
- 15.M. Heller, “REST and CRUD: The Impedance Mismatch”, InfoWorld, 2007, https://www.infoworld.com/article/2640739/rest-and-crud–the-impedance-mismatch.html.
- 16. Strahl B. D., Allis C. D., Nature 2000, 403, 41–45. [DOI] [PubMed] [Google Scholar]
- 17.Reviewed in:
- 17a. Falkenberg K. J., Johnstone R. W., Nat. Rev. Drug Discovery 2014, 13, 673–691; [DOI] [PubMed] [Google Scholar]
- 17b. Allis C. D., Jenuwein T., Nat. Rev. Genetics 2016, 17, 487–500. [DOI] [PubMed] [Google Scholar]
- 18. Hanover J. A., Krause M. W., Love D. C., Nat. Rev. Mol. Cell Biol. 2012, 13, 312–321; [DOI] [PubMed] [Google Scholar]; Yuzwa S. A., Vocadlo D. J., Chem. Soc. Rev. 2014, 43, 6839–6858. [DOI] [PubMed] [Google Scholar]
- 19. Hart G. D., J. Biol. Chem. 2019, 294, 2211–2231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Little is currently known about readers for O-GlcNAc, although proteins involved in signalling have recently been implicated. Yang X., Qian K., Nat. Rev. Mol. Cell Biol. 2017, 18, 452–465;28488703 [Google Scholar]; Toleman C. A., Schumacher M. A., Yu S. H., Zeng W., Cox N. J., Smith T. J., Soderblom E. J., Wands A. M., Kohler J. J., Boyce M., Proc. Natl. Acad. Sci. USA 2018, 115, 5956–5961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Ambrosi M., Cameron N. R., Davis B. G., Org. Biomol. Chem. 2005, 3, 1593–1608; [DOI] [PubMed] [Google Scholar]; Arnaud J., Audrey A., Imberty A., Chem. Soc. Rev. 2013, 42, 4798–4813. [DOI] [PubMed] [Google Scholar]
- 22. Boraston A. B., Bolam D. N., Gilbert H. J., Davies G. J., Biochem. J. 2004, 382, 769–781; [DOI] [PMC free article] [PubMed] [Google Scholar]; Carvalho C., Phan N. N., Che Y. F., Reilly P. J., Biopolymers 2015, 103, 203–2145. [DOI] [PubMed] [Google Scholar]
- 23. Zhang X., Wang Y., J. Mol. Biol. 2016, 428, 3183–3193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Losfeld M. E., Scibona E., Lin C.-W., Villiger T. K., Gauss R., Morbidelli M., Aebi M., FASEB J. 2017, 31, 4623–4635. [DOI] [PubMed] [Google Scholar]
- 25. Tannous A., Pisoni G. B., Hebert D. N., Molinari M., Sem. Cell Dev. Biol. 2015, 41, 79–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Anelli T., Sitia R., EMBO J. 2008, 27, 315–327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Mohorko E., Glockshuber R., Aebi M., J. Inherited Metab. Dis. 2011, 34, 869–878. [DOI] [PubMed] [Google Scholar]
- 28. Barker M. K., Rose D. R., J. Biol. Chem. 2013, 288, 13563–13574. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Alessio C. D., Caramelo J. J., Parodi A., Semin. Cell Dev. Biol. 2010, 21, 491–499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Schrag J. D., Bergeron J. J. M., Li Y., Borisova S., Hahn M., Thomas Y. D., Cygler M., Mol. Cell 2001, 8, 633–644. [DOI] [PubMed] [Google Scholar]
- 31. Michalak M., Groenendyk J., Szabo E., Gold L. I., Opas M., Biochem. J. 2009, 417, 651–666. [DOI] [PubMed] [Google Scholar]
- 32. Halperin L., Jung J., Michalak M., IUBMB Life 2014, 66, 318–326. [DOI] [PubMed] [Google Scholar]
- 33. Hettinghouse A., Liu R., Liu C. J., Pharmacol. Ther. 2018, 181, 34–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Ellgaard L., Frickel E.-M., Cell Biochem. Biophys. 2003, 39, 223–247. [DOI] [PubMed] [Google Scholar]
- 35. Lamriben L., Graham J. B., Adams B. M., Hebert D. N., Traffic 2016, 17, 308–326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Benyair R., Ogen-Shtern N., Lederkremer G. Z., Sem. Cell Dev. Biol. 2015, 41, 99–109. [DOI] [PubMed] [Google Scholar]
- 37. Hood L., Huang H. V., Dreyer W. J., J. Supramol. Struct. 1977, 7, 531–559. [DOI] [PubMed] [Google Scholar]
- 38. Feizi T., Trends Biochem. Sci. 1981, 6, 333–335. [Google Scholar]
- 39. Brandley B. K., Schnaar R. L., J. Leukocyte Biol. 1986, 40, 97–111. [DOI] [PubMed] [Google Scholar]
- 40. Hakomori S.-i., Proc. Natl. Acad. Sci. USA 2002, 99, 225–232.11773621 [Google Scholar]
- 41. Bertozzi C. R., Kiessling L. L., Science 2001, 291, 2357–2364. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re‐organized for online delivery, but are not copy‐edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
Supplementary
