Abstract
Protein glycosylation is a diverse and omnipresent protein modification. Glycosylation provides glycoproteins with important structural and functional properties to facilitate critical biological processes. Despite the significance of protein glycosylation, the investigation of glycoproteome, especially O-linked glycoproteome, remains elusive due to the lack of a comprehensive methodology to conform with the diversity of O-linked glycoforms of O-linked glycoproteins. In recent years, mass spectrometry has become an indispensable tool for the characterization of O-linked glycosylated proteins across biological systems. We herein summarize the recent developments and insights of MS-based O-linked glycoproteomic technologies, databases and bioinformatic tools, with a focus on O-linked intact glycopeptide analysis.
Keywords: Protein Glycosylation, Glycoproteome, O-linked Glycosylation, Proteomics, Mass Spectrometry
Introduction
Glycoproteins are decorated extensively with the most omnipresent and exquisite modification – glycosylation, which takes up ~ 50% of the proteome[1, 2]. The abundance and diversity of glycosylation validate the indispensable roles of glycosylation in various physiological and pathological functions including, but not limited to, cellular signaling, protein folding, apoptosis regulation, inflammatory responses, immune escape, and cancer metastasis[3, 4]. In protein glycosylation, monosaccharides, the basic building blocks of glycans are linked by glycosidic bonds through various enzymatic actions of glycosidases and glycosyltransferases[5, 6]. They can be anchored to at least nine different amino acid side chains[7]. The oligosaccharide modified glycoproteins are N-linked and O-linked glycoproteins. The biosynthesis of N-linked glycosylation is initialized by the covalent attachment of an oligosaccharide unit (14 monosaccharide residues) to the amide group of asparagine (Asn), then the oligosaccharide unit undergoes a sequential trimming and addition of certain monosaccharides by glycosidases and glycosyltransferases in a specific manner. Moreover, the N-linked glycosylation has a universal core glycan structure (GlcNAc2Man3) on a consensus peptide sequence: Asn-X-Ser/Thr, where X is any amino acid except proline (Pro). In contrast, the O-linked glycosylation has neither consensus sequences nor universal core structures. O-linked glycosylation (Figure 1A) is built up from the first monosaccharide (including galactosamine/GalNAc, mannose/Man, galactose/Gal, fucose/Fuc, glucose/Glc, and glucosamine/GlcNAc) that is attached to the hydroxyl groups of the side chains of serine (Ser) and threonine (Thr). Based on these structures, either linear or branched oligosaccharide chain of various lengths could be further extended. In both N-linked and O-linked glycosylation, glycans can be further modified with chemical modifications such as phosphorylation, sulfation, and acetylation, and other monosaccharides including Fuc and Sia in α-linkages, Gal, GalNAc and GlcNAc in both α- and β-linkages, and sulfate[8].
Figure 1.
A) Core structures for mucin-type O-glycans;B) Enrichment strategies for O-linked glycipeptides:1) Metabolic labeling; 2) Hydrazide chemistry; 3) Lectin weak affinity chromatography (LWAC); 4) EXoO; 5) Chemoenzymatic strategy; 6) HILIC; 7) "SimpleCell" strategy.
As a result of the structural diversity that comes with the biosynthesis of O-linked glycosylation, to characterize O-linked glycoproteome, which is the assignment of specific O-linked glycans to protein sites in a proteome level is challenging. Mass spectrometry-based approaches have been predominantly applied for the intact O-linked glycoproteome characterization[9, 10]. However, apart from the analytical challenges that come with the inherent complexities of O-linked glycosylation on specific glycosylation sites in intact glycopeptides, glycopeptides are often low in abundance, the heterogeneity generates various sub-stoichiometric modifications to the glycopeptide that further decrease the quantities of a unique glycoform[11]. Moreover, the selection of glycopeptides for mass spectrometry analysis can be further competed by the majority of coexisting unmodified peptides[12*]. Coupled with the fact that glycosylation tends to lower the ionization efficiency[13], it is necessary to increase the robustness of glycopeptide analysis by glycopeptide enrichments. Additionally, MS fragmentation methods do not cleave glycans and peptide portion with the same efficiency. For instance, collision-induced dissociation (CID) generates mostly fragment ions from the glycan portion of a glycopeptide, and the cleavage preferably happens at peptide backbone of glycopeptide for electron-capture dissociation (ECD) and electron-transfer dissociation (ETD)[14], whereas higher-energy collisional dissociation (HCD) cleaves at both peptide backbone and glycan portion of intact glycopeptides[15, 16]. Hence, a hybrid fragmentation approach[17], such as electron transfer dissociation with HCD supplemental activation (EThcD)[18**] would be beneficial for the profiling of intact O-linked glycoproteome. At last, the limited availability of adequate databases and bioinformatic tools for O-linked glycoproteomic interpretation is also hindering the advancing in O-linked glycoproteomic research.
Current Advances in O-linked Glycopeptide Enrichment and Separation
Efficient selective enrichment and separation regimes are crucial for the glycoproteomic characterization of O-linked glycoproteome[19]. Currently, a selection of enrichment methodologies (Figure 1B) of specific glycopeptide subtypes has been adopted to this purpose, including hydrazide chemical tagging[20], metabolic labeling[9, 11, 21-23*], chemoenzymatic labeling[22, 24, 25**], lectin chromatography[26, 27], HILIC[28, 29] and “SimpleCell” technology with homogenized O-linked glycans[30-32**], as well as a site-specific chemoenzymatic extraction of O-linked glycopeptides by EXoO[24, 25**, 33**].
Lectin Weak Affinity Chromatography
Apart from the traditional HILIC method, in which hydrophilic materials are used to capture the hydrophilic glycans anchored on the glycoproteins/glycopeptides, lectin weak affinity chromatography (LWAC) is the most adopted enrichment regime for peptides with O-linked glycosylation[19]. Lectins is a type of non-enzymatic proteins that recognize and bind certain types of glycans linked to proteins[10], they also exhibit different specificities towards glycans as well as the surrounding amino acid sequences of peptide backbone[34]. Lectins display varying specificity range, some recognize a wide range of glycan structures, so they are suitable for the enrichment of larger portion of glycopeptides with different glycans, while some have a narrower specificity, so they can be used for the enrichment of a small subset of the O-linked glycopeptides[35, 36]. Based on these features of lectins, single[37], serial[38] or multi-lectin[39] affinity chromatography could be adopted for glycoprotein purification. However, the available lectins and their combinations are neither efficient nor sufficient enough to capture the entire glycoproteome[36]. To render the complexity down, in an approach called “SimpleCell”, researchers homogenized O-linked glycans by blocking the elongation. Combining lectin enrichment and electron transfer dissociation (ETD)/higher energy collisional dissociation (HCD)-based MS analysis, 600 O-linked glycoproteins were identified with ~3000 O-linked glycosylation sites on these engineered human cells[40].
Hydrazide Chemistry
Another popular enrichment method in the field of glycoproteomics is hydrazide chemistry with a rather high specificity. Once oxidized (by periodate or certain kind of enzyme), carbohydrate chains would be covalently conjugated to hydrazide beads. Target glycoproteins/glycopeptides could be selectively released from the beads using chemical or enzymatic approaches[41, 42]. In the study published by Zheng et al.[41], Tn antigen (O-GalNac) was oxidized specifically using galactose oxidase. The peptides with oxidized Tn antigen would then be captured by hydrazide beads and released by methoxylamine (CH3ONH2). After LC–MS/MS analysis, 96 glycoproteins with Tn antigen were identified in Jurkat cells[41]. In particular, sialyl-Tn antigen is characterized pervasively due to its pathological implications[3*]. Sialic acid residues are easily oxidized under mild periodate conditions, then oxidized sialic acid residues can be captured by hydrazide beads. Moreover, since the glycosidic bond between sialic acid residues and monosaccharides are less stringent, sialylated O-linked glycopeptides could be released by hydrolyzing the sialic acid residues-linked glycosidic bonds using acids such as formic acid[19].
Metabolic Labeling and Chemoenzymatic Labeling
Ever since it was first introduced, metabolic labeling has been widely utilized for the characterization of O-linked glycoproteins/glycopeptides. The basic regime introduced by the Bertozzi group[21] is to replace the natural GalNAc with GalNAz, on which the azido group could be modified with some affinity tags such as biotin. Thus, subsequent enrichment is made possible. Woo et al. [9, 11] reported another mass-independent platform–IsoTaG, for intact glycopeptides analysis based on metabolic labeling and ETD-MS/MS. The isotope recoding group of their affinity probes contains two bromide atoms–80Br and 82Br, which have equal natural abundance. IsoTaG uses an algorithm to search for MS1 spectrum of enriched O-glycopeptides with “1:2:1 type” peak intensity, then tandem MS analysis would be directed to these identified isotopically recorded species. IsoTaG stands out for its high confidence in the identifications of O-linked glycopeptides. Sialic acid residues are also of interest for metabolic labeling, since sialic acid residues are often found at the termini of various glycan structures and they are often associated with cancers[43]. Similar rationale is also adopted in chemoenzymatic labeling, where a chemically active group, e.g. an azide, acetone, or alkyne, is linked to the GlcNAc residues of O-linked glycopeptides under enzymatic catalysis[22, 23*]. These chemical active groups provide docking sites for enrichment tags, which could be captured by functional resin or beads through mechanisms like “click-chemistry”[44**].
Site-specific Chemoenzymatic Extraction (EXoO)
Recently, with the discovery of a novel O-protease – OpeRATOR™ (Genovis Inc), which specifically cleaves O-linked glycopeptides at the N-termini of O-linked glycan-anchored Ser/Thr to release site-specific O-linked glycopeptides with their conjugated O-linked glycans, a highly-specific solid-phase enrichment (EXoO) of O-linked intact glycopeptides is made possible[24, 25**, 33**]. This robust chemoenzymatic enrichment method, coupled with ETD-MS/MS and HCD-MS/MS, has been proven to characterize O-linked intact glycopeptides with rather high specificity and throughput. EXoO could map over 3,000 O-GalNAcylation sites for the first time in complex biological samples as well as cell lines in one single experiment[33**]. Apart from OpeRATOR™, another mucinase StcE[45**] was recently employed in the “bump-hole” study of engineered GalNAc-T[46*, 47**], and the fragmentation approach comparison study for glycopeptides[18**].
Mass Spectrometry Fragmentation
To characterize intact O-linked glycoproteome, we need detailed information on both the peptide backbone and its associated glycan. We will focus on multiple tandem MS (MS/MS) strategies that have been developed and applied in the characterization for intact glycopeptides, including CID, ETD, and HCD (a comparison is shown in Figure 2).
Figure 2.
Comparison of different fragmentation regimes.
CID
Collision-induced/activated dissociation (CID/CAD) is the most common and robust fragmentation method used for peptide analysis[48]. CID breaks the peptide (-CO-NH-) bonds, producing b (N-terminal fragment) and y (C-terminal fragment) ions of peptides, leaving positive charge on either. Meanwhile, carbohydrate Y series ions are also created since glycosidic bonds of glycopeptide are more fragile than the amide bonds. Thus, the glycan moiety structure would be revealed predominantly[10]. Due to this difference in cleavage efficiency, CID is often applied in conjunction with other fragmentation regime for intact glycopeptides characterization[49]. Alternatively, adding one more fragmentation stage (MS3) to selected Y ion generated by MS/MS could be adopted to generate b and y ions from a peptide backbone[50].
ETD
Electron transfer dissociation (ETD) is an alternative MS fragmentation method[17]. During ETD fragmentation, random fragmentation along the peptide backbone is induced, albeit from gas-phase reactions with either thermal electrons or fluoranthene, respectively[14]. Multiply protonated peptide receives an electron from a radical anion and cleaves at the N-Cα bond to produces c (N-terminal fragment) and z (C-terminal fragment) ions[51]. Unlike CID, labile post-translational modifications, such as glycosylation, are generally retained on the peptide backbone and are largely unaffected by the ETD fragmentation process, making ETD an effective fragmentation regime for identifications of O-linked glycopeptides and glycosylation sites. ETD can be used alone for glycopeptide characterization. However, due to the inefficient cleavage of glycosidic linkages by ETD, the information of the monosaccharide units is barely obtained through ETD, only the monoisotopic masses of oligosaccharides could be obtained. To tackle this issue, hybrid fragmentation regimes that provide supplemental energy have been introduced and gained popularity, with ETD followed by supplemental HCD (EThcD) being one of the most robust glycoproteomic methods[18**].
HCD
Higher-energy collisional dissociation (HCD)is a state-of-the-art dissociation approach that is used extensively for high mass accuracy peptide analysis. HCD fragmentation also produces b and y ions. Compared to CID, which is often limited by the “one-third" m/z cutoff rule, HCD has higher trap efficiency in the low m/z range due to the fact that HCD fragmentation is performed in a dedicated octopole collision cell at the far end of the “C-trap”[52]. Moreover, small diagnostic oxonium ions of monosaccharaides could also be detected in HCD[53-55]. In addition, HCD could dissociate the peptide backbone efficiently as well as the attached labile glycans[55-57], HCD is therefore a desirable fragmentation regime for protein and glycopeptide identification and quantification. However, there’s tradeoff between proteome coverage and mass accuracy for HCD. In many cases, HCD is often combined with CID and/or ETD for complimentary fragmentation[55, 58, 59], which in general provides greater confidence in the identification of glycopeptides and glycosylation sites than studies relying on a single fragmentation regime[17].
Bioinformatics
The heterogeneity of O-linked glycoproteome has been challenging to conform with. A protein contains 4 glycosylation sites with 10 different glycans could yield 10,000 possible glycoforms, which at some point is simply impossible to incorporate into the database searching algorithms. Moreover, the glycans attached to the same protein vary in different cells, tissues, organisms and physiological states[60**]. The same glycan structure on different proteins may also exhibit different functions. Therefore, the interpretation with regard to the glycosylation should be comprehensive, integrating the glycosylation sites, the site-specific glycan structures, the protein scaffolds, and the co-expression of related glycogenes.
With the agglomerative efforts by scientists of all regimes, tremendous improvements in the analytical methods including high-performance separation strategies and mass spectrometry-based analysis have propelled the studies on glycosylation. Consequently, the qualitative and quantitative data of glycans, glycosites, glycopeptides, and glycoproteins have increased dramatically. Along with the increase in data availability, corresponding databases and bioinformatic tools to store, retrieve, integrate, and interpret these data have also grown substantially.
Databases for O-linked glycosylation
The glycan takes up a huge part of our O-linked intact glycopeptide analysis. Various glycan databases have been established ever since the first database in glycobiology – CarbBank[61]. CFG Glycan Structure Database[62] (http://www.functionalglycomics.org/glycomics/molecule/jsp/carbohydrate/carbMoleculeHome.jsp), Glycan Mass Spectral DataBase by JCGGDB[63] (https://jcggdb.jp/rcmg/glycodb/Ms_ResultSearc), UniCarb-DB[64] (https://unicarb-db.expasy.org/), GLYCOSCIENCE.de[65] (http://www.glycosciences.de/), GlyTouCan[66](https://glytoucan.org/) and Carbohydrate Structure DataBase[67] (http://csdb.glycoscience.ru/database/) are integrative databases of note, providing structural, chemical, mass spectrometry, origin and glycotransferase information of glycans.
Compared to N-linked glycosylation, there are relatively less data available for O-linked glycosylation due to the aforementioned challenges. Limited resources for O-GalNAc and O-GlcNAc proteins are made available through GlycoDomain Viewer[68] (https://glycodomain.glycomics.ku.dk/) and Glyco-DIA[69*] (https://github.com/CCGMS/Glyco-DIA) based on the “SimpleCell” technology, as well as NetOGlyc[70] (www.cbs.dtu.dk/services/NetOGlyc/), and YinOYang 1.2[71] (http://www.cbs.dtu.dk/services/YinOYang/).
Data Analysis Software for O-linked glycosylation
The recent published O-Pair Search[72**] (https://github.com/smith-chem-wisc/MetaMorpheus), utilized an ion-indexed open modification search strategy, to assign O-linked glycopeptides to paired HCD and ETD spectra. O-Pair Search is reported to be able to identify more O-glycopeptides compared to the most widely used commercial glycopeptide search tool, Byonic[73], while reducing search times significantly by more than 2,000-fold[72**]. Moreover, O-Pair Search also accepts user-defined glycan databases, potentiating searches other than O-GalNAcylation. Another mode within MSFragger (http://msfragger.nesvilab.org) —MSFragger-Glyco[74**], developed by Polasky et al., is also based on the ion-indexed search strategy. MSFragger-Glyco tremendously improved the identification of labile glycan spectra and the annotation of glycopeptide spectrum matches(glycoPSMs). For both O-Pair Search and MSFragger-Glyco, more glycopeptide assignments are achieved in less time compared to traditional search algorithms. Another newly developed searching tool for intact O-glycopeptides – AOGP[75*], is found to exhibit superior performance in identifying intact O-glycopeptides of single proteins.
Concluding Remarks and Future Perspectives
The field of O-linked glycoproteome study, especially the study of O-linked intact glycopeptides, is growing substantially over the recent years. Advancements in sample preparation and separation, instrumentation, specialized algorithms and bioinformatic tools, brought on many in-depth discoveries. However, the identified O-linked glycopeptides only comprise a small part of the entire O-linked glycoproteome. Challenges remain for O-linked intact glycopeptide analysis.
Although multiple enrichment methodologies have been made possible, all of them fall short in term of comprehensiveness or specificity. Affinity-based and labeling-based enrichment methodologies are limited to only a subset of glycan structures, while hydrophilic interaction-based methodologies cannot distinguish glycopeptides with other hydrophilic non-glycopeptides. To address this problem, a hybrid or multiplexed enrichment methodology could be designed to improve the coverage and throughput. In mass spectrometry-based analysis, the annotations of glycan structures often rely on oxonium ions and neutral loss. Since some monosaccharides have the same mass, it would be quite troublesome to pinpoint these monosaccharides and distinguish structure isomers[10]. Additionally, a hybrid fragmentation regime based on current available technologies could also benefit our analysis of O-linked intact glycopeptides[17], providing us with more details with regard to glycan structures and peptide backbones. Last but not least, a lag in the development for data integration and interpretation is also hindering the studies on O-linked glycoproteome. Future data analysis software hopefully will include more complete lists of glycospecific fragments, along with the ability to incorporate spectra generated from different types of fragmentation regimes. In recent years, we believe these challenges would be resolved with the rapid advancing of our technology.
References
- [1].Gill DJ, Clausen H, and Bard F, “Location, location, location: New insights into O-GalNAc protein glycosylation,” Trends in Cell Biology. 2011, doi: 10.1016/j.tcb.2010.11.004. [DOI] [PubMed] [Google Scholar]
- [2].Apweiler R, Hermjakob H, and Sharon N, “On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database,” Biochim. Biophys. Acta - Gen. Subj, 1999, doi: 10.1016/S0304-4165(99)00165-8. [DOI] [PubMed] [Google Scholar]
- [3*]. Reily C, Stewart TJ, Renfrow MB, and Novak J, “Glycosylation in health and disease,” Nature Reviews Nephrology. 2019, doi: 10.1038/s41581-019-0129-4. This review provides a comprehensive summary of changes in glycosylation related to various diseases.
- [4].Barchi JJ, “Mucin-type glycopeptide structure in solution: Past, present, and future,” Biopolymers. 2013, doi: 10.1002/bip.22313. [DOI] [PubMed] [Google Scholar]
- [5].Varki A, “Biological roles of glycans,” Glycobiology, 2017, doi: 10.1093/glycob/cww086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Yarema KJ and Bertozzi CR, “Characterizing glycosylation pathways,” Genome Biology. 2001, doi: 10.1186/gb-2001-2-5-reviews0004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Pless DD and Lennarz WJ, “Enzymatic conversion of proteins to glycoproteins,” Proc. Natl. Acad. Sci. U. S. A, 1977, doi: 10.1073/pnas.74.1.134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].D. ACR. D. EJ. H. FH. P. S. R. BC. W. HG. Varki EM and E., Essentials of Glycobiology, 3rd edition. 2015. [Google Scholar]
- [9].Woo CM et al. , “Development of IsoTaG, a Chemical Glycoproteomics Technique for Profiling Intact N- and O-Glycopeptides from Whole Cell Proteomes,” J. Proteome Res, 2017, doi: 10.1021/acs.jproteome.6b01053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Cao L, Qu Y, Zhang Z, Wang Z, Prytkova I, and Wu S, “Intact glycopeptide characterization using mass spectrometry,” Expert Review of Proteomics. 2016, doi: 10.1586/14789450.2016.1172965. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Woo CM, Iavarone AT, Spiciarich DR, Palaniappan KK, and Bertozzi CR, “Isotope-targeted glycoproteomics (IsoTaG): A mass-independent platform for intact N- and O-glycopeptide discovery and analysis,” Nat. Methods, 2015, doi: 10.1038/nmeth.3366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12*]. Zhou Y et al. , “An Integrated Workflow for Global, Glyco-, and Phospho-proteomic Analysis of Tumor Tissues,” Anal. Chem, 2020, doi: 10.1021/acs.analchem.9b03753. An integrated workflow for multiplex analysis of global, glyco-, and phospho-proteomics was developed, with an improved protein identification and charaterization of global proteome and protein PTMs compared to other coventional approaches.
- [13].Pouria S et al. , “Glycoform composition profiling of O-glycopeptides derived from human serum IgA1 by matrix-assisted laser desorption ionization-time of flight-mass spectrometry,” Anal. Biochem, 2004, doi: 10.1016/j.ab.2004.03.053. [DOI] [PubMed] [Google Scholar]
- [14].Zhang Y, Fonslow BR, Shan B, Baek MC, and Yates JR, “Protein analysis by shotgun/bottom-up proteomics,” Chemical Reviews. 2013, doi: 10.1021/cr3003533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Toghi Eshghi S, Shah P, Yang W, Li X, and Zhang H, “GPQuest: A spectral library matching algorithm for site-specific assignment of tandem mass spectra to intact n-glycopeptides,” Anal. Chem, 2015, doi: 10.1021/acs.analchem.5b00024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Yang W et al. , “Glycoform analysis of recombinant and human immunodeficiency virus envelope protein gp120 via higher energy collisional dissociation and spectral-aligning strategy,” Anal. Chem, 2014, doi: 10.1021/ac500876p. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Reiding KR, Bondt A, Franc V, and Heck AJR, “The benefits of hybrid fragmentation methods for glycoproteomics,” TrAC - Trends in Analytical Chemistry. 2018, doi: 10.1016/j.trac.2018.09.007. [DOI] [Google Scholar]
- [18**]. Riley NM, Malaker SA, Driessen MD, and Bertozzi CR, “Optimal Dissociation Methods Differ for N- A nd O-Glycopeptides,” J. Proteome Res, 2020, doi: 10.1021/acs.jproteome.0c00218. This paper discussed the optimal fragmentation approaches for N-linked and mucin-type O-linked glycopeptides analysis. Each fragmentation approach was accessed with the pros and cons laid out.
- [19].You X, Qin H, and Ye M, “Recent advances in methods for the analysis of protein o-glycosylation at proteome level,” Journal of Separation Science. 2018, doi: 10.1002/jssc.201700834. [DOI] [PubMed] [Google Scholar]
- [20].Nilsson J et al. , “Enrichment of glycopeptides for glycan structure and attachment site identification,” Nat. Methods, 2009, doi: 10.1038/nmeth.1392. [DOI] [PubMed] [Google Scholar]
- [21].Hang HC, Yu C, Kato DL, and Bertozzi CR, “A metabolic labeling approach toward proteomic analysis of mucin-type O-linked glycosylation,” Proc. Natl. Acad. Sci. U. S. A, 2003, doi: 10.1073/pnas.2335201100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Khidekel N et al. , “Probing the dynamics of O-GlcNAc glycosylation in the brain using quantitative proteomics,” Nat. Chem. Biol, 2007, doi: 10.1038/nchembio881. [DOI] [PubMed] [Google Scholar]
- [23*]. Hao Y et al. , “Next-generation unnatural monosaccharides reveal that ESRRB O-GlcNAcylation regulates pluripotency of mouse embryonic stem cells,” Nat. Commun, 2019, doi: 10.1038/s41467-019-11942-y. This paper developed several novel unnatural monosaccharides for the metabolic labelling of O-GlcNAc events with improved selectivity. These new metabolic labelling monosaccharides overcome the S-glycosylation observed with previous generations of unnatural monosaccharides.
- [24].Yang W, Ao M, Hu Y, Li QK, and Zhang H, “Mapping the O-glycoproteome using site-specific extraction of O-linked glycopeptides (EXoO),” Mol. Syst. Biol, 2018, doi: 10.15252/msb.20188486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25**]. Yang W, Ao M, Song A, Xu Y, Sokoll L, and Zhang H, “Mass Spectrometric Mapping of Glycoproteins Modified by Tn-Antigen Using Solid-Phase Capture and Enzymatic Release,” Anal. Chem, vol. 92, no. 13, 2020, doi: 10.1021/acs.analchem.0c01564. This study focuses on the mapping of glycoproteins modified by Tn-antigen, using the novel EXoO approach to specifically immobolize, isotopically tag, release and identify Tn-antigen sites on glycopeptides.
- [26].Vosseller K et al. , “O-linked N-acetylglucosamine proteomics of postsynaptic density preparation using lectin weak affinity chromatography and mass spectrometry,” Mol. Cell. Proteomics, 2006, doi: 10.1074/mcp.T500040-MCP200. [DOI] [PubMed] [Google Scholar]
- [27].Trinidad JC, Schoepfer R, Burlingame AL., and Medzihradszky KF, “N- and O-Glycosylation in the murine synaptosome,” Mol. Cell. Proteomics, 2013, doi: 10.1074/mcp.M113.030007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Hägglund P, Bunkenborg J, Elortza F, Jensen ON, and Roepstorff P, “A new strategy for identification of N-glycosylated proteins and unambiguous assignment of their glycosylation sites using HILIC enrichment and partial deglycosylation,” J. Proteome Res, 2004, doi: 10.1021/pr034112b. [DOI] [PubMed] [Google Scholar]
- [29].Yang W et al. , “Comparison of Enrichment Methods for Intact N- and O-Linked Glycopeptides Using Strong Anion Exchange and Hydrophilic Interaction Liquid Chromatography,” Anal. Chem, 2017, doi: 10.1021/acs.analchem.7b03641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Steentoft C et al. , “Mining the O-glycoproteome using zinc-finger nuclease-glycoengineered SimpleCell lines,” Nat. Methods, 2011, doi: 10.1038/nmeth.1731. [DOI] [PubMed] [Google Scholar]
- [31].Steentoft C et al. , “Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology,” EMBO J, 2013, doi: 10.1038/emboj.2013.79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [32**]. Bagdonaite I et al. , “O-glycan initiation directs distinct biological pathways and controls epithelial differentiation,” EMBO Rep, 2020, doi: 10.15252/embr.201948885. This is a prime example of "SimpleCell" technology where single-GalNAc/GalGalNAc glycosites for different GalNAc-transferases (GalNAc-T1, GalNAc-T2, and GalNAc-T3) were identified.
- [33**]. Yang W, Song A, Ao M, Xu Y, and Zhang H, “Large-scale site-specific mapping of the O-GalNAc glycoproteome,” Nat. Protoc, vol. 15, no. 8, 2020, doi: 10.1038/s41596-020-0345-1. This paper provides a detailed protocol for the implementation of EXoO.
- [34].Madariaga D et al. , “Serine versus threonine glycosylation with α-o-GalNac: Unexpected selectivity in their molecular recognition with lectins,” Chem. - A Eur. J, 2014, doi: 10.1002/chem.201403700. [DOI] [PubMed] [Google Scholar]
- [35].Alley WR, Mann BF, and Novotny MV, “High-sensitivity analytical approaches for the structural characterization of glycoproteins,” Chemical Reviews. 2013, doi: 10.1021/cr3003714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Fanayan S, Hincapie M, and Hancock WS, “Using lectins to harvest the plasma/serum glycoproteome,” Electrophoresis. 2012, doi: 10.1002/elps.201100567. [DOI] [PubMed] [Google Scholar]
- [37].Chowdhury S, Ray S, and Chatterjee BP, “Single step purification of polysaccharides using immobilized jackfruit lectin as affinity adsorbent,” Glycoconj. J, 1988, doi: 10.1007/BF01048329. [DOI] [Google Scholar]
- [38].Sumi S, Arai K, Kitahara S, and ichiro Yoshida K, “Serial lectin affinity chromatography demonstrates altered asparagine-linked sugar-chain structures of prostate-specific antigen in human prostate carcinoma,” J. Chromatogr. B Biomed. Sci. Appl, 1999, doi: 10.1016/S0378-4347(99)00069-9. [DOI] [PubMed] [Google Scholar]
- [39].Zeng Z et al. , “A proteomics platform combining depletion, multi-lectin affinity chromatography (M-LAC), and isoelectric focusing to study the breast cancer proteome,” Anal. Chem, 2011, doi: 10.1021/ac2002802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [40].Steentoft C, Bennett EP, and Clausen H, “Glycoengineering of human cell lines using zinc finger nuclease gene targeting: SimpleCells with homogeneous GalNAc O-glycosylation allow isolation of the O-glycoproteome by one-step lectin affinity chromatography,” Methods Mol. Biol, 2013, doi: 10.1007/978-1-62703-465-4_29. [DOI] [PubMed] [Google Scholar]
- [41].Zheng J, Xiao H, and Wu R, “Specific Identification of Glycoproteins Bearing the Tn Antigen in Human Cells,” Angew. Chemie - Int. Ed, 2017, doi: 10.1002/anie.201702191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [42].Tian Y, Zhou Y, Elliott S, Aebersold R, and Zhang H, “Solid-phase extraction of N-linked glycopeptides,” Nat. Protoc, 2007, doi: 10.1038/nprot.2007.42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Zhang Z, Wuhrer M, and Holst S, “Serum sialylation changes in cancer,” Glycoconjugate Journal. 2018, doi: 10.1007/s10719-018-9820-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44**]. Parker CG and Pratt MR, “Click Chemistry in Proteomic Investigations,” Cell. 2020, doi: 10.1016/j.cell.2020.01.025. This review summarizes how bio-orthogonal reactions, "click chemistry", specifically azide–alkyne click chemistry, have been used for the selective tagging and functional investigation of protein-small-molecule interactions and protein modification such as O-GlcNAc, in native biological environments.
- [45**]. Malaker SA et al. , “The mucin-selective protease StcE enables molecular and functional analysis of human cancer-associated mucins,” Proc. Natl. Acad. Sci. U. S. A, 2019, doi: 10.1073/pnas.1813020116. This paper is the first to demonstrate the potential of StcE mucinases as a powerful tool for the glycoproteomic study of mucin domain structure and function.
- [46*]. Choi J et al. , “Engineering Orthogonal Polypeptide GalNAc-Transferase and UDP-Sugar Pairs,” J. Am. Chem. Soc, 2019, doi: 10.1021/jacs.9b04695 The first "bump–hole" chemical reporter system for studying GalNAc-T isoforms in vitro was developed in this study, laying the fundation for "bump–hole" in vivo labelling.
- [47**]. Schumann B et al. , “Bump-and-Hole Engineering Identifies Specific Substrates of Glycosyltransferases in Living Cells,” Mol. Cell, 2020, doi: 10.1016/j.molcel.2020.03.030. This is the first demonstration of "bump–hole" in vivo labelling. "Bump–hole" GalNAc-Ts exhibited exciting new potentials for glycoproteomic study of mucins.
- [48].Cody RB, Burnier RC, and Frelser BS, “Collision-Induced Dissociation with Fourier Transform Mass Spectrometry,” Anal. Chem, 1982, doi: 10.1021/ac00238a029. [DOI] [Google Scholar]
- [49].Alley WR, Mechref Y, and Novotny MV, “Characterization of glycopeptides by combining collision-induced dissociation and electron-transfer dissociation mass spectrometry data,” Rapid Commun. Mass Spectrom, 2009, doi: 10.1002/rcm.3850. [DOI] [PubMed] [Google Scholar]
- [50].Halim A, Nilsson J, Rüetschi U, Hesse C, and Larson G, “Human urinary glycoproteomics; attachment site specific analysis of N- and O-linked glycosylations by CID and ECD,” Mol. Cell. Proteomics, 2012, doi: 10.1074/mcp.M111.013649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [51].Syka JEP, Coon JJ, Schroeder MJ, Shabanowitz J, and Hunt DF, “Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry,” Proc. Natl. Acad. Sci. U. S. A, 2004, doi: 10.1073/pnas.0402700101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [52].Olsen JV, Macek B, Lange O, Makarov A, Horning S, and Mann M, “Higher-energy C-trap dissociation for peptide modification analysis,” Nat. Methods, 2007, doi: 10.1038/nmeth1060. [DOI] [PubMed] [Google Scholar]
- [53].Mayampurath AM, Wu Y, Segu ZM, Mechref Y, and Tang H, “Improving confidence in detection and characterization of protein n-glycosylation sites and microheterogeneity,” Rapid Commun. Mass Spectrom, 2011, doi: 10.1002/rcm.5059. [DOI] [PubMed] [Google Scholar]
- [54].Cao L et al. , “Characterization of intact N- and O-linked glycopeptides using higher energy collisional dissociation,” Anal. Biochem, 2014, doi: 10.1016/j.ab.2014.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [55].Scott NE et al. , “Simultaneous glycan-peptide characterization using hydrophilic interaction chromatography and parallel fragmentation by CID, higher energy collisional dissociation, and electron transfer dissociation MS applied to the N-linked glycoproteome of Campylobact,” Mol. Cell. Proteomics, 2011, doi: 10.1074/mcp.M000031-MCP201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [56].Jedrychowski MP, Huttlin EL, Haas W, Sowa ME, Rad R, and Gygi SP, “Evaluation of HCD- and CID-type fragmentation within their respective detection platforms for murine phosphoproteomics,” Mol. Cell. Proteomics, 2011, doi: 10.1074/mcp.M111.009910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [57].Segu ZM and Mechref Y, “Characterizing protein glycosylation sites through higher-energy C-trap dissociation,” Rapid Commun. Mass Spectrom, 2010, doi: 10.1002/rcm.4485. [DOI] [PubMed] [Google Scholar]
- [58].Yin X, Bern M, Xing Q, Ho J, Viner R, and Mayr M, “Glycoproteomic analysis of the secretome of human endothelial cells,” Molecular and Cellular Proteomics. 2013, doi: 10.1074/mcp.M112.024018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [59].Singh C, Zampronio CG, Creese AJ, and Cooper HJ, “Higher energy collision dissociation (HCD) product ion-triggered electron transfer dissociation (ETD) mass spectrometry for the analysis of N-linked glycoproteins,” J. Proteome Res, 2012, doi: 10.1021/pr300257c. [DOI] [PubMed] [Google Scholar]
- [60**]. Li X, Xu Z, Hong X, Zhang Y, and Zou X, “Databases and bioinformatic tools for glycobiology and glycoproteomics,” International Journal of Molecular Sciences. 2020, doi: 10.3390/ijms21186727. This reviews summarizes the up-to-date databases and bioinformatic tools for glycobiology and glycoproteomic studies in detail, providing resources for readers in searching for information on related topics.
- [61].Hardy BJ, Doughty SW, Parretti MF, Tennison J, and Wilson I, “Letter to the Glyco-Forum,” 1997, doi: 10.1093/oxfordjournals.glycob.a018840. [DOI] [PubMed] [Google Scholar]
- [62].Raman R, Venkataraman M, Ramakrishnan S, Lang W, Raguram S, and Sasisekharan R, “Advancing glycomics: Implementation strategies at the consortium for functional glycomics,” Glycobiology. 2006, doi: 10.1093/glycob/cwj080. [DOI] [PubMed] [Google Scholar]
- [63].Kameyama A et al. , “A strategy for identification of oligosaccharide structures using observational multistage mass spectral library,” Anal. Chem, 2005, doi: 10.1021/ac048350h. [DOI] [PubMed] [Google Scholar]
- [64].Campbell MP et al. , “Validation of the curation pipeline of UniCarb-DB: Building a global glycan reference MS/MS repository,” Biochim. Biophys. Acta - Proteins Proteomics, 2014, doi: 10.1016/j.bbapap.2013.04.018. [DOI] [PubMed] [Google Scholar]
- [65].Lütteke T, Bohne-Lang A, Loss A, Goetz T, Frank M, and von der Lieth CW, “GLYCOSCIENCES.de: An internet portal to support glycomics and glycobiology research,” Glycobiology. 2006, doi: 10.1093/glycob/cwj049. [DOI] [PubMed] [Google Scholar]
- [66].Aoki-Kinoshita K et al. , “GlyTouCan 1.0 - The international glycan structure repository,” Nucleic Acids Res., 2016, doi: 10.1093/nar/gkv1041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [67].Toukach PV and Egorova KS, “Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts,” Nucleic Acids Res., 2016, doi: 10.1093/nar/gkv840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [68].Joshi HJ et al. , “GlycoDomainViewer: A bioinformatics tool for contextual exploration of glycoproteomes,” Glycobiology, 2018, doi: 10.1093/glycob/cwx104. [DOI] [PubMed] [Google Scholar]
- [69*]. Ye Z, Mao Y, Clausen H, and Vakhrushev SY, “Glyco-DIA: a method for quantitative O-glycoproteomics with in silico-boosted glycopeptide libraries,” Nat. Methods, 2019, doi: 10.1038/s41592-019-0504-x. This study demonstrates how to implement DIA analysis for the study of glycopeptides.
- [70].Hansen JE, Lund O, Tolstrup N, Gooley AA, Williams KL, and Brunak S, “NetOglyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility,” Glycoconj. J, 1998, doi: 10.1023/A:1006960004440. [DOI] [PubMed] [Google Scholar]
- [71].Gupta R and Brunak S, “Prediction of glycosylation across the human proteome and the correlation to protein function.,” Pac. Symp. Biocomput, 2002, doi: 10.1142/9789812799623_0029. [DOI] [PubMed] [Google Scholar]
- [72**]. Lu L, Riley NM, Shortreed MR, Bertozzi CR, and Smith LM, “O-Pair Search with MetaMorpheus for O-glycopeptide characterization,” Nat. Methods, 2020, doi: 10.1038/s41592-020-00985-5. This paper reported a novel data analysis approach for the improved identification of O-glycopeptides and localization of O-glycosites, using graph theory (more commonly used in top-down proteomics) and probability-based localization.
- [73].Bern M, Kil YJ, and Becker C, “Byonic: Advanced peptide and protein identification software,” Curr. Protoc. Bioinforma, 2012, doi: 10.1002/0471250953.bi1320s40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [74**]. Polasky DA, Yu F, Teo GC, and Nesvizhskii AI, “Fast and comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco,” Nat. Methods, 2020, doi: 10.1038/s41592-020-0967-9. This paper reported a new mode in MSFragger search engine which stands out for its fast and sensitive identification of N- and O-linked glycopeptides and open glycan searches.
- [75*]. Huang J et al. , “Development of a Computational Tool for Automated Interpretation of Intact O-Glycopeptide Tandem Mass Spectra from Single Proteins,” Anal. Chem, 2020, doi: 10.1021/acs.analchem.0c01091. This paper reported a computational tool AOGP for intact O-glycopeptide analysis on single proteins, which utilized de novo sequencing for O-glycans and a database search strategy for peptide backbones.


