Abstract
O-GlcNAcylation has gradually been recognized as a critically important protein post-translational modification in mammalian cells. Besides regulation of gene expression, its crosstalk with protein phosphorylation is vital for cell signaling. Despite its importance, comprehensive analysis of O-GlcNAcylation is extraordinarily challenging due to the low abundances of many O-GlcNAcylated proteins and the complexity of biological samples. Here, we developed a novel chemoenzymatic method based on a wild-type galactosyltransferase and uridine diphosphate galactose (UDP-Gal) for global and site-specific analysis of protein O-GlcNAcylation. This method integrates enzymatic reactions and hydrazide chemistry to enrich O-GlcNAcylated peptides. All reagents used are more easily accessible and cost-effective as compared to the engineered enzyme and click chemistry reagents. Biological triplicate experiments were performed to validate the effectiveness and the reproducibility of this method, and the results are comparable with the previous chemoenzymatic method using the engineered enzyme and click chemistry. Moreover, because of the promiscuity of the galactosyltransferase, 18 unique O-glucosylated peptides were identified on the EGF domain from nine proteins. Considering that effective and approachable methods are critical to advance glycoscience research, the current method without any sample restrictions can be widely applied for global analysis of protein O-GlcNAcylation in different samples.
Graphical Abstract

O-GlcNAcylation is a type of protein glycosylation where a single N-acetylglucosamine (GlcNAc) is linked to the serine and threonine residues. It is added by O-GlcNAc transferase (OGT) and removed by O-GlcNAcase (OGA).1 O-GlcNAcylated proteins are primarily located in the cytoplasm and the nucleus, but extracellular O-GlcNAcylated proteins have also been reported recently.2–4 To date, many O-GlcNAcylated proteins have been identified, and this modification has been found to be extensively involved in many cellular events, including gene transcription and cell signaling.5–7 As protein O-GlcNAcylation plays a critical role in mammalian cell survival, its dysregulation is associated with multiple diseases, such as cancer, diabetes, and neurodegenerative diseases.8–10
Despite its importance, global analysis of protein O-GlcNAcylation is extremely challenging due to the low abundances of many O-GlcNAcylated proteins and the complexity of biological samples. Different enrichment methods have been reported for comprehensive analysis of O-GlcNAcylated proteins by mass spectrometry (MS),11–16 including antibody and lectin-based methods.17,18 Other chemical methods, including β-elimination Michael addition (BEMA),19 and metabolic labeling by sugar analogues, were also reported for O-GlcNAcylated protein/peptide enrichment.13
Previously, an elegant chemoenzymatic method was developed to enrich O-GlcNAcylated proteins.11 An engineered galactosyltransferase (GalT) was employed to label O-GlcNAcylated peptides/proteins, and this mutated GalT (i.e., Y289L GalT) can tolerate the bulky acetyl-azido group on C2 of UDP-N-azidoacetyl galactosamine (UDP-GalNAz). The azido group can serve as a handle for O-GlcNAcylated peptide/protein enrichment using click chemistry. This method has been applied to analyze protein O-GlcNAcylation in different samples.12,20,21 However, the engineered GalT is required, and the expression of the Y289L GalT plasmid is not trivial for many laboratories.22 Additionally, the sugar donor containing the azido group, UDP-GalNAz, is relatively difficult to synthesize, and its price is much higher than that of the corresponding regular UDP sugars without the azido group. A kit designated for this chemoenzymatic labeling was available including the engineered enzyme and UDP-GalNAz, but one kit at around 600 USD is only sufficient for labeling one sample with a few milligrams of peptides/proteins for proteomic studies.20,23 Furthermore, the use of a cleavable linker that releases enriched glycopeptides for site-specific analysis is also costly. The limited availability and the high cost of the reagents greatly hamper the applications of this method.
To solve these problems, here a novel chemoenzymatic method has been developed for global and site-specific characterization of protein O-GlcNAcylation. Instead of the engineered Y289L GalT, a wild-type galactosyltransferase from bovine milk is used. This enzyme can use UDP-galactose and transfer galactose to the glycopeptide acceptor. The labeled glycopeptides are oxidized with galactose oxidase (GAO) that specifically oxidizes the C6 hydroxyl group of galactose and GalNAc. The tagged glycopeptides then are enriched by hydrazide resins and released with methoxyamine. All reagents used are more easily accessible and cost-effective as compared to the engineered enzyme and click chemistry reagents. Biological triplicate experiments were performed to validate the effectiveness and the reproducibility of this method, and the results are comparable with the previous chemoenzymatic method using the engineered enzyme and click chemistry reagents. Moreover, because of the promiscuity of the galactosyltransferase, 18 unique O-glucosylated peptides were identified on the EGF domain from nine proteins. Without any sample restrictions, this method can be widely applied for global and site-specific analysis of protein O-GlcNAcylation in different samples.
EXPERIMENTAL SECTION
Cell Culture, Cell Lysis, and Protein Digestion.
MCF-7 cells (from American type culture collection, ATCC) were grown in high glucose Dulbecco’s Modified Eagle’s Medium (DMEM, Sigma-Aldrich) containing 10% fetal bovine serum (FBS, Corning) in a humidified incubator with 5.0% CO2 at 37 °C. Cells were harvested when the confluency reached ∼80%, washed twice with ice-cold PBS, and pelleted by centrifugation. Cells were lysed by a buffer containing 50 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES), pH = 7.4, 150 mM NaCl, 0.5% sodium deoxycholate (SDC), 0.1% sodium dodecyl sulfate (SDS), 1.0% Nonidet P-40 (Sigma), 25 units/mL benzonase, and 1 tablet/10 mL of EDTA-free protease inhibitor (Roche) for 45 min at 4 °C. Proteins were reduced with 5 mM dithiothreitol (DTT, Sigma) for 25 min at 56 °C and alkylated with 14 mM iodoacetamide (Sigma) for 20 min in the dark at room temperature. Proteins were purified by the methanol−chloroform protein precipitation method, and the resulted proteins were digested with trypsin for 16 h at 37 °C.
Chemoenzymatic Enrichment of O-GlcNAcylated Peptides.
After digestion, peptides were desalted using tC18 Sep-Pak cartridge (Waters). Typically, ∼2 mg of peptides was used for chemoenzymatic enrichment of glycopeptides. Briefly, purified peptides were dissolved in a reaction buffer (50 mM HEPES, pH = 8.6, 10 mM MnCl2, 1.0 mM UDP-galactose (UDP-Gal, Millipore)), and then 0.5 U galactosyltransferase from bovine milk (EC 2.4.1.22, Sigma) was added to the solution to initiate the reaction. The mixture was incubated at 30 °C for 16 h. The tagged glycopeptides were oxidized with galactose oxidase (GAO, Innovative Research) for enrichment. DMSO was added to the solution to make a final concentration of 20% (v/v). The purpose of adding DMSO was to quench the hydroxyl radicals generated during the oxidation that could damage the peptide backbone.24 16 U horseradish peroxidase (HRP, Sigma) and 25 U galactose oxidase were added to the solution, and the reaction lasted for 1 h at 35 °C. The oxidized glycopeptides were bound to Ultralink hydrazide resins (Thermo) with a reaction buffer containing 0.10 M aniline (Sigma) and 0.30 M sodium acetate, and the pH was adjusted to 4.5 by adding acetic acid. Aniline can catalyze the reaction between aldehyde and hydrazine with the highest reaction rate at pH = 4.5.25 The reaction lasted for 1 h at room temperature, and the resins were washed five times using 100 mM PBS to remove nonspecific binding peptides. The enriched glycopeptides were eluted with a buffer containing 0.20 M methoxyamine hydrochloride (Sigma), 0.10 M NaAc, and 0.10 M aniline, pH = 4.5, for 1 h at room temperature. The elution was repeated once, and the eluents were combined, desalted, and fractionated by stage-tip, and then lyophilized.
LC−MS/MS Analysis.
The dried samples were dissolved in 5 μL of solvent containing 5% ACN. Formic acid is not recommended in the solvent because the oxime bond in the labeled glycopeptides could be hydrolyzed under acidic conditions. On the contrary, the bond is stable at neutral pH.26,27 Four microliters of the solution was loaded onto a microcapillary column packed with C18 beads (Magic C18AQ, 3 μm, 200 Å, 100 μm × 16 cm, Michrom Bioresources) using a Dionex WPS-3000TPLRS autosampler (UltiMate 3000 thermostated Rapid Separation Pulled Loop Wellplate Sampler). Peptides from every fraction were separated by reversed-phase high performance liquid chromatography (HPLC) with a 120 min gradient of 2−30% ACN (with 0.125% FA). Separated peptides were detected with a data-dependent Top15 method in a hybrid dual-cell quadrupole linear ion trap − Orbitrap mass spectrometer (LTQ Orbitrap Elite, ThermoFisher, with Xcalibur 3.0.63 software). For each cycle, the peptides were first analyzed by a full MS scan with the resolution of 60 000 in the Orbitrap cell. Peptides then were fragmented by higher-energy collision dissociation (HCD) at 34% normalized collision energy (NCE), and fragments were recorded in the Orbitrap cell (resolution = 15 000) as well. Selected precursor ions were excluded from further sequencing for 90 s. Ions with singly or unassigned charge were not fragmented.
Database Search, Glycopeptide Identification, and Bioinformatic Analysis.
The resulting raw files were searched against the human (Homo sapiens) proteome database (from UniProt) using Byonic software.28 The following parameters were employed during the search: 20 ppm precursor mass tolerance; 0.025 Da product ion mass tolerance; two missed cleavages; scan type, HCD; fixed modification, alkylation of cysteine (+57.0215 Da); variable modifications, oxidation of methionine (+15.9949 Da), O-GlcNAc (HexNAc(1)Hex(1) 27.0109, HexNAc(1)Hex(1) 9.0003 (−18.0106 due to the water loss) on S or T), and O-glucose (Hex(1)Hex(1) 27.0109, Hex(1)Hex(1) 9.0003 on S or T). The false discovery rates (FDRs) of glycopeptides were filtered to less than 1%. To ensure the high quality of the glycopeptide identification, additional filters including |Log prob| > 4, Score > 300 were implemented.29 As glycopeptides modified with the Thomsen−Friedenreich (TF) antigen, a mucin-type O-glycan, could also be enriched and identified using this method, the spectra were manually checked to remove potential TF bearing glycopeptides on the basis of the relative intensities of the diagnostic peaks in the low mass range.30 Briefly, a ratio (the peak intensities of (m/z 138 + m/z 168)/(m/z 126 + m/z 144)) was adopted to determine the identities of the glycan. If the ratio is >2.0, then the glycopeptide was identified as O-GlcNAcylated. If the ratio was between 0.4 and 0.6, the glycopeptide was identified as the TF antigen modified. The identified O-GlcNAcylated proteins were clustered using the Database for Annotation, Visualization, and Integrated Discovery (DAVID).31
RESULTS
Principle of the Chemoenzymatic Method to Enrich Glycopeptides.
The current method integrates enzymatic and hydrazide chemistry reactions to enrich O-GlcNAcylated peptides, and the principle is displayed in Figure 1. Proteins from a human cell lysate were digested into peptides. To enrich O-GlcNAcylated peptides, galactosyltransferase from bovine milk (GalT) was added along with UDP-galactose (UDP-Gal) and MnCl2 to tag O-GlcNAc with a β−1,4-linked galactose. Previously, GalT had already been applied in some pioneering studies of O-GlcNAcylation, where it transferred isotope-labeled UDP-Gal to nuclear-cytoplasmic O-GlcNAcylated proteins for detection.32,33 Galactose was linked to both glucose and GlcNAc by GalT. It was reported that with α-lactalbumin, glucose became the preferred acceptor of the enzyme.34 However, without α-lactalbumin, the reaction rate for linking Gal to GlcNAc is much faster. Therefore, to increase the reaction efficiency toward O-GlcNAc, we did not add α-lactalbumin.
Figure 1.

Principle for chemoenzymatic enrichment of O-GlcNAcylated peptides. R = CH3 or H.
After the addition of galactose to O-GlcNAc, the C6 hydroxyl group of the galactose can be specifically oxidized and converted to the aldehyde group by galactose oxidase (GAO). GAO is a metalloenzyme that can specifically oxidize galactose and GalNAc.35 It has been successfully exploited for the enrichment of O-glycosylated collagen,36 cell-surface N-glycoproteins,37,38 and Tn antigen-bearing glycoproteins.24 The aldehyde group generated after the oxidation serves as a handle to enrich glycopeptides using hydrazide resins. After the removal of nonspecific binding peptides, the enriched glycopeptides were eluted with methoxyamine. The elution is fast and complete, as the substitution reaction is driven by the excessive amount of methoxyamine in the solution, as well as the oxime formed is much more stable than hydrazone (the linkage between the resins and the glycopeptides). The eluted O-GlcNAcylated peptides carry a relatively small mass tag, which enabled site-specific analysis of O-GlcNAcylation by LC−MS/MS.
Optimization of the Enzymatic Labeling Reaction and Identification of O-GlcNAcylated Peptides.
To evaluate the effect of GalT, the parallel experiments with different amounts of GalT were performed (Figure 2a). Each sample with ∼2 mg of peptides from a HEK 293T whole cell lysate was treated with different amounts of the enzyme. The results showed that from 0.25 to 2.5 U, the numbers of identified glycoproteins were very similar, which indicated that the enzyme amount as low as 0.25 U is sufficient. The result clearly demonstrated that the enzymatic reaction is effective.
Figure 2.

(a) Evaluation of the amount of GalT for the transfer of Gal to O-GlcNAc. (b) An example O-GlcNAcylated peptide identified: PLVPPVS#GHATIAR (“#” refers to the O-GlcNAcylation site). The glycopeptide was confidently identified with a Score of 601.7 and |Log prob| of 12.8.
An example O-GlcNAcylated peptide of PLVPPVS#GHATIAR (“#” refers to the O-GlcNAcylation site) is shown in Figure 2b. This glycopeptide is from nuclear receptor corepressor 2 (NCOR2), which is located in the nucleus. The presence of the modified GlcNAc was verified by the diagnostic peaks (shown in green). The peak assigned as HexNAc(1)Hex(1) + 27.0109 Da is the modified GlcNAc, which serves as a direct proof of the modification, and +27.0109 Da is due to the mass increase after the oxidation and the reaction with methoxyamine. The O-GlcNAcylation site was localized at S423 with a Delta Mod. score = 11.7. With use of HCD, fragments from glycan were often detected,39,40 and this provided us with solid experimental evidence to further confirm the identities of the glycan.
The current method was applied to profile protein O-GlcNAcylation in MCF-7 cells. In biological triplicate experiments, 179, 223, and 174 O-GlcNAcylated proteins were identified, respectively (Figure 3a and Table S1). There are 135 O-GlcNAcylated proteins identified in all three experiments, which demonstrates that this method is reasonably reproducible. Protein clustering results showed that the O-GlcNAcylated proteins identified were enriched in the nucleoplasm, cytoplasm, and cell−cell adherens junction (Figure 3b). Moreover, proteins involved in transcription, and with the functions of protein binding, DNA binding, and RNA binding, were over-represented. These results are consistent with the known locations and functions of O-GlcNAcylated proteins, which further validated that the current method is effective for global analysis of protein O-GlcNAcylation.
Figure 3.

(a) Venn diagram showing the overlaps of identified O-GlcNAcylated proteins in biological triplicate experiments. (b) Clustering of identified O-GlcNAcylated proteins based on biological process, cellular component, and molecular function.
In this study, the TF antigen, which is composed of a galactose and a GalNAc, has the same molecular weight as the modified O-GlcNAc, and it is also bound to the S/T residues. However, this should not be a problem because this glycan is uncommon in normal human cells. Even though glycopeptides with the TF antigen are detected by MS, they can be distinguished from O-GlcNAcylated peptides on the basis of the fragmentation pattern differences of the two glycans in tandem MS, as reported by Halim et.al.30 Through comparison of the intensities of the diagnostic peaks of the glycans (details included in the Experimental Section), around 10 TF-bearing proteins were identified (Table S2), and one example is shown in Figure S1.
Comparison of the Results from the Current Method with Those from Previous Reports.
Electron transfer dissociation (ETD) was often used to fragment O-GlcNAcylated peptides as ETD breaks the peptide bonds while the glycosidic linkage remains, which thus preserved the glycan attached to the amino acid residues for accurate site localization.41,42 By contrast, HCD breaks both the glycosidic bond and the peptide amide bond, and only a few of peptide fragments still have the glycans attached to them. Nevertheless, glycan fragments can serve as a proof for the glycan and could be used to determine the glycan structure. In this work, HCD was employed for O-glycopeptide analysis.
To demonstrate that the current method was effective for the identification of protein O-GlcNAcylation, we compared the current results with those from two very recent publications using the chemoenzymatic enrichment method based on the engineered GalT and click chemistry, in which protein O-GlcNAcylation in cancer cell lines was globally analyzed by ETD-MS.23,43 In one experiment, Qin et al. identified 196 proteins with O-GlcNAc in HeLa cells, and Li et al. found 269 O-GlcNAcylated proteins from HEK293 cells. In this work, 251 O-GlcNAcylated proteins from MCF-7 cells were identified using the current method with HCD in triplicate experiments, and, on average, 192 were detected in each experiment. The numbers of identified glycoproteins are very similar, and the differences could be due to multiple reasons, including the use of different cell lines and MS dissociation methods, besides the different enrichment methods. The overlaps between every two of the three data sets are around 50% (Figure S2). Overall, the results from the current method are comparable with those in the previous reports, but without the need of the engineered enzyme and click chemistry reagents, the current method is more accessible and cost-effective.
O-GlcNAcylation and Unexpected O-Glucosylation on the EGF Domain.
The EGF domain is a short sequence bound with high affinity to specific cell-surface receptors to activate some signal transduction cascades. It is primarily found in the extracellular region of membrane-bound proteins, especially notch proteins. Notch proteins are frequently regulated by glycosylation on EGF domains, including O-fucose (O-Fuc), O-glucose (O-Glc), and O-GlcNAc.44–46 Alfaro et al. first demonstrated the use of chemoenzymatic enrichment coupling with a proteomics workflow for the identification of extracellular O-GlcNAcylation on the EGF domain.20 This atypical O-GlcNAcylation is added by ER-resident OGT (EOGT), and it participates in the regulation of cell−cell and cell−matrix interactions on the cell surface.47,48 It was found that O-GlcNAcylation on the EGF domain follows a consensus sequence, CXXGX(T/S)GXXC (X refers to any amino acid residues).20
Six O-GlcNAcylation sites were found on the EGF domain in four proteins by the current method (Table 1), and one example glycopeptide from notch 2 protein is displayed in Figure 4a. All of the glycopeptides identified were manually checked to make sure that the glycosylation sites were localized on the EGF domain. Five out of six O-GlcNAcylation sites have the motif (shown in red), but for the O-GlcNAcylated peptide identified from attractin (ATRN), the glycine on the fourth position of the motif is substituted by leucine (shown in yellow). Considering the similarities between the structures of glycine and leucine, it is possible that EOGT could tolerate this slight structure difference. In fact, in the early work about EOGT, a motif of “CXXXX(T/S)G” was proposed for extracellular O-GlcNAc.47,49
Table 1.
O-GlcNAcylation Sites Identified on the EGF Domaina
| Glycopeptide | Glycan | Position | Score | Delta Mod | |Log Prob| | Gene Name |
|---|---|---|---|---|---|---|
| K.CLTGFT#GQK.C | HexNAc(1)Hex(1) 27.0109 |
170 | 736.1 | 38.9 | 14.2 | NOTCH2 |
| R.YSCVCSPGFT#GQR.C | HexNAc(1)Hex(1) 27.0109 |
666 | 716.1 | 43.3 | 15.4 | NOTCH2 |
| R.CPPGFT#GDYCETEVDLCYSRPCGPHGR.C | HexNAc(1)Hex(1) 27.0109 |
1276 | 432.1 | 6.0 | 9.5 | CELSR2 |
| R.CVCEPGWS#GPR.C | HexNAc(1)Hex(1) 27.0109 |
718 | 581.3 | 329.8 | 10.9 | NOTCH3 |
| R.CTCPPGYT#GLR.C | HexNAc(1)Hex(1) 27.0109 |
1191 | 673.7 | 153.2 | 14.0 | NOTCH3 |
| K.CENLTT#GK.HC | HexNAc(1)Hex(1) 9.0003 |
1080 | 622.2 | 7.4 | 12.6 | ATRN |
“#” denotes the O-GlcNAcylation site, and the consensus motif for O-GlcNAcylation on the EGF domain is highlighted in red. The sequence in yellow is similar to the consensus motif, but has some difference on certain amino acid residues.
Figure 4.

Examples of the identified O-GlcNAcylated peptide and O-glucosylated peptide with the site on the EGF domain. (a) A glycopeptide modified with O-GlcNAc. This peptide is from notch 2 protein. Score = 698.1, |Log prob| = 13.4, Delta Mod = 47.0. (b) A glycopeptide modified with O-Glc. This peptide is from notch 3 protein. Score = 956.4, |Log prob| = 19.6, Delta Mod = 697.9.
Interestingly, besides O-GlcNAcylation, O-glucosylation was also extensively identified on the EGF domain in this work. We reason that the identification of O-Glc was due to the promiscuity of GalT that takes both terminal Glc and GlcNAc as substrates. In total, 18 unique glycopeptides modified with O-Glc were identified on the EGF domain among nine proteins (Table 2), and an example glycopeptide with O-Glc is shown in Figure 4b. O-Glc was previously identified on the EGF domain of the notch proteins, and it is critical for their folding and functions.50,51 A family of glucosyltransferases (POGLUT1, POGLUT2, and POGLUT3) were found to modify the serine residue on different EGF domains in notch proteins52 and two other proteins critical to Drosophila development.53 Similar to O-GlcNAcylation on the EGF domain, there are the consensus motifs of O-Glc as well: “CXSXPC” when added by POGLUT1 and “CXNTXGSFXC” when added by POGLUT2 and POGLUT3.52 It was predicted that about 50 proteins in mammals and 14 in Drosophila could be modified with O-glucose, but the vast majority of them stayed undetected.44
Table 2.
O-Glucosylation Sites Identified on the EGF Domaina
| Glycopeptide | Glycan | Position | Score | Delta Mod | |Log Prob| | Suggested Enzymes | Gene Name |
|---|---|---|---|---|---|---|---|
| K.AGLLCHLDDACIS#NPCHK.G | Hex(1)Hex(1) 27.0109 | 369 | 832.5 | 721.9 | 16.7 | POGLUT1 | NOTCH2 |
| R.VKDGCDVDDPCTS#SPCPPNSR.C | Hex(1)Hex(1) 27.0109 | 1866 | 586.1 | 0.8 | 12.2 | POGLUT1 | CELSR1 |
| R.CQLEDPCHS#GPCAGR.G | Hex(1)Hex(1) 27.0109 | 76 | 991.2 | 803.1 | 20.3 | POGLUT1 | NOTCH3 |
| R.GFRGPDCSLPDPCLS#SPCAHGAR.C | Hex(1)Hex(1) 27.0109 | 111 | 719.0 | 1.2 | 15.8 | POGLUT1 | NOTCH3 |
| R.GPDCSLPDPCLS#SPCAHGAR.C | Hex(1)Hex(1) 9.0003 | 114 | 608.5 | 0.0 | 13.6 | POGLUT1 | NOTCH3 |
| K.TGLLCHLDDACVS#NPCHEDAICDTNPVNGR.A | Hex(1)Hex(1) 9.0003 | 345 | 566.6 | 1.6 | 11.6 | POGLUT1 | NOTCH3 |
| R.CETDVNECLS#GPCR.N | Hex(1)Hex(1) 27.0109 | 428 | 572.1 | 16.7 | 11.0 | POGLUT1 | NOTCH3 |
| R.DACES#QPCR.A | Hex(1)Hex(1) 27.0109 | 736 | 388.1 | 314.2 | 5.1 | POGLUT1 | NOTCH3 |
| R.CQTVLSPCES#QPCQHGGQCRPSPGPGGGLTFTCHCAQPFWGPR.C | Hex(1)Hex(1) 27.0109 | 1243 | 548.7 | 0.0 | 12.7 | POGLUT1 | NOTCH3 |
| R.LGESCINTVGS#FR.C | Hex(1)Hex(1) 27.0109 | 229 | 628.9 | 32.9 | 14.0 | POGLUT2/3 | FBLN1 |
| CR.NNLGS#FNCECSPGSK.L | Hex(1)Hex(1) 27.0109 | 868 | 437.4 | 9.6 | 11.0 | POGLUT2/3 | FBN2 |
| CVNSK.GS#FHCECPEGLTLDGTGR.V | Hex(1)Hex(1) 27.0109 | 976 | 675.5 | 9.4 | 11.0 | POGLUT2/3 | FBN2 |
| R.GEHCVNTLGS#FHCYK.A | Hex(1)Hex(1) 27.0109 | 733 | 694.6 | 9.5 | 14.0 | POGLUT2/3 | FBLN2 |
| R.LCQHTCENTLGS#YR.C | Hex(1)Hex(1) 27.0109 | 912 | 626.2 | 21.0 | 12.6 | POGLUT2/3 | FBLN2 |
| R.VNNGGCSSLCLATPGS#R.QC | Hex(1)Hex(1) 27.0109 | 809 | 622.4 | 22.9 | 14.1 | POGLUT2/3 | LRP1 |
| K.CDQNKFS#VK.C | Hex(1)Hex(1) 9.0003 | 1237 | 663.4 | 453.1 | 13.7 | POGLUT2/3 | LRP1 |
| K.CINTDGS#YK.C | Hex(1)Hex(1) 27.0109 | 1681 | 431.6 | 0.8 | 5.0 | POGLUT2/3 | LTBP1 |
| R.HGDCLNNPGS#YR.C | Hex(1)Hex(1) 9.0003 | 367 | 356.5 | 304.9 | 4.9 | POGLUT2/3 | LTBP3 |
The two types of consensus motifs are highlighted in blue and red.
In this work, for O-glucosylation, the protein substrates with both types of the consensus motifs were found. Nine unique O-glucosylated peptides contain the motif of “CXSXPC”, which suggests that they are the product of POGLUT1. Among them, one is from NOTCH2, one is from CELSR1, and seven are from NOTCH3.
Any glycopeptides with the motif of (or similar to) “CXNTXGSFXC” (presumably added by POGLUT2/3) have the sequence highlighted in red. Although some of them have the exact motif, others are more or less different. We reasoned that the original motif proposed by Takeuchi et al., which was based on a few sites identified on notch proteins, could be expanded to a more general motif “CXXXXXSXXC” for the substrates of POGLUT2/3, based on the O-Glc-modified glycopeptides identified here. It was also possible that the O-glucosylation sites identified in this data set could be products of other unfound enzymes, and more research is needed to further confirm the enzyme(s) responsible for the O-glucosylation sites identified here and the consensus motif of the O-glucosylation added by POGLUT2/3.
Some other O-glucosylation sites on proteins without an EGF domain were also found (Table S3). Among them, the same glycopeptide was identified as O-glucosylated as previously reported,54 where O-glucose was found on a peptide from residues 612−637 of the protein HCFC1. Moreover, a peptide from histone H2B (residues 110−121) was identified to be O-glucosylated. This O-glucosylation site could potentially overlap with a previously identified O-GlcNAcylation site, that is, S112. This O-GlcNAcylation promotes the monoubiquitination of K120, presumably for transcription activation.55,56 The result has implicated the possibility of O-Glc being a part of histone code.
DISCUSSION
The chemoenzymatic method using the engineered GalT and UDP-GalNAz has advanced the study of protein O-GlcNAcylation.11 Nevertheless, the engineered GalT is not easy to express in many laboratories. Moreover, UDP-GalNAz is relatively difficult to synthesize and is costly. Although the kit dedicated to this method is commercially available, the high expense may restrict many researchers from using it to study this modification. Especially for O-GlcNAcylation profiling using multiplex proteomics, where a large amount of proteins from each sample is required and multiple samples are included in each experiment, the expense will be much higher. Here, we demonstrated that the wild-type GalT and UDP-Gal can be employed to enrich O-GlcNAcylated peptides, and the results are comparable with those from the engineered GalT. More importantly, without the need of the engineered enzyme and click chemistry reagents, the current method is more accessible and cost-effective.
To further improve the coverage of protein O-GlcNAcylation, the following measures may be helpful: first, O-GlcNAcylation is a reversible modification and can be removed by OGA. The inhibition of OGA (Thiamet-G and PUGNAc) will elevate the cellular protein O-GlcNAcylation level.57,58 Second, more sensitive and high-speed mass spectrometers will be more effective in identifying O-GlcNAcylated peptides considering that many O-GlcNAcylated peptides have very low abundance even after the enrichment. Third, because the modified glycan group (O-GlcNAc) is fragile under CID and HCD, other activation methods such as electron-transfer/higher-energy collision dissociation (EThcD) and activated-ion ETD (AI-ETD) may be helpful for the detection of O-GlcNAcylated peptides.59,60 Furthermore, more advanced software will facilitate the identification of O-GlcNAcylated peptides.61
Interestingly, using the current method, we identified protein O-glucosylation on the EGF domain. This type of protein glycosylation was reported previously, but has not been well studied. One of the main reasons is that only a few sites on notch proteins were reported, and many others remain unknown. The current method allows us to identify eight O-glucosylation sites on two notch proteins, and 10 on the other seven proteins previously unknown to be O-glucosylated. For future O-glucosylated peptide enrichment, adding α-lactabumin along with GalT may facilitate the identification of protein O-glucosylation because it changes the structure of GalT and increases the rate of catalysis for glucose.62 O-Glucosylation is critical for the folding and the functions of proteins with EGF domain, and, therefore, the current method will also advance the study of protein O-glucosylation.
Using this method, the labeled O-GlcNAc has the same molecular weight as the endogenous TF antigen. This should not be a big concern because the TF antigen is uncommon in normal human cells, and proteins modified with the TF antigen are on the cell surface and in the secretory pathway. The vast majority of proteins modified with O-GlcNAc are located in the nucleus and the cytoplasm. Additionally, these two modifications can be distinguished by the pattern differences of the oxonium ions in tandem MS.30
In conclusion, a novel method based on the wild-type GalT and UDP-Gal was developed to enrich O-GlcNAcylated peptides for global and site-specific analysis of protein O-GlcNAcylation. More than 200 O-GlcNAcylated proteins were identified, which are comparable with those from the previous chemoenzymatic method using the engineered GalT and UDP-GalNAz. The reagents used in this study are much more easily accessible and affordable. Without the engineered enzyme and click chemistry reagents, the current method will be widely applied for global analysis of protein O-GlcNAcylation. Besides O-GlcNAcylated peptides, many O-glucosylated peptides were also identified. This method will aid in a better understanding of glycoprotein functions and cellular activities.
Supplementary Material
ACKNOWLEDGMENTS
This work was supported by the National Science Foundation (CAREER award CHE-1454501) and the National Institute of General Medical Sciences of the National Institutes of Health (R01GM118803).
Footnotes
ASSOCIATED CONTENT
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.analchem.0c01284.
Figure S1, an example glycopeptide modified with TF antigen; Figure S2, Venn diagram showing the overlap of O-GlcNAcylated proteins identified in this work and reported from the literature; and Table S3, other O-glucosylated peptides identified in this work (PDF)
Table S1, identified O-GlcNAcylated peptides from biological triplicate experiments (XLSX)
Table S2, glycopeptides with the TF antigen identified in triplicate experiments (XLSX)
The authors declare no competing financial interest.
REFERENCES
- (1).Lubas WA; Frank DW; Krause M; Hanover JA J. Biol. Chem 1997, 272 (14), 9316–9324. [DOI] [PubMed] [Google Scholar]
- (2).Hart GW Annu. Rev. Biochem 1997, 66, 315–335. [DOI] [PubMed] [Google Scholar]
- (3).Hart GW; Housley MP; Slawson C Nature 2007, 446 (7139), 1017–1022. [DOI] [PubMed] [Google Scholar]
- (4).Sakaidani Y; Nomura T; Matsuura A; Ito M; Suzuki E; Murakami K; Nadano D; Matsuda T; Furukawa K; Okajima T Nat. Commun 2011, 2, 1. [DOI] [PubMed] [Google Scholar]
- (5).Li X; Zhu Q; Shi X; Cheng Y; Li X; Xu H; Duan X; Hsieh-Wilson LC; Chu J; Pelletier J; Ni M; Zheng Z; Li S; Yi W Proc. Natl. Acad. Sci. U. S. A 2019, 116 (16), 7857–7866. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (6).Liu C; Li J Front. Endocrinol. (Lausanne, Switz.) 2018, 9, 415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (7).Ruan HB; Han X; Li MD; Singh JP; Qian K; Azarhoush S; Zhao L; Bennett AM; Samuel VT; Wu J; Yates JR 3rd; Yang X Cell Metab 2012, 16 (2), 226–237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Hanover JA; Chen W; Bond MR J. Bioenerg. Biomembr 2018, 50 (3), 155–173. [DOI] [PubMed] [Google Scholar]
- (9).Yang X; Ongusaha PP; Miles PD; Havstad JC; Zhang F; So WV; Kudlow JE; Michell RH; Olefsky JM; Field SJ; Evans RM Nature 2008, 451 (7181), 964–969. [DOI] [PubMed] [Google Scholar]
- (10).Levine PM; Galesic A; Balana AT; Mahul-Mellier AL; Navarro MX; De Leon CA; Lashuel HA; Pratt MR Proc. Natl. Acad. Sci. U. S. A 2019, 116 (5), 1511–1519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (11).Khidekel N; Arndt S; Lamarre-Vincent N; Lippert A; Poulin-Kerstien KG; Ramakrishnan B; Qasba PK; Hsieh-Wilson LC J. Am. Chem. Soc 2003, 125 (52), 16162–16163. [DOI] [PubMed] [Google Scholar]
- (12).Wang ZH; Udeshi ND; O’Malley M; Shabanowitz J; Hunt DF; Hart GW Mol. Cell. Proteomics 2010, 9 (1), 153–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (13).Boyce M; Carrico IS; Ganguli AS; Yu SH; Hangauer MJ; Hubbard SC; Kohler JJ; Bertozzi CR Proc. Natl. Acad. Sci. U. S. A 2011, 108 (8), 3141–3146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (14).Wang XS; Yuan ZF; Fan J; Karch KR; Ball LE; Denu JM; Garcia BA Mol. Cell. Proteomics 2016, 15 (7), 2462–2475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (15).Suttapitugsakul S; Sun FX; Wu RH Anal. Chem 2020, 92 (1), 267–291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (16).Xiao HP; Sun FX; Suttapitugsakul S; Wu RH Mass Spectrom. Rev 2019, 38 (4−5), 356–379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (17).Trinidad JC; Barkan DT; Gulledge BF; Thalhammer A; Sali A; Schoepfer R; Burlingame AL Mol. Cell. Proteomics 2012, 11 (8), 215–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (18).Teo CF; Ingale S; Wolfert MA; Elsayed GA; Not LG; Chatham JC; Wells L; Boons GJ Nat. Chem. Biol 2010, 6 (5), 338–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (19).Wells L; Vosseller K; Cole RN; Cronshaw JM; Matunis MJ; Hart GW Mol. Cell. Proteomics 2002, 1 (10), 791–804. [DOI] [PubMed] [Google Scholar]
- (20).Alfaro JF; Gong CX; Monroe ME; Aldrich JT; Clauss TRW; Purvine SO; Wang ZH; Camp DG; Shabanowitz J; Stanley P; Hart GW; Hunt DF; Yang F; Smith RD Proc. Natl. Acad. Sci. U. S. A 2012, 109 (19), 7280–7285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (21).Wang S; Yang F; Petyuk VA; Shukla AK; Monroe ME; Gritsenko MA; Rodland KD; Smith RD; Qian WJ; Gong CX; Liu T J. Pathol 2017, 243 (1), 78–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (22).Ramakrishnan B; Qasba PK J. Biol. Chem 2002, 277 (23), 20833–20839. [DOI] [PubMed] [Google Scholar]
- (23).Qin K; Zhu Y; Qin W; Gao J; Shao X; Wang YL; Zhou W; Wang C; Chen X ACS Chem. Biol 2018, 13 (8), 1983–1989. [DOI] [PubMed] [Google Scholar]
- (24).Zheng J; Xiao H; Wu R Angew. Chem., Int. Ed 2017, 56 (25), 7107–7111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (25).Dirksen A; Hackeng TM; Dawson PE Angew. Chem., Int. Ed 2006, 45 (45), 7581–7584. [DOI] [PubMed] [Google Scholar]
- (26).Kalia J; Raines RT Angew. Chem., Int. Ed 2008, 47 (39), 7523–7526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (27).Kolmel DK; Kool ET Chem. Rev 2017, 117 (15), 10358–10376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (28).Bern M; Kil YJ; Becker C Curr. Protoc. Bioinformatics 2012,1 DOI: 10.1002/0471250953.bi1320s40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (29).Xiao H; Chen W; Smeekens JM; Wu R Nat. Commun 2018, 9 (1), 1692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (30).Halim A; Westerlind U; Pett C; Schorlemer M; Ruetschi U; Brinkmalm G; Sihlbom C; Lengqvist J; Larson G; Nilsson JJ Proteome Res 2014, 13 (12), 6024–6032. [DOI] [PubMed] [Google Scholar]
- (31).Huang da W; Sherman BT; Lempicki RA Nat. Protoc 2009, 4 (1), 44–57. [DOI] [PubMed] [Google Scholar]
- (32).Torres CR; Hart GW J. Biol. Chem 1984, 259, 3308–3317. [PubMed] [Google Scholar]
- (33).Holt GD; Hart GW J. Biol. Chem 1986, 261, 8049–8057. [PubMed] [Google Scholar]
- (34).Bell JE; Beyer TA; Hill RL J. Biol. Chem 1976, 251, 3003–3013. [PubMed] [Google Scholar]
- (35).Whittaker JW Adv. Protein Chem 2002, 60, 1–49. [DOI] [PubMed] [Google Scholar]
- (36).Taga Y; Kusubata M; Ogawa-Goto K; Hattori S Mol. Cell. Proteomics 2012, 11 (6), M111.010397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (37).Ramya TN; Weerapana E; Cravatt BF; Paulson JC Glycobiology 2013, 23 (2), 211–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (38).Sun F; Suttapitugsakul S; Wu R Anal. Chem 2019, 91 (6), 4195–4203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (39).Khatri K; Pu Y; Klein JA; Wei J; Costello CE; Lin C; Zaia JJ Am. Soc. Mass Spectrom 2018, 29 (6), 1075–1085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (40).Palaniappan KK; Bertozzi CR Chem. Rev 2016, 116 (23), 14277–14306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (41).Syka JE; Coon JJ; Schroeder MJ; Shabanowitz J; Hunt DF Proc. Natl. Acad. Sci. U. S. A 2004, 101 (26), 9528–9533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (42).Darula Z; Medzihradszky KF Mol. Cell. Proteomics 2018, 17 (1), 2–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (43).Li J; Li Z; Duan X; Qin K; Dang L; Sun S; Cai L; Hsieh-Wilson LC; Wu L; Yi W ACS Chem. Biol 2019, 14 (1), 4–10. [DOI] [PubMed] [Google Scholar]
- (44).Haltom AR; Jafar-Nejad H Glycobiology 2015, 25 (10), 1027–1042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (45).Takeuchi H; Fernandez-Valdivia RC; Caswell DS; Nita-Lazar A; Rana NA; Garner TP; Weldeghiorghis TK; Macnaughtan MA; Jafar-Nejad H; Haltiwanger RS Proc. Natl. Acad. Sci. U. S. A 2011, 108 (40), 16600–16605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (46).Ogawa M; Sawaguchi S; Kawai T; Nadano D; Matsuda T; Yagi H; Kato K; Furukawa K; Okajima T J. Biol. Chem 2015, 290 (4), 2137–2149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (47).Sakaidani Y; Nomura T; Matsuura A; Ito M; Suzuki E; Murakami K; Nadano D; Matsuda T; Furukawa K; Okajima T Nat. Commun 2011, 2, 583. [DOI] [PubMed] [Google Scholar]
- (48).Ogawa M; Okajima T Curr. Opin. Struct. Biol 2019, 56, 72–77. [DOI] [PubMed] [Google Scholar]
- (49).Sakaidani Y; Ichiyanagi N; Saito C; Nomura T; Ito M; Nishio Y; Nadano D; Matsuda T; Furukawa K; Okajima T Biochem. Biophys. Res. Commun 2012, 419 (1), 14–19. [DOI] [PubMed] [Google Scholar]
- (50).Leonardi J; Fernandez-Valdivia R; Li YD; Simcox AA; Jafar-Nejad H Development 2011, 138 (16), 3569–3578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (51).Takeuchi H; Kantharia J; Sethi MK; Bakker H; Haltiwanger RS J. Biol. Chem 2012, 287 (41), 33934–33944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (52).Takeuchi H; Schneider M; Williamson DB; Ito A; Takeuchi M; Handford PA; Haltiwanger RS Proc. Natl. Acad. Sci. U. S. A 2018, 115 (36), E8395–E8402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (53).Haltom AR; Lee TV; Harvey BM; Leonardi J; Chen YJ; Hong Y; Haltiwanger RS; Jafar-Nejad H PLoS Genet 2014, 10 (11), e1004795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (54).Darabedian N; Gao JX; Chuh KN; Woo CM; Pratt MR J. Am. Chem. Soc 2018, 140 (23), 7092–7100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (55).Fujiki R; Hashiba W; Sekine H; Yokoyama A; Chikanishi T; Ito S; Imai Y; Kim J; He HH; Igarashi K; Kanno J; Ohtake F; Kitagawa H; Roeder RG; Brown M; Kato S Nature 2011, 480 (7378), 557–560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (56).Chen Q; Chen Y; Bian C; Fujiki R; Yu X Nature 2013, 493 (7433), 561–564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (57).Yuzwa SA; Macauley MS; Heinonen JE; Shan X; Dennis RJ; He Y; Whitworth GE; Stubbs KA; McEachern EJ; Davies GJ; Vocadlo DJ Nat. Chem. Biol 2008, 4 (8), 483–490. [DOI] [PubMed] [Google Scholar]
- (58).Horsch M; Hoesch L; Vasella A; Rast DM Eur. J. Biochem 1991, 197 (3), 815–818. [DOI] [PubMed] [Google Scholar]
- (59).Buch-Larsen SC; Hendriks IA; Lodge JM; Rykær M; Furtwan̈gler B; Shishkova E; Westphall MS; Coon JJ; Nielsen ML Mapping physiological ADP-ribosylation using Activated Ion Electron Transfer Dissociation (AI-ETD). bioRxiv 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (60).Riley NM; Hebert AS; Westphall MS; Coon JJ Nat. Commun 2019, 10 (1), 1311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (61).Polasky DA; Yu F; Teo GC; Nesvizhskii AI Fast and Comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco. bioRxiv 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (62).Fitzgerald DK; Brodbeck U; Kiyosawa I; Mawal R; Colvin B; Ebner KE J. Biol. Chem 1970, 245, 2103–2108. [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
