Abstract
Protein O-linked mannose (O-Man) glycosylation is an evolutionary conserved posttranslational modification that fulfills important biological roles during embryonic development. Three nonredundant enzyme families, POMT1/POMT2, TMTC1-4, and TMEM260, selectively coordinate the initiation of protein O-Man glycosylation on distinct classes of transmembrane proteins, including α-dystroglycan, cadherins, and plexin receptors. However, a systematic investigation of their substrate specificities is lacking, in part due to the ubiquitous expression of O-Man glycosyltransferases in cells, which precludes analysis of pathway-specific O-Man glycosylation on a proteome-wide scale. Here, we apply a targeted workflow for membrane glycoproteomics across five human cell lines to extensively map O-Man substrates and genetically deconstruct O-Man initiation by individual and combinatorial knockout of O-Man glycosyltransferase genes. We established a human cell library for the analysis of substrate specificities of individual O-Man initiation pathways by quantitative glycoproteomics. Our results identify 180 O-Man glycoproteins, demonstrate new protein targets for the POMT1/POMT2 pathway, and show that TMTC1-4 and TMEM260 pathways widely target distinct Ig-like protein domains of plasma membrane proteins involved in cell–cell and cell–extracellular matrix interactions. The identification of O-Man on Ig-like folds adds further knowledge on the emerging concept of domain-specific O-Man glycosylation which opens for functional studies of O-Man–glycosylated adhesion molecules and receptors.
Keywords: glycobiology, post-translational modification (PTM), O-man, protein O-mannosyltransferases, mass spectrometry (MS), lectins, genetic engineering, CRISPR/cas, protein domains, cadherin, plexin, dystroglycan
Graphical Abstract

Highlights
-
•
Improved workflow for sensitive C- and O-Man glycoproteomics.
-
•
Differential analyses of O-Man glycoproteomes in genetically engineered cells.
-
•
Dissection of POMT1/POMT2, TMTC1-4, and TMEM260 substrate specificities.
In Brief
The biosynthetic regulation and initiation of protein O-linked mannose (O-Man) glycosylations has been dissected in genetically engineered human cell lines by a sensitive method for differential O-glycoproteomics. The results expand current knowledge on C- and O-Man glycoproteins and reveal how three unique (POMT1/POMT2, TMTC1-4, and TMEM260) biosynthetic pathways orchestrate domain-specific O-Man biosynthesis.
Protein O-linked mannose glycosylation (O-Man) at serine (Ser) and threonine (Thr) residues is an essential protein posttranslational modification conserved from bacteria to humans (1, 2, 3, 4). Biosynthesis of O-Man is initiated in the endoplasmic reticulum (ER) lumen by integral transmembrane GT-CA enzymes (5, 6, 7) that utilize lipid-linked dolichol phosphate mannose as donor substrate (8). In yeast, seven protein O-mannosyltransferases (PMT1-PMT7) are known to orchestrate O-Man initiation on at least 25% of the proteins that traffic the ER and secretory pathway (9, 10). The mammalian orthologs POMT1 and POMT2, which are absent in plants, are distinguished from yeast PMTs by their narrow substrate specificities and dedicated functions for O-Man initiation on only a few human proteins, including KIAA1549, SUCO, and α-dystroglycan (α-DG) (11, 12, 13). Mammals have evolved a complex biosynthetic machinery involving at least 17 genes, among which POMT1, POMT2, POMGNT1, POMGNT2, MGAT5B, B3GALNT2, FKTN, FKRP, TMEM5 (RXYLT1), B4GAT1, LARGE1, and LARGE2 participate in the assembly of complex O-Man glycans on α-DG of the dystrophin-associated glycoprotein complex (14, 15, 16). Functional O-Man glycosylation of α-DG is required for interactions with extracellular matrix (ECM) components, including laminin, agrin, and perlecan, which are anchored to the dystrophin-associated glycoprotein complex and the actin cytoskeleton through a complex O-Man polysaccharide known as matriglycan (14).
More recently, we have shown that mammals have evolved multiple biosynthetic pathways for O-Man initiation based on the GT-CA type enzymes TMTC1, TMTC2, TMTC3, TMTC4, and TMEM260 (17, 18, 19, 20). The TMTC1-4 and TMEM260 enzymes, classified in the CAZy database as GT105 and GT117, respectively, serve distinct classes of extracellular immunoglobulin (Ig)-like protein domains found on transmembrane adhesion molecules (cadherins) and receptors (plexins) (17, 18). Unlike α-DG, O-Man on Ig-like domains is not elongated into complex structures but is rather present as single α-linked mannose monosaccharides on β-strands of the Ig-like domains. The TMTC1-4 enzymes are dedicated to the cadherin superfamily of adhesion molecules and selectively glycosylate highly conserved Ser/Thr residues with O-Man in the two β-strands (B and G) of extracellular cadherin (EC) domains (17), which are functional units that direct homo/heterophilic cis- and trans-binding important for cell–cell interactions (21, 22). The recently identified TMEM260 enzyme selectively serves extracellular immunoglobulin, plexin, transcription factor (IPT) domains found among a subset of structurally related plasma membrane receptors, including hepatocyte growth factor receptor (cMET), macrophage-stimulating protein receptor (MST1R or Recepteur d'Origine Nantais, RON), and members of the plexin family (17, 18). The IPT domain is a distinct subclass of the Ig-like fold, and while the functions of their O-Man glycans, located on conserved Ser/Thr residues within β-strands of IPT domains, remains unknown, TMEM260-driven O-Man glycosylation appears to be critical for receptor maturation and epithelial morphogenesis (18).
O-Man glycosylation fulfills important roles during development, and dysregulation of O-Man initiation is associated with severe developmental disorders in humans. Mutations in POMT1/POMT2 genes are linked to a subclass of congenital muscular dystrophies known as α-dystroglycanopathies, characterized by progressive muscular degeneration and developmental abnormalities in brain and eyes (23, 24, 25). Genetic defects in TMTC1-4 are primarily associated with neurological disorders, including brain malformation and hearing loss (26, 27, 28, 29, 30), while bi-allelic mutations in TMEM260 underlie the SHDRA syndrome, characterized by congenital heart defects, kidney phenotypes, and neurodevelopmental disorders (18, 31, 32).
However, the specific contributions of individual biosynthetic enzymes and O-Man initiation pathways towards the modified glycoproteome, as well as the mechanisms underlying substrate selection and potential crosstalk, remain understudied. The lack of understanding is in part due to the widespread and ubiquitous expression of O-Man glycosyltransferases across cell types (33), which precludes studies of individual biosynthetic pathways without interference from isoenzymes or other O-Man glycosyltransferases. A comprehensive investigation of enzyme specificities and their substrate selectivity may thus not only improve the understanding of receptor functions and regulations but also uncover molecular details that explain phenotypes arising from pathway-specific dysfunctions in O-Man glycosylation. In this study, we therefore aimed to dissect the biosynthetic regulation and pathway-specific O-Man initiation for POMT1/POMT2, TMTC1-4, and TMEM260 enzyme families.
We established an improved O-glycoproteomics workflow for targeted analysis of transmembrane O-Man substrates in five human cell lines representing different tissue origins and expanded the human O-Man glycoproteome. Furthermore, we established a panel of genetically engineered human HEK293 cell lines with defined O-Man glycosylation capacities for differential O-Man glycoproteomics. We identify 180 O-Man glycoproteins (of which 67 not previously described) and dissect individual contributions of POMT1/POMT2, TMTC1-4, and TMEM260 enzymes. We report the hitherto largest O-Man glycoproteome with site-specific mapping across a wide range of protein classes involved in ECM–cell-cell interactions, which opens for further functional studies of distinct protein classes based on their O-Man glycosylation status.
Experimental Procedures
Experimental Design and Statistical Rationale
This study is based on four data packages (A-D, Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6), reported in supplemental Data. Data package A (supplemental Data – Datasets 1–4), aimed at improving the preparative methodology for O-Man glycoproteomics, was collected from experiments in HEK293 WT cells or similar and/or re-analyzed data from our previous study (17) (Fig. 1). Data package B (supplemental Data – Datasets 5–9), aimed at identifying O-Man glycoproteins across five WT cell lines (Figs. 2 and 6), is based on light diethyl-labeled (DEL) samples analyzed independently from each other by shotgun O-Man glycoproteomics. Data package C (supplemental Data – Datasets 10–19) is based on differential glycoproteomic analyses (Figs. 3, 4 and 6) of five individual datasets with light/heavy DEL. Unless otherwise specified, membrane preparations of glycoengineered cell lines were labeled with light isotopes, and control cell lines HEK293 with knock-out (KO) of COSMC and POMGNT1 genes (HEK293 WTSC) were labeled heavy isotopes. Data package D (supplemental Data – Datasets 20–36) is based on the analyses of purified transmembrane or soluble proteins overexpressed in WT or glycoengineered cells (Fig. 5), labeled with diethyl isotopes before bottom-up analyses. All peptide spectral matches (PSMs) were identified by Proteome Discoverer 1.4 using probability-based scoring (Sequest-HT or MS-Amanda, p < 0.01) and further validated by manual inspection of glycopeptide MS/MS identifications and MS1-level quantifications to ensure the accuracy of the assignments. Unless otherwise stated (Experimental Procedures section), datasets are based on single shotgun experiment, with one biological and one technical replicate.
Fig. 1.
An optimized workflow for O-Man glycoproteomics.A, schematics of the optimized workflow, including crude membrane preparation of cells (CMP), diethyl labeling (DEL), N-glycan removal, BC2L-A LWAC, and nLC-MS/MS. Key steps for the optimized workflow are highlighted in green. B, comparison of glycopeptide enrichment of HEK293 total cell lysate (TCL) and crude membrane preparations (CMP). CMP leads to an increase of both glyco-PSMs and glycoproteins detected. Color code as in panel (D). C, comparison of O-Man data identified in one experiment run with standard workflow in our previous work (17) and the ones with the improved workflow employed in this work. For this information, only dimethyl medium or diethyl heavy channels of the same cell line were considered. The current workflow allows a 2× identification of unique O-Man proteins. D, comparison of O-Man data of panel (C) in terms of unique proteins identified. More than 4× unique proteins were identified exclusively with the optimized compared to the standard workflow.
Fig. 2.
The O-Man glycoproteome of five mammalian cell lines.A, comparison of glyco-PSMs and glycoproteins identified across the five different cell lines analyzed. HEK293 data are more abundant in O-Man glyco-PSMs and glycoproteins, while data from SH-SY5Y cells contains more C-Man proteins. B, gene ontology enrichment analysis for biological processes underlying the proteins identified in the datasets. Cell adhesion appears to be the most enriched term across the O-Man hits identified. C, overview of identified proteins across cell lines. Proteins are divided into groups based on domains (EC, IPT, Other domains, Domain annotation not available) according to the UniProt database (May 2022). Proteins containing EC domains are subdivided according to cadherins/protocadherin classes. Ocher color indicates that the protein has been identified with at least one O-Man-glycosylated residue on the respective annotated domain. Gray signifies that the protein in the respective domain has not been detected with O-Man sugars in the specific cell line.
Fig. 3.
MS analysis of a glycoengineered HEK293 cell panel.A, graphical depiction of glycoengineered cell matrix with cell line identifier on the left and genes targeted by genetic engineering on top (italic). B, schematics for differential O-Man glycoproteomics used in this study, based on the optimized workflow in Figure 1A. C, glyco-PSMs and glycoproteins identified across these five datasets. The bar charts show the total number of identifications (glyco-PSMs and glycoproteins) in each paired analysis of case and control (HEK293SC) cell lines. The number of glyco-PSMs and glycoproteins identified in the paired analysis of HEK293nO-Man thus reflects the glycoproteome of HEK293SC. D, gene ontology enrichment analysis for biological processes in which O-Man proteins identified are involved; terms regarding cell-cell adhesion show the highest enrichment. E, glycoengineered cells where only one pathway is left unaltered show O-Man glycosylation capacity like the HEK293SC control, in line with the previous knowledge on canonical O-Man substrates. Each circle represents a O-Man PSM, and log10(Heavy/Light) = 0 signifies that the O-Man capacity for the specific PSM in glycoengineered cell is equal to WT. Only the canonical substrates are colored in each glycoengineered cell line for clarity. HEK293WT cells show O-Man glycosylation capacity like HEK293SC, confirming that the HEK293SC is a suitable control cell line for this experiment. The HEK293nO-Man cell line shows global loss of O-Man glycosylation compared to HEK293SC control. Each circle represents a O-Man PSM, and log10(Heavy/Light) = 0 signifies that the O-Man capacity for the specific PSM in glycoengineered cell is equal to the one of HEK293SC.
Fig. 4.
Differential O-Man glycoproteome analysis in glycoengineered HEK293 cells.A, O-Man proteins identified with EC, IPT, or domains without annotation. Color code is reported in panel (B) and clarified in detail in the Results section. B, proteins with O-Man identified on other domains, according to UniProt annotations (May 2022). Numbering refers to the list reported in panel (C). C, list of domains identified with O-Man proteins. Numbering refers to proteins identified in panel (B).
Fig. 5.
Analysis of O-Man using recombinantly expressed reporters. Overview of constructs design and proteins expressed in cells with different genetic backgrounds. From top to bottom are the design of the constructs with the protein expressed and relative protein domains (according to UniProt information), the information on which cell line the constructs were expressed in, a representative Coomassie staining for expressed proteins (relative expression levels between different genetic backgrounds are not quantitative), a graphical report on glycosylation status of the proteins according to glycoproteomics data from crude membrane preparations and purified proteins (bottom).
Fig. 6.
C-Man proteins identified across WT and glycoengineered human cell lines. Overview of C-Man proteins identified in the datasets identified by BC2L-A lectin enrichment. Protein domain annotations were compiled based on UniProt (May 2022). For proteins with same gene name but corresponding to multiple UniProt accession numbers (e.g., HLA proteins), only one single gene name is reported.
Mammalian Cell Culturing
HEK293 and Caco-2 cells were grown in Dulbecco’s modified Eagle’s medium culturing medium (Sigma) with 10% fetal bovine serum (Gibco) and 1% GlutaMAX (Gibco). The same medium, with further addition of 1% MEM non-essential amino acid Solution (Sigma) was used for culturing of HepG2 cells. BG1 cells were cultured in RPMI 1640 medium (Sigma) supplemented with 10% fetal bovine serum (Gibco) and 1% GlutaMAX (Gibco). SH-SY5Y cells were cultured in 50:50 mixture of Dulbecco’s modified Eagle’s medium medium (Sigma) and RPMI 1640 (Sigma), supplemented with 10% fetal bovine serum (Gibco) and 1% GlutaMAX (Gibco). For protein production, cells were handled as described below. An overview of cell lines generated and used in this work can be found in supplemental Table 3.
Constructs, Guides Design, and Genetic Engineering
KO cell lines were generated according to published protocols (34, 35). Parental cell lines and guide RNA plasmids used in this study have been described previously (18, 36). Parental HEK293 cells were cotransfected with 1.0 μg each of PBKS-Cas9-2A-eGFP plasmid (Addgene plasmid #68371) and plasmid encoding for guide RNA, as in supplemental Table 1 using Lipofectamine 3000 (Thermo Fisher Scientific), according to manufacturer’s protocol. Cells were fluorescence activated cell sorting-enriched 48 h later, following fluorescent signal and further single-cell sorted 7 days thereafter. Indel Detection by Amplicon Analysis guided the selection of KO clones, using primers in supplemental Table 2 (34, 35) which were further validated by Sanger Sequencing. For model proteins in Figure 5, constructs for soluble DAG1 (Addgene plasmid #51651), ITGA2 (Addgene plasmid #51910), ITGB1 (Addgene plasmid #51920), ITGA5 (Addgene plasmid #51909), ITGAV (Addgene plasmid #51919), SEMA4D (Addgene plasmid #51827), EPHB1 (Addgene plasmid #51750), and PTPRJ (Addgene plasmid #51816), as well as the plasmid F11R pEBio (Addgene plasmid #61486) used for further cloning were a gift from Gavin Wright (37) and obtained through Addgene. For the generation of the data in this work, plasmid F11R pEBio (Addgene plasmid #61486) was modified by moving the 726 bp region isolated by digestion, performed with NotI-HF (NEB, cat. R3189) and AscI (NEB, cat. #R0558) according to manufacturer’s recommendations, onto the vector for PTPRJ (Addgene plasmid #51816), previously digested with the same enzymes, using T4 ligase (NEB, cat. #M0202) according to manufacturer’s protocol. Ligated product was transformed to Stellar competent cells (TaKaRa) and plated on Agar plates with carbenicillin (100 μg/ml) for selection. The final plasmid was confirmed by Sanger Sequencing. Plasmid for CADH1 was described previously (17). Plasmid for CDH11 was synthesized and cloned by GeneScript on EPB71 vector (Addgene plasmid #90018).
Plasmid for the expression of full-length DAG1 with C-terminal 3xFLAG tag was generated by the amplification of DAG1 cDNA sequence using primers in supplemental Table 2 from reverse-transcribed total mRNA from HEK293 cells. mRNA extraction was performed with RNeasy Mini kit (QIAGEN) and reverse-transcription with High-Capacity cDNA Reverse Transcription kit (Applied Biosystems), both according to manufacturers’ protocols. PCR product was ligated with InFusion ligation enzyme (TaKaRa) on EPB71 backbone (Addgene plasmid #90018) previously linearized by PCR. Ligation products were plated on Kanamycin-containing Agar plates (50 μg/ml) and final plasmids confirmed by Sanger Sequencing, highlighting that DAG1 gene presents the natural variant S14W (UniProt: VAR_024335, dbSNP: rs2131107). Stable expression of DAG1 full length construct was performed by zinc finger nuclease KI on AAVS1 safe harbor locus as previously described (18). Briefly, parental HEK293 cells were cotransfected with 3 μg of donor plasmid and 1.5 μg each of ZH001C and ZH001D using Lipofectamine 3000 (Thermo Fisher Scientific), according to manufacturer’s protocol. eGFP- and Crimson-positive cells were fluorescence activated cell sorting-enriched 48 h after transfection and single-cell sorted 7 days thereafter. Knock-in clones were identified by Junction PCR with primers and protocols as performed previously (18) and validated by Western blotting for 3xFLAG tag (Antibody M2, Sigma).
SDS-PAGE and Western Blotting
For SDS-PAGE separation of proteins, samples were mixed with 4× NuPAGE LDS Sample Buffer (Invitrogen), reduced with 5 mM DTT at 60 °C, and alkylated with 10 mM 2-iodoacetamide (IAA) at room temperature (RT) in darkness. Proteins are separated using pre-cast 10-well NuPAGE 4 to 12%, Bis-Tris, 1.0 to 1.5 mm, Mini Protein Gels (Invitrogen) using MES buffer. Prestained Protein Ladder – Broad molecular weight (10–245 kDa) (Abcam, ab116028) was used as molecular marker. Gels are stained with InstantBlue Coomassie Protein Stain (ISB1L) (Abcam, ab119211) and destained with water. For Western blotting, SDS-PAGE–resolved gels were transferred to ethanol-activated Polyvinylidene fluoride membrane using a Tris-glycine transfer buffer (25 mM Tris pH 8.0 at RT, 192 mM glycine, 10% ethanol in water). Membranes were blocked for 30 min at RT with blocking milk, made with 5% skim-milk powder diluted in 20 mM Tris pH 7.4, 150 mM NaCl, 0.1% Tween-20 (TBS-T), and incubated for 1 h at RT with Mouse anti-FLAG (M2) antibody-HRP conjugated (Sigma) diluted 1:4000 in blocking milk. After 3× washes in TBS-T (5 min each), membranes were developed by SuperSignal West Pico PLUS chemiluminescent substrate (Thermo Fisher Scientific) using ImageQuant LAS 4000 (GE Healthcare). Membranes were stripped with 1× ReBlot Plus Strong Antibody Stripping Solution (#2504, Merck Millipore), blocked for 30 min in blocking milk, and incubated overnight at 4 °C with anti-β-actin (Santa Cruz) diluted 1:1000 in blocking milk. After 3× washes in TBS-T, a 1:4000 dilution of anti-mouse HRP-conjugated secondary antibody into blocking milk (DAKO) was applied to the membrane for 1 h at RT. Membranes were then washed and imaged as above.
Glycoproteomic Samples Preparations
For glycoproteomics analyses (optimized workflow), cells were cultured in 7×T175 flasks to subconfluence and harvested by scraping in cold PBS. Cells were permeabilized in 150 mM NaCl, 50 mM Hepes pH 7.4, 25 μg/ml Digitonin (Sigma) and membranes further solubilized in 150 mM NaCl, 50 mM Hepes pH 7.4, 1% NP-40 according to Holden et al., 2009 (38). Membrane proteins were precipitated with 4× volumes of acetone overnight at −20 °C and solubilized by sonication in 1 ml of 1 mg/ml RapiGest (Waters) in 50 mM ammonium bicarbonate buffer. Samples were reduced (5 mM DTT, 30 min at 70 °C), alkylated (10 mM IAA, 30 min at RT in darkness), and unreacted IAA quenched with additional 5 mM DTT. Protease digestion was performed overnight with 25 μg of trypsin modified, sequencing grade enzyme (Roche) per sample, and tryptic peptides were desalted using C18 Sep-Pack Cartridges (Waters). DEL of Speedvac-desiccated peptides was performed essentially as in Jung et al., 2019 and in Larsen et al. 2023 (18, 39). For differential glycoproteomics, labeled peptides were mixed 1:1 according to ratiocheck analyses by mass spectrometry. Peptide mixtures were dried using Speedvac, before treatment with 10U EndoH (NEB) in sodium acetate buffer, pH 5.5 overnight, and subsequently with 15U peptide-N-glycosidase F (Sigma) in Tris buffer, pH 8.6 overnight at 37 °C in constant agitation at 750 RPM.
BC2L-A Beads and Column Preparation
BC2L-A lectin was produced using a pRSET-A construct encoding the full-length BC2L-A with an N-terminal 6xHis tag (40). The construct was electroporated into Rosetta 2 Escherichia coli competent bacteria (Novagen), and transformed cells were selected on agar plates using carbenicillin (100 μg/ml) and chloramphenicol (34 μg/ml). A 1 L culture of selected bacteria (OD600∼0.7) was induced for 3 h with 0.1 mM IPTG at 37 °C. Bacteria were harvested, washed in PBS, and lysed with TES buffer (200 mM Tris–HCl, pH 8.0, 0.5 mM ethylenediaminetetraacetic acid (EDTA), pH 8.0, 500 mM Sucrose) supplemented with 1 mM phenylmethylsulfonyl fluoride and 3 μg/ml pepstatin A. Lysate was clarified by centrifugation at 27,000g and supplemented with NaCl at a final concentration of 150 mM. Ni-NTA Agarose (Qiagen) was added to the clarified lysate for 30 min and moved to stationary column. Agarose was washed with 6 column-volumes (CVs) of 20 mM sodium phosphate, pH 8.0, 900 mM NaCl, with 6 CVs of 20 mM sodium phosphate, pH 8.0, 150 mM NaCl, 10 mM imidazole, pH 8.0, and eluted with 20 mM sodium phosphate, pH 8.0, 150 mM NaCl, 250 mM imidazole, pH 8.0. Purified lectin was assessed by SDS-PAGE, buffer exchanged to 50 mM Hepes, 150 mM NaCl, 1 mM CaCl2 using Zeba Spin Desalting Columns, 7K MWCO (Thermo Fisher Scientific), and conjugated to Pierce NHS-Activated Agarose Dry Resin (Thermo Fisher Scientific) according to manufacturer’s protocol. A 3.5 m LWAC column was prepared as previously (18, 40), equilibrated with 25 mM Tris pH 7.4 at 4 °C, 150 mM NaCl, 0.1 mM CaCl2, and maintained at 4 °C.
Lectin Weak Affinity Chromatography
Digested samples are diluted to 1 ml with BC2L-A running buffer (25 mM Tris pH 7.4 at 4 °C, 150 mM NaCl) containing 0.1 mM CaCl2, filtered with 0.22 μM PVDF filters before lectin weak affinity chromatography (LWAC). A 3.5-m column with BC2L-A agarose beads was used for chromatographic separation at 100 μl/min using Äkta purifier (GE Healthcare) at 4 °C. BC2L-A LWAC column was washed with running buffer until UV signal (A210) was lower than 5 mAU. The elution of glycosylated peptides was achieved with a 20 mM EDTA solution in running buffer. Elution fractions were acidified by trifluoroacetic acid, and EDTA precipitated by freeze-thaw cycle and centrifugation. Before MS/MS analysis, samples were stage tip-desalted (Empore disk-C18, 3M).
Expression and Purification of Model Proteins
HEK293 adherent cell lines with selected genetic background were seeded on 10-cm dishes previously coated with poly-L-lysine (Sigma, 0.1 mg/ml). Ten micrograms of plasmid were transfected on each 10-cm dish using 30 μg PEI MAX, both diluted in OptiMEM (Gibco). Media were changed the following day to F17 Expression Medium (Gibco) with 2% GlutaMax or OptiMEM (Gibco), and supernatants were harvested 5 days thereafter. Harvested media were filtered to remove particles and used for Ni-NTA purification. For CADH1 and CDH11, 0.5 ml Ni-NTA Agarose slurry (QIAGEN), each 50 ml of supernatant were equilibrated in 1X binding buffer (20 mM Tris, pH 8.0, 500 mM NaCl, 3 mM CaCl2, 5 mM imidazole), added to supernatants supplemented to a final concentration of 20 mM Tris, pH 8.0, 500 mM NaCl, 3 mM CaCl2, 5 mM imidazole, and incubated overnight at 4 °C in constant rotation. Beads were then moved to column and washed once with five CVs of binding buffer, a similar volume of wash buffer (20 mM Tris, pH 8.0, 500 mM NaCl, 3 mM CaCl2, 20 mM imidazole), and eluted with 2.5 CVs of elution buffer (20 mM Tris, pH 8.0, 500 mM NaCl, 3 mM CaCl2, 200 mM imidazole). For the remaining proteins, 0.5 ml Ni-NTA agarose slurry (QIAGEN) for each 50 ml of supernatant were equilibrated in 1× binding buffer (50 mM sodium phosphate buffer pH 8.0 at RT, 300 mM NaCl, 10 mM imidazole), added to supernatants supplemented to a final concentration of 50 mM sodium phosphate buffer pH 8.0 at RT, 300 mM NaCl, 10 mM imidazole, and incubated overnight at 4 °C in constant rotation. Beads were moved to column and washed once with five CVs of wash buffer (50 mM sodium phosphate buffer pH 8.0 at RT, 500 mM NaCl, 20 mM imidazole) and eluted with 2.5 CVs of elution buffer (50 mM sodium phosphate buffer pH 8.0 at RT, 150 mM NaCl, 300 mM imidazole). Fractions were inspected by SDS-PAGE for purified proteins before further analyses. Samples were buffer exchanged with Zeba Spin columns (Thermo Fisher Scientific) to 5 mM Hepes pH 7.4, 5 mM NaCl and stored at −80 °C. All proteins in Figure 5A are expressed independently, except for ITGA2, ITGA5, and ITGAV which are co-expressed with ITGB1, as previously recommended in original study using these constructs (37). This was done by cotransfecting equal amounts of plasmids (5 μg of each ITGA2, ITGA5, and ITGAV + 5 μg ITGB1) in each plate and performing purification as above. Regarding full length DAG1, purification was performed using dynabeads conjugated in-house with anti-FLAG M2 monoclonal antibody (Sigma-Aldrich). Beads conjugation was performed with a procedure adapted from previous protocols (41). Briefly, anti-FLAG M2 monoclonal antibody (Sigma-Aldrich) was conjugated to Epoxy Dynabeads M-270 (InVitrogen) at a ratio of 20 μg per 1 mg beads in 0.1 M Na-Phosphate buffer, pH 7.4 with gradual addition of ammonium sulfate to a final concentration of 1M. Conjugation was performed overnight at 30 °C in gentle rotation. Beads were washed with PBS and PBS+0.5% Triton-X100 and stored in 10% glycerol, 0.02% NaN3 at a final concentration of 150 μg beads/μl. Conjugated beads were washed in borate buffer (pH 8.5) and crosslinked using a 20 mM solution of dimethyl pimelimidate (Thermo Fisher Scientific) in borate buffer (pH 8.5) for 2 h. Beads were washed with 100 mM Tris pH 8.0 and PBS and used for affinity enrichment of DAG1. DAG1-expressing cells were lysed in a cell pellet:lysis buffer ratio of 1:2 using 50 mM Hepes pH 7.4, 50 mM KOAc, 2 mM MgCl2, 300 mM NaCl, 0.5% Triton X-100, 0.5% Tween-20. Input lysate was cleared with 21,000g centrifugation and incubated for 1 h in rotation at 4 °C with anti-FLAG beads, pre-cleared in lysis buffer. Beads were washed in lysis buffer lacking Triton and eluted with 10% SDS (55 μl) for 20 min in agitation. Approximatively 10% of elution volume was used for WB analysis, 45% was used for SDS-PAGE analysis to confirm absence of contaminants, and the remaining 45% for S-Trap digestion and proteomics analysis, as below.
S-Trap Digestion of Model Proteins
Ten microliters of purified proteins were reduced with 10 mM DTT and alkylated with 20 mM IAA. Unreacted IAA was quenched with the addition of 10 mM DTT, and SDS was added to a final concentration of 5%. Digestion was performed using Micro S-Trap columns (ProtiFi) following the manufacturer’s S-Trap micro high recovery quick card 2, using ammonium bicarbonate buffer instead of TEAB. Digestion was performed overnight at 37 °C using modified trypsin or chymotrypsin (only for FCGRIIIA). Eluted samples were desalted by in-house packed Stage tips (Empore disk-C18, 3M) before DEL. Labeled peptides from soluble constructs were mixed according to Nanodrop measurements (Absorbance = 205 nm) and desalted before mass spectrometry analysis. Peptides from full-length DAG1 purified from total cell lysates were labeled with DEL, desalted, and analyzed individually using mass spectrometry (supplemental Fig. S4).
Mass Spectrometry Analyses
Samples were analyzed using mass spectrometry following protocols implemented in our previous work (18) with slight modifications. Peptide and glycopeptides were dissolved in 0.1% formic acid and injected using an EASY-nLC 1000 or -nLC 1200 (Thermo Fisher Scientific) system coupled to a Fusion Tribrid or Fusion Tribrid Lumos mass spectrometer (Thermo Fisher Scientific), respectively. The nLC systems utilized a single analytical column packed with Reprosil-Pure-AQ C18 phase for sample separation. For glycoproteome analyses, gradient elution was performed using solvent A (0.1% formic acid) and solvent B (acetonitrile with 0.1% formic acid), with a stepwise increase from 5% to 20% B over 95 min, followed by a gradient from 20% to 80% B over 10 min, and a final hold at 80% B for 15 min. For bottom-up analyses of S-Trap–digested samples, a similar gradient elution was employed, starting from 2% to 25% B over 65 min, followed by a gradient from 25% to 80% B over 10 min, and a final hold at 80% B for 15 min. For ratiochecks, a similar gradient elution was employed, starting from 2% to 25% B over 95 min followed by a gradient from 25% to 80% B over 10 min, and a final hold at 80% B for 15 min. Mass spectrometry analysis included precursor MS1 scans (m/z 355–1700) acquired in the mass spectrometer at a resolution of 120,000. Subsequently, Orbitrap Higher-energy dissociation (HCD)-MS/MS and electron-transfer/collision-induced dissociation (ETciD)-MS/MS were performed on multiply charged precursors (z = 2–6). Only HCD fragmentation was performed for ratiocheck samples. Data-dependent fragmentation events were triggered by a minimum MS1 signal threshold of 10,000 to 50,000 ions. The MS2 spectra were acquired at a resolution of 60,000 for both HCD and ETciD methods. The comparison of HEK293SC versus HEK293nO-Man was analyzed on Fusion Tribrid Lumos and Fusion Tribrid instruments.
Mass Spectrometry Data Search
Mass spectrometric data (.raw files) was processed by the Proteome Discoverer (PD) 1.4 software (Thermo Fisher Scientific, https://www.thermofisher.com/dk/en/home/industrial/mass-spectrometry/liquid-chromatography-mass-spectrometry-lc-ms/lc-ms-software/multi-omics-data-analysis/proteome-discoverer-software.html) and the Sequest HT or MS-Amanda nodes. Individual .raw files from total cell lysates and crude membrane preparations (Data packages A–C, supplemental Data – Datasets 1–19) were searched against a .fasta file containing the canonical human proteome (n = 23,473 entries) downloaded (January 2013) from the UniProtKB database (http://www.uniprot.org/) (42). Data package D (supplemental Data – Datasets 20–36) from purified model proteins was searched against a FASTA file containing the sequences of all model proteins in this study (available in supplemental Data). Detailed information on .raw files and processing is available through Proteome Xchange database (see Data Availability section). Parameters for data processing are summarized in supplemental Table 4. Fragment ion mass tolerance was set to 0.02 Da and up to n = 2 missed trypsin cleavages (both full- and semi-specific) were allowed during each search. Peptide confidence levels were determined using the Target Decoy PSM Validator node, and only identifications with high confidence (p < 0.01) were considered. Filter was applied to retain only minimum Peptide Confidence = High, minimum Search Engine Rank = 1, and Peptide Mass Deviation = 7.0 ppm. Data from the comparison HEK293SC versus HEK293nO-Man analyzed with Fusion Tribrid alone were processed as above, for Figure 1, D and E, while data from both Fusion Tribrid Lumos and Fusion Tribrid instruments were analyzed jointly in Proteome Discoverer for Figures 3, 4 and 6 and supplemental Figures. To ensure data quality and omit potential false positives that pass the p < 0.01 filtering by Proteome Discoverer 1.4, each glycopeptide PSM was manually inspected for accurate assignment, including MS1 mass accuracy ( ± 7 ppm), monoisotopic peak assignment, MS1 peak picking for quantification, fragment ion assignment (b-, y-, c-, z-, and Y-ions in MS2), fragment ion isotope pattern, and fragment ion charge state. Glyco-PSMs that failed to meet the requirements above were excluded from the final result output (supplemental Data).
Mass Spectrometry Data Processing and Visualization
Manually inspected and validated spectral matches from glycoproteome data are exported from Proteome Discoverer and further processed in Microsoft Excel. To perform domain annotations for glycosylated sites, we relied on the “Domains” annotation provided by UniProt (May 2022) for categorization purposes with slight modifications for laminins, integrins, and DAG1. Laminin domain annotated as “Domain I” and “Laminin Domain I” are both classified as “Laminin Domain I”; laminin “Domain II” is annotated as “Laminin Domain II”. The domain annotated as “Laminin Domain II and I” is reported separately in figures but not accounted as separate individual domain (i.e., in domain accounting for instance in supplemental Fig. S7); this does not count as individual domain, and in putative specificity, it follows the information obtained in PSMs from “Laminin Domain I” and “Laminin Domain II”. For integrins, domains with no annotated domain report annotation of heavy and light chain, according to UniProt data. For DAG1, regions of the protein are reported according to UniProt information and previous literature (13). Both integrin and DAG1 regions are accounted for in the count of individual domains (i.e., in supplemental Fig. S7). “Domain annotation not available” indicates a region of the protein where there is no annotation of a domain in UniProt database. Gene names are annotated according to Proteome Discoverer output, with the only exception of the protein C14orf166 B, reported with its updated (August 2023) gene name LRRC74A. For proteins with same gene name but corresponding to multiple UniProt accession numbers (e.g., HLA proteins), only one single gene name is reported according to UniProt information (August 2023). Analogously, for UniProt accession numbers corresponding to multiple gene names/isoforms (e.g., HLA proteins), only the main isoform according to UniProt database is reported. PSMs are classified as O- or C- Man according to “Modification” column. To ensure accuracy of reported data, all glyco-PSMs with “W” were manually inspected to ensure the accuracy of site-specific assignments by HCD and/or ETciD, particularly for peptides with sequence containing Ser/Thr proximal to a Trp residue. The glycan type of glyco-PSMs with insufficient MS2 data to support site-assignment was categorized as “Ambiguous.” All inspected PSMs (including contaminant bovine proteins detected and with any value in the column “# Protein Groups”) are used for PSMs counts. A PSM with at least one of the two types of sugar linkages counts as “O- and C-Man glyco-PSM” and excluded from the count for the “O-Man glyco-PSM” and “C-Man glyco-PSM.” For protein counts, only proteins with “# Protein Groups” = 1 are counted, excluding ambiguously identified proteins. A protein is a C- or O-Man protein if it contains the respective type of sugar on at least one PSM from the inspected list. Proteins with at least one O-Man PSM and one C-Man PSM is considered “O- and C-Man protein” and excluded from the count for the “O-Man protein” and “C-Man protein.”
For Figures 2C, 4, and 6, O-Man and C-Man PSMs are isolated and presented separately. For this protein visualization, only proteins with “# Protein Groups” = 1 are considered, excluding ambiguously identified proteins. PSMs are categorized into ratio groups, where each PSM is assigned to only one ratio category. For each gene name, one entry is kept for each domain on which the sugar has been identified and on the ratio category using “Remove Duplicates” function in Excel. Proteins with more than one category of ratio for each domain are collapsed into one entry with category of ratio as a merge of the two (e.g., 10≤ x <100 and x ≥100 becomes x ≥10). In case of two nonadjacent groups, the corresponding square of the heatmap will have double color (in the case of x ≥100 and 0.1≤ x <10). The same procedure has been done analogously to O- and to C-Man entries. A ratio difference of 10 has been used as threshold level for difference between case and control based on ratiocheck results (supplemental Fig. S2) and with calculations mentioned in Results section.
In Figure 2, specifically for the “Other domains” category, if a protein has glycosylation on more domains classified as “Other domains,” it is represented graphically only once. Conversely, in Figure 4, to depict the specificity of each enzyme family on different domains, the protein is represented multiple times corresponding to the number of domains within each respective category of the heatmap.
For ratiocheck data processing and visualization, only PSMs with Quant Usage = “Used” are considered in the calculations. For calculation purposes, ratios that equal to 0 are substituted with 0.00001.
Data and figures for heatmaps, barplots, ratiocheck distribution histogram, upset plot, lollipop plot were processed and generated using R, with the packages ggplot2 and upsetR, among the others and further modified using Adobe Illustrator. Data intersections for Venn diagrams and barplots were performed using the online tool InteractiVenn (43) or DeepVenn (44) and further processed offline.
Gene ontology enrichment analyses were performed using ShinyGO 0.77 (45), using human pathway database GO Biological Process. No background list was uploaded; by ShinyGO default, the gene list was therefore compared with all protein-coding genes in the genome as background. False discovery rate cutoff set at 0.05, “# pathways to show” set to 10, pathway size between 2 and 2000, removing redundancy, and abbreviating pathways. Graphs were made on the same online tool by sorting pathway by fold enrichment (x-axis), coloring by −log10(false discovery rate), and using the gene number as size of the terminal circles.
supplemental Fig. S1 was made using data from studies in references (13, 17, 18, 19, 40). The 175 unambiguous and unique proteins identified were compiled after manual inspection following parameters in this Experimental Procedures section. Note that protein with Accession Number P01889 reported as O-Man protein in (40) has been excluded for these criteria. List of proteins with transmembrane region was acquired in August 2023 via UniProt using the filters (taxonomy_id:9606) AND (keyword:KW-0812) AND (reviewed:true). Proteins with Signal Peptide were acquired likewise with the filters (taxonomy_id:9606) AND (keyword:KW-0732) AND (reviewed:true).
supplemental Fig. S7 was made using STRING (46). All 241 O-Man proteins unambiguously (and manually inspected, following the parameters from this paragraph) identified in this study and in previous studies (13, 17, 18, 19, 40) were searched using STRING online tool against human proteome. Note that for this visualization, the data from the test experiment between crude membrane protein (CMP) and total cell lysate (TCL) (Fig. 1B) were included. The full STRING network is visualized, with network clustered using k-means clustering (8 clusters) in STRING and leaving all the parameters as default. Coloring is made in STRING; proteins with SMART domain identifier SM00112 (Cadherin repeats) are colored in red and with SMART domain identifier SM00429 (Ig-like, plexins, transcription factors) are in blue. The image was further modified using Adobe Illustrator. Two additional accession numbers reported in previous studies but not identified in STRING databases are manually added at the bottom of the figure.
Results
An Optimized Workflow for O-Man Glycoproteomics
Sensitive glycoproteomics largely hinges on optimized preparation, efficient glycopeptide enrichment, and precise quantification of the analyzed sample(s) (47). We first aimed to improve these three key steps in our established workflow (13, 17) for identification and quantification of O-Man proteins and establish an optimized workflow for O-Man glycoproteomics. Previous glycoproteomics studies have demonstrated that the majority (72%) of proteins undergoing O-Man glycosylation consist of single-pass or multi-pass transmembrane proteins (13, 17, 18, 19, 40) (supplemental Fig. S1A). To improve sensitivity in O-Man glycoproteomics analysis, we therefore envisioned that CMP (38) would enrich the transmembrane subproteome and exclude abundant cytosolic proteins that are typically extracted in TCL. We separately analyzed HEK293 TCL and CMP samples by detergent-assisted protein extraction, proteolytic digestion, and lectin enrichment of O-Man glycopeptides (Fig. 1A). Bottom-up analyses of HEK293 cells subjected to CMP extraction identified 520 O-Man glyco-PSMs originating from 54 O-Man glycoproteins, while only 275 O-Man glyco-PSMs from 39 unique O-Man glycoproteins were detected in the same sample subjected to TCL extraction (Fig. 1B), thus demonstrating that CMP extraction improves analytical sensitivity for O-Man glycoproteomics.
Second, we sought to improve O-Man glycopeptide enrichment from complex mixtures by using the α-mannose–specific Burkholderia cenocepacia lectin A (BC2L-A) (48, 49). BC2L-A, a 28 kDa dimeric and Ca2+-dependent lectin capable of binding both O-Man and C-linked mannose glycans on tryptophan residues (C-Man), has emerged as an alternative to concanavalin A (ConA) for improved enrichment of glycopeptides by LWAC (40). Moreover, BC2L-A was found to capture O-Man glycopeptides in ConA-LWAC flow-through fractions, clearly indicating that BC2L-A is a better affinity reagent for O-Man glycoproteomics compared to ConA (40). Therefore, we chose to substitute ConA with the BC2L-A lectin in our LWAC procedure for improved O-Man glycopeptide enrichment.
Furthermore, we assessed the approach for differential labeling of peptides by stable isotopes. In our most recent study (18), we employed DEL for two-plex experiments, which is a cost-effective, flexible, and readily available method that utilizes 13C isotopologues of acetaldehyde, as an alternative to dimethyl labeling (39). In agreement with previous studies (39), the use of DEL improved quantification accuracy and retention of short hydrophilic O-Man glycopeptides on C18 columns (supplemental Fig. S1B). DEL of peptides has also been reported to enhance ionization of hydrophilic peptides (39), thus improving the number of identifications in bottom-up analyses. We therefore also employed light DEL in one-plex experiments where no quantification was necessary, to improve the sensitivity in our glycoproteomic workflows.
Taken together, the optimized workflow combining CMP, DEL, and BC2L-A enrichment (Fig. 1A) enabled significant improvements in bottom-up O-Man glycoproteomics, with a 2-fold increase in the total number of O-Man proteins identified in comparison to the standard method used in our previous studies (17) (Fig. 1C). The optimized workflow (Fig. 1D) allowed identification of 60 O-Man glycoproteins in total (n = 38 exclusive) compared to the standard workflow (17), which identified 30 O-Man glycoproteins (n = 8 exclusive), clearly demonstrating the overall improved sensitivity for O-Man glycoproteomics.
An Expanded O-Man Glycoproteome in Five Human Cell Lines
We analyzed five human cell lines derived from different tissues, including BG1 (ovarian adenocarcinoma), CaCo-2 (colorectal adenocarcinoma), HEK293 (embryonic kidney), HepG2 (hepatocellular carcinoma), and SH-SY5Y (neuroblastoma), using the optimized glycoproteomics workflow described above (Fig. 1A). CMP tryptic digests from each cell line were labeled with light DEL before O-Man glycopeptide enrichment by BC2L-A LWAC and mass spectrometry. HCD/ETciD-based bottom-up analyses identified 3882 glyco-PSMs from 120 unique O-Man proteins (Fig. 2A), 33 of which have not been described as O-Man substrates before (13, 17, 18, 19, 40). Notably, we identified 79 O-Man glycoproteins in HEK293 cells alone, which represents a two-fold improvement compared to our standard workflow (17), further demonstrating the improved sensitivity of the method used here. Gene ontology enrichment analysis of the 120 proteins identified showed strong enrichment of biological process terms for cell adhesion (Fig. 2B), in agreement with the function of canonical substrates for O-Man proteins such as cadherins and protocadherins (20). Indeed, cadherins and protocadherins constitute the largest class of O-Man proteins identified in these datasets, where single O-Man monosaccharides modify EC domains of 54 unique members of the cadherin superfamily. While some O-Man proteins appear to be cell-line–specific, the majority of the O-Man proteins were detected in at least two different cell lines (Fig. 2C). As recently described, IPT domains are TMEM260 substrates (18), and here, we identified seven members (PLXNA1-4, PLXNB2-3, PLXND1) of the plexin family, as well as cMET and RON receptors, which account to nine proteins (of n = 12 predicted) with O-Man on IPT domains. Interestingly, we also identified 26 proteins with O-Man on distinct Ig-like domains, including Ig-like C2-type domains (n = 5), Fibronectin type-III domains (n = 4), and Ig-like V-type domains (n = 2), indicating that O-Man glycosyltransferases selectivity is not limited to the Ig-like fold subclasses defined by EC and IPT domains. While most of the identified O-Man glycopeptides map to annotated protein domains, 34 additional proteins were identified with O-Man glycosylation in unstructured/unannotated regions as reported in UniProt. As expected, O-Man glycosylation in unstructured protein domains was identified on α-DG, KIAA1549, and SUCO, which are substrates of the classical POMT1–POMT2 pathway (13), but also on a subset of proteins that have not been previously reported as O-Man targets, mainly representing ER-localized or membrane proteins. In the five human cell lines examined, we identified the largest and most diverse set of O-Man proteins in HEK293 cells. We therefore chose to glyco-engineer HEK293 cells by CRISPR/Cas9 KO to deconstruct O-Man initiation pathways and establish suitable cell models for differential O-Man glycoproteomics.
O-Man Glycosyltransferase Specificities in Genetically Deconstructed HEK293 Cells
To study specificities, relationships, and potential crosstalk between the POMT1/POMT2, TMTC1-4, and TMEM260 glycosyltransferase families responsible for initiation of O-Man biosynthesis, we generated a library of HEK293 cell lines with combinatorial CRISPR-Cas9 knock-out of selected genes for a partial and complete deconstruction of O-Man glycosylation capacity in HEK293 cells (Fig. 3A). The genetic engineering was undertaken in the HEK293 COSMC/POMGNT1 KO (SimpleCell; HEK293SC) background (13, 50), which are not capable of producing core-M1 or core-M2 type complex O-Man glycans (51, 52), thus allowing a simplified one-step (BC2L-A LWAC) enrichment and identification of O-Man glyco-sites on diverse protein classes. To investigate pathway-specific O-Man initiation, we employed a combinatorial KO design in HEK293SC cells to inactivate two out of three enzyme families in combination, which allowed us to establish individual cell lines where either POMT1/POMT2, TMTC1-4, or TMEM260 genes remained unedited. For simplicity, we hereafter refer to each HEK293 cell line by the active biosynthetic pathway: for example, a combinatorial KO of COSMC/POMGNT1/POMT1/POMT2/TMTC1-4 genes in HEK293 is denoted as “HEK293TMEM260” (Fig. 3A). For control experiments, we also established a HEK293 cell line with complete deconstruction of O-Man glycosylation capacity by COSMC/POMGNT1/POMT1/POMT2/TMTC1-4/TMEM260 KO, which we here refer to as the “HEK293nO-Man” cell line. The glycoengineered cell lines showed no gross morphological defects or growth abnormalities in cell cultures.
To study enzymes specificities, we applied the optimized workflow for differential glycoproteomics analyses (Fig. 3B) of HEK293TMEM260, HEK293TMTC1-4, HEK293POMT1/POMT2, and HEK293nO-Man for comparative analysis using HEK293SC as control. We also performed a differential analysis comparing HEK293WT and HEK293SC to assess if COSMC/POMGNT1 KO influenced O-Man initiation (Fig. 3A). Collectively, the five comparative datasets identified a total of 9026 glyco-PSMs, 7410 of which were O-Man and 1614 were C-Man PSMs (Fig. 3C). A Gene Ontology Enrichment Analysis was performed on the 138 unique O-Man proteins identified, suggesting that biological process terms related to cell signaling and adhesion were enriched, in agreement with the functions of cadherins and plexins (Fig. 3D). Proteomics of heavy and light labeled tryptic digests analyzed after 1:1 mixing revealed a normal distribution of H/L ratios for case/control samples, and we used interquartile range calculations for each comparison to determine biological/technical variability and outliers by Q1-1.5xIQR and Q3+1.5xIQR boundaries (supplemental Fig. S2A), which formed the basis for setting a ±10-fold change as a threshold for differential regulation of O-Man glycosylation in case/control comparisons.
We proceeded with analyses of differentially labeled O-Man glycopeptides from case/control comparisons to evaluate O-Man glycoproteome changes in the engineered cell lines (Fig. 3E) and first examined relative abundances of O-Man glycopeptides in the differential analysis of HEK293WT/HEK293SC cells. We observed that >90% of the O-Man glycopeptide H/L ratios were within the technical variability (±10-fold) of the measurement and could thus conclude that there is no significant difference in O-Man initiation between HEK293WT and HEK293SC cells (Fig. 3E). Notably, we observed a distinct cluster of data points that are >10 fold more abundant in HEK293WT cells (Fig. 3E), which is likely due to more abundant expression of specific O-Man glycoproteins in the bulk cell population of HEK293WT cells compared to the single cell clone of HEK293SC. For HEK293POMT1/POMT2, HEK293TMTC1-4, and HEK293TMEM260, which were all compared to O-Man initiation in the HEK293SC background, we observed H/L ratio spread across n = 6 orders of magnitude in all comparisons, indicating that combinatorial KOs in each cell line influenced O-Man initiation of distinct substrates. More specifically, in the HEK293POMT1/POMT2/HEK293SC comparison, we observed that O-Man glycopeptides from KIAA1549, SUCO, and α-DG did not change in relative abundance (±10-fold), thus demonstrating that POMT1/POMT2 is solely responsible for the initiation of O-Man biosynthesis within unstructured regions of KIAA1549, SUCO, and α-DG. In contrast, O-Man glycopeptides derived from cadherins or plexins where >100 fold more abundant in the HEK293SC cell line (supplemental Data – Dataset 15), which demonstrated that HEK293POMT1/POMT2 cells are unable to induce O-Man initiation on EC-domains (TMTC1-4–dependent) or IPT-domains (TMEM260-dependent). Furthermore, we confirmed TMTC1-4–dependent O-Man initiation of EC domains in the HEK293TMTC1-4/HEK293SC differential analysis, where O-Man glycopeptides derived from cadherins (n = 1453) were found to be equally abundant while selective loss of O-Man was observed on KIAA1549, SUCO, and α-DG (POMT1/POMT2-dependent) and on IPT domains (Fig. 3E and supplemental Data – Dataset 16). Moreover, the differential analysis of HEK293TMEM260/HEK293SC cells demonstrated loss of O-Man glycosylation on cadherins, KIAA1549, SUCO, and α-DG while no changes (±10-fold) were observed for O-Man glycopeptides derived from IPT domains (Fig. 3E). Finally, in the HEK293nO-Man/HEK293SC comparison, we observed a skewed distribution of H/L ratios for O-Man glycopeptides, with >98% of the ratios found above the 10-fold change threshold, thus demonstrating that O-Man glycosylation is abolished upon POMT1/POMT2/TMTC1-4/TMEM260 KO (Fig. 3E). Taken together, these results, which agree with previous studies (13, 17, 18), validated the approach and showed that our O-Man deconstruction cell library is suitable for analyses of pathway-specific substrate specificities.
The O-Man Glycoproteome in the Context of Three Biosynthetic Pathways
The use of glycoengineered cells and targeted glycoproteomics workflow allowed us to map the most extensive human O-Man glycoproteome (supplemental Data – Datasets 1, 2, 5–9, 15–19 and supplemental Fig. S6). We sought to compile the glycoproteomics results presenting all detected O-Man substrates in the context of their biosynthetic pathways and grouped by protein domains (Fig. 4), which includes the datasets from HEK293SC/HEK293nO-Man originally collected for the comparison between CMP and TCL, as presented in Figure 1B. The differential glycoproteomics workflow allows pairwise assessment with the HEK293SC control and results can be grouped into four distinct outcomes for each domain type within every detected protein. 1) “Glycosylated” describes a particular protein domain in which all glyco-PSMs detected exhibit glycosylation in both the HEK293SC control and the case cell line at a similar ratio. This suggests that the genetically engineered cell line closely resembles a HEK293SC cell line. Conversely, 2) “Not glycosylated” refers to a type of domain where O-Man glycosylation was found in the control cell line but not in the case cell line. This suggests that the glycoengineering resulted in the loss of O-Man glycosylation from all O-Man PSMs detected in the control cell line for each specific type of domain, within a given protein. We observed some cases, referred to as 3) “Ambiguous” in which protein domains within a protein displayed at least one O-Man PSM with glycosylation resembling the control cell line and at least one other displaying sensitivity to genetic engineering or that the data output did not allow confident identification/quantification. Lastly, we classify 4) “Not detected” the protein domains in which no O-Man PSMs were detected in the dataset, but that were detected in any of the other datasets. The latter can be attributed to the inherent limitations of shotgun glycoproteomics, where specific precursor ions may not be selected for MS/MS fragmentation. Thus, we consider O-Man glycoproteins consistently identified across multiple datasets as the “core O-Man glycoproteome” of HEK293 cells, while data points indicated as “Not detected” represent less abundant O-Man glycoproteins that are not consistently identified due to the inherent limitations of data-dependent acquisition.
Among the classical O-Man substrates, we identified α-DG, SUCO, KIAA1549, as well as 50 nonclassical protein substrates with O-Man on EC domains and 10 proteins with modification on IPT domains (Fig. 4A). HEK293POMT1/2 cells were, in addition to α-DG, SUCO, and KIAA1549, also capable of inducing O-Man glycosylation on a limited set of other substrates, predominantly on unannotated domains or unstructured/disordered regions, including some site-specific glycosylation. Notably, HEK293POMT1/2 exhibit glycosylation activity within POMT1/2 MIR domains and other regions of POMT1. All proteins containing EC domains are glycosylated both in HEK293WT and HEK293TMTC1-4 cell line, in line with previous evidence that TMTC1-4 enzymes are responsible for EC domain O-Man glycosylation (17). CDH11 and CDH13 O-Man glycosylation appears to be independent of TMTC1-4, in line with our previous study (17), and for CDH11, at least one glycosylated PSM was observed in each cell line. Additionally, while previous data (17) reported CDH13 glycosylation to be insensitive to TMTC1-4 KO, here we identify site-specific examples of glycosylation in HEK293POMT1/2, while HEK293TMEM260 cells are not capable of glycosylating EC domains of CDH13. More studies are thus warranted to elucidate the biosynthetic basis for O-Man glycosylation of the atypical CDH11 and CDH13 cadherins. Furthermore, we detect site-specific glycosylation on PCDHGC3 in HEK293nO-Man dataset, but not in HEK293POMT1/2 nor HEK293TMEM260. Moreover, plexins, cMET, and MST1R/RON showed O-Man glycosylation at comparable levels in HEK293WT and HEK293TMEM260 as expected (18), as well as unreported cases of site-specific glycosylation in HEK293POMT1/2 (MET, PLXNA3, PLXNB2), HEK293TMTC1-4 (PLXNA3, PLXNB2), and in HEK293nO-Man (PLXNB2). These results indicate an interplay between initiation pathways with specific proteins serving as substrates for different enzyme families, which needs further investigation. Remarkably, our platform allowed detection of O-Man glycosylation on several other proteins, both in regions with no assigned domain/unstructured regions (Fig. 4A) or annotated domains (UniProt) (Fig. 4, B and C). Few of these candidates were also identified previously, such as integrins and interestingly, a site on beta-dystroglycan (β-DG) where O-Man glycosylation was abolished by KO of TMEM260 (18). Here, we also identified members of α-integrins which where O-Man glycosylated in the HEK293TMTC1-4 cell line, while ITGB1 showed site-specific glycosylation in both HEK293TMTC1-4 and HEK293TMEM260. Overall, our results demonstrate that nonclassical enzyme families (TMTC1-4 and TMEM260) target Ig-like fold and unannotated domains that show structural similarities, while POMT1/2 appear to have specificity for unstructured regions of distinct proteins.
In-Depth Glycoproteomics Analysis of Purified Reporter Proteins
The identification of O-Man modifications on common protein folds (Fig. 4C) prompted us to investigate the enzymatic preference and specificity in detail. We sought to use the same glycoengineered cell lines to express soluble reporter proteins and assess O-Man glycosylation and occupancy at specific protein domains. We adopted a conventional approach by recombinant expression of reporter constructs, digestion with trypsin or chymotrypsin before bottom-up mass spectrometry analyses (53). We sought to analyze representative proteins to assess O-Man glycosylation, specificity, and, importantly, occupancy/stoichiometry at the site-specific level and expressed 12 constructs encoding His-tagged reporter proteins (ectodomains) in HEK293SC cells and HEK293nO-Man cells. These included both canonical substrates (DAG1, CADH1, CDH11), noncanonical substrates (ITGB1, ITGA5, ITGAV, F11R), as well as proteins not identified as O-Man proteins in this study but belonging to the same family of O-Man proteins identified here (ITGA2, SEMA4D, PTPRJ) or containing protein domains identified with O-Man in this study (FCGRIIIA, EPHB1; Fig. 5A). Purified proteins were digested with trypsin or chymotrypsin, labeled with DEL, and mixed pairwise, before mass spectrometry analysis (supplemental Fig. S4A). Glycans in α-DG were detected both in the glycoproteome data and in the model protein data for HEK293WT and HEK293SC, but not in HEK293nO-Man (supplemental Data – Datasets 20–21), in agreement with previous data in this study. The β-DG glycosylation identified in glycoproteome data was not identified on the reporter protein (supplemental Data – Datasets 20–21). Analysis of recombinantly expressed CADH1 ectodomain yielded similar data to our previous studies (17) and glycoproteome data in this study, confirming that O-Man is present in HEK293WT and HEK293SC, but not in HEK293nO-Man. We then sought to express CDH11 to shed light on the unique glycosylation patterns identified on this protein in this and previous datasets. Soluble ectodomain of CDH11 presented glycans in HEK293WT and HEK293SC, but not in HEK293nO-Man. We also identified O-GalNAc at the same sites for O-Man, suggesting that absence of O-Man enzymes allows O-GalNAc glycosylation in Golgi apparatus (supplemental Data – Datasets 22–25). We then focused on the noncanonical substrates, none of which showed O-Man glycosylation in either HEK293SC or HEK293nO-Man cells (supplemental Data – Datasets 26–33). This was surprising since O-Man was identified on ITGB1, ITGA5, ITGAV, F11R earlier in this study and in our previous studies (18). We further investigated these findings by expressing DAG1 in its native transmembrane form, to assess glycan pattern both on α-DG and β-DG. We therefore generated a full-length DAG1 construct with 3xFLAG tag at the C-terminal end and stably overexpressed it in isogenic HEK293WT, HEK293SC, and HEK293nO-Man cells using zinc finger nucleases KI on AAVS1 safe harbor locus (supplemental Fig. S5A). Expression of the construct was confirmed by means of Western blotting (supplemental Fig. S5B). The expressed transmembrane protein was purified (supplemental Figs. S4B and S5B) using M2-conjugated magnetic beads (anti-FLAG mouse IgG) and subjected to proteolytic digestion on S-trap, DEL, and MS analysis (supplemental Fig. S4B). We confirmed O-Man glycosylation was on α-DG but not on β-DG, indicating that the O-Man glycosylation on β-DG and other noncanonical O-Man substrates has low occupancy (supplemental Data – Datasets 34–36). Taken together, these results demonstrate that O-Man glycosylation may occur on a wider range of substrates, albeit at low stoichiometry.
BC2L-A Lectin and Membrane Glycoproteomics Expand the C-Man Glycoproteome
Targeted membrane glycoproteomics workflow used here not only led to the identification of new O-Man proteins but also expanded the knowledge of the C-Man glycoproteome, due to the ability of the lectin BC2L-A to capture C-Man peptides and proteins (40). Collectively, this study identified 73 human and 2 bovine C-Man proteins (most likely from cell culturing media) across 12 datasets and on 17 protein domains (Fig. 6A). Several protein domains are targeted by C-Man glycosylation, with the overrepresentation of TSP type-1 domains as primary target for C-Man glycosylation. Interestingly, we identify known O-Man proteins (α-DG, plexins, protocadherins) as carriers of C-Man glycosylation and identify a total of 43 new C-Man glycoproteins (supplemental Fig. S9), representing a >2-fold expansion of the C-Man glycoproteome compared to previous studies in humans and mice (40, 54, 55, 56, 57, 58, 59). Reassuringly, the genetic engineering for O-Man enzymes did not impact their C-Man glycosylation, in line with the specificity of DPY19 enzymes (60). Nevertheless, these results should be interpreted with care before the occupancy and biological relevance has been determined.
Discussion
O-Man glycosylation has emerged as a widespread modification among specific adhesion molecules and receptors of eukaryotic cells. The discovery of three independent biosynthetic pathways targeting distinct proteins at the cell surface, including α-DG, cadherins, and plexins, clearly points to important biological roles for O-Man in human physiology (20), and this notion is further supported by the diverse and severe developmental phenotypes in patients with defects in O-Man biosynthesis (27, 28, 61, 62).
Here, we provide further expansion of the O-Man glycoproteome in five human cell lines and map pathway-specific biosynthesis using a panel of WT and isogenic HEK293 cells with combinatorial KOs of O-Man initiation pathways. This study improves the glycoproteomic workflow for O-Man analyses, allowing us to identify a total of 180 proteins as targets for human O-Man enzymes and unveils details on substrate specificities at protein-, domain-, and site-specific levels for each O-Man glycosyltransferase family, thus advancing the knowledge on how cells orchestrate and fine-tune biosynthetic processes relevant to O-Man glycosylation (supplemental Fig. S7).
The first analysis of the mammalian O-Man glycoproteome relied on the combination of genetic engineering of the glycosylation capacities and LWAC followed by quantitative O-glycoproteomics (13, 19, 50, 63). The current workflow introduces key improvements for sensitive identification and quantification of O-Man glycosylations, including the use of a small, dimeric BC2L-A lectin for C- and O-Man glycopeptide enrichment and substitution of deuterated dimethyl for 13C2-diethyl stable isotope labeling for improved glycopeptide retention on C18 reversed phase chromatography, ionization, and concurrent elimination of “deuterium effect” for improved quantification (39). The genetic engineering for the deconstruction of biosynthetic pathways combined with our glycoproteomic strategy reveals two major types of transmembrane proteins as acceptor substrates for O-Man glycosylation. The first type of protein substrates are characterized by unstructured, mucin-like domains, including α-DG, KIAA1549, and SUCO, which are densely O-Man glycosylated by POMT1/POMT2 enzymes in the ER before continued biosynthesis and elongation in the secretory pathway resulting in capped and complex O-Man glycans known as core-M1, core-M2, and core-M3/matriglycan (15). The second type of protein substrates, which includes cadherins, protocadherins, plexins, cMET, and RON receptors, are characterized by O-Man glycosylation on distinct Ig-like folds. This study confirms and further expands the knowledge on substrate specificities of the TMTC1-4 and TMEM260 enzyme families, which have unique functions for O-Man glycosylation of EC- and IPT-domains, respectively. Although both TMTC1-4 and TMEM260 enzyme families reside and perform their glycosylation function in the ER, O-Man on EC- and IPT-domains does not undergo further biosynthetic elongation into complex structures, as shown previously (17, 18, 64) and in this study, using secreted reporter proteins (Fig. 5 and supplemental Fig. S4A). α-linked O-Man monosaccharides are substrates for POMGNT1 and POMGNT2, which catalyze the second biosynthetic step and addition of GlcNAc-β1-2 or GlcNAc-β1-4-Man-O-Ser/Thr structures, respectively (15). It is thus surprising that EC- and IPT-domains, akin to C-Man–modified TSP-domains (65), can traffic the secretory pathway and be presented at the cell surface (or secreted as reporter proteins) with O-Man monosaccharide modifications without further structural elongation and capping. Most likely, the polypeptide context to which O-Man is attached plays a decisive role for the following biosynthetic steps and impacts whether O-Man can serve as a substrate for additional glycosylation reactions. Complex O-Man glycans are found on unstructured, mucin-like regions of α-DG (19) as well as in the highly disordered extracellular domain of KIAA1549 (not shown), which indicates that O-Man located on disordered polypeptides is a preferable acceptor for the POMGNT1 enzyme. We have not observed complex O-Man glycans on the disordered SUCO protein, despite data mining using MS-Fragger (66) for open-search queries (not shown), which is reasonable considering that SUCO is an ER-localized protein that should not encounter the Golgi-resident POMGNT1 enzyme. In contrast, O-Man monosaccharides located on folded EC- and IPT-domains may interact with POMGNT1 during their traffic through the secretory pathway, but it is reasonable to believe that O-Man on folded Ig-like domain is an inaccessible substrate, potentially due to steric clashes between/within the active site of POMGNT1 and folded Ig-like domains.
In this study, we further confirm that POMT1/POMT2 are highly selective enzymes that target mucin-like domains of only a few protein substrates (α-DG, KIAA1549, and SUCO). The molecular basis for POMT1/POMT2 selectivity is puzzling considering that there are hundreds of disordered proteins with dense Ser/Thr content trafficking the secretory pathway, for example, mucins or proteins with mucin-like domains, yet none of these have been reported to undergo O-Man glycosylation, and we find no evidence in this study to contradict this observation. A consensus sequon for O-Man glycosylation has not been identified; however, distinct polypeptide sequences termed cis-controlling peptidic elements have previously been suggested to function as recognition determinants for protein-specific installation of, for example, polysialic acid, LacdiNAc, mannose-6-phosphate, and POMT1/POMT2-driven O-Man glycosylation (67). We envision that focused glycoproteomic studies of other mammalian cells or even tissues, using multiple proteases and alternative enrichment strategies, may expand the understanding of POMT1/POMT2 substrate selectivity and potentially unveil details on cis-controlling peptidic elements that direct specific substrates to POMT1/POMT2 enzymes. Furthermore, future structural studies focused on the interaction between POMT1/POMT2 MIR-domains and α-DG, KIAA1549 ectodomain, and SUCO substrates may provide further insight to unravel the molecular details of POMT1/POMT2 substrate selectivity.
Our differential analyses of cells with genetically deconstructed biosynthetic pathways further highlights that TMTC1-4 and TMEM260 have unique functions and dedicated roles for O-Man glycosylation of EC-and IPT-domains, respectively. EC-domains, commonly found on the ∼120 different members of the cadherin superfamily of adhesion molecules, are modified on two distinct β-strands (B and G) with O-Man projecting in opposite orientations perpendicular to the plane of each EC-domain, indicating that TMTC1-4 may regulate or fine-tune cadherin functions and binding properties, especially for the clustered protocadherins which are known to utilize the EC1-EC4 domain interface for homophilic trans-interactions (68). The TMTC1-4 enzymes share a common N-terminal architecture including the first seven transmembrane (TM) that constitute the conserved GT-C module (7). Conserved acidic residues located within the first ER-luminal loop connecting TM1 and TM2 are predicted to be important for catalytic functions based on homology and structural similarities with, for example, yeast PMTs, POMTs, TMEM260, and other GT-C enzymes (18, 69). The variable GT-C module comprises the C-terminal ER-luminal domains that distinguish TMTC1-4 from other GT-C enzymes based on a variable number of tetratricopeptide repeats (TPRs) in each isoenzyme (Fig. 7). TPR repeats are evolutionarily conserved structural motifs that facilitate various molecular interactions between proteins, domains, and short polypeptides (70), which may explain TMTC1-4 specificity, that is, cadherin recruitment through EC-domain recognition by the TPR domains before O-Man glycans are added at the catalytic site. We believe that the same reasoning applies to TMEM260, which has an N-terminal conserved GT-C module responsible for catalytic activity (18), while the ER-luminal variable GT-C module likely recruits substrates through specific interactions between TPR repeats and IPT-domains. Notably, IPT-domains are modified on B-strands, which are required for receptor maturation and traffic to the plasma membrane but may also influence cis-interactions and receptor dimerization events at the cell surface. For Ig-like folds including both EC- and IPT-domains, it however remains unknown whether O-Man glycans are added to unfolded, partially folded, or completely folded domains, and further studies are necessary to resolve the molecular details of TMTC1-4 and TMEM260 interactions with their substrates, potentially through cryo-EM or structural analysis in simplified systems using recombinantly expressed Ig-like domains and TPR repeats.
Fig. 7.
Three enzyme families for O-Man initiation in humans. O-Man glycosylation is driven by three biosynthetic pathways: POMT1/2, TMTC1-4, and TMEM260. Each of these have specificities for canonical substrates and other substrates. Figures made using AlphaFold (75) and alignments made using PyMol (The PyMOL Molecular Graphics System, Version 2.5.2, Schrödinger, LLC.).
We also observe that all three biosynthetic pathways are capable of O-Man glycosylation on noncanonical protein substrates that do not fit the categorization described above. For example, protein disulfide isomerases (PDIA3, PDIA4, and PDIA6) are O-Man glycosylated by the TMTC1-4 family, even though they lack Ig-like domains (Fig. 4, A and B). Previous studies have shown that TMTC3 may interact with PDIA3 (71), and it is possible that PDIs and TMTC1-4 isoforms form higher order molecular assemblies in the ER lumen, thus allowing PDIs to become O-Man glycosylated due to their close proximity to the catalytic site of TMTCs. Proximity-based glycosylation, unlike reactions driven by the intrinsic specificities of glycosyltransferases, may also explain O-Man on POMT1 and POMT2 enzymes in HEK293POMT1/2 cells, which are known to form heterodimers in the ER (72) or O-Man on TMTC1 and TMTC2 in HEK293TMTC1-4 cells, which indirectly suggests that functional TMTCs assemble as dimers in the ER.
Our results further showed that a subset of noncanonical substrates including β-DG, ITGA5, ITGAV, ITGB1, and F11R (supplemental Data – Datasets 15–19), are O-Man glycosylated on the β-strands of, for example, Ig-like C- or V-type domains. To further validate these findings, we adopted a targeted approach using a panel of recombinantly expressed and purified reporter proteins; however, we could not confirm O-Man glycosylations in bottom-up analyses (Fig. 5A), indicating that the occupancy of O-Man on noncanonical substrates is low and only detectable following enrichment procedures (Fig. 3B). We speculate that C- and V-type domains, together with other subclasses of Ig-like folds (Fig. 4, B and C), transiently interact with TPR domains due to their overall structural similarity to EC- and IPT-domains, thus allowing a subfraction of substrates to become O-Man glycosylated at low occupancy. Therefore, we advocate that O-Man, as well as other types of O-glycosylations identified by large scale glycoproteomic analyses of complex samples, should be interpreted with care and that such finding are validated by, for example, targeted analyses of isolated protein substrates before any major conclusions are drawn with respect to biological and/or functional relevance. This is also true for a number of POMT1/POMT2 substrates identified in HEK293POMT1/2 cells, for example, APP, ASTN1, GLA, HEXA, HSPA5, HYOU1, and TMEM43, where further studies are warranted to validate enzyme specificities and O-Man functions for these substrates.
Finally, our genetic deconstruction strategy with complete KO of O-Man initiation pathways in HEK293nO-Man cells corroborates previous studies (13, 17, 18) and aligns well with the data presented in this study, demonstrating that combinatorial KO of POMT1, POMT2, TMTC1-4, and TMEM260 abolishes O-Man glycosylation on >100 protein substrates including α-DG, KIAA1549, the classical cadherins, protocadherins, and plexin receptors (Fig. 4). Surprisingly, we note that CDH11, HNRNPA2B1, MCFD2, and EGFL2 are identified as glycoproteins in our HEK293nO-Man dataset (supplemental Data – Dataset 19). While we cannot rule out that these proteins are modified by O-Man or other glycans that match the 162.0528 atomic mass unit mass increment corresponding to hexose (e.g., glucose or galactose), we find it unlikely that a GT-C enzyme is responsible for glycosylation of these protein substrates. Further investigation is needed to resolve the biosynthetic basis for these modifications, especially for CDH11 and MCFD2, which are consistently identified with O-Hex modifications in our genetically engineered cell lines.
In conclusion, the human O-Man glycoproteome now covers >240 glycoproteins (180 identified in this study) involved in various functional networks, including those associated with cell–ECM interactions, cell-cell adhesion, and receptor signaling (supplemental Fig. S8). The O-Man glycans are predominantly found on transmembrane proteins, of which a disproportionally large fraction is modified on unique protein folds, primarily Ig-like domains. This study establishes a roadmap by identifying substrates and defining specificities for the three families (POMT1/POMT2, TMTC1-4, and TMEM260) of biosynthetic enzymes, which now can be used to guide further functional studies on O-Man glycosylation in human health and disease.
Data Availability
The mass spectrometry proteomics data, including .raw, .msf, and annotated/indexed glycopeptide spectra, have been deposited to the ProteomeXchange Consortium (73) via the PRIDE (74) partner repository with the dataset identifier PXD045597. This article contains supplemental data, including all validated data reported in tabular format. Annotated data used for generation of figures, as well as all codes utilized for generating figures and processing data are available at request to the corresponding author.
Supplemental Data
This article contains supplemental data.
Conflict of interest
The authors declare no competing interests.
Acknowledgments
We thank colleagues at Copenhagen Center for Glycomics (CCG) for helpful discussions, Hans Bakker (Hanover Medical School, Germany) for sharing the BC2L-A lectin expression construct, and Zhang Yang and Julie Van Coillie (CCG) for sharing the FCGRIIIA plasmid.
Author contributions
L. P., W. T., S. Y. V., and A. H. writing–review and editing; L. P. and A. H. writing–original draft; L. P. and A. H. visualization; L. P. and A. H. validation; L. P., W. T., S. Y. V., and A. H. methodology; L. P., W. T., S. Y. V., and A. H. investigation; L. P., W. T., S. Y. V., and A. H. formal analysis; L. P., W. T., and A. H. data curation; L. P. and A. H. conceptualization; S. Y. V. and A. H. resources; A. H. supervision; A. H. project administration; A. H. funding acquisition.
Funding and additional information
This work was supported by the Danish National Research Foundation Grant DNRF107, the Mizutani Foundation for Glycoscience, and a research grant (00025438) from VILLUM FONDEN.
Footnotes
Present address for Weihua Tian: Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads, Building 224, DK-2800 Kgs. Lyngby, Denmark.
Supplementary Data
References
- 1.Lommel M., Strahl S. Protein O-mannosylation: conserved from bacteria to humans. Glycobiology. 2009;19:816–828. doi: 10.1093/glycob/cwp066. [DOI] [PubMed] [Google Scholar]
- 2.Lehle L., Strahl S., Tanner W. Protein glycosylation, conserved from yeast to man: a model organism helps elucidate congenital human diseases. Angew. Chem. Int. Ed. Engl. 2006;45:6802–6818. doi: 10.1002/anie.200601645. [DOI] [PubMed] [Google Scholar]
- 3.Endo T. O-mannosyl glycans in mammals. Biochim. Biophys. Acta. 1999;1473:237–246. doi: 10.1016/s0304-4165(99)00182-8. [DOI] [PubMed] [Google Scholar]
- 4.Saxena H., Buenbrazo N., Song W.Y., Li C., Brochu D., Robotham A., et al. Toward an experimental system for the examination of protein mannosylation in actinobacteria. Glycobiology. 2023;33:512–524. doi: 10.1093/glycob/cwad023. [DOI] [PubMed] [Google Scholar]
- 5.Gentzsch M., Tanner W. The PMT gene family: protein O-glycosylation in saccharomyces cerevisiae is vital. EMBO J. 1996;15:5752–5759. [PMC free article] [PubMed] [Google Scholar]
- 6.Manya H., Chiba A., Yoshida A., Wang X., Chiba Y., Jigami Y., et al. Demonstration of mammalian protein O-mannosyltransferase activity: coexpression of POMT1 and POMT2 required for enzymatic activity. Proc. Natl. Acad. Sci. U. S. A. 2004;101:500–505. doi: 10.1073/pnas.0307228101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Alexander J.A.N., Locher K.P. Emerging structural insights into C-type glycosyltransferases. Curr. Opin. Struct. Biol. 2023;79 doi: 10.1016/j.sbi.2023.102547. [DOI] [PubMed] [Google Scholar]
- 8.Strahl-Bolsinger S., Gentzsch M., Tanner W. Protein O-mannosylation. Biochim. Biophys. Acta. 1999;1426:297–307. doi: 10.1016/s0304-4165(98)00131-7. [DOI] [PubMed] [Google Scholar]
- 9.Neubert P., Halim A., Zauser M., Essig A., Joshi H.J., Zatorska E., et al. Mapping the O-mannose glycoproteome in Saccharomyces cerevisiae. Mol. Cell Proteomics. 2016;15:1323–1337. doi: 10.1074/mcp.M115.057505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Tanner W., Lehle L. Protein glycosylation in yeast. Biochim. Biophys. Acta. 1987;906:81–99. doi: 10.1016/0304-4157(87)90006-2. [DOI] [PubMed] [Google Scholar]
- 11.Willer T., Valero M.C., Tanner W., Cruces J., Strahl S. O-mannosyl glycans: from yeast to novel associations with human disease. Curr. Opin. Struct. Biol. 2003;13:621–630. doi: 10.1016/j.sbi.2003.09.003. [DOI] [PubMed] [Google Scholar]
- 12.Endo T. Structure, function and pathology of O-mannosyl glycans. Glycoconj J. 2004;21:3–7. doi: 10.1023/B:GLYC.0000043740.26062.2c. [DOI] [PubMed] [Google Scholar]
- 13.Larsen I.S.B., Narimatsu Y., Joshi H.J., Yang Z., Harrison O.J., Brasch J., et al. Mammalian O-mannosylation of cadherins and plexins is independent of protein O-mannosyltransferases 1 and 2. J. Biol. Chem. 2017;292:11586–11598. doi: 10.1074/jbc.M117.794487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Yoshida-Moriguchi T., Campbell K.P. Matriglycan: a novel polysaccharide that links dystroglycan to the basement membrane. Glycobiology. 2015;25:702–713. doi: 10.1093/glycob/cwv021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Sheikh M.O., Halmo S.M., Wells L. Recent advancements in understanding mammalian O-mannosylation. Glycobiology. 2017;27:806–819. doi: 10.1093/glycob/cwx062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Endo T. Mammalian O-mannosyl glycans: biochemistry and glycopathology. Proc. Jpn. Acad. Ser. B Phys. Biol. Sci. 2019;95:39–51. doi: 10.2183/pjab.95.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Larsen I.S.B., Narimatsu Y., Joshi H.J., Siukstaite L., Harrison O.J., Brasch J., et al. Discovery of an O-mannosylation pathway selectively serving cadherins and protocadherins. Proc. Natl. Acad. Sci. U. S. A. 2017;114:11163–11168. doi: 10.1073/pnas.1708319114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Larsen I.S.B., Povolo L., Zhou L., Tian W., Mygind K.J., Hintze J., et al. The SHDRA syndrome-associated gene TMEM260 encodes a protein-specific O-mannosyltransferase. Proc. Natl. Acad. Sci. U. S. A. 2023;120 doi: 10.1073/pnas.2302584120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Vester-Christensen M.B., Halim A., Joshi H.J., Steentoft C., Bennett E.P., Levery S.B., et al. Mining the O-mannose glycoproteome reveals cadherins as major O-mannosylated glycoproteins. Proc. Natl. Acad. Sci. U. S. A. 2013;110:21018–21023. doi: 10.1073/pnas.1313446110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Larsen I.S.B., Narimatsu Y., Clausen H., Joshi H.J., Haliz A. Multiple distinct O-Mannosylation pathways in eukaryotes. Curr. Opin. Struct. Biol. 2019;56:171–178. doi: 10.1016/j.sbi.2019.03.003. [DOI] [PubMed] [Google Scholar]
- 21.Patel S.D., Chen C.P., Bahna F., Honig B., Shapiro L. Cadherin-mediated cell-cell adhesion: sticking together as a family. Curr. Opin. Struct. Biol. 2003;13:690–698. doi: 10.1016/j.sbi.2003.10.007. [DOI] [PubMed] [Google Scholar]
- 22.Shapiro L., Weis W.I. Structure and biochemistry of cadherins and catenins. Cold Spring Harb Perspect. Biol. 2009;1:a003053. doi: 10.1101/cshperspect.a003053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Barresi R., Campbell K.P. Dystroglycan: from biosynthesis to pathogenesis of human disease. J. Cell Sci. 2006;119:199–207. doi: 10.1242/jcs.02814. [DOI] [PubMed] [Google Scholar]
- 24.Live D., Wells L., Boons G.J. Dissecting the molecular basis of the role of the O-mannosylation pathway in disease: α-dystroglycan and forms of muscular dystrophy. Chembiochem. 2013;14:2392–2402. doi: 10.1002/cbic.201300417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wu K.M., Hunter T.L., Proakis A.G. A dual electrophysiologic test for atrial antireentry and ventricular antifibrillatory studies. Effects of bethanidine, procainamide, and WY-48986. J. Pharmacol. Methods. 1990;23:87–95. doi: 10.1016/0160-5402(90)90036-k. [DOI] [PubMed] [Google Scholar]
- 26.Runge C.L., Indap A., Zhou Y., Kent J.W., King E., Erbe C.B., et al. Association of TMTC2 with human nonsyndromic sensorineural hearing loss. JAMA Otolaryngol. Head Neck Surg. 2016;142:866–872. doi: 10.1001/jamaoto.2016.1444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Guillen-Ahlers H., Erbe C.B., Chevalier F.D., Montoya M.J., Zimmerman K.D., Langefeld C.D., et al. TMTC2 variant associated with sensorineural hearing loss and auditory neuropathy spectrum disorder in a family dyad. Mol. Genet. Genomic Med. 2018;6:653–659. doi: 10.1002/mgg3.397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Jerber J., Zaki M.S., Al-Aama J.Y., Rosti R.O., Ben-Omran T., Dikoglu E., et al. Biallelic mutations in TMTC3, encoding a transmembrane and TPR-containing protein, lead to cobblestone lissencephaly. Am. J. Hum. Genet. 2016;99:1181–1189. doi: 10.1016/j.ajhg.2016.09.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Farhan S.M.K.K., Nixon K.C.J.J., Everest M., Edwards T.N., Long S., Segal D., et al. Identification of a novel synaptic protein, TMTC3, involved in periventricular nodular heterotopia with intellectual disability and epilepsy. Hum. Mol. Genet. 2017;26:4278–4289. doi: 10.1093/hmg/ddx316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Li J., Akil O., Rouse S.L., McLaughlin C.W., Matthews I.R., Lustig L.R., et al. Deletion of Tmtc4 activates the unfolded protein response and causes postnatal hearing loss. J. Clin. Invest. 2018;128:5150–5162. doi: 10.1172/JCI97498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Ta-Shma A., Khan T.N., Vivante A., Willer J.R., Matak P., Jalas C., et al. Mutations in TMEM260 cause a pediatric neurodevelopmental, cardiac, and renal syndrome. Am. J. Hum. Genet. 2017;100:666–675. doi: 10.1016/j.ajhg.2017.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Peng M., Jing S., Duan S., Lu G., Zhou K., Hua Y., et al. A novel homozygous variant of TMEM260 induced cardiac malformation and neurodevelopmental abnormality: case report and literature review. Front Med. 2023;10 doi: 10.3389/fmed.2023.1157042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Dworkin L.A., Clausen H., Joshi H.J. Applying transcriptomics to studyglycosylation at the cell type level. iScience. 2022;25 doi: 10.1016/j.isci.2022.104419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Lonowski L.A.A., Narimatsu Y., Riaz A., Delay C.E.E., Yang Z., Niola F., et al. Genome editing using FACS enrichment of nuclease-expressing cells and indel detection by amplicon analysis. Nat. Protoc. 2017;12:581–603. doi: 10.1038/nprot.2016.165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Yang Z., Steentoft C., Hauge C., Hansen L., Thomsen A.L., Niola F., et al. Fast and sensitive detection of indels induced by precise gene targeting. Nucleic Acids Res. 2015;43:e59. doi: 10.1093/nar/gkv126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Narimatsu Y., Joshi H.J., Yang Z., Gomes C., Chen Y.H., Lorenzetti F.C., et al. A validated gRNA library for CRISPR/Cas9 targeting of the human glycosyltransferase genome. Glycobiology. 2018;28:295–305. doi: 10.1093/glycob/cwx101. [DOI] [PubMed] [Google Scholar]
- 37.Sun Y., Vandenbriele C., Kauskot A., Verhamme P., Hoylaerts M.F., Wright G.J. A human platelet receptor protein microarray identifies the high affinity immunoglobulin E receptor subunit α (FcεR1α) as an activating platelet endothelium aggregation receptor 1 (PEAR1) ligand. Mol. Cell Proteomics. 2015;14:1265–1274. doi: 10.1074/mcp.M114.046946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Holden P., Horton W.A. Crude subcellular fractionation of cultured mammalian cell lines. BMC Res. Notes. 2009;2:243. doi: 10.1186/1756-0500-2-243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Jung J., Jeong K., Choi Y., Kim S.A., Kim H., Lee J.W., et al. Deuterium-free, three-plexed peptide diethylation for highly accurate quantitative proteomics. J. Proteome Res. 2019;18:1078–1087. doi: 10.1021/acs.jproteome.8b00775. [DOI] [PubMed] [Google Scholar]
- 40.Hütte H.J., Tiemann B., Shcherbakova A., Grote V., Hoffmann M., Povolo L., et al. A bacterial mannose binding lectin as a tool for the enrichment of C- and O-mannosylated peptides. Anal. Chem. 2022;94:7329–7338. doi: 10.1021/acs.analchem.2c00742. [DOI] [PubMed] [Google Scholar]
- 41.Hakhverdyan Z., Molloy K.R., Subbotin R.I., Fernandez-Martinez J., Chait B.T., Rout M.P. Measuring in vivo protein turnover and exchange in yeast macromolecular assemblies. STAR Protoc. 2021;2 doi: 10.1016/j.xpro.2021.100800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.UniProt Consortium UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 2023;51:D523–D531. doi: 10.1093/nar/gkac1052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Heberle H., Meirelles G.V., da Silva F.R., Telles G.P., Minghim R. InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams. BMC Bioinformatics. 2015;16:169. doi: 10.1186/s12859-015-0611-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Hulsen T. DeepVenn -- a web application for the creation of area-proportional Venn diagrams using the deep learning framework Tensorflow.js. arXiv. 2022;27 doi: 10.48550/arXiv.2210.04597. [preprint] [DOI] [Google Scholar]
- 45.Ge S.X., Jung D., Yao R. ShinyGO: a graphical gene-set enrichment tool for animals and plants. Bioinformatics. 2020;36:2628–2629. doi: 10.1093/bioinformatics/btz931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Szklarczyk D., Kirsch R., Koutrouli M., Nastou K., Mehryary F., Hachilif R., et al. The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023;51:D638–D646. doi: 10.1093/nar/gkac1000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bagdonaite I., Malaker S.A., Polasky D.A., Riley N.M., Schjoldager K., Vakhrushev S.Y., et al. Glycoproteomics. Nat. Rev. Methods Prim. 2022;2:48. [Google Scholar]
- 48.Lameignere E., Malinovská L., Sláviková M., Duchaud E., Mitchell E.P., Varrot A., et al. Structural basis for mannose recognition by a lectin from opportunistic bacteria Burkholderia cenocepacia. Biochem. J. 2008;411:307–318. doi: 10.1042/bj20071276. [DOI] [PubMed] [Google Scholar]
- 49.Lameignere E., Shiao T.C., Roy R., Wimmerova M., Dubreuil F., Varrot A., et al. Structural basis of the affinity for oligomannosides and analogs displayed by BC2L-A, a Burkholderia cenocepacia soluble lectin. Glycobiology. 2010;20:87–98. doi: 10.1093/glycob/cwp151. [DOI] [PubMed] [Google Scholar]
- 50.Steentoft C., Vakhrushev S.Y., Vester-Christensen M.B., Schjoldager K.T.B.G., Kong Y., Bennett E.P., et al. Mining the O-glycoproteome using zinc-finger nuclease-glycoengineered SimpleCell lines. Nat. Methods. 2011;8:977–982. doi: 10.1038/nmeth.1731. [DOI] [PubMed] [Google Scholar]
- 51.Takahashi S., Sasaki T., Manya H., Chiba Y., Yoshida A., Mizuno M., et al. A new beta-1,2-N-acetylglucosaminyltransferase that may play a role in the biosynthesis of mammalian O-mannosyl glycans. Glycobiology. 2001;11:37–45. doi: 10.1093/glycob/11.1.37. [DOI] [PubMed] [Google Scholar]
- 52.Stalnaker S.H., Aoki K., Lim J.M., Porterfield M., Liu M., Satz J.S., et al. Glycomic analyses of mouse models of congenital muscular dystrophy. J. Biol. Chem. 2011;286:21180–21190. doi: 10.1074/jbc.M110.203281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Schjoldager K.T., Narimatsu Y., Joshi H.J., Clausen H. Global view of human protein glycosylation pathways and functions. Nat. Rev. Mol. Cell Biol. 2020;21:729–749. doi: 10.1038/s41580-020-00294-x. [DOI] [PubMed] [Google Scholar]
- 54.Kruger R.P., Lee J., Li W., Guan K.L. Mapping netrin receptor binding reveals domains of Unc5 regulating its tyrosine phosphorylation. J. Neurosci. 2004;24:10826–10834. doi: 10.1523/JNEUROSCI.3715-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Goto Y., Niwa Y., Suzuki T., Dohmae N., Umezawa K., Simizu S. C-mannosylation of human hyaluronidase 1: possible roles for secretion and enzymatic activity. Int. J. Oncol. 2014;45:344–350. doi: 10.3892/ijo.2014.2438. [DOI] [PubMed] [Google Scholar]
- 56.Sasazawa Y., Sato N., Suzuki T., Dohmae N., Simizu S. C-mannosylation of thrombopoietin receptor (c-Mpl) regulates thrombopoietin-dependent JAK-STAT signaling. Biochem. Biophys. Res. Commun. 2015;468:262–268. doi: 10.1016/j.bbrc.2015.10.116. [DOI] [PubMed] [Google Scholar]
- 57.Fujiwara M., Kato S., Niwa Y., Suzuki T., Tsuchiya M., Sasazawa Y., et al. C-mannosylation of R-spondin3 regulates its secretion and activity of Wnt/β-catenin signaling in cells. FEBS Lett. 2016;590:2639–2649. doi: 10.1002/1873-3468.12274. [DOI] [PubMed] [Google Scholar]
- 58.Okamoto S., Murano T., Suzuki T., Uematsu S., Niwa Y., Sasazawa Y., et al. Regulation of secretion and enzymatic activity of lipoprotein lipase by C-mannosylation. Biochem. Biophys. Res. Commun. 2017;486:558–563. doi: 10.1016/j.bbrc.2017.03.085. [DOI] [PubMed] [Google Scholar]
- 59.John A., Järvå M.A., Shah S., Mao R., Chappaz S., Birkinshaw R.W., et al. Yeast- and antibody-based tools for studying tryptophan C-mannosylation. Nat. Chem. Biol. 2021;17:428–437. doi: 10.1038/s41589-020-00727-w. [DOI] [PubMed] [Google Scholar]
- 60.Buettner F.F.R., Ashikov A., Tiemann B., Lehle L., Bakker H.C. Elegans DPY-19 is a C-mannosyltransferase glycosylating thrombospondin repeats. Mol. Cell. 2013;50:295–302. doi: 10.1016/j.molcel.2013.03.003. [DOI] [PubMed] [Google Scholar]
- 61.Pagnamenta A.T., Jackson A., Perveen R., Beaman G., Petts G., Gupta A., et al. Biallelic TMEM260 variants cause truncus arteriosus, with or without renal defects. Clin. Genet. 2022;101:127–133. doi: 10.1111/cge.14071. [DOI] [PubMed] [Google Scholar]
- 62.Endo T. Glycobiology of α-dystroglycan and muscular dystrophy. J. Biochem. 2015;157:1–12. doi: 10.1093/jb/mvu066. [DOI] [PubMed] [Google Scholar]
- 63.Levery S.B., Steentoft C., Halim A., Narimatsu Y., Clausen H., Vakhrushev S.Y. Advances in mass spectrometry driven O-glycoproteomics. Biochim. Biophys. Acta. 2015;1850:33–42. doi: 10.1016/j.bbagen.2014.09.026. [DOI] [PubMed] [Google Scholar]
- 64.Winterhalter P.R., Lommel M., Ruppert T., Strahl S. O-glycosylation of the non-canonical T-cadherin from rabbit skeletal muscle by single mannose residues. FEBS Lett. 2013;587:3715–3721. doi: 10.1016/j.febslet.2013.09.041. [DOI] [PubMed] [Google Scholar]
- 65.Shcherbakova A., Preller M., Taft M.H., Pujols J., Ventura S., Tiemann B., et al. C-mannosylation supports folding and enhances stability of thrombospondin repeats. Elife. 2019;8:1–15. doi: 10.7554/eLife.52978. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Polasky D.A., Yu F., Teo G.C., Nesvizhskii A.I. Fast and comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco. Nat. Methods. 2020;17:1125–1132. doi: 10.1038/s41592-020-0967-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Hanisch F.G., Breloy I. Protein-specific glycosylation: signal patches and cis-controlling peptidic elements. Biol. Chem. 2009;390:619–626. doi: 10.1515/BC.2009.043. [DOI] [PubMed] [Google Scholar]
- 68.Rubinstein R., Thu C.A., Goodman K.M., Wolcott H.N., Bahna F., Mannepalli S., et al. Molecular logic of neuronal self-recognition through protocadherin domain interactions. Cell. 2015;163:629–642. doi: 10.1016/j.cell.2015.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Lommel M., Schott A., Jank T., Hofmann V., Strahl S. A conserved acidic motif is crucial for enzymatic activity of protein O-mannosyltransferases. J. Biol. Chem. 2011;286:39768–39775. doi: 10.1074/jbc.M111.281196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Perez-Riba A., Itzhaki L.S. The tetratricopeptide-repeat motif is a versatile platform that enables diverse modes of molecular recognition. Curr. Opin. Struct. Biol. 2019;54:43–49. doi: 10.1016/j.sbi.2018.12.004. [DOI] [PubMed] [Google Scholar]
- 71.Racapé M., Duong Van Huyen J.P., Danger R., Giral M., Bleicher F., Foucher Y., et al. The involvement of SMILE/TMTC3 in endoplasmic reticulum stress response. PLoS One. 2011;6 doi: 10.1371/journal.pone.0019321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Akasaka-Manya K., Manya H., Nakajima A., Kawakita M., Endo T. Physical and functional association of human protein O-mannosyltransferases 1 and 2. J. Biol. Chem. 2006;281:19339–19345. doi: 10.1074/jbc.M601091200. [DOI] [PubMed] [Google Scholar]
- 73.Deutsch E.W., Bandeira N., Perez-Riverol Y., Sharma V., Carver J.J., Mendoza L., et al. The ProteomeXchange consortium at 10 years: 2023 update. Nucleic Acids Res. 2023;51:D1539–D1548. doi: 10.1093/nar/gkac1040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Perez-Riverol Y., Bai J., Bandla C., García-Seisdedos D., Hewapathirana S., Kamatchinathan S., et al. The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res. 2022;50:D543–D552. doi: 10.1093/nar/gkab1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The mass spectrometry proteomics data, including .raw, .msf, and annotated/indexed glycopeptide spectra, have been deposited to the ProteomeXchange Consortium (73) via the PRIDE (74) partner repository with the dataset identifier PXD045597. This article contains supplemental data, including all validated data reported in tabular format. Annotated data used for generation of figures, as well as all codes utilized for generating figures and processing data are available at request to the corresponding author.







