Abstract
N-acetylgalactosaminyl-transferases (GalNAc-Ts) initiate mucin-type O-glycosylation, an abundant and complex posttranslational modification that regulates host-microbe interactions, tissue development, and metabolism. GalNAc-Ts contain a lectin domain consisting of three homologous repeats (α, β, and γ), where α and β can potentially interact with O-GalNAc on substrates to enhance activity toward a nearby acceptor Thr/Ser. The ubiquitous isoenzyme GalNAc-T1 modulates heart development, immunity, and SARS-CoV-2 infectivity, but its substrates are largely unknown. Here, we show that both α and β in GalNAc-T1 uniquely orchestrate the O-glycosylation of various glycopeptide substrates. The α repeat directs O-glycosylation to acceptor sites carboxyl-terminal to an existing GalNAc, while the β repeat directs O-glycosylation to amino-terminal sites. In addition, GalNAc-T1 incorporates α and β into various substrate binding modes to cooperatively increase the specificity toward an acceptor site located between two existing O-glycans. Our studies highlight a unique mechanism by which dual lectin repeats expand substrate specificity and provide crucial information for identifying the biological substrates of GalNAc-T1.
Cross-talk between two sugar-binding pockets of the O-glycosyltransferase GalNAc-T1 influences its complex substrate specificity.
INTRODUCTION
Mucin-type O-glycosylation is a posttranslational modification that occurs on ~80% of proteins transported through the secretory pathway, influencing their structure, stability, and function (1). Substrates include the densely O-glycosylated mucin proteins responsible for the gel-like properties of mucus that lines and lubricates the epithelial surface to protect underlying layers from physical or chemical damage, prevent pathogen invasion, and maintain homeostasis through interactions with the microbiota. Aberrant mucin-type O-glycosylation is associated with various cancers, developmental disorders, neurological syndromes, metabolic diseases, and inflammation of the gastrointestinal tract (2–5).
A conserved family of N-acetylgalactosaminyl-transferases (GalNAc-Ts) initiate mucin-type O-glycosylation by catalyzing GalNAc transfer from uridine 5′-diphosphate (UDP)–GalNAc to a Thr/Ser on a protein to form Thr/Ser–O-GalNAc (6–10). GalNAc-Ts are Golgi-anchored type II membrane proteins containing two luminal domains that are important for activity, including an N-terminal Mn2+-dependent catalytic domain that adopts a glycosyltransferase family A (GT-A) fold (Fig. 1A). GalNAc-Ts contain a ricin B-type lectin domain that is connected to the C terminus of the catalytic domain via a ~10–amino acid linker. The lectin domain consists of three homologous repeats (α, β, and γ), where α and β, but not γ, have GalNAc-binding potential (11, 12). Lectin domain binding to an extant GalNAc on a substrate (Thr–O-GalNAc) enhances the enzymatic activity of most GalNAc-Ts toward an acceptor site 7 to 11 amino acids away (13–18). For some GalNAc-Ts, the preferred acceptor site is N-terminal to the extant Thr–O-GalNAc (N-terminal directionality), and, for others, it is C-terminal (C-terminal directionality) (15, 16).
Fig. 1. Characterization of the GalNAc-T1 lectin domain mutants.
(A) GalNAc-T1 and its closest human homolog GalNAc-T13 are the only isoenzymes containing two lectin domain repeats (α and β) that can bind GalNAc. The catalytic domains are light gray, while the lectin domain repeats are shown in pink (α repeat), blue (β repeat), and beige (γ repeat). (B) Catalytic efficiency (kcat/KM) of GalNAc-T1WT, GalNAc-T1D444A, GalNAc-T1D484A, and GalNAc-T1D444A/D484A against Muc5AC-A peptides. Putative acceptor sites on Muc5AC peptides are colored red. Yellow squares represent GalNAc. Assays were performed in triplicate in three independent experiments. Outliers were identified and eliminated using the ROUT method. Additional kinetic data are shown in fig. S1 and table S1. (C) Glycosylation N-terminal to a previous site is directed by the GalNAc-T1 β repeat, whereas the GalNAc-T1 α repeat dictates glycosylation in the C-terminal direction. (D) MD simulation results assessing side chain flexibility, one of the metrics used for comparative analysis of the dynamics upon single mutations in the α or β repeats of GalNAc-T1, showing the asymmetric changes introduced by the mutations compared to GalNAc-T1WT (blue, <0.5 Å; yellow, >0.8 Å; red, in between). The starting model of human GalNAc-T1 was produced by AlphaFold2. The catalytic domain was excluded from the analysis.
In humans, 20 GalNAc-Ts with distinct and overlapping substrate specificities and functions meet the challenge of O-glycosylating numerous and diverse substrates (19). There is no consensus sequence for O-glycosylation. Instead, each isoenzyme follows its own rules for sequence recognition near an acceptor site (20–22). The isoenzyme GalNAc-T1 is highly expressed in most tissue types and influences many biological pathways, including salivary gland function, heart development, immune system function, and viral infectivity (23–32). However, the repertoire of substrates and the mechanisms by which GalNAc-T1 recognizes those substrates are poorly understood. GalNAc-T1 and its closest homolog GalNAc-T13 are the only human isoenzymes with a conserved GalNAc-binding motif consisting of Asp, His, and Asn in both the α and β repeats of the lectin domain (Fig. 1A and fig. S1A) (33). Previous studies show that GalNAc-T1 variants D444A (α) and D484A (β), but not D525A (γ), have reduced affinity toward glycosylated peptide and protein substrates, suggesting a role for both α and β in GalNAc-T1 activity (34, 35). However, the mechanistic consequences of two GalNAc-binding lectin repeats and their role in substrate specificity have not been shown, and many of the biological substrates of GalNAc-T1 are still not known.
Here, we use biochemical, structural, and computational methods to gain insight into the mechanism of GalNAc-T1–mediated glycosylation and show that two lectin repeats influence GalNAc-T1 function and substrate specificity in multiple ways. First, we demonstrate that the α repeat directs O-glycosylation C-terminal to an existing Thr–O-GalNAc, while the β repeat directs O-glycosylation N-terminal to Thr–O-GalNAc. Unexpectedly, disrupting GalNAc binding in the α repeat influences the function of the β repeat and vice versa. We show that the effects on enzymatic activity are asymmetrical, with α and β affecting each other in distinct ways. Last, we show that the α and β repeats can interact with two Thr–O-GalNAc on a mucin 1 (Muc1) di-glycopeptide substrate through multiple binding modes to greatly enhance GalNAc-T1 activity toward an intermediate acceptor site. Overall, our results reveal the unique mechanisms of GalNAc-T1 lectin–mediated substrate recognition and expand our understanding of the diverse substrate specificities among this complex family of enzymes.
RESULTS
The GalNAc-T1 lectin domain α and β repeats have distinct directionality but function cooperatively
Studies using random peptide libraries show that GalNAc-T1 can efficiently glycosylate acceptor sites that are N- or C-terminal to a Thr–O-GalNAc (16); however, the molecular basis of this bidirectionality is not known. To investigate the roles of the GalNAc-T1 α and β repeats in glycopeptide substrate recognition and O-glycosylation directionality, we constructed lectin repeat variants to disrupt GalNAc binding in α (GalNAc-T1D444A), β (GalNAc-T1D484A) or both repeats (GalNAc-T1D444A/D484A). To simplify the kinetic analysis, we designed peptide and glycopeptide substrates based on the Muc5AC mucin repeat (GTT3PSPVPTTSTT13SAP) but with only one or two acceptor sites (highlighted in bold): Muc5AC-A (GAT3PAPVPAGAGT13PAP); Muc5AC-A3 (GAT3GalNAcPAPVPAGAGT13PAP), for measuring glycosylation in the C-terminal direction; and Muc5AC-A13 (GAT3PAPVPAGAGT13GalNAcPAP) for measuring glycosylation in the N-terminal direction (Fig. 1B). On the basis of previous structural studies with Muc5AC peptides, we predict that the length of these peptides prevents them from concurrently binding both repeats, allowing the individual probing of each repeat with a single Thr–O-GalNAc (36). Using a bioluminescent glycosyltransferase assay (UDP-Glo, Promega), we measured the activity of wild-type (WT) and lectin domain variants of GalNAc-T1 against these substrates.
The kcat/KM values show that GalNAc-T1WT prefers glycosylated substrates Muc5AC-A3 and Muc5AC-A13 over the non-glycosylated peptide Muc5AC-A (Fig. 1B and fig. S1, B and C), consistent with previous studies showing that Thr–O-GalNAc enhances GalNAc-T1 activity via interactions with the lectin domain (16, 34, 35). Disrupting α repeat GalNAc binding (GalNAc-T1D444A) results in a 16-fold reduction in kcat/KM compared to GalNAc-T1WT when using Muc5AC-A3 as a substrate due to a 10-fold increase in KM and 3-fold decrease in kcat (Fig. 1B and fig. S1, B and C). These data suggest that both substrate binding and acceptor threonine alignment in the active site are affected by disrupting α repeat GalNAc binding. There are minimal changes to the activity of GalNAc-T1D444A toward Muc5AC-A and Muc5AC-A13 compared to GalNAc-T1WT. Thus, the α repeat interacts with a Thr–O-GalNAc N-terminal to an acceptor site to specifically direct glycosylation in the C-terminal direction but does not affect O-glycosylation N-terminal to an existing Thr–O-GalNAc (Fig. 1C).
Disrupting β repeat GalNAc binding results in a 29-fold decrease in kcat/KM for GalNAc-T1D484A compared to GalNAc-T1WT when using Muc5AC-A13 as a substrate, indicating that the β repeat has specificity toward a C-terminal Thr–O-GalNAc and directs glycosylation to an N-terminal acceptor site (Fig. 1, B and C). The decrease is driven by a 20-fold increase in KM and 1.5-fold drop in kcat, suggesting that substrate binding has a more dominant role in influencing β repeat specificity than α repeat specificity (fig. S1C). Disrupting both the α and β repeats (GalNAc-T1D444A/D484A) results in a decrease in kcat/KM for both mono-glycopeptides compared to GalNAc-T1WT, due primarily to an increase in KM (Fig. 1B and fig. S1B), supporting earlier studies showing that glycopeptide specificity of GalNAc-T1 is dictated by the α and β repeats (35).
Unexpectedly, disrupting β repeat GalNAc binding (GalNAc-T1D484A) results in a twofold reduction in kcat/KM compared to GalNAc-T1WT when using Muc5AC-A3 as a substrate (Fig. 1B and fig. S1, B and C), indicating that the β repeat could also influence glycosylation C-terminal to a prior Thr–O-GalNAc. The reduction is due to a threefold decrease in kcat with no substantial change to KM, indicating that correct substrate alignment, not enzyme-substrate complex formation, is affected by D484A. Thus, we hypothesize that a single–amino acid change in the β repeat could have long-range consequences on the function of the α repeat and that the two repeats function cooperatively. To assess how this cooperativity influences enzymatic function, we performed molecular dynamics (MD) simulations of GalNAc-T1WT, GalNAc-T1D444A (α), GalNAc-T1D484A (β), and GalNAc-T1D444A/D484A (α and β) using apo-GalNAc-T1WT as a template. Comparative analyses show that a mutation in one of the repeats induces side-chain reorientations and dynamic changes across the entire lectin domain with only minor backbone conformational changes, affecting the other repeat relative to GalNAc-T1WT (Fig. 1D and fig. S2). The main differences are observed in side-chain flexibility and their long-range cross-correlation of movements. The observed effects on the β repeat when α repeat GalNAc binding is reduced (GalNAc-T1D444A) are distinct from the effects on the α repeat when the β repeat GalNAc binding is decreased (GalNAc-T1D484A), revealing asymmetrical cooperation between the repeats.
Modest but potentially notable H-bond rearrangements can also be seen at the lectin (β repeat)/catalytic domain interface that could influence the correct alignment of the substrate in the active site by stabilizing a realigned catalytic domain relative to the lectin domain (Fig. 1D and fig. S2). It is worth noting that loops 476-482 and 492-502 in the β repeat are closer to the catalytic domain than the corresponding loops in the α repeat and contain several basic and acidic residues (Asp479, Asp480, Lys496, His498, and His499) posed to interact with the catalytic domain; this is not the case for the corresponding loops in the α repeat, where only loop 446-454 contains charged residues (Arg448, Lys449, Glu450, Glu452, and Lys453) that are not well positioned to interact with the catalytic domain. Thus, any changes in their interaction patterns (H-bonds or salt bridges) are bound to have asymmetrical long-range consequences. These results suggest that disrupting one repeat even with a single mutation could trigger changes in the α/β, β/catalytic-domain, and peptide/lectin-domain interactions patterns, conceivably affecting KM or kcat independently, and explain the differences that we observed in the kcat of Muc5AC-A3 between GalNAc-T1WT and GalNAc-T1D484A (fig. S1, B and C).
To verify that single point mutations in the α or β repeats prevent glycopeptide binding through direct interactions with Thr–O-GalNAc, we performed a GalNAc competition assay by measuring the activity of the GalNAc-T1 variants against the Muc5AC-A substrates in the presence of increasing concentrations of free GalNAc (fig. S3, A to D). We predict that high concentrations of free GalNAc could reduce GalNAc-T1 activity through two possible mechanisms: by replacing UDP-GalNAc in the active site and by interacting with lectin repeat GalNAc binding pockets to block glycopeptide substrate binding. The relative activity of GalNAc-T1WT (relative to 0 mM free GalNAc) toward glycopeptides Muc5AC-A3 and Muc5AC-A13 decreases with increasing concentrations of free GalNAc, while the relative activity is not similarly perturbed toward the (unglycosylated) Muc5AC-A (fig. S3A). These observations demonstrate that GalNAc-T1WT interacts directly with Thr–O-GalNAc via its lectin repeats. If a lectin repeat mutation prevents Thr–O-GalNAc binding, then free GalNAc should not be able to inhibit binding toward that substrate to the same extent as it does in GalNAc-T1WT. For GalNAc-T1D444A, the point mutation disrupts binding to Thr–O-GalNAc on Muc5AC-A3 but has minimal effect on interactions with Thr–O-GalNAc on Muc5AC-A13. Thus, free GalNAc does not compete with Muc5AC-A3 binding because the α repeat is disrupted but competes with Muc5AC-A13 binding because the β repeat is intact (fig. S3B). A similar rationale explains the results for GalNAc-T1D484A: Because the β repeat is altered and the α repeat is intact, we see unaffected activity toward Muc5AC-A13 and Muc5AC-A and greater inhibition toward Muc5AC-A3 in the presence of free GalNAc (fig. S3C). Free GalNAc does not affect the binding of GalNAc-T1D444A/D484A to Muc5AC-A3 or Muc5AC-A13 to the same extent as GalNAc-T1WT, demonstrating that this variant binds GalNAc less efficiently through the lectin domain. Overall, these data support the kinetic data and show that a single–amino acid change in the α or β repeats can disrupt GalNAc binding.
β repeat binding to a C-terminal Thr–O-GalNAc correctly positions an N-terminal Thr for catalysis
To further validate our biochemical data, we attempted to crystallize both human and mouse GalNAc-T1 (98% identity overall, and 99% identity in the lectin domain) with various glycopeptide substrates but were only able to obtain crystals and an x-ray crystal structure of mouse GalNAc-T1 in complex with Muc5AC-13, UDP, and Mn2+ (Muc5AC-13: GTT3PSPVPTTSTT13GalNAcSAP; table S2 and Fig. 2A) to 2.3-Å resolution. The asymmetric unit contains a dimer with one GalNAc-T1–Mn2+–UDP–(Muc5AC-13)2 complex (A) having well-resolved electron density and one GalNAc-T1–Mn2+–UDP–(Muc5AC-13)1 complex (B) with poor electron density. The analysis centers on complex A, where GalNAc-T1 is bound to two Muc5AC-13 glycopeptides, one via the α repeat and the other via the β repeat (Fig. 2A). The Muc5AC-13 glycopeptide interacting with the β repeat GalNAc binding pocket via Thr13–O-GalNAc correctly positions the acceptor Thr3 into the active site for catalysis (Fig. 2B). Thr13–O-GalNAc binding to GalNAc-T1-β adopts the same conformation as the conserved α repeat binding pocket, as shown by the structural alignment with GalNAc-T2-α (fig. S4A) (36–39). Unexpectedly, a second Muc5AC-13 peptide is bound to the α repeat via interactions with Thr–O-GalNAc13. However, the 2Fobs-Fcalc electron density for peptide residues beyond Thr10 is not clear and does not provide evidence that the α repeat correctly positions this glycopeptide into the active site for catalysis. Overall, the structure supports the biochemical data showing that the β repeat interacts with a C-terminal GalNAc on a glycopeptide to direct glycosylation in the N-terminal direction and correctly positions the acceptor Thr in the active site for catalysis.
Fig. 2. The x-ray crystal structure of GalNAc-T1.
(A) The structure of GalNAc-T1 showing the catalytic (gray) and lectin domains in complex with UDP (brownish red), Mn2+ (cyan), and two Muc5AC-13 peptides (green) superimposed over a Fo-Fc omit map contoured at 2σ. GalNAc-T1 can interact with C-terminal Thr13–O-GalNAc (yellow sphere) using both its α (magenta) and β (blue) repeats but only the β repeat directs the acceptor Thr3 (red circle) into the active site. (B) A Fo-Fc omit map (brownish red) contoured at 2σ superimposed over the GalNAc-T1 active site containing UDP (brownish red), His211, Asp209, His344, and water that are coordinating a Mn2+ (cyan) with octahedral geometry. A Fo-Fc omit map contoured at 2σ (green) is superimposed over the N terminus of Muc5AC-13, showing the acceptor Thr13 correctly positioned in the active site for catalysis. (C) The α and β repeats of GalNAc-T1 showing GalNAc superimposed over a Fo-Fc omit map (orange) contoured at 2.5σ. Asp444, His460, and Asn465 are involved in hydrogen bonding to GalNAc in the α repeat, while residues Asp484, His498, and Asn503 participate in hydrogen bonds with GalNAc in the β repeat. β repeat GalNAc-T1 residue Lys496 makes hydrogen bonds with peptide residues Thr9 and Thr10 and a backbone carbonyl in β-bound Muc5AC-13. Glu450 in α repeat of GalNAc-T1 interacts with Lys496 in the β repeat and Thr10 in the β-bound Mu5AC-13, suggesting the presence of intra-lectin domain communications. (D) Comparing GalNAc-T1 in complex with Muc5AC-13, which uses the β repeat to direct glycosylation in the N-terminal direction to the crystal structure of GalNAc-T2 in complex with Muc5AC-13 (green), where Thr13–O-GalNAc (yellow sphere) binds to the α repeat (magenta) to direct glycosylation in the N-terminal direction [Protein Data Bank (PDB) ID: 5AJP] (36).
Comparing apo-GalNAc-T1 (40) to substrate-bound GalNAc-T1 reveals that glycopeptide binding does not considerably alter lectin domain conformation, but subtle changes occur in the side chains of several residues, most notably in non-conserved residues Lys496 and Glu450 (figs. S1A and S4B). Lys496 adopts a distinct conformation and makes hydrogen bonds with the Thr9 and Thr10 side chains and a backbone carbonyl in Muc5AC-13 (Fig. 2C). Because other substrates do not necessarily contain Thr at similar positions, Lys496 binding to Muc5AC-13 could uniquely increase specificity toward this substrate via additional side-chain interactions, whereas the interaction with the backbone carbonyl is more likely to be conserved. MD simulations show that the Lys496 H-bonding patterns with residues in the α repeat and catalytic domain are influenced by mutations in the α and β repeats. Therefore, any intra-GalNAc-T1 weakening or strengthening of this residue’s H-bond is predicted to affect affinity and kinetics of peptide binding (fig. S2 and data S2). In addition, Glu450 in the α repeat interacts with Lys496 in the β repeat and Thr10 in the β-bound Muc5AC-13. These interactions can mediate communication between the repeats and further support the observations that Thr–O-GalNAc binding in the β repeat could have long-range effects on GalNAc binding in the α repeat and vice versa.
GalNAc-T2, which has one functional lectin repeat (α), can similarly glycosylate an acceptor site N-terminal to an existing GalNAc (16). We compared the structure of GalNAc-T1–Muc5AC-13 to GalNAc-T2–Muc5AC-13 to further understand how GalNAc-T1-β and GalNAc-T2-α can perform a similar function (36). The lectin domains of GalNAc-T1 and GalNAc-T2 have distinct orientations relative to their catalytic domain, positioning the β repeat of GalNAc-T1 in a similar location in space to the α repeat of GalNAc-T2 (Fig. 2D). Consequently, Thr13–O-GalNAc in Muc5AC-13 binds to the β repeat of GalNAc-T1 to position the N-terminal acceptor of the peptide in the active site similarly to α repeat binding of Muc5AC-13 in GalNAc-T2, showing how the two isoenzymes can O-glycosylate the same glycopeptide N-terminally from an extant GalNAc using distinct lectin repeats.
The GalNAc-T1 lectin domain repeats cooperatively bind di-glycosylated Muc1
Because the α and β repeats of GalNAc-T1 simultaneously bind Thr–O-GalNAc in the crystal structure, we reasoned that a longer glycosylated substrate could simultaneously interact with α and β via GalNAc to enhance the substrate specificity of GalNAc-T1 toward an intervening acceptor site. GalNAc-T1 glycosylation of the membrane-bound protein Muc1 is well characterized and occurs primarily at a specific Thr, making the kinetics easier to interpret (41–45). Therefore, we used Muc1 as our substrate to test the effect of simultaneous binding on substrate specificity. We designed a 34–amino acid Muc1 peptide containing one acceptor site for GalNAc-T1(bold) and two potential glycosylated sites that we predict will interact with both lectin repeats and correctly position the acceptor threonine -TSAP- into the active site (RPAPGST7APPAHGVT15SAPDTRPAPGST27APPAHGV; Fig. 3A for full panel). To verify O-glycosylation at Thr15, unglycosylated Muc1 treated with WT or variants of GalNAc-T1 was analyzed by higher-energy dissociation product ions-triggered electron-transfer/higher-energy collision dissociation (HCD-pd-EThcD) mass spectrometry followed by the software package pGlyco3 (46). Thr15 (T15SAP) was the only acceptor site identified with confidence on Muc1 after enzymatic treatment, consistent with previous results (fig. S5 and data S3). These data further show that lectin repeat mutations do not alter substrate specificity.
Fig. 3. GalNAc-T1 lectin domain repeats synergistically bind diglycosylated Muc1.
(A) Catalytic efficiency (kcat/KM) of GalNAc-T1WT, GalNAc-T1D444A, GalNAc-T1D484A, and GalNAc-T1D444A/D484A against Muc1 peptides. GalNAc-T1 activity is synergistically enhanced by the presence of two GalNAcs and two active lectin repeats. Additional kinetic data are given in table S3 and fig. S6. (B) Activity assay verifying that the enhancement from the dual lectin domain GalNAc binding repeats of GalNAc-T1 is synergistic. Assays were performed in duplicate and replicated three times. (C) Schematic representation of the binding modes of diglycosylated Muc1 observed in the dynamics simulations. Stable binding of both GalNAcs was observed in some of the simulations (about 20% of the time; middle panel); in others, either the C terminus (40%, left) or the N terminus (40%, right) detached early on from the GalNAc-binding pocket, whereas the other GalNAc remained bound and stable. The intermediate Thr remained well positioned in the catalytic site in all cases (fig. S9, A and B).
As observed with the Muc5AC-A peptides, the activity of GalNAc-T1WT is enhanced by the presence of GalNAc on Muc1 due to decreases in KM for all the Muc1 glycopeptide substrates compared to unglycosylated Muc1 (Fig. 3A and fig. S6, A and B). A single GalNAc at the N terminus (Muc1-7, predicted to bind the α repeat) shows a 16-fold increase in kcat/KM compared to unglycosylated Muc1, while a GalNAc at the C terminus (Muc1-27, predicted to bind the β repeat) results in a 5-fold increase in kcat/KM compared to unglycosylated Muc1 (Fig. 3A and table S3). For the di-glycopeptide (Muc1-7,27) containing GalNAc at the N and C termini, the kcat/KM is 81-fold greater than for the unglycosylated peptide, suggesting that two GalNAcs on a substrate have a synergistic effect on the activity of GalNAc-T1WT. The increase in catalytic efficiency is driven by a decrease in KM, as kcat is only slightly less than the kcat for the unglycosylated peptide, indicating that, for Muc1-7,27, complex formation is the rate-limiting step and primary driver of substrate specificity and catalytic efficiency. Like GalNAc-T1, GalNAc-T2 is a bidirectional transferase, but with a single GalNAc binding lectin repeat (α) (16). The mechanism of GalNAc-T2 bidirectionality could be a function of its flexible linker that allows its lectin domain to adopt various conformations relative to the catalytic domain and accommodate variable glycopeptides (16, 36, 47). Unlike GalNAc-T1, GalNAc-T2 has similar kcat/KM toward Muc1 mono- and di-glycopeptides (fig. S7 and table S4), providing evidence that the synergistic effect observed for GalNAc-T1 and Muc1 is unique and likely due to the presence of two GalNAc binding lectin repeats. We also observed a 1.5-fold difference in GalNAc-T1WT activity toward 50 μM Muc1-7,27 in comparison to a combination mono-glycopeptides consisting of 25 μM Muc1-7 and 25 μM Muc1-27, indicating that dual binding contributes to the total activity and supporting the synergistic enhancement seen toward Muc1-7,27 in the kinetic data (Fig. 3B).
As predicted, a mutation in the α repeat (GalNAc-T1D444A) results in a fivefold decrease in kcat/KM toward Muc1-7 compared to GalNAc-T1WT that is driven by an increase in KM (Fig. 3A; fig. S6, A and B; and table S3). A 2.5-fold increase in kcat/KM toward Muc1-27 compared to GalNAc-T1WT further indicates that mutations in one repeat can have global effects on enzyme function. While we expected to see similar activity for GalNAc-T1D444A toward Muc1-27 and Muc1-7,27, the kcat/KM of GalNAc-T1D444A for the di-glycopeptide Muc1-7,27 is ~2-fold higher than Muc1-27 due to a lower KM for the di-glycopeptide. This could be due to residual Thr–O-GalNAc binding to the α repeat pocket in the absence of Asp444, supported by a ~1.5-fold lower KM for GalNAc-T1D444A and Muc1-7 compared with unglycosylated Muc1 (Fig. 3A and fig. S6). The data do not indicate that residual binding of GalNAc occurs in the shorter Muc5AC-A3 peptide and GalNAc-T1D444A, suggesting that there may be additional interactions between amino acids in the longer Muc1 and GalNAc-T1 that help promote residual binding in the α repeat. Disrupting the β repeat in GalNAc-T1D484A results in a 5.5-fold decrease in kcat/KM toward Muc1-27 relative to GalNAc-T1WT that is largely driven by a 10-fold reduction in the kcat. Thus, unlike the α repeat, a mutation in the β repeat is more likely influencing substrate alignment of Muc1-27 in the active site and catalytic turnover. This contrasts with GalNAc-T1D484A activity toward Muc5AC-A13, where the decrease in kcat/KM upon disruption of the β repeat is driven mainly by an increase in KM, indicating that peptide sequence, length, and flexibility could influence catalysis. In the case of Muc1-27, interactions with the peptide could have a more major compensatory role in the absence of GalNAc binding, in contrast to Muc5AC-A13. Lys496 or other residues could be interacting with Muc1-27 and compensating for the disruption of GalNAc binding via Asp484. Consequently, GalNAc could bind to the pocket due to proximity but not in a productive manner to properly align the acceptor Thr in the active site.
We used a GalNAc competition assay to further dissect Thr–O-GalNAc–mediated Muc1 glycopeptide binding to GalNAc-T1. Increasing concentrations of free GalNAc reduce the relative activity (relative to 0 mM GalNAc) of GalNAc-T1WT toward the Muc1 glycopeptides compared with the unglycosylated Muc1 peptide (fig. S8A). Inhibition of GalNAc-T1WT activity toward unglycosylated Muc1 is more extensive than observed for Muc5AC-A; however, this trend is consistent across variants, indicating that it does not appear to be affected by mutations in the α or β repeats (fig. S8, A to D). For GalNAc-T1D444A, increasing concentrations of free GalNAc do not decrease the relative activity toward Muc1-7 to the same extent as the other Muc1 substrates, suggesting that this mutation effectively reduces α repeat binding to Thr–O-GalNAc on Muc1-7 (fig. S8B). We expected to see an effect on Muc1-7,27 with GalNAc-T1D444A, but the data for Muc1-7,27 and Muc1-27 are similar, supporting the notion that a mutation in the α repeat has long-range effects on binding to the β repeat that, in this case, appear to compensate for the decreased α-mediated binding to Muc1-7,27.
For GalNAc-T1D484A (β), we observe a decrease in the relative activity toward Muc1-27 at higher concentrations of free GalNAc than what is observed for the other substrates, showing that this mutation also prevents Thr–O-GalNAc from efficiently binding the β repeat (fig. S8C). Like GalNAc-T1D444A, a mutation in the β repeat does not seem to effect GalNAc mediated Muc1-7,27 binding, pointing to a compensatory effect by the unaltered α repeat. The relative activity of GalNAc-T1D444A/D484A toward the Muc1 glycopeptides is greater than GalNAc-T1WT–mediated activity in our GalNAc competition assay, supporting a role for both repeats in Muc1 binding and glycosylation (fig. S8D). Together, the data suggest that disrupting one repeat leads to a compensatory role for the other repeat. Both the GalNAc-T1D444A and GalNAc-T1D484A variants show no major changes in relative activity in our GalNAc competition assay toward Muc1-7,27, providing evidence for intra-lectin domain compensation or influence. The experiments and simulations suggest that the dynamic changes in the β repeat (increased flexibility and decreased cross-correlations) are responsible for the enhanced enzymatic activity of GalNAc-T1D444A, whereas the opposite changes observed in the α repeat diminish the activity of GalNAc-T1D484A and that these observations can be substrate specific. The changes in the H-bond network at the lectin/catalytic domain interface involving charged residues Asp479, Asp480, and Lys496, all in the β repeat, may also play a role, as they are also located at the peptide/enzyme interface.
To gain insight into the molecular and kinetic mechanisms of GalNAc-T1-Muc1 complex formation, we performed a series of MD simulations of the Muc1 di-glycopeptide bound to GalNAc-T1. The simulations suggest three main stable binding modes (Fig. 3C and fig. S9, A and B). The di-glycopeptide can bind to both repeats simultaneously, with hydrophobic and H-bond interactions stabilizing the two GalNAc in their respective binding sites. However, in most simulations, one GalNAc detaches from its pocket, whereas the other GalNAc remains bound and stable. Although there are noteworthy structural fluctuations of the bound peptides, in all the modes, the intermediate acceptor Thr tends to remain close to and correctly positioned in the catalytic site. The simulations support a kinetic model in which binding to either the α, β, or both repeats direct glycosylation (fig. S9, C and D), and simultaneous binding to both GalNAc moieties is not strictly necessary to increase specificity toward an intermediate acceptor site.
DISCUSSION
GalNAc-T1 is a ubiquitous mucin-type O-glycosyltransferase that regulates critical cellular processes, including immunity and development. However, its substrate repertoire is not well characterized, and how GalNAc-T1–mediated O-glycosylation affects many of these processes is still not known. Like many of the isoenzymes in this family, GalNAc-T1 contains a conserved pocket in the active site that interacts with a proline residue three amino acids downstream to an acceptor site (+3 position) to enhance substrate binding and help align the acceptor Thr for catalysis, as seen in Muc5AC (-T3PSP- and -T13SAP-) and Muc1 (-TSAP-) (47). This gives GalNAc-T1 the ability to modify sites on unglycosylated substrates, such as the severe acute respiratory syndrome coronavirus (SARS) 2 spike protein, where O-glycosylation of Thr678 located near the frequently mutated Pro681 adjacent to the furin cleavage site (-T678NSPRRAR-) decreases the furin-mediated cleavage of spike (31, 32). Although GalNAc-T1 can modify nascent peptides, both the α and β repeats in the lectin domain of GalNAc-T1 were shown to enhance enzymatic activity, indicating that GalNAc-T1 prefers glycopeptide substrates (34, 35). However, the molecular details of how these repeats influence catalysis and substrate specificity have yet to be described.
In this study, we show that GalNAc-T1-α and GalNAc-T1-β cooperate to efficiently glycosylate substrates: The α repeat promotes glycosylation C-terminal to an extant GalNAc, while the β repeat supports N-terminal O-glycosylation to an extant GalNAc. In addition, we show that GalNAc-T1 activity is synergistically enhanced by the presence of two GalNAc groups on a di-glycopeptide substrate that can interact with both the α and β repeats of the lectin domain to correctly position an intermediate acceptor site for catalysis. The effect of mutations in α and β on GalNAc-T1 activity depends on the sequence, length, and flexibility of the substrate, as shown by the contrasting effects of lectin domain mutations on glycopeptides Muc1 and Muc5AC-A. The various substrate binding modes show how GalNAc-T1 can modify a broad substrate repertoire and influence diverse cellular pathways. The mechanism described for GalNAc-T1 can be extended to its closest homolog, GalNAc-T13, which is ~84% identical and has previously been shown to be bidirectional (Fig. 4A) (16). The two paralogs have virtually identical substrate specificity and are essentially regulated by their distinct cell expression patterns (33); GalNAc-T1 is more widely expressed, whereas GalNAc-T13 expression primarily occurs in neuronal cells (48). Understanding the mechanism of GalNAc-T13 glycosylation is likewise critical because GALNT13 is associated with many cancers (33), further highlighting the importance of this study.
Fig. 4. Lectin domains differentially influence the directionality of human GalNAc-Ts.
(A) GalNAc-T1 and GalNAc-T13 are bidirectional due to two GalNAc-binding lectin repeats, and GalNAc-T1 (and likely GalNAc-T13) simultaneously binds two Thr–O-GalNAc via its α and β repeats. GalNAc-T2 and GalNAc-T3 (and possibly GalNAc-T6) are bidirectional but have only one GalNAc-binding lectin repeat (α). GalNAc-T5 and GalNAc-T16 are bidirectional but have no known ability to bind GalNAc via the lectin repeat. The α repeats of GalNAc-T4 and GalNAc-12 direct C-terminal glycosylation from a prior site with additional enhancement coming from catalytic domain GalNAc binding. GalNAc-T14 has N-terminal directionality but no evidence of lectin domain-mediated GalNAc binding. GalNAc-T7 and GalNAc-10 (and possibly GalNAc-T17) have GalNAc binding capability in the lectin domain (α for T7 and β for T10 and T17), but activity enhancement is driven by catalytic domain GalNAc binding. There are no data for GalNAc-T8, GalNAc-T9, GalNAc-T11, GalNAc-T15, GalNAc-T18, and GalNAc-T19. Structural models of human GalNAc-T3, GalNAc-T5, GalNAc-T6, GalNAc-T8, GalNAc-T9, GalNAc-T11, GalNAc-T13, GalNAc-T14, GalNAc-T15, GalNAc-T16, GalNAc-T17, GalNAc-T18, and GalNAc-T19 were generated with AlphaFold2 (58), and experimental structures were used for GalNAc-T1 (this paper); GalNAc-T2, GalNAc-T4, GalNAc-T7, GalNAc-T10, and GalNAc-T12 (36, 37, 49, 50, 59); and the peptide bound to human GalNAc-T3 (green) (38). Catalytic domains are shown in light gray, α in magenta, β in blue, γ in wheat, and peptides from experimental structures in green. (B) The hierarchal model for O-glycosylation. GalNAc-T1 was previously characterized as polypeptide preferring, early transferase. However, its lectin domain-mediated preference for glycopeptides suggests that it can function early in the process and at intermediate stages to efficiently glycosylate acceptor sites that are positioned between two prior sites.
Apart from GalNAc-T13, these analyses cannot be generalized to other human GalNAc-Ts because the sequence identity between the lectin domain of GalNAc-T1 and the other isoenzymes is ~20 to 30%. This strongly indicates an immense degree of heterogeneity in how these repeats influence substrate binding, align the acceptor in the active site, and regulate enzyme function. GalNAc-T2, GalNAc-T3, GalNAc-T6, and GalNAc-T7 contain GalNAc binding pockets in the α repeat (Fig. 4A and fig. S1A). Both GalNAc-T2 and GalNAc-T3 are capable of glycosylating in both directions (16), and, on the basis of sequence homology, GalNAc-T6 would be expected to behave similarly to GalNAc-T3. For GalNAc-T2, bidirectionality could be due to the flexible linker between the catalytic and lectin domains, which could reorient the α repeat to promote binding to Thr–O-GalNAc with N- or C-terminal acceptor sites located 7 to 11 amino acids from the extant site (36, 39). GalNAc-T12 and GalNAc-T4 have a strong preference for α repeat–mediated C-terminal O-glycosylation and contain GalNAc-binding pockets in their catalytic domains that further enhance enzymatic activity (37, 49).
α Repeat enhancement is minimal for GalNAc-T7, which also contains a putative GalNAc binding pocket in the catalytic domain (16). GalNAc-T10 and, by extension, GalNAc-17 have GalNAc-binding residues in the β repeat and primarily exhibit GalNAc binding ability via the catalytic domain. Like GalNAc-T7, lectin domain enhancement is present but reduced for GalNAc-10 in comparison to other isoenzymes toward glycopeptides with lectin domain binding Thr–O-GalNAc sites. A chimeric protein consisting of GalNAc-T10 with a GalNAc-T1 linker increases enzymatic activity compared to GalNAc-T10WT, suggesting that the GalNAc-T10 linker may be preventing efficient lectin domain binding to a glycopeptide substrate (50). In contrast, GalNAc-T14 glycosylates N-terminal to a prior site on glycopeptides with lectin domain binding Thr–O-GalNAc sites, despite the lack of conserved GalNAc binding motifs within the lectin domain, and, similarly, GalNAc-T5 and GalNAc-T16 do not have GalNAc binding residues within the lectin domain but appear to be bidirectional (16). It is possible that the lectin domains of GalNAc-T14, GalNAc-T5, and GalNAc-T16 contain unidentified, distinct GalNAc binding residues or pockets, which remains to be determined. There are now no data on the lectin domains of GalNAc-T11, GalNAc-T15, or the Y family of transferases GalNAc-T8, GalNAc-T9, GalNAc-T18 and GalNAc-T19. Thus, many unknowns remain encoded within the lectin domains of many GalNAc-Ts, and studies on individual enzymes will provide insight into the unique roles of each distinct lectin domain.
The proposed sequential and hierarchal model of O-glycosylation starts with GalNAc-Ts that readily glycosylate nascent proteins (early transferases), followed by transferases that glycosylate partially glycosylated proteins (intermediate transferases), and concludes with GalNAc-Ts that recognize and further modify densely glycosylated substrates (late transferases) (Fig. 4B). The ability of GalNAc-T1 to glycosylate nascent peptides led to its characterization as “peptide-preferring” enzyme that functions as an early transferase (51, 52). However, our data show that, although GalNAc-T1 is active against nascent peptides, its activity is synergistically amplified toward di-glycopeptides that can simultaneously interact via GalNAc to the α and β repeats. Thus, GalNAc-T1 is “peptide capable” as opposed to peptide preferring. This contrasts with strict glycopeptide-preferring isoenzymes like GalNAc-T10, GalNAc-T7, GalNAc-T4 and GalNAc-T12, which do not modify nascent peptides. Strict glycopeptide-preferring GalNAc-Ts typically contain a GalNAc binding pocket in their catalytic domain that allows them to glycosylate one to three amino acids away from the previous site (16, 37). The data highlight a complex role for GalNAc-T1, where it functions at early and intermediate points in the glycosylation pathway depending on the substrate and the context (Fig. 4B).
Overall, this study provides an additional example of how each member of this large enzyme family, which does not recognize a consensus sequence or motif, follows its own rules for substrate recognition and activity. In doing so, the enzymes can work together to O-glycosylate a wide range of substrates. Some enzymes, like GalNAc-T1 and GalNAc-T2, function at multiple stages of O-glycosylation. Other enzymes have limited variability and acceptor site specificity. Each isoenzyme expressed within a cell type coordinates with other isoenzymes to correctly glycosylate the many proteins that will be secreted into the extracellular space. The location(s) of each individual enzyme within the Golgi adds another layer of regulation. However, it is not clear how relative GalNAc-T localization, aqueous environment, and molecular crowding influence GalNAc-T function in vivo.
Because many of the biological substrates of GalNAc-T1 have not been identified, how the lectin repeats influence site specificity in cells is still not known. We interrogated the National Cancer Institute Genomic Data Commons Data Portal and found four cases of bronchus/lung cancer associated with a D444H mutation in GALNT1, which is the most frequent cancer-associated mutation in the functional region of the enzyme and a case of bladder cancer associated with a D484N mutation. However, how mutations in GALNT1 disrupt other biological pathways remains unclear. Thus, understanding the different ways that GalNAc-T1 O-glycosylates its substrates in vitro can be used to correctly interpret glycoproteomics data from human samples, particularly samples with mutations in GALNT1, and hopefully begin to understand the biological substrates that are affected by GalNAc-T1–mediated O-glycosylation in health and disease.
MATERIALS AND METHODS
Expression and purification of human and mouse GalNAc-T1
A glycerol stock of Pichia pastoris expressing mouse GalNAc-T1 (pKN55-His6-TEV-mGalNAc-T1-∆56) and a plasmid expressing human GalNAc-T1 (pKN55-His6-TEV-hGalNAc-T1-∆44) were gifts from the Tabak lab at the National Institute of Dental and Craniofacial Research (NIDCR) at the National Institutes of Health (NIH). GalNAc-T1D444A and GalNAc-T1D484A were generated by site-directed mutagenesis using pKN55-His6-TEV-hGalNAc-T1-∆44 as a template, and GalNAc-T1D444A/D484A was generated by site-directed mutagenesis using GalNAc-T1D444A as a template (table S5). The mutations were verified by DNA sequencing. Plasmids were isolated and linearized by Pmel and transformed by electroporation into SMD1168 P. pastoris expression cells (Invitrogen) to generate stable GalNAc-T1–expressing strains. Transformants were selected on Minimal Dextrose (MD) plates [1.34% (w/v) Yeast Nitrogen Base (YNB), 4× 10 to 5% (w/v) biotin, and 2% (w/v) dextrose]. All proteins were expressed and purified using the same protocol. Cultures of 1 liter were grown, and protein expression was induced and purified as described (36).
Transferase assays and enzyme kinetics
Peptides and glycopeptides were synthesized and purified by Anaspec. Enzyme kinetics and activity assays were determined using the UDP-Glo Glycotransferase Assay (Promega). GalNAc-T1 and GalNAc-T2 glycosylation reactions (25 μl) were performed in a master mix containing 25 mM Hepes, 5 mM MnCl2, 100 mM NaCl, 5 mM β-mercaptoethanol (βME), and 25 μM ultra-pure UDP-GalNAc (Promega) (pH 7.3) in the presence of variable concentrations of peptide to generate enzyme kinetic curves. To initiate reactions against peptides Muc5AC-A, Muc5AC-A3, and Muc5AC-A13, 1 μl of ~0.3 μM WT human GalNAc-T1 or variants was added directly to the master mix and incubated at room temperature for 30 min. To initiate glycosylation reactions against the Muc1 set of peptides, the master mix was added to the 96-well plate containing 1 μl of ~0.4 μM GalNAc-T1 or ~1 nM GalNAc-T2. The reactions were then incubated at 37°C for 15 min. All reactions were stopped using 25 μl of UDP-detection reagent and mixed for 30 s in the double orbital mode at 365 cpm. Glycosylation reactions against peptides Muc5AC-A, Muc5AC-A3, and Muc5AC-A13 were then incubated for 1 hour at room temperature. Glycosylation reactions against the Muc1 set of peptides were incubated for 1 hour at exactly 27°C in the dark. In the case that bubbles were present following incubation, the 96-well plate was spun at 1600 rpm for 30 s. Luminescence values were obtained using a Synergy Neo2 (Biotek) with a 0.3-s integration time and automatic gain determination. To correlate the UDP concentration to luminescence and approximate the product formation, we generated a UDP standard curve in each assay consisting of 25 mM Hepes, 5 mM MnCl2, 100 mM NaCl, 5 mM βME, and 0 to 25 μM UDP (pH 7.3). The resulting kinetic values were corrected for free UDP-GalNAc hydrolysis before being fit to a Michaelis-Menten or substrate inhibition program in GraphPad Prism 9 software. The KM and maximum velocity (Vmax) values were obtained from each biological replicate and used to calculate kcat and kcat/KM. SEs were calculated where appropriate. The ROUT (robust regression and outlier removal) method was used to identify and eliminate outliers. Reactions were performed in duplicate or triplicate and repeated two to three times as noted. Both inhibition and activity assays were carried out as outlined above. GalNAc inhibition reactions were performed using 50 μM peptide substrate.
Crystallization and data collection
Around 5 to 7 ml of purified mouse GalNAc-T1 (98% identity to human GalNAc-T1) stored at −80°C was thawed and buffer exchanged into 100 mM NaCl, 20 mM Hepes, 0.25 mM EDTA, and 10 mM βME (pH 7.3) by purification over a SD200 16/60 gel filtration column. Peak fractions were collected and concentrated to ~8 to 10 mg/ml using a 10,000-kDa cutoff spin column (Millipore). The complex was assembled by combining GalNAc-T1 with 4 mM Muc5AC-13 (Anaspec), 5 mM UDP, and 5 mM MnCl2 to a final GalNAc-T1 concentration of 5 mg/ml. The complex was incubated at room temperature for 15 min and then placed on ice. Hanging drops were set up by mixing 1 μl of complex with an equal volume of well buffer containing 20% PEG-3350 (polyethylene glycol, molecular weight 3350) and 0.1 M MES (pH 6.5). Crystals were cryoprotected with 20% glycerol, 20% PEG-3350, and 0.1 M MES (pH 6.5) and frozen with LN2. X-ray data were collected at the Advanced Photon Source SER-CAT BM-22 beam line (Argonne, IL). Data were processed and scaled using HKL2000 (53), and the initial structure was solved by Molecular Replacement (MolRep, CCP4i) (54, 55) using the structure of apo-mGalNAc-T1 as a search model [Protein Data Bank (PDB) ID: 1XHB]. The structure was refined in PHENIX (56) and Coot (57) to 2.3-Å resolution, and figures were generated in PyMOL (The PyMOL Molecular Graphics System, version 2.0, Schrodinger LLC).
Modeling and simulations
A model of the full-length human GalNAc-T1 (UniProt ID: Q10472) was built with AlphaFold2 (58). The poorly predicted 1-48 segment was removed, and the protein (apo-GalNAc-T1WT) was subjected to a 5-ns MD simulation at 37°C, 1 atm, 120 mM of NaCl, and pH 7 (Supplementary Materials and Methods). A conformation at the end of the simulation was used to build two complexes, one with UDP (UDP–apo-GalNAc-T1WT) and the other with UDP-GalNAc (UDP-GalNAc–apo-GalNAc-T1WT), along with Mn2+ ions, as observed in the crystal structures. Each complex was subjected to a 5-ns MD simulation under the above conditions. A conformation at the end of each of these simulations was used to build two additional complex with di-glycosylated human Muc1-derived peptide (repeat APGSTAPPAHGVTSAPDTRPAPGSTAPPA; with acetylated and amidated termini): UDP–apo-GalNAc-T1WT:Muc1 and UDP-GalNAc–UDP–apo-GalNAc-T1WT:Muc1-5-25, where Muc1-5-25 has α-D-GalNAc-L-Thr at positions 5 and 25 (the equivalent to Muc1-7-27; Fig. 3A). In building these protein:peptide complexes, two initial conformations of the peptide were considered: one obtained after a 1-ns MD simulation starting from a stretched backbone conformation and the other as modeled by AlphaFold in the context of the full-length hMuc1 (UniProt ID: 15941), predicted to be part of a solenoid horseshoe structure. The corresponding complexes were modeled through steered MD to position Thr5-GalNAc, Thr13, and Thr25-GalNAc close to their relative positions in the available crystal structures. The results reported here were practically the same regardless of the initial peptide conformation. The final complexes were subjected to (unbiased) 30-ns MD simulations at the above thermodynamic conditions, and all the analyses were carried out over the last 20 ns (Supplementary Materials and Methods). Ten independent simulations were performed for each complex.
Acknowledgments:
We thank the Ten Hagen and Tabak labs at NIDCR for helpful discussions and access to resources and equipment, N. Guydosh at NIDDK for reading and editing the manuscript, A. Collette for visual illustration, and the beamline staff at the Advanced Photon Source (SER-CAT) for assistance. We thank our colleagues at NIDCR and the greater glycobiology community for insightful feedback and reviewers for time and constructive comments. Molecular graphics and analyses for MD simulations were performed with UCSF ChimeraX, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from National Institutes of Health R01-GM129325 and the Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases. This work used the computational resources of the NIH HPC Biowulf cluster (http://hpc.nih.gov).
Funding: This work was supported by National Institutes of Health grant 1-ZIA-DE000754-03 (to N.L.S.).
Author contributions: Conceptualization: N.L.S. and A.M.C. Investigation: N.L.S., S.I.S., A.M.C., W.Y., A.J.L., and S.A.H. Methodology: N.L.S., A.M.C., S.A.H., and W.Y. Resources: N.L.S. and W.Y. Funding acquisition: N.L.S. Data curation: N.L.S., A.M.C., and W.Y. Validation: N.L.S., A.M.C., and W.Y. Formal analysis: N.L.S., A.M.C., S.A.H., W.Y., and A.J.L. Visualization: N.L.S., A.M.C., S.A.H., W.Y., and A.J.L. Project administration: N.L.S., A.M.C., S.A.H., and W.Y. Supervision: N.L.S. Writing—original draft: N.L.S. Writing—review and editing: N.L.S., A.M.C., S.A.H., and W.Y.
Competing interests: The authors declare that they have no competing interests.
Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Structure coordinates for the crystal structure of GalNAc-T1 in complex with Muc5AC-13 have been deposited in the Protein Data Bank (www.rcsb.org; PDB ID: 8V9Q; PDB DOI: 10.2210/pdb8v9q/pdb), and x-ray diffraction data have been deposited to SBGrid Data Bank (https://data.sbgrid.org; dataset number 1071, DOI:10.15785/SBGRID/1071) (60). Raw mass spectrometry data have been deposited to ProteomeXchange (www.proteomexchange.org; ID: PXD047845). Raw assay data, kinetics, and statistical calculations are found in data S1; GalNAc-T1 and Muc1 modeling and simulations are found in data S2; and mass spectrometry sample data are found in data S3.
Supplementary Materials
This PDF file includes:
Supplementary Materials and Methods
Figs. S1 to S9
Tables S1 to S5
Legends for data S1 to S3
References
Other Supplementary Material for this manuscript includes the following:
Data S1 to S3
REFERENCES AND NOTES
- 1.Wandall H. H., Nielsen M. A. I., King-Smith S., de Haan N., Bagdonaite I., Global functions of O-glycosylation: Promises and challenges in O-glycobiology. FEBS J. 288, 7183–7212 (2021). [DOI] [PubMed] [Google Scholar]
- 2.Kudelka M. R., Ju T., Heimburg-Molinaro J., Cummings R. D., Simple sugars to complex disease—Mucin-type O-glycans in cancer. Adv. Cancer Res. 126, 53–135 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hussain M. R. M., Hoessli D. C., Fang M., N-acetylgalactosaminyltransferases in cancer. Oncotarget 7, 54067–54081 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kato K., Hansen L., Clausen H., Polypeptide N-acetylgalactosaminyltransferase-associated phenotypes in mammals. Molecules 26, (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Zhang Y., Wang L., Ocansey D. K. W., Wang B., Wang L., Xu Z., Mucin-type O-glycans: Barrier, microbiota, and immune anchors in inflammatory bowel disease. J. Inflamm. Res. 14, 5939–5953 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sorensen T., White T., Wandall H. H., Kristensen A. K., Roepstorff P., Clausen H., UDP-N-acetyl-alpha-D-galactosamine:polypeptide N-acetylgalactosaminyltransferase. Identification and separation of two distinct transferase activities. J. Biol. Chem. 270, 24166–24173 (1995). [DOI] [PubMed] [Google Scholar]
- 7.de Las Rivas M., Lira-Navarrete E., Gerken T. A., Hurtado-Guerrero R., Polypeptide GalNAc-Ts: From redundancy to specificity. Curr. Opin. Struct. Biol. 56, 87–96 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Bennett E. P., Mandel U., Clausen H., Gerken T. A., Fritz T. A., Tabak L. A., Control of mucin-type O-glycosylation: A classification of the polypeptide GalNAc-transferase gene family. Glycobiology 22, 736–756 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Raman J., Guan Y., Perrine C. L., Gerken T. A., Tabak L. A., UDP-N-acetyl-α-d-galactosamine:polypeptide N-acetylgalactosaminyltransferases: Completion of the family tree. Glycobiology 22, 768–777 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ten Hagen K. G., Fritz T. A., Tabak L. A., All in the family: The UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases. Glycobiology 13, 1R–16R (2002). [DOI] [PubMed] [Google Scholar]
- 11.Hazes B., The (QxW)3 domain: A flexible lectin scaffold. Protein Sci. 5, 1490–1501 (1996). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Imberty A., Piller V., Piller F., Breton C., Fold recognition and molecular modeling of a lectin-like domain in UDP GalNAc:polypeptide N-acetylgalactosaminyltransferases. Protein Eng. 10, 1353–1356 (1997). [DOI] [PubMed] [Google Scholar]
- 13.Hassan H., Reis C. A., Bennett E. P., Mirgorodskaya E., Roepstorff P., Hollingsworth M. A., Burchell J., Taylor-Papadimitriou J., Clausen H., The lectin domain of UDP-N-acetyl-D-galactosamine: Polypeptide N-acetylgalactosaminyltransferase-T4 directs its glycopeptide specificities. J. Biol. Chem. 275, 38197–38205 (2000). [DOI] [PubMed] [Google Scholar]
- 14.Raman J., Fritz T. A., Gerken T. A., Jamison O., Live D., Liu M., Tabak L. A., The catalytic and lectin domains of UDP-GalNAc:polypeptide alpha-N-Acetylgalactosaminyltransferase function in concert to direct glycosylation site selection. J. Biol. Chem. 283, 22942–22951 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gerken T. A., Revoredo L., Thome J. J., Tabak L. A., Vester-Christensen M. B., Clausen H., Gahlay G. K., Jarvis D. L., Johnson R. W., Moniz H. A., Moremen K., The lectin domain of the polypeptide GalNAc transferase family of glycosyltransferases (ppGalNAc Ts) acts as a switch directing glycopeptide substrate glycosylation in an N- or C-terminal direction, further controlling mucin type O-glycosylation. J. Biol. Chem. 288, 19900–19914 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Revoredo L., Wang S., Bennett E. P., Clausen H., Moremen K. W., Jarvis D. L., Ten Hagen K. G., Tabak L. A., Gerken T. A., Mucin-type O-glycosylation is controlled by short- and long-range glycopeptide substrate recognition that varies among members of the polypeptide GalNAc transferase family. Glycobiology 26, 360–376 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Wandall H. H., Irazoqui F., Tarp M. A., Bennett E. P., Mandel U., Takeuchi H., Kato K., Irimura T., Suryanarayanan G., Hollingsworth M. A., Clausen H., The lectin domains of polypeptide GalNAc-transferases exhibit carbohydrate-binding specificity for GalNAc: Lectin binding to GalNAc-glycopeptide substrates is required for high density GalNAc-O-glycosylation. Glycobiology 17, 374–387 (2007). [DOI] [PubMed] [Google Scholar]
- 18.Pedersen J. W., Bennett E. P., Schjoldager K. T., Meldal M., Holmer A. P., Blixt O., Clo E., Levery S. B., Clausen H., Wandall H. H., Lectin domains of polypeptide GalNAc transferases exhibit glycopeptide binding specificity. J. Biol. Chem. 286, 32684–32696 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kong Y., Joshi H. J., Schjoldager K. T., Madsen T. D., Gerken T. A., Vester-Christensen M. B., Wandall H. H., Bennett E. P., Levery S. B., Vakhrushev S. Y., Clausen H., Probing polypeptide GalNAc-transferase isoform substrate specificities by in vitro analysis. Glycobiology 25, 55–65 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gerken T. A., Jamison O., Perrine C. L., Collette J. C., Moinova H., Ravi L., Markowitz S. D., Shen W., Patel H., Tabak L. A., Emerging paradigms for the initiation of mucin-type protein O-glycosylation by the polypeptide GalNAc transferase family of glycosyltransferases. J. Biol. Chem. 286, 14493–14507 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gerken T. A., Raman J., Fritz T. A., Jamison O., Identification of common and unique peptide substrate preferences for the UDP-GalNAc:polypeptide alpha-N-acetylgalactosaminyltransferases T1 and T2 derived from oriented random peptide substrates. J. Biol. Chem. 281, 32403–32416 (2006). [DOI] [PubMed] [Google Scholar]
- 22.O'Connell B. C., Hagen F. K., Tabak L. A., The influence of flanking sequence on the O-glycosylation of threonine in vitro. J. Biol. Chem. 267, 25010–25018 (1992). [PubMed] [Google Scholar]
- 23.Block H., Ley K., Zarbock A., Severe impairment of leukocyte recruitment in ppGalNAcT-1-deficient mice. J. Immunol. 188, 5674–5681 (2012). [DOI] [PubMed] [Google Scholar]
- 24.Phelan C. M., Tsai Y. Y., Goode E. L., Vierkant R. A., Fridley B. L., Beesley J., Chen X. Q., Webb P. M., Chanock S., Cramer D. W., Moysich K., Edwards R. P., Chang-Claude J., Garcia-Closas M., Yang H., Wang-Gohrke S., Hein R., Green A. C., Lissowska J., Carney M. E., Lurie G., Wilkens L. R., Ness R. B., Pearce C. L., Wu A. H., Van Den Berg D. J., Stram D. O., Terry K. L., Whiteman D. C., Whittemore A. S., DiCioccio R. A., McGuire V., Doherty J. A., Rossing M. A., Anton-Culver H., Ziogas A., Hogdall C., Hogdall E., Kjaer S. K., Blaakaer J., Quaye L., Ramus S. J., Jacobs I., Song H., Pharoah P. D., Iversen E. S., Marks J. R., Pike M. C., Gayther S. A., Cunningham J. M., Goodman M. T., Schildkraut J. M., Chenevix-Trench G., Berchuck A., Sellers T. A., A. C. S. Ovarian Cancer Association Consortium, G. Australian Ovarian Cancer Study , Polymorphism in the GALNT1 gene and epithelial ovarian cancer in non-Hispanic white women: The ovarian cancer association consortium. Cancer Epidemiol. Biomarkers Prev. 19, 600–604 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sellers T. A., Huang Y., Cunningham J., Goode E. L., Sutphen R., Vierkant R. A., Kelemen L. E., Fredericksen Z. S., Liebow M., Pankratz V. S., Hartmann L. C., Myer J., Iversen E. S. Jr., Schildkraut J. M., Phelan C., Association of single nucleotide polymorphisms in glycosylation genes with risk of epithelial ovarian cancer. Cancer Epidemiol. Biomarkers Prev. 17, 397–404 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Simon E. J., Linstedt A. D., Site-specific glycosylation of Ebola virus glycoprotein by human polypeptide GalNAc-transferase 1 induces cell adhesion defects. J. Biol. Chem. 293, 19866–19873 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Tenno M., Ohtsubo K., Hagen F. K., Ditto D., Zarbock A., Schaerli P., von Andrian U. H., Ley K., Le D., Tabak L. A., Marth J. D., Initiation of protein O glycosylation by the polypeptide GalNAcT-1 in vascular biology and humoral immunity. Mol. Cell. Biol. 27, 8783–8796 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Tian E., Hoffman M. P., Ten Hagen K. G., O-glycosylation modulates integrin and FGF signalling by influencing the secretion of basement membrane components. Nat. Commun. 3, 869 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Tian E., Stevens S. R., Guan Y., Springer D. A., Anderson S. A., Starost M. F., Patel V., Ten Hagen K. G., Tabak L. A., Galnt1 is required for normal heart valve development and cardiac function. PLOS ONE 10, e0115861 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zhang L., Lv B., Shi X., Gao G., High expression of N-acetylgalactosaminyl-transferase 1 (GALNT1) associated with invasion, metastasis, and proliferation in osteosarcoma. Med. Sci. Monit. 26, e927837 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Zhang L., Mann M., Syed Z. A., Reynolds H. M., Tian E., Samara N. L., Zeldin D. C., Tabak L. A., Ten Hagen K. G., Furin cleavage of the SARS-CoV-2 spike is modulated by O-glycosylation. Proc. Natl. Acad. Sci. U.S.A. 118, (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Gonzalez-Rodriguez E., Zol-Hanlon M., Bineva-Todd G., Marchesi A., Skehel M., Mahoney K. E., Roustan C., Borg A., Di Vagno L., Kjær S., Wrobel A. G., Benton D. J., Nawrath P., Flitsch S. L., Joshi D., González-Ramírez A. M., Wilkinson K. A., Wilkinson R. J., Wall E. C., Hurtado-Guerrero R., Malaker S. A., Schumann B., O-linked sialoglycans modulate the proteolysis of SARS-CoV-2 spike and likely contribute to the mutational trajectory in variants of concern. ACS Cent. Sci. 9, 393–404 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Festari M. F., Trajtenberg F., Berois N., Pantano S., Revoredo L., Kong Y., Solari-Saquieres P., Narimatsu Y., Freire T., Bay S., Robello C., Benard J., Gerken T. A., Clausen H., Osinaga E., Revisiting the human polypeptide GalNAc-T1 and T13 paralogs. Glycobiology 27, 140–153 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Tenno M., Kezdy F. J., Elhammer A. P., Kurosaka A., Function of the lectin domain of polypeptide N-acetylgalactosaminyltransferase 1. Biochem. Biophys. Res. Commun. 298, 755–759 (2002). [DOI] [PubMed] [Google Scholar]
- 35.Tenno M., Saeki A., Kezdy F. J., Elhammer A. P., Kurosaka A., The lectin domain of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase 1 is involved in O-glycosylation of a polypeptide with multiple acceptor sites. J. Biol. Chem. 277, 47088–47096 (2002). [DOI] [PubMed] [Google Scholar]
- 36.Lira-Navarrete E., de Las Rivas M., Companon I., Pallares M. C., Kong Y., Iglesias-Fernandez J., Bernardes G. J., Peregrina J. M., Rovira C., Bernado P., Bruscolini P., Clausen H., Lostao A., Corzana F., Hurtado-Guerrero R., Dynamic interplay between catalytic and lectin domains of GalNAc-transferases modulates protein O-glycosylation. Nat. Commun. 6, 6937 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Fernandez A. J., Daniel E. J. P., Mahajan S. P., Gray J. J., Gerken T. A., Tabak L. A., Samara N. L., The structure of the colorectal cancer-associated enzyme GalNAc-T12 reveals how nonconserved residues dictate its function. Proc. Natl. Acad. Sci. U.S.A. 116, 20404–20410 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.de Las Rivas M., Paul Daniel E. J., Narimatsu Y., Companon I., Kato K., Hermosilla P., Thureau A., Ceballos-Laita L., Coelho H., Bernado P., Marcelo F., Hansen L., Maeda R., Lostao A., Corzana F., Clausen H., Gerken T. A., Hurtado-Guerrero R., Molecular basis for fibroblast growth factor 23 O-glycosylation by GalNAc-T3. Nat. Chem. Biol. 16, 351–360 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.de Las Rivas M., Lira-Navarrete E., Daniel E. J. P., Companon I., Coelho H., Diniz A., Jimenez-Barbero J., Peregrina J. M., Clausen H., Corzana F., Marcelo F., Jimenez-Oses G., Gerken T. A., Hurtado-Guerrero R., The interdomain flexible linker of the polypeptide GalNAc transferases dictates their long-range glycosylation preferences. Nat. Commun. 8, 1959 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Fritz T. A., Hurley J. H., Trinh L., Shiloach J., Tabak L. A., The beginnings of mucin biosynthesis: The crystal structure of UDP-GalNAc:polypeptide α-N-acetylgalactosaminyltransferase-T1. Proc. Natl. Acad. Sci. U.S.A. 101, 15307–15312 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Coelho H., Rivas M. L., Grosso A. S., Diniz A., Soares C. O., Francisco R. A., Dias J. S., Companon I., Sun L., Narimatsu Y., Vakhrushev S. Y., Clausen H., Cabrita E. J., Jimenez-Barbero J., Corzana F., Hurtado-Guerrero R., Marcelo F., Atomic and specificity details of mucin 1 O-glycosylation process by multiple polypeptide GalNAc-transferase isoforms unveiled by NMR and molecular modeling. JACS Au 2, 631–645 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Hanisch F., Muller S., Hassan H., Clausen H., Zachara N., Gooley A. A., Paulsen H., Alving K., Peter-Katalinic J., Dynamic epigenetic regulation of initial O-glycosylation by UDP N-acetylgalactosamine:peptide N-acetylgalactosaminyltransferases. J. Biol. Chem. 274, 9946–9954 (1999). [DOI] [PubMed] [Google Scholar]
- 43.Kirnarsky L., Nomoto M., Ikematsu Y., Hassan H., Bennett E. P., Cerny R. L., Clausen H., Hollingsworth M. A., Sherman S., Structural analysis of peptide substrates for mucin-type O-glycosylation. Biochemistry 37, 12811–12817 (1998). [DOI] [PubMed] [Google Scholar]
- 44.Stadie T. R. E., Chai W., Lawson A. M., Byfield P. G. H., Hanisch F., Studies on the order and site specificity of GalNAc transfer to MUC1 tandem repeats by UDP-GalNAc: Polypeptide N-acetylgalactosaminyltransferase from milk or mammary carcinoma cells. Eur. J. Biochem. 229, 140–147 (1995). [PubMed] [Google Scholar]
- 45.Wandall H. H., Hassan H., Mirgorodskaya E., Kristensen A. K., Roepstorff P., Bennett E. P., Nielsen P. A., Hollingsworth M. A., Burchell J., Taylor-Papadimitriou J., Clausen H., Substrate specificities of three members of the human UDP-N-acetyl-α-D-galactosamine: Polypeptide N-acetylgalactosaminyltransferase family, GalNAc-T1, -T2, and -T3. J. Biol. Chem. 272, 23503–23514 (1997). [DOI] [PubMed] [Google Scholar]
- 46.Zeng W. F., Cao W. Q., Liu M. Q., He S. M., Yang P. Y., Precise, fast and comprehensive analysis of intact glycopeptides and modified glycans with pGlyco3. Nat. Methods 18, 1515–1523 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Fritz T. A., Raman J., Tabak L. A., Dynamic association between the catalytic and lectin domains of human UDP-GalNAc:Polypeptide alpha-N-acetylgalactosaminyltransferase-2. J. Biol. Chem. 281, 8613–8619 (2006). [DOI] [PubMed] [Google Scholar]
- 48.Zhang Y., Iwasaki H., Wang H., Kudo T., Kalka T. B., Hennet T., Kubota T., Cheng L., Inaba N., Gotoh M., Togayachi A., Guo J., Hisatomi H., Nakajima K., Nishihara S., Nakamura M., Marth J. D., Narimatsu H., Cloning and characterization of a new human UDP-N-acetyl-alpha-D-galactosamine:polypeptide N-acetylgalactosaminyltransferase, designated pp-GalNAc-T13, that is specifically expressed in neurons and synthesizes GalNAc alpha-serine/threonine antigen. J. Biol. Chem. 278, 573–584 (2003). [DOI] [PubMed] [Google Scholar]
- 49.de Las Rivas M., Paul Daniel E. J., Coelho H., Lira-Navarrete E., Raich L., Companon I., Diniz A., Lagartera L., Jimenez-Barbero J., Clausen H., Rovira C., Marcelo F., Corzana F., Gerken T. A., Hurtado-Guerrero R., Structural and mechanistic insights into the catalytic-domain-mediated short-range glycosylation preferences of GalNAc-T4. ACS Cent. Sci. 4, 1274–1290 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Kubota T., Shiba T., Sugioka S., Furukawa S., Sawaki H., Kato R., Wakatsuki S., Narimatsu H., Structural basis of carbohydrate transfer activity by human UDP-GalNAc: Polypeptide alpha-N-acetylgalactosaminyltransferase (pp-GalNAc-T10). J. Mol. Biol. 359, 708–727 (2006). [DOI] [PubMed] [Google Scholar]
- 51.Hagen F. K., Hazes B., Raffo R., deSa D., Tabak L. A., Structure-function analysis of the UDP-N-acetyl-d-galactosamine:polypeptide N-acetylgalactosaminyltransferase. J. Biol. Chem. 274, 6797–6803 (1999). [DOI] [PubMed] [Google Scholar]
- 52.Pratt M. R., Hang H. C., Ten Hagen K. G., Rarick J., Gerken T. A., Tabak L. A., Bertozzi C. R., Deconvoluting the functions of polypeptide N-alpha-acetylgalactosaminyltransferase family members by glycopeptide substrate profiling. Chem. Biol. 11, 1009–1016 (2004). [DOI] [PubMed] [Google Scholar]
- 53.Z. Otwinowski, W. Minor, “Processing of x-ray diffraction data collected in oscillation mode” in Methods in Enzymology, vol. 276 of Macromolecular Crystallography, part A, C. W. Carter Jr., R. M. Sweet, Eds. (Academic Press, 1997), pp. 307–326. [DOI] [PubMed] [Google Scholar]
- 54.Potterton E., Briggs P., Turkenburg M., Dodson E., A graphical user interface to theCCP4 program suite. Acta Crystallogr. D Biol. Crystallogr. 59, 1131–1137 (2003). [DOI] [PubMed] [Google Scholar]
- 55.Winn M. D., Ballard C. C., Cowtan K. D., Dodson E. J., Emsley P., Evans P. R., Keegan R. M., Krissinel E. B., Leslie A. G. W., McCoy A., McNicholas S. J., Murshudov G. N., Pannu N. S., Potterton E. A., Powell H. R., Read R. R., Vagin A., Wilson K. S., Overview of the CCP4 suite and current developments. Acta Crystallogr. D Biol. Crystallogr. 67, 235–242 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Adams P. D., Afonine P. V., Bunkóczi G., Chen V. B., Davis I. W., Echols N., Headd J. J., Hung L. W., Kapral G. J., Grosse-Kunstleve R. W., McCoy A. J., Moriarty N. W., Oeffner R., Read R. J., Richardson D. C., Richardson J. S., Terwilliger T. C., Zwart P. H., PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Emsley P., Lohkamp B., Scott W. G., Cowtan K. D., Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Zidek A., Potapenko A., Bridgland A., Meyer C., Kohl S. A. A., Ballard A. J., Cowie A., Romera-Paredes B., Nikolov S., Jain R., Adler J., Back T., Petersen S., Reiman D., Clancy E., Zielinski M., Steinegger M., Pacholska M., Berghammer T., Bodenstein S., Silver D., Vinyals O., Senior A. W., Kavukcuoglu K., Kohli P., Hassabis D., Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Yu C., Liang L., Yin Y., Structural basis of carbohydrate transfer activity of UDP-GalNAc: Polypeptide N-acetylgalactosaminyltransferase 7. Biochem. Biophys. Res. Commun. 510, 266–271 (2019). [DOI] [PubMed] [Google Scholar]
- 60.Morin A., Eisenbraun B., Key J., Sanschagrin P. C., Timony M. A., Ottaviano M., Sliz P., Collaboration gets the most out of software. eLife 2, e01456 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Brooks B. R., Brooks C. L. III, Mackerell A. D., Nilsson L., Petrella R. J., Roux B., Won Y., Archontis G., Bartels C., Caflisch S. B. A., Caves L., Cui Q., Dinner A. R., Feig M., Fischer S., Gao J., Hodoscek M., Im W., Kuczera K., Lazaridis T., Ma J., Ovchinnikov V., Paci E., Pastor R. W., Post C. B., Pu J. Z., Schaefer M., Tidor B., Venable R. M., Woodcock H. L., Wu X., Yang W., York D. M., Karplus M., CHARMM: The biomolecular simulation program. J. Comput. Chem. 30, 1545–1615 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Sondergaard C. R., Olsson M. H. M., Rostkowski M., Jensen J. H., Improved treatment of ligands and coupling effects in empirical calculation and rationalization of pka values. J. Chem. Theory. Comput. 7, 2284–2295 (2011). [DOI] [PubMed] [Google Scholar]
- 63.Olsson M. H. M., Sondergaard C. R., Rostkowski M., Jensen J. H., PROPKA3: Consistent treatment of internal and surface residues in empirical pKa predictions. J. Chem. Theory. Comput. 7, 525–537 (2011). [DOI] [PubMed] [Google Scholar]
- 64.Hassan S. A., Mehler E. L., Zhang D., Weinstein H., Molecular dynamics simulations of peptides and proteins with a continuum electrostatic model based on screened Coulomb potentials. Proteins 51, 109–125 (2003). [DOI] [PubMed] [Google Scholar]
- 65.Hassan S. A., Steinbach P., Water-exclusion and liquid-structure forces in implicit solvation. J. Phys. Chem. B 115, 14668–14682 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Meng E. C., Goddard T. D., Pettersen E. F., Couch G. S., Pearson Z. J., Morris J. H., Ferrin T. E., UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 32, e4792 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Materials and Methods
Figs. S1 to S9
Tables S1 to S5
Legends for data S1 to S3
References
Data S1 to S3




