Skip to main content
Oxford University Press - PMC COVID-19 Collection logoLink to Oxford University Press - PMC COVID-19 Collection
. 2021 Sep 24:cwab102. doi: 10.1093/glycob/cwab102

Distinct shifts in site-specific glycosylation pattern of SARS-CoV-2 spike proteins associated with arising mutations in the D614G and Alpha variants

Chu-Wei Kuo 1, Tzu-Jing Yang 2, Yu-Chun Chien 3, Pei-Yu Yu 4, Shang-Te Danny Hsu 5,6,, Kay-Hooi Khoo 7,8,
PMCID: PMC8689840  PMID: 34735575

Abstract

Extensive glycosylation of the spike protein of severe acute respiratory syndrome coronavirus 2 virus not only shields the major part of it from host immune responses, but glycans at specific sites also act on its conformation dynamics and contribute to efficient host receptor binding, and hence infectivity. As variants of concern arise during the course of the coronavirus disease of 2019 pandemic, it is unclear if mutations accumulated within the spike protein would affect its site-specific glycosylation pattern. The Alpha variant derived from the D614G lineage is distinguished from others by having deletion mutations located right within an immunogenic supersite of the spike N-terminal domain (NTD) that make it refractory to most neutralizing antibodies directed against this domain. Despite maintaining an overall similar structural conformation, our mass spectrometry-based site-specific glycosylation analyses of similarly produced spike proteins with and without the D614G and Alpha variant mutations reveal a significant shift in the processing state of N-glycans on one specific NTD site. Its conversion to a higher proportion of complex type structures is indicative of altered spatial accessibility attributable to mutations specific to the Alpha variant that may impact its transmissibility. This and other more subtle changes in glycosylation features detected at other sites provide crucial missing information otherwise not apparent in the available cryogenic electron microscopy-derived structures of the spike protein variants.

Keywords: cryo-EM, mass spectrometry, SARS-CoV-2, site-specific glycosylation

Introduction

In slightly more than a year’s span since the pandemic outbreak of coronavirus disease of 2019 (COVID-19), the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike glycoprotein (hereafter S protein) has become one of the most intensely studied glycoproteins in the shortest period of time. All possible aspects have been investigated, from its three-dimensional (3D) structure to site-specific glycosylation pattern, complemented by numerous host receptor (human angiotensin converting enzyme 2; hACE2) and neutralizing antibody binding data (Scudellari 2020; Barcena et al. 2021). The initial effort was to identify any structural difference between the trimeric S protein of SARS-CoV-2 and that of its closest relative, SARS-CoV-1, as well as other related coronaviruses (CoVs) infectious to human. More infectious genetic variants of SARS-CoV-2 have since emerged, as the virus continues to spread and accumulates mutations (Harvey et al. 2021; Janik et al. 2021). The D614G mutation in the S protein was among the first identified in the early phase of the pandemic that became prevalent globally with increased infectivity (Korber et al. 2020; Volz et al. 2021; Zhou, Thao, Hoffmann, et al. 2021). Starting late 2020, rapid evolution led to simultaneous appearance of several variants of concern (CDC), including the B.1.1.7 variant first appeared in the United Kingdom, now labeled as the Alpha variant (Konings et al. 2021), which was found to be more transmissible and likely also caused more severe illness (Davies, Abbott, et al. 2021; Davies, Jarvis, et al. 2021). In fact, each emerging variant imposes considerable uncertainty factors on the effectiveness of currently available vaccines and neutralizing antibodies (Corti et al. 2021; Harvey et al. 2021; Janik et al. 2021), all of which were initially developed based on the original non-mutated spike protein sequence.

Currently available glycosylation data for the intact S protein are mostly derived from liquid Chromatography with tandem mass spectrometry (LC–MS/MS)-based analysis of the recombinant trimeric samples likewise constructed based on the original non-mutated ectodomain sequence and produced in HEK293 (Wang et al. 2020; Watanabe et al. 2020; Zhao et al. 2020; Allen et al. 2021; Brun et al. 2021) or insect cells (Bangaru et al. 2020; Zhang et al. 2020; Zhou, Tian, Qi, et al. 2021), as well as a couple of studies on the S protein extracted from virions produced in African green monkey Vero cells (Yao et al. 2020; Allen et al. 2021) or human lung cancer cell lines (Brun et al. 2021). Since glycosylation is both host and protein conformation-dependent, it is anticipated that the glycosylation pattern of HEK293-derived intact trimeric S proteins would better resemble that of the actual virus infecting and propagating through human host cells, particularly at the level of the relative amount of oligomannose versus hybrid/complex type at each of the known 22 N-glycosylation sites. Beyond this simplistic classification, further enumerating the exact terminal structural units carried on the complex glycans may not be relevant since it would be HEK293-specific and not reflecting that of the infected cells and infecting virus per se. HEK293, e.g., is known to produce significant amount of terminal ± sulfated GalNAcβ1-4GlcNAc (LacdiNAc), with little or no Lewis type structures (Narimatsu et al. 2019; Zhao et al. 2020), which are not universal features of all human cells. Similarly, glycosylation analyses of S proteins produced in insect cells, or the S1 and S2 subunits (Shajahan et al. 2020; Zhang et al. 2020; Brun et al. 2021) independently expressed out of their trimeric assembly context, only matter in so far as considering their use as antigens for diagnostics and vaccine candidates. A serological assay (Amanat et al. 2020) has demonstrated that the S protein produced in HEK293 cell displays a significantly higher reactivity in COVID-19 plasma/serum detection compared with that produced in insect cells, despite the two sharing an identical protein sequence, thus underscoring the importance of the glycosylation patterns to host immunity and potentially vaccine efficacy.

Since the pandemic outbreak, many laboratories, including ours, have been engaged in determining the atomic structures of the aforementioned trimeric S proteins produced in HEK293, along with rapid site-specific glycosylation mapping, not least to ensure batch-to-batch consistency of antigen production for various downstream assays, diagnostics and vaccine development. Typically, to increase stability and improve yield, the soluble form of intact S protein trimer was engineered to have the putative furin cleavage site abolished together with a tandem proline replacement in the S2 subunit to lock the S protein in a prefusion conformation (known as the 2P mutation), the transmembrane helix was replaced by a T4 fibritin foldon trimerization motif followed by different affinity tags for purification (Pallesen et al. 2017). More recent work has demonstrated that the degree of downstream processing of glycosylation at each site closely resembles that derived from virion particle, when side-by-side analysis was performed (Brun et al. 2021). This lends credence to analysis of the engineered recombinant trimer as surrogates of the trimeric S protein on true virion. Any significant departure from the established oligomannose versus hybrid/complex type N-glycans distribution on each site may be interpreted as altered local conformation and spatial flexibility that would affect the accessibility of various glycosyltransferases (Allen et al. 2021). This is particularly relevant as mutated variants are evolving rapidly. LC–MS/MS-based glycosylation analysis is far more sensitive, rapid, accessible and less costly than producing sufficiently good quality and quantity of recombinant proteins for cryo-electron microscopy (cryo-EM) structure determinations. Moreover, there are already several reports showing that the glycan coats may act on the conformational plasticity of the S protein beyond a simple shielding effect (Casalino et al. 2020; Sztain et al. 2021). Others have shown that glycans, particularly those on the N-terminal domain (NTD), would potentiate binding to auxiliary lectin receptors, thus enhancing infectivity (Soh et al. 2020; Lempp et al. 2021). Although none of the accumulated mutations in the variants of concern directly abolish any of the conserved 22 N-glycosylation sites, it is not known if their respective glycosylation pattern would be significantly altered by underlying conformational changes, which, in turn, may impact on the viral transmissibility, pathogenic severity of COVID-19, and the preventive solutions being deployed worldwide.

To allow a reliable site-specific glycosylation mapping comparison across different S protein constructs and mutated sequences, a standardized and reliable analytical platform is required to discern any real shift in the glycosylation pattern of the variants from random noise intrinsic to the very heterogeneous nature of protein glycosylation. Drawing from the many reported analyses and our own, we have carefully evaluated the S protein glycopeptide data acquisition and analysis settings. We then adopted a sample and data processing workflow that places a higher premium on throughput and objective consistency over comprehensive coverage in both qualitative identification and quantitative mapping. Biological triplicate analyses starting from producing each recombinant spike protein were undertaken, and glycosylation variations on each site for all triplicates were faithfully recorded and reported along with the averaged values and normalized percentage total. We further dealt with problematic technical issues, including balancing the false positives and negatives and identifying the minor population of O-glycopeptides amidst the overwhelming pool of N-glycopeptides and peptides without additional enrichment or fractionation.

We found that the mutations of the D614G and Alpha variants did not impact significantly on the gross site-specific glycosylation pattern, consistent with no major change in the 3D conformation of their trimeric spike proteins as determined by cryo-EM, except for different proportion of their receptor-binding domains (RBD) found in either an open and upward or down and closed conformation (Yurkovetskiy et al. 2020; Benton et al. 2021; Cai et al. 2021; Gobeil et al. 2021; Yang, Yu, Chang, Kuo, et al. 2021; Yang, Yu, Chang, Liang, et al. 2021). Focusing on the few key N-glycosylation sites in proximity with the pocket formed by the RBD and the NTD of one protomer and the neighboring RBD from another protomer, we nevertheless detected a significant shift in the relative amount of less processed oligomannose state versus more processed complex type structures at a few sites, indicative of a slight but significant alteration in the local spatial compactness and accessibility. Of high relevance is the considerable shift to more processed N-glycans at site N122 in the Alpha variant, which is directly affected by the nearby Y144 deletion (Δ144), as evident from the cryo-EM density map. Moreover, preserving the original furin processing 682RRAR685 sequence indeed brought instability and lower yield but curiously effected a change in the predominant O-glycosylation forms of one site without significantly altering the overall N-glycosylation pattern. We conclude that minor changes brought by deletion and point mutations in local conformation within a largely stable overall footprint would considerably affect the downstream processing efficiency of the attached glycans. Since each of the glycans covers a not-so-insignificant surface area and may effectively mask or distort a conformational epitope, as well as engaging additional host receptors, its impact on the efficacy of neutralizing antibodies and the acquired cell-mediated immunity upon vaccination cannot be under-estimated.

Results and discussion

Datasets generated and glycopeptide identification criteria

Site-specific glycosylation mapping was undertaken for each of the recombinant trimeric S proteins produced in HEK293F cells, purified from the culture supernatant, checked for the integrity of trimeric assembly by SEC-MALS in the case of S-fm2P (SEC elution profiles for other variants) and used for cryo-EM-based structural studies. Three biological replicates were prepared for each of the S-2P, S-fm2P, S-D614G and S-Alpha variants (S-2P and S-fm2P refer to the trimeric spike proteins from the original strain with and without the furin cleavage site, respectively, whereas both S-D614G and S-Alpha carry the same furin site mutations as in S-fm2P, see Figure 1A), yielding a total of twelve S protein samples subjected to an analytical workflow (Figure 1B) optimized for rapid and standardized quantitative assessment of potential variations in site-specific glycosylation. Our strategy did not aim for comprehensive coverage of all 22 known N-glycosylation sites, nor specifically gear towards uncovering as many low occupancy O-glycosylation sites. Instead, a single simultaneous trypsin and chymotrypsin digestion was applied, which would reproducibly yield good quality site-specific N-glycosylation information for 19 sites, including the few critical ones for the NTD and RBD. Data visualization for the N-glycosylation pattern are provided in a few different formats. First, the complete qualitative identification results as processed by Byonic search and filtered by PEP2D < 0.001 and score > 200, and the ensuing Byologic-quantifiable glycopeptides filtered by the unique peptide backbone and glycan composition derived from the higher stringency identification results, were reported as Supplementary Tables SII and SIII, respectively, in Excel sheets format. All twelve LC–MS/MS raw datasets have also been deposited to MassIVE site. Second, the quantified glycopeptides normalized as percentage total of all glycan composition identified for each site were visualized as a comprehensive heatmap and presented in Supplementary Figure S1. This is further supplemented by individual zoomed-in heat maps for sites N74, N122 and N282 (Supplementary Figure S2), after removing glycopeptides not found on these particular sites.

Fig. 1.

Fig. 1

The construct designs for the recombinant trimeric spike proteins and the analytical workflow used to map their site-specific glycosylation profiles. (A) The trimeric prefusion S proteins analyzed in parallel for direct comparison were all stabilized by 2P mutations and RRAR to GSAS substitution at the furin cleavage site, with a foldon trimerization motif added for the wild type (fm2P), D614G and Alpha (B1.1.7) variants, but a stabilized wild type S protein (2P) retaining the furin cleavage site was also included. Despite the deletions in the S-Alpha, the residue numbers of the remaining protein sequence and the N-glycosylation sites follows that of S-WT (wild type) and the other variants without deletions. (B) The recombinant S proteins were purified from the culture supernatant and digested simultaneously by trypsin and chymotrypsin for LC–MS/MS analysis as indicated. The acquired datasets were processed by Byonic and the Byologic module under Byos® for glycopeptide identification and quantification, respectively. N- and O-glycopeptides were searched and analyzed separately for better results. (This figure is available in black and white in print and in color at Glycobiology online).

Third, normalized bar charts for the relative amount of major glycopeptide types identified for select NTD sites on each of the four S protein variant triplicates are presented to enable rapid visualization of how the site-specific glycosylation pattern, including the degree of sialylation, fucosylation and sulfation may differ among different S protein variants and among the sample triplicates (Figure 2A). Fourth, a simplified pie chart version grouping the glycopeptides into different oligomannose types (Man5-Man9HexNAc2) and complex/hybrid types containing 3–7 HexNAc (disregarding the number of Hex, Fuc and NeuAc) for 15 sites are given for comparison across the three furin site mutated S protein variants (Figure 2C). Unlike the bar charts that faithfully show the batch-to-batch variation, these pie charts utilized averaged data from the triplicates for each protein sample, which is more representative but at the expense of degenerated information on glycosylation details and variations. Finally, the most abundant N-glycan composition for each identified site of the Alpha variant was presented in cartoon drawings and localized onto an S-protein structure rendered from the cryo-EM data to indicate how these are spatially distributed (Figure 2B).

Fig. 2.

Fig. 2

Overall site-specific glycosylation pattern of the recombinant trimeric spike proteins derived from the original SARS-CoV-2 strain and its mutated D614G and Alpha variants. (A) Full glycosylation heterogeneity for the few NTD sites as revealed by triplicate analyses of the four samples including the S-2P sample retaining the RRAR furin cleavage site. Their respective locations on the overall 3D structure of the spike protein were indicated. (B) Side view and top view of the trimeric S proteins based on our cryo-EM structure of S-D614G (PDB ID 7EAZ), highlighting a single protomer with another protomer (faded) in the background. The most abundant N-glycan identified for each N-glycosylation site of S-Alpha is depicted by cartoon drawing adopting the recommended symbol nomenclature for glycan (SNFG; Varki et al. 2015). White square indicates a HexNAc, which can either be GalNAc if the structure carries terminal LacdiNAc (GalNAcβ1-4GlcNAc-) unit or GlcNAc if it is structure with terminal non-galactosylated GlcNAc and/or bisecting GlcNAc. (C) Data from triplicate analyses were averaged and degenerated into Man3-9GlcNAc2 and HexNAc4-7X (X can be any number of Hex, Fuc, NeuAc and sulfate) to be visualized as pie charts for 15 out of 17 sites with quantifiable N-glycopeptides. The color code used for both (A) and (C) is shown as strips of color chart. In (A), any structure with sulfate (SO3), NeuAc (Neu) or more than one Fuc will be color coded as such, disregarding the number of HexNAc (N) and Hex (H). The glycan compositions are arranged by increasing number of HexNAc, from bottom to up in the bar charts. Full representation in heat maps can be found in Supplementary Figures S1 and S2. (This figure is available in black and white in print and in color at Glycobiology online).

As already discussed by others, there are many sources that would contribute to discrepancies among the reported glycosylation patterns, including the exact recombinant protein constructs, host cell culture conditions, protein purification and digestion methods, the LC–MS/MS data acquisition, processing and analysis parameters, as well as the criteria used to accept or filter the computational search results. To strike a fine balance between data processing throughput and accuracy, as well as objective consistency, N-glycopeptides identification by Byonic for S-fm2P and S-2P were first scrutinized for the numbers of peptide spectrum matches (PSM) containing at least three peptide fragment ions at PEP2D < 0.001 coupled with different score cut-off (Supplementary Figure S3), which led to adopting the additional filtering criteria of score > 200 to be applied across all datasets. The glycan compositions thus identified on each site were then manually checked for any inconsistency against the well-known glycomic repertoire of HEK293 (Narimatsu et al. 2019). During this process, a common source of error in misassigning the mass difference between HexNAc and Fuc to carbamidomethylation on Lys by Byonic due to the absence of the correct glycan composition in the 132 N-glycan library used was identified (Supplementary Figure S4) and rectified by adding in eight additional entries (Supplementary Table SI), for subsequent reprocessing and processing of all datasets. We further confirmed, by the presence of diagnostic oxonium ions at m/z 407.166, that terminal HexNAc2 unit attributable to LacdiNAc is a common feature, along with ions that respectively verify the occurrence of sulfated HexNAc and fucosylated HexNAc2 particularly on, but not restricted to, glycopeptides from site N74 (Supplementary Figure S5). This glycosylation characteristic of HEK293F is reflected by several identified glycan compositions with Fuc2 and even Fuc3, or sulfate, with and without additional sialylation, as highlighted in the color code used (Figure 2A, Supplementary Figures S1 and S2).

Consistent features among the variations

Due to substantial inherent variations in sample and data processing among different laboratories, detailed comparison of identified site-specific N-glycopeptides down to individual glycoforms level is not possible and largely nonmeaningful. On the other hand, a few emerging characteristics homing in on the overall degree of glycan processing on specific sites as inferred from the relative amount of oligomannose versus hybrid/complex type structures appear to be very consistently reproduced by each reported analysis to date. Allen et al. (2021) has applied the same data processing method to analyze the available datasets from five different sources of prefusion state stabilized recombinant trimeric spike proteins (Amsterdam Medical Centre, Harvard Medical School, Switzerland, The Wellcome centre for Human genetics and University of Southampton/University of Texas at Austin), very similar to our wildtype S-fm2P used here. They further calculated the average oligomannose/hybrid versus complex type glycan compositions carried on each site from these five samples. In addition, Wang et al. (2020) analyzed the same recombinant trimeric S protein (University of Texas at Austin) first examined by Watanabe et al. (2020). More recently, Brun et al. (2021) analyzed a different stabilized source of recombinant trimeric S protein (Ichan School of Medicine at Mount Sinai, New York). All these cited works similarly used Byonic/Byologic for quantitative mapping of intact trimeric S protein ectodomain produced in mammalian cells (HEK293, with the exception of the Swiss sample produced in CHO cells). Collectively, these provided a good reference set of data to be compared against our own Academia Sinica produced sample (Figure 3). Another report by Zhao et al. (2020) used spectral count (PSM) for quantification, which makes direct comparison difficult, although the overall picture concluded is similar. Its source of the recombinant S protein (Harvard Medical School) was, however, already included in the recent work by Allen et al. (2021). Other reports on S1/S2 protomer or subunits (Shajahan et al. 2020; Zhang et al. 2020; Brun et al. 2021), RBD domain (Zhang et al. 2020; Antonopoulos et al. 2021; Bagdonaite et al. 2021; Gstottner et al. 2021) or intact S protein produced in insect cells (Bangaru et al. 2020; Zhang et al. 2020; Zhou, Tian, Qi, et al. 2021) are not considered here since these carried distinctively different glycosylation pattern.

Fig. 3.

Fig. 3

The relative abundance of major classes of site-specific N-glycopeptides identified on the stabilized trimeric S-protein variants. The site-specific N-glycopeptides were classified according to their assigned glycan compositions, with Man5-9GlcNAc2 defined as oligomannose type, HexNAc3-X as hybrid type, HexNAc4-7-X as complex type, with X can be any number of Hex, Fuc, NeuAc and sulfate. The definition of HexNAc3-containing N-glycan as hybrid type does not take into considerations that some HexNAc4-5-containing N-glycans can also be hybrid instead of complex type, which cannot be readily resolved by current MS2 data on the glycopeptides. The numbers in each cell refer to the % total after summing the quantified peak areas for the respective classes of identified N-glycopeptides, using the average value from triplicate analysis (see Supplementary Table SIII). Fucose, NeuAc and sulfate refer to N-glycopeptides identified as carrying one or more of these substituents and calculated as % total over all quantified N-glycopeptides for that particular site. The color intensity reflects the relative abundance. Data for S-fm2P, S-D614G and S-Alpha were generated in this work. These are compared against the average value (Ref Avg) derived by Allen et al. (2021) from datasets of five different sources of similarly prefusion state stabilized recombinant trimeric spike proteins analyzed by others. For direct comparison, the values for Hybrid, Fhybrid, HexNAc3x and HexNAc3Fx in their chart were summed together as Hybrid in this Table and the % total values for oligomannose, hybrid and complex were recalculated accordingly. (This figure is available in black and white in print and in color at Glycobiology online).

All complex types are not made equal

Focusing on the NTD, the most consistent glycosylation pattern is the one at N234, which is the only site that is always >80% unprocessed at Man8 and Man9-oligomannose state. Like the Harvard and Swiss samples, our own S-fm2P sample yielded a small % of complex type glycan at this site, whereas samples from other sources, as well as our D614G and Alpha variants, are approaching 100% oligomannose, which may reflect a slight difference in the spatial accessibility afforded by the different trimeric assembly. On the other extreme, glycans at N74 and N282 are consistently processed almost completely to complex type structures for all samples. In fact, these two NTD sites, which are situated away from interacting directly with the RBD from other protomers, carry the most varied glycosylation heterogeneity that differs significantly between the two sites (Supplementary Figure S2). N74 is not only more highly sialylated but also is the site carrying the highest amount of sulfation and additional Fuc, both due to the prominence of LacdiNAc (Supplementary Figure S5). In contrast, a significant proportion of glycans at N282 does not contain Fuc, and most are not sialylated. The average values provided by Allen et al. are 98% fucosylated and 43% sialylated for N74 versus 80% and 15%, respectively, for N282. Our own S-fm2P sample yielded even more contrasted values (Figure 3). This feature is largely maintained by the D614G and Alpha variants although there is a significant decrease in the degree of fucosylation at N282 going from the wild type (both our own and the ref average at 73–80%) to the variants (45–48%). Moreover, the exact distribution and relative amount of each distinct complex type glycan also varied slightly from batch-to-batch and more pronouncedly from one to another mutated variants. The immunobiological consequence is not possible to determine since the actual glycosylation structures on the infectious virion will be dependent on the infected host cell types. Nevertheless, it points towards the possibility of carrying different glycotopes that can be recognized differently by the glycan-binding proteins on myriad immune cells in infected individuals carrying natural variations in glycomic expression, which would in turn contribute to different immune response and pathological severity.

Different shades of glycan processing

Sites N61, N122 and N165 typify another category of N-glycosylation sites, which carry a mixture of oligomannose/hybrid and complex type structures at variable amount. The average values provided by Allen et al. for oligomannose versus hybrid/complex type structures at these three sites for the five recombinant protein samples are 39%:60%, 16%:82% and 18%:81%, respectively (Allen et al. 2021). However, there are significant inter-sample variations, whereby oligomannose structures at site N61 can be as high as 70–80% (in Wellcome Center, Southampton/Texas and Mount Sinai samples) and the complex type structures at site N165 can be as high as approaching 100% (in Swiss and Southampton/Texas samples). Nonetheless, in most cases, including our analysis, the single most abundant structure at N165 is Man5GlcNAc2. This is also the structure used to model how it is modulating the conformational dynamics of the RBD, along with the Man8 or Man9-oligomannose structure at N234 (Casalino et al. 2020). We observed only a slight increase in oligomannose structures in the overall similar glycosylation pattern of N165 upon D614G and further mutations in the Alpha variant, albeit not without some variations in the actual complex type structural heterogeneity. Similarly, the single most abundant structure at site N61 in most samples is Man5GlcNAc2. Relative to N165, the degree of fucosylation and sialylation at N61 is much lower, and there appears to be also a slight shift towards more Man5GlcNAc2 and even less sialylation in the Alpha variant.

The glycosylation pattern at N122 is more similar to N165 than N61 in terms of the extent of processing to complex type structure and degree of sialylation but is significantly less core fucosylated, according to the average value by Allen et al. and also reflected in our own analysis. Interestingly, although single mutation at D614G retains much of the same glycosylation pattern, a significant shift is noted in the Alpha variant more than in any other sites. The overall shift (Figure 4C) is consistent with its gaining a more downstream processing status from Man5 to more highly fucosylated and sialylated complex type structures. On average, the heterogenous glycans at N122 would thus be expected to occupy a bulkier spatial volume in the Alpha variant relative to D614G or the original non-mutated strain. It is perhaps no coincidence that N122 is located nearby a disordered loop (N3, residues 141–156; Chi et al. 2020) that harbors the deleted Y144 in the Alpha variant and is no longer apparent in our reconstructed cryo-EM map (Figure 4A; Yang, Yu, Chang, Liang, et al. 2021). The distinct orientation of the N-glycan stubs at N122 further indicates some subtle local conformational changes not apparent from comparing the overall end state structures of the S protein variants resolved by cryo-EM single particle reconstruction. Yet, the extent of site-specific glycan processing as the trimeric spike protein transits through the Golgi apparatus, including the antennary branching, core and additional peripheral fucosylation, sialylation and sulfation, is extremely sensitive to the slightest change in the local conformational dynamics brought about by the primary sequence mutations. Some sites exemplified by N122 would become more accessible to the Golgi-resident glycosyltransferases, whereas other sites such as N61 would become less processed. It is not a one-way loosening up of the entire compact structure but rather subtly affected in either way while maintaining the overall structural similarity. These findings exemplify the sensitivity of complementary MS analysis in probing conformational changes during glycan processing that is otherwise inaccessible to other structural biology tools, including cryo-EM.

Fig. 4.

Fig. 4

N-glycosylation at site N122 and its location at an exposed region of the NTD of the trimeric S protein variants. (A) Orthogonal views of the low-pass filtered cryo-EM maps of S-D614G and S-Alpha with the visibly resolved stems of N-glycans (first two GlcNAc moieties extending from the sidechain of Asn) at specific regions highlighted in different colors. The five N-glycans within the NTD are colored in light yellow, with the exception for N122, which is colored in gold. The N-glycans at N331 and N343 within the RBD are colored in lime. The expanded views of the NTD highlight the differences in the EM maps between the D614G and Alpha variants due to Δ69–70 and Δ144 in the latter. (B) Orthogonal views of the superposition of the atomic models of the D614G and Alpha variants, which are shown in gray (PDB ID 7EAZ) and cyan (PDB ID 7EDF), respectively. The positions of the backbone Cα atoms of the N-glycosylated asparagine residues are shown in green or gray spheres with their residue identities indicated. (C) The glycosylation bar charts for N122 on the S proteins of the D614G and Alpha variants, averaging data from triplicate analysis. The most likely N-glycan structures corresponding to select few of the identified glycosyl compositions are annotated as cartoon drawings using the SNFG standard. These structures were not further verified experimentally. (This figure is available in black and white in print and in color at Glycobiology online).

All shifts are towards less processed states except N122

The N-glycans at N331 and N343 located in the RBD have consistently been reported as highly processed, implying a relatively exposed accessibility. In our hands, N343 does have a few oligomannose structures (~15%) but, importantly, both sites carry mostly complex type structures and are not appreciably affected by mutations in the D614G and Alpha variants. Outside the RBD, site N657 carries mostly complex type structures, whereas both N603 and N616 have a mixture of oligomannose and complex type structures with the former consistently less processed. Notably, the value for N603 (and several other sites) as determined from analyses of different S-protein sources varies considerably (Allen et al. 2021) and not well represented by the reported average value (Figure 3), which was skewed by atypically high amount of complex type structures at this site in the Swiss and Harvard samples (Allen et al. 2021). Getting into S2, the first few sites from N717, N801 to N1074 spotted a decreasing proportion of oligomannose structures. This trend is equally observed in other reported samples (Allen et al. 2021), and upheld in our mutated variants. The remaining N-glycosylation sites towards the C-terminus of S2 are all dominated by complex type structures with few or no oligomannose structures. More importantly, we consistently detected substantial increases in the relative amount of oligomannose structures at N603, N616 and N1074 in the D614G and Alpha variants. The overall picture gleaned from these analyses suggests that S-D614G and S-Alpha are more similar in their N-glycosylation pattern at a majority of sites and, relative to S-fm2P, generally show a shift towards less processed states for sites that originally carry a variable mixture of oligomannose and complex type structures. The changes may be related to the increase propensity of the RBD to be in an open, upward conformation in S-D614G (Yurkovetskiy et al. 2020; Benton et al. 2021; Yang, Yu, Chang, Kuo, et al. 2021) and S-Alpha (Cai et al. 2021; Gobeil et al. 2021; Yang, Yu, Chang, Liang, et al. 2021). Only glycosylation at N122, which is relatively distant from the RBD, among the sites analyzed bucks this trend and gets more processed into complex types in S-Alpha. On the other hand, N-glycosylation at the very N-terminal sites N17, N61, N74 and N122, are less affected by the D614G mutation alone but register more pronounced shifts in the S-Alpha variant that carries multiple mutations, including the two deletions in the NTD.

Impact of furin site and other mutations on O-glycosylation

It is now known that retaining the RRAR furin cleavage site would adversely affect the yield of the trimeric recombinant S protein produced in HEK293 cells due to potential cleavage into S1 and S2 in Golgi, which may, however be still held together (Brun et al. 2021; Watanabe et al. 2021). All cryo-EM and glycosylation analyses of stabilized trimeric S proteins to date were thus performed on S-fm2P derivatives with the site mutated. This is despite a low endogeneous furin level in HEK293 cells. In fact, to ensure efficient furin-mediated proteolytic processing of HIV envelop protein gp160 into gp120 and gp41, it is necessary to co-transfect a furin-expression plasmid together with the HIV envelope gp160 expression plasmid (Binley et al. 2002; Chung et al. 2014; Rutten et al. 2018). Indeed, without boosting the cellular level of furin in our HEK293F cells, we found that our S-2P protein could be recovered and purified in a manner similar to other GSAS-stabilized S-fm2P proteins but at a much lower yield and gave bands corresponding to both uncleaved S1/S2 and cleaved S1 and S2 subunits by SDS-PAGE (Supplementary Figure S6). The amount of furin-processed S-2P was about 50% according to visual inspection. Despite using the same total protein amount for digest and LC–MS/MS analysis, the signal intensity was significantly lower than the other three recombinant spike proteins with furin cleavage site mutated. This may have unnecessarily skewed the pattern towards slightly more oligomannose type profiles since glycopeptides with oligomannose structures are more readily identified by the analytical process. The data were therefore not used for direct comparison with those of other S protein variants. Nonetheless, the overall site-specific glycosylation characteristics of S-2P largely recapitulate the overall picture concluded from analyzing the GSAS-mutated S-fm2P, S-D614G and S-Alpha, namely with N-glycans at N234 being almost all retained as oligomannose, N74 and N282 being highly processed into complex types, and N61, N122 and N165 somewhere in between (Figure 2A). This is consistent with the current understanding that furin-induced cleavage, if occurs, is a rather late Golgi event (Molloy et al. 1999; Brun et al. 2021) after all glycan addition and further processing into complex type structures have largely taken place. It also indicates that having RRAR instead of GSAS did not significantly affect the overall 3D structure of the S protein.

Interestingly, however, we noted that S-2P actually yielded significantly more of mono- and disialylated core 1 and 2 O-glycans at T323 relative to the non-sialylated core 1 and 2 and single GalNAc structures (Figure 5). This is relevant since sialylated glycopeptides are usually more difficult to identify than non-sialylated ones due to less favorable ionization and fragmentation properties. The fact that a higher proportion of sialylated glycopeptide was identified for a sample of lower amount that yields overall lower signal intensity added confidence that the increase in sialylation at this O-glycosylation site is of real significance. It reflects a relatively late Golgi event that coincides with potential S1/S2 cleavage. Among other O-glycosylation sites, we only detected a low level of O-glycosylation at T678 located within the loop that harbors the furin cleavage site for the GSAS-stabilized S-fm2P but not S-D614G or S-Alpha (Supplementary Table SIV). Incidentally, both variants afforded higher levels of sialylation on their O-glycans at T323, relative to S-fm2P and more similar to S-2P (Figure 5B). Collectively, these findings indicate that the spatial confinement or accessibility of T323 within the RBD is sensitive to local conformational changes and those occurring elsewhere, including the nonstructured S1/S2 loop that may or may not be cleaved before the trimeric S protein exits Golgi.

Fig. 5.

Fig. 5

Quantitative analysis of O-glycosylation at T323. (A) Extracted ion chromatograms of the peptide 320VQPTESIVR carrying the common cores 1 and 2 O-glycans for S-fm2P (upper panel) and S-2P (lower panel). Each of the putative peaks were manually verified and those not corresponding to the expected glycopeptides are marked with “X”. The overall occupancy is very low and the inset shows the amount of the most abundant glycoforms with single GalNAc relative to the non-glycosylated peptide, after 10× magnification. (B) The quantified distribution of various glycoforms for this site for all 12 samples (3 replicates each for the 4 recombinant spike proteins) analyzed, with the color code used for different O-glycan structures shown in the box above. Full data for this and other identified O-glycosylation sites are presented in Supplementary Table SIV. (C, D) HCD-MS2 and EThcD-MS2 data for the disialylated HexNAc2Hex2-carrying O-glycopeptide from S-2P. The c3 and c4 ions afforded by EThcD-MS2 unambiguously localize the disialylated O-glycan to T323. (This figure is available in black and white in print and in color at Glycobiology online).

Concluding perspectives

Despite many reports on the site-specific glycosylation pattern of the SARS-CoV-2 spike proteins in all different states, including S1/S2 protomer, subunit and stabilized trimer produced either in insect or HEK293 cells, a detailed comparison is difficult. The many sample preparation, analytical data acquisition and processing variations almost assure a significant level of differences in view of the heterogenous nature of protein glycosylation. Nonetheless, if this heterogeneity is degenerated simply into oligomannose versus hybrid/complex type as a measure of the extent of glycan processing and hence spatial accessibility, a more unifying picture informative of the glycosylation site-specific local conformation of a properly assembled trimeric S protein can be better described.

In that respect, our glycosylation analysis of the S proteins is largely consistent with the currently accepted model (Allen et al. 2021). More importantly, this is the first report that attempts to identify any departure from this archetypal view as mutated SARS-CoV-2 variants of concern evolve. It is made possible by having tight quality control over the in-house produced S protein sample source in conjunction with applying the same analytical workflow, data analysis parameters and objective identification criteria for side-by-side comparison. Based on the overall very similar protein structures determined by companion cryo-EM analysis, large differences in glycosylation pattern are not expected. We nevertheless identified several distinct trends, including a more processed state at N122 in the S-Alpha variant when most other sites shifted instead towards less processed status. Among the sites that are mostly decorated with complex type structures such as N74 and N282, the exact N-glycans carried and their overall degree of sialylation and fucosylation, not only vary between sites but also between the original and mutated variants for the same site. This extends further to O-glycosylation profiles at T323 and T678, although their very low level of occupancy may have much less impact.

The few critical mutations in the RBD of the S-Alpha appear not to have significantly impact its glycosylation pattern any more than the single D614G mutation does. This may be related to the positive selection pressure to maintain or enhance productive infectivity via engaging the host receptor, hACE2. We observed that N-glycans at N331 and N343 at the RBD remain highly processed while N234 and N165 also retain their Man9 and Man5 structures, respectively, consistent with glycosylation of these few sites playing active structural roles in modulating the conformational dynamics of the spike RBD (Casalino et al. 2020; Sztain et al. 2021). In contrast, both the unique ∆Y144 and Δ69–70 deletions in the NTD of Alpha variant have been associated with immune escape from potent neutralizing antibodies (Collier et al. 2021; McCallum et al. 2021; Wang et al. 2021) against the single immunogenic supersite formed primarily by the disordered N1–N5 loops and bordered by glycans at N17, N74, N122 and N149 (Chi et al. 2020; Cerutti et al. 2021). Glycans at N74 and N149 could not be visualized in our cryo-EM density map, consistent with their location on fairly flexible loop, unhindered from access and carry fully processed complex type structures. N122, on the other hand, is located nearby the N3 loop (residues 141–156) that is visibly affected by ∆Y144 in the Alpha variant (Figure 4A). This may have significantly affected the accessibility of N122, and hence the shift in the glycosylation status observed. The more exposed complex type N-glycans at N122, along with those at N74, N149 and N282 would collectively present an assorted range of host cell dependent glyco-epitopes that contribute to engaging the host glycan-binding auxiliary receptors (Soh et al. 2020; Lempp et al. 2021; Reis et al. 2021) in facilitated infectivity.

Although a more definitive functional consequence cannot be readily delineated here, our findings attest to the sensitivity of glycosylation to point and/or deletion mutations in the primary sequence that may or may not lead to obvious conformational changes. More specifically, MS-based site-specific glycosylation mapping can detect these subtle alterations otherwise not informed by common protein structural analysis, at higher throughput and sensitivity once a standardized workflow is established.

Material and methods

Plasmid construction and purification of SARS-CoV-2 spike

The codon-optimized nucleotide sequence of full-length SARS-CoV-2 S protein and that of the Alpha variant were kindly provided by Dr. Che Alex Ma (Genomics Research Center, Academia Sinica) and Dr. Mi-Hua Tao (Institute of Biomedical Sciences, Academia Sinica), respectively. The DNA sequence corresponding to residues 1–1208 of the S protein was subcloned from the full-length S sequence, appended with the T4 fibritin foldon sequence at the C-terminus, followed by a c-Myc sequence and a hexahistidine (His6) tag and inserted into the mammalian expression vector pcDNA3.4-TOPO (Thermo Fisher Scientific, Carlsbad, CA). The D614G mutant was generated by site-direction mutagenesis with specific primers as described in our previous study (Yang, Yu, Chang, Liang, et al. 2021). Two variants were subsequently generated by introducing a tandem proline stabilization mutation (2P, 986KV987 → 986PP987), with and without furin cleavage site mutation (fm, 682RRAR685 → 682GSAG685), designated as S-fm2P and S-2P, respectively. The same construct design of S-fm2P was used for the D614G and Alpha variants (hereafter designated as S-D614G and S-Alpha, respectively).

The expression vectors encoding for all S proteins were transiently transfected into HEK293 Freestyle (HEK293F) cells with polyethylenimine (PEI, linear, 25 kDa, Polysciences, Warrington, PA) at a ratio of DNA: PEI = 1:2. The transfected cells were incubated at 37°C, with 8% CO2 for six days. The cells were pelleted by centrifugation at 4000 rpm for 30 min, and the supernatant containing the overexpressed S proteins was collected and filtered through a 0.22 μm cutoff membrane (Satorius, Germany). The supernatant was incubated with HisPur Cobalt Resin (Thermo Fisher Scientific, Rockford, lL) in 50 mM Tris–HCl (pH 8.0), 300 mM NaCl, 5 mM imidazole and 0.02% NaN3 at 4°C overnight. The resin was sedimented by gravity in an open column (Glass Econo-Column® Chromatography Column, Bio-Rad, Hercules, CA), washed with 50 mM Tris–HCl (pH 8.0), 300 mM NaCl, 10 mM imidazole and the target protein was eluted by 50 mM Tris–HCl (pH 8.0), 150 mM NaCl, 150 mM imidazole. The eluent was concentrated and further purified by using a size-exclusion chromatography (SEC) column (Superose 6 increase 10/300 GL, GE Healthcare, Pittsburgh, PA) in 50 mM Tris–HCl (pH 8.0), 150 mM NaCl, 0.02% NaN3. The purity of the samples was confirmed by SDS-PAGE. The protein concentrations were determined by using the UV absorbance at 280 nm using a UV–Vis spectrometer (Nano-photometer N60, IMPLEN, Germany).

Sec-MALS

The integrity of the trimeric assembly of the S-fm2P and the degree of glycosylation was confirmed by SEC coupled with multiangle light scattering (SEC-MALS) as described previously (Yang et al. 2020). The purified S protein was separated by an SEC column (BioSEC-3, Agilent, Santa Clara, CA) connected to an HPLC system (Analytical HPLC 1260 LC system, Agilent), and the corresponding static light scattering signals were detected by a Wyatt Dawn Heleos II multiangle light scattering detector (Wyatt Technology, Santa Barbara, CA). To dissect the contributions of amino acids and glycans to the overall molecular weight of the SEC elution peak, the refractive index increments (dn/dc) of protein and protein conjugate (glycan moiety) were defined as 0.185 and 0.140 mL/g, respectively. The buffer viscosity (η) was set to 0.8945 cP at 25°C based on the theoretical estimate using SEDNTERP. Those parameters were applied in ASTRA 6.0 software (Wyatt Technology).

In-solution proteolytic digestion of the S proteins

Proteins were reduced with 10 mM dithiothreitol at 37°C for 1 h, then alkylated with 50 mM iodoacetamide in 25 mM ammonium bicarbonate and 4 M urea for 1 h in the dark at room temperature. After that, dithiothreitol was added to a final concentration of 50 mM, then buffer-exchanged to 25 mM ammonium bicarbonate buffer using Amicon Ultra-0.5, 10 ka (Merck Millipore, Ireland) and treated overnight with sequencing grade trypsin and chymotrypsin (Promega, Madison, WI) simultaneously at an enzyme-to-substrate ratio of 1:30 at 37°C. The digested products were then diluted with formic acid to a final concentration of 0.1% for further experiments.

Glycopeptides analysis by liquid chromatography-mass spectrometry

The digested peptides were cleaned up using ZipTip C18 (Merck Millipore), dried down and redissolved in 0.1% formic acid (Solvent A). Data were acquired on Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher Scientific, San Jose, CA) fitted with an Easy-nLC 1200 system (Thermo Fisher Scientific). For each LC–MS/MS analysis, an equivalent of 1 μg glycoprotein digest was loaded onto an Acclaim PepMap RSLC C18 column (Thermo Fisher Scientific, Lithuania) and separated at a flow rate of 300 nL/min using a gradient of 5% to 40% solvent B (80% acetonitrile with 0.1% formic acid) in 200 min. The mass spectrometer was operated in the Top speed (3 s) data-dependent acquisition mode. Briefly, survey full scan MS spectra were acquired in the Orbitrap from 400 to 1800 m/z at a mass resolution of 120,000. The highest charge state ions within charge state 2–6 were sequentially isolated for MS2 analysis using the following settings: HCD MS2 with AGC target at 5 × 104, isolation window 2, orbitrap resolution 30,000, step collision energy (%): 25, 28 and 32. HCD-pd-EThcD triggered by at least detection of one of the product ions at m/z 138.0545, 204.0867, 274.0926 or 366.1396 within the top 20 ions. For EThcD, the calibrated charge-dependent parameter was used, the supplemental activation collision energy was 15% and orbitrap resolution 60,000. For each of the SARS-CoV-2 S protein variants, three biological replicates were prepared and analyzed by LC–MS.

Glycopeptide identification and quantification

The MS and MS2 data were first evaluated by Preview (Protein Metrics Inc.) to determine the optimum search parameters for Byonic (Protein Metrics Inc., Cupertino, CA). The HCD and EThcD MS2 data were then processed by Byonic (v 3.10.10 for S-2P/fm2P/D614G O-glycopeptide identification, and v 3.11.3 for S-Alpha O-glycopeptide and all N-glycopeptide identification) using the following general parameters: search against the SARS-CoV-2 spike protein sequence with semi specific cleavages at F, Y, W, L, K, R residues, allowing up to two missed cleavages, with the precursor ion mass tolerance set at 4 ppm and the fragment ion mass tolerance at 10 ppm. Fixed modification included cysteine carbamidomethylation (+57.0215 Da, at C), whereas variable common modifications considered were carbamidomethylation (+57.0215 Da, at H, K), dithiothreitol (+151.9966 Da, at C), oxidation (+15.9949 Da, at M), deamidation (+0.9840 Da, at N), carbamylation (+43.0058 Da, at N-terminus, K, R) and variable rare modifications considered were Gln to pyro-Glu (−17.0265 Da, at N-terminal Q), Glu to pyro-Glu (−18.0106 Da, at N-terminal E). The N-glycopeptide and O-glycopeptide identification were processed by Byonic separately. For N-glycopeptide identification, the built-in N-glycan library of “132 human” was used as a base but modified by (i) removing N-glycans smaller than trimannosyl core structure (HexNAc2Hex3) and (ii) adding in 8 N-glycans not listed in the library and selected N-glycan compositions carrying an extra sulfate (see Supplementary Table SI for the exact N-glycan library used in this work). For O-glycopeptide identification, the built-in O-glycan library of “O-glycan 9 common” was used to identify O-glycopeptides. The criteria used in additional manual filtering of positive matches were PEP2D < 0.001, score > 200, and retaining only those identified by such criteria in at least 3 out of the 12 datasets analyzed. The filtered Byonic identification results were output into an Excel file (Supplementary Table SII) along with curated nonredundant lists of identified glycan compositions and peptide backbones to be used as one of the filtering criteria for subsequent quantification output.

The unfiltered Byonic search results were fed to the Byologic module of the Byos suite (v3.11, Protein Metrics Inc.) for quantification purpose based on peak areas of extracted ion chromatograms at 5 ppm accuracy. In cases when the same glycan composition of a particular N-glycosylation site occurs on more than one unique peptide backbones due to different peptide modification and/or mis-cleavage, the peak areas for those glycopeptides starting with the same N-terminal amino acids were summed and treated as one quantified entry for that site. Only those quantifiable peaks identified at more relaxed criteria (PEP2D < 0.01) but carrying either the glycan and/or peptide backbone identified initially by PEP2D < 0.001 and score > 200 were exported into an Excel file (Supplementary Table SIII), from which the bar chart and heat maps were generated. For these final outputs, peak areas of all glycopeptides carrying the same glycan composition on a particular site but different peptide backbones were summed as one unique quantified entry. To generate the pie chart for each site, values from three biological triplicates of the same CoV-2 variant were averaged. The O-glycopeptide quantifications (reported in Supplementary Table SIV) were manually performed using the peak areas of the extracted ion chromatograms within 5 ppm accuracy by Xcalibur (Thermo Fisher Scientific), with spurious or non-verifiable peaks excluded.

Supplementary Material

Supplemental_Data_0915_cwab102
Table_SI_NGlycopeptide_Library_for_Byonic_Search_cwab102
Table_SII_NGlycopeptide_Byonic_ID_Result_cwab102
Table_SIII_Quantified_NGlycopeptides_cwab102
Table_SIV_OGlycopeptide_Byonic_ID_Result_cwab102

Acknowledgements

We thank the mammalian cell culture facility and the biophysics facility of Institute of Biological Chemistry, Academia Sinica, for supporting the protein production and characterizations, respectively. The technical support team of Protein Metrics is also gratefully acknowledged for helpful discussion in using the Byos software suite during the early stage of this work.

Contributor Information

Chu-Wei Kuo, Institute of Biological Chemistry, Academia Sinica, 128 Academia Road Sec 2, Nankang, Taipei 11529, Taiwan.

Tzu-Jing Yang, Institute of Biochemical Sciences, National Taiwan University, 1 Roosevelt Road Sec 4, Daan, Taipei 10617, Taiwan.

Yu-Chun Chien, Institute of Biochemical Sciences, National Taiwan University, 1 Roosevelt Road Sec 4, Daan, Taipei 10617, Taiwan.

Pei-Yu Yu, Institute of Biological Chemistry, Academia Sinica, 128 Academia Road Sec 2, Nankang, Taipei 11529, Taiwan.

Shang-Te Danny Hsu, Institute of Biological Chemistry, Academia Sinica, 128 Academia Road Sec 2, Nankang, Taipei 11529, Taiwan; Institute of Biochemical Sciences, National Taiwan University, 1 Roosevelt Road Sec 4, Daan, Taipei 10617, Taiwan.

Kay-Hooi Khoo, Institute of Biological Chemistry, Academia Sinica, 128 Academia Road Sec 2, Nankang, Taipei 11529, Taiwan; Institute of Biochemical Sciences, National Taiwan University, 1 Roosevelt Road Sec 4, Daan, Taipei 10617, Taiwan.

Data availability statement

The mass spectrometry data underlying this article are available at the Center for Computational Mass Spectrometry, MassIVE and can be accessed via the dataset identifier MSV000087696.

Funding

This work was supported by intramural funding of Academia Sinica (to S.T.D.H. and K.H.K.); an Academia Sinica Career Development Award (AS-CDA-109-L08 to S.T.D.H.); and the Ministry of Science and Technology (MOST 109-3114-Y-001-001 to S.T.D.H.). We thank the Academia Sinica Common Mass Spectrometry Facilities for Proteomics and Protein Modification Analysis (AS-CFII-108-107), and the Academia Sinica Cryo-EM Facility (AS-CFII-108-110) for data collection, both of which are funded by the Academia Sinica Core Facility and Innovative Instrument Project grants.

Conflict interest statement

All authors declare that there is no conflict of interest.

References

  1. Allen  JD, Chawla  H, Samsudin  F, Zuzic  L, Shivgan  AT, Watanabe  Y, He  WT, Callaghan  S, Song  G, Yong  P, et al.  2021. Site-specific steric control of SARS-CoV-2 spike glycosylation. Biochemistry. 60:2153–2169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Amanat  F, Stadlbauer  D, Strohmeier  S, Nguyen  THO, Chromikova  V, McMahon  M, Jiang  K, Arunkumar  GA, Jurczyszak  D, Polanco  J, et al.  2020. A serological assay to detect SARS-CoV-2 seroconversion in humans. Nat Med. 26:1033–1036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Antonopoulos  A, Broome  S, Sharov  V, Ziegenfuss  C, Easton  RL, Panico  M, Dell  A, Morris  HR, Haslam  SM. 2021. Site-specific characterization of SARS-CoV-2 spike glycoprotein receptor-binding domain. Glycobiology. 31:181–187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bagdonaite  I, Thompson  AJ, Wang  X, Sogaard  M, Fougeroux  C, Frank  M, Diedrich  JK, Yates  JR  3rd, Salanti  A, Vakhrushev  SY, et al.  2021. Site-specific O-glycosylation analysis of SARS-CoV-2 spike protein produced in insect and human cells. Viruses. 13:551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bangaru  S, Ozorowski  G, Turner  HL, Antanasijevic  A, Huang  D, Wang  X, Torres  JL, Diedrich  JK, Tian  JH, Portnoff  AD, et al.  2020. Structural analysis of full-length SARS-CoV-2 spike protein from an advanced vaccine candidate. Science. 370:1089–1094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Barcena  M, Barnes  CO, Beck  M, Bjorkman  PJ, Canard  B, Gao  GF, Gao  Y, Hilgenfeld  R, Hummer  G, Patwardhan  A, et al.  2021. Structural biology in the fight against COVID-19. Nat Struct Mol Biol. 28:2–7. [DOI] [PubMed] [Google Scholar]
  7. Benton  DJ, Wrobel  AG, Roustan  C, Borg  A, Xu  P, Martin  SR, Rosenthal  PB, Skehel  JJ, Gamblin  SJ. 2021. The effect of the D614G substitution on the structure of the spike glycoprotein of SARS-CoV-2. Proc Natl Acad Sci U S A. 118:e2022586118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Binley  JM, Sanders  RW, Master  A, Cayanan  CS, Wiley  CL, Schiffner  L, Travis  B, Kuhmann  S, Burton  DR, Hu  SL, et al.  2002. Enhancing the proteolytic maturation of human immunodeficiency virus type 1 envelope glycoproteins. J Virol. 76:2606–2616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Brun  J, Vasiljevic  S, Gangadharan  B, Hensen  M, A  VC, Hill  ML, Kiappes  JL, Dwek  RA, Alonzi  DS, Struwe  WB, et al.  2021. Assessing antigen structural integrity through glycosylation analysis of the SARS-CoV-2 viral spike. ACS Cent Sci. 7:586–593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cai  Y, Zhang  J, Xiao  T, Lavine  CL, Rawson  S, Peng  H, Zhu  H, Anand  K, Tong  P, Gautam  A, et al.  2021. Structural basis for enhanced infectivity and immune evasion of SARS-CoV-2 variants. Science. 373:642–648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Casalino  L, Gaieb  Z, Goldsmith  JA, Hjorth  CK, Dommer  AC, Harbison  AM, Fogarty  CA, Barros  EP, Taylor  BC, McLellan  JS, et al.  2020. Beyond shielding: The roles of glycans in the SARS-CoV-2 spike protein. ACS Cent Sci. 6:1722–1734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. CDC . SARS-CoV-2 Variant Classifications and Definitions. Centers for Disease Control and Prevention (CDC) of the USA Department of Health & Human Services. https://www.cdc.gov/coronavirus/2019-ncov/variants/variant-info.html
  13. Cerutti  G, Guo  Y, Zhou  T, Gorman  J, Lee  M, Rapp  M, Reddem  ER, Yu  J, Bahna  F, Bimela  J, et al.  2021. Potent SARS-CoV-2 neutralizing antibodies directed against spike N-terminal domain target a single supersite. Cell Host Microbe. 29:819, e817–833. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Chi  X, Yan  R, Zhang  J, Zhang  G, Zhang  Y, Hao  M, Zhang  Z, Fan  P, Dong  Y, Yang  Y, et al.  2020. A neutralizing human antibody binds to the N-terminal domain of the spike protein of SARS-CoV-2. Science. 369:650–655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chung  NP, Matthews  K, Kim  HJ, Ketas  TJ, Golabek  M, de  Los  RK, Korzun  J, Yasmeen  A, Sanders  RW, Klasse  PJ, et al.  2014. Stable 293 T and CHO cell lines expressing cleaved, stable HIV-1 envelope glycoprotein trimers for structural and vaccine studies. Retrovirology. 11:33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Collier  DA, De Marco  A, Ferreira  I, Meng  B, Datir  RP, Walls  AC, Kemp  SA, Bassi  J, Pinto  D, Silacci-Fregni  C, et al.  2021. Sensitivity of SARS-CoV-2 B.1.1.7 to mRNA vaccine-elicited antibodies. Nature. 593:136–141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Corti  D, Purcell  LA, Snell  G, Veesler  D. 2021. Tackling COVID-19 with neutralizing monoclonal antibodies. Cell. 184:3086–3108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Davies  NG, Abbott  S, Barnard  RC, Jarvis  CI, Kucharski  AJ, Munday  JD, Pearson  CAB, Russell  TW, Tully  DC, Washburne  AD, et al.  2021a. Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England. Science. 372:eabg3055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Davies  NG, Jarvis  CI, Group CC-W, Edmunds  WJ, Jewell  NP, Diaz-Ordaz  K, Keogh  RH. 2021b. Increased mortality in community-tested cases of SARS-CoV-2 lineage B.1.1.7. Nature. 593:270–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gobeil  SM, Janowska  K, McDowell  S, Mansouri  K, Parks  R, Stalls  V, Kopp  MF, Manne  K, Li  D, Wiehe  K, et al.  2021. Effect of natural mutations of SARS-CoV-2 on spike structure, conformation, and antigenicity. Science. 373:eabi6226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gstottner  C, Zhang  T, Resemann  A, Ruben  S, Pengelley  S, Suckau  D, Welsink  T, Wuhrer  M, Dominguez-Vega  E. 2021. Structural and functional characterization of SARS-CoV-2 RBD domains produced in mammalian cells. Anal Chem. 93:6839–6847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Harvey  WT, Carabelli  AM, Jackson  B, Gupta  RK, Thomson  EC, Harrison  EM, Ludden  C, Reeve  R, Rambaut  A, Consortium C-GU  et al.  2021. SARS-CoV-2 variants, spike mutations and immune escape. Nat Rev Microbiol. 19:409–424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Janik  E, Niemcewicz  M, Podogrocki  M, Majsterek  I, Bijak  M. 2021. The emerging concern and interest SARS-CoV-2 variants. Pathogens. 10:633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Konings  F, Perkins  MD, Kuhn  JH, Pallen  MJ, Alm  EJ, Archer  BN, Barakat  A, Bedford  T, Bhiman  JN, Caly  L, et al.  2021. SARS-CoV-2 variants of interest and concern naming scheme conducive for global discourse. Nat Microbiol. 6:821–823. [DOI] [PubMed] [Google Scholar]
  25. Korber  B, Fischer  WM, Gnanakaran  S, Yoon  H, Theiler  J, Abfalterer  W, Hengartner  N, Giorgi  EE, Bhattacharya  T, Foley  B, et al.  2020. Tracking changes in SARS-CoV-2 spike: Evidence that D614G increases infectivity of the COVID-19 virus. Cell. 182:812, e819–827. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Lempp  FA, Soriaga  L, Montiel-Ruiz  M, Benigni  F, Noack  J, Park  YJ, Bianchi  S, Walls  AC, Bowen  JE, Zhou  J, et al.  2021. Lectins enhance SARS-CoV-2 infection and influence neutralizing antibodies. Nature. 10.1038/s41586-021-03925-1. [DOI] [PubMed] [Google Scholar]
  27. McCallum  M, De Marco  A, Lempp  FA, Tortorici  MA, Pinto  D, Walls  AC, Beltramello  M, Chen  A, Liu  Z, Zatta  F, et al.  2021. N-terminal domain antigenic mapping reveals a site of vulnerability for SARS-CoV-2. Cell. 184:2332, e2316–2347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Molloy  SS, Anderson  ED, Jean  F, Thomas  G. 1999. Bi-cycling the furin pathway: From TGN localization to pathogen activation and embryogenesis. Trends Cell Biol. 9:28–35. [DOI] [PubMed] [Google Scholar]
  29. Narimatsu  Y, Joshi  HJ, Nason  R, Van Coillie  J, Karlsson  R, Sun  L, Ye  Z, Chen  YH, Schjoldager  KT, Steentoft  C, et al.  2019. An atlas of human glycosylation pathways enables display of the human glycome by gene engineered cells. Mol Cell. 75:394, e395–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Pallesen  J, Wang  N, Corbett  KS, Wrapp  D, Kirchdoerfer  RN, Turner  HL, Cottrell  CA, Becker  MM, Wang  L, Shi  W, et al.  2017. Immunogenicity and structures of a rationally designed prefusion MERS-CoV spike antigen. Proc Natl Acad Sci U S A. 114:E7348–E7357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Reis  CA, Tauber  R, Blanchard  V. 2021. Glycosylation is a key in SARS-CoV-2 infection. J Mol Med (Berl). 99:1023–1031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Rutten  L, Lai  YT, Blokland  S, Truan  D, Bisschop  IJM, Strokappe  NM, Koornneef  A, van  Manen  D, Chuang  GY, Farney  SK, et al.  2018. A universal approach to optimize the folding and stability of prefusion-closed HIV-1 envelope trimers. Cell Rep. 23:584–595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Scudellari  M. 2020. The sprint to solve coronavirus protein structures - and disarm them with drugs. Nature. 581:252–255. [DOI] [PubMed] [Google Scholar]
  34. Shajahan  A, Supekar  NT, Gleinich  AS, Azadi  P. 2020. Deducing the N- and O-glycosylation profile of the spike protein of novel coronavirus SARS-CoV-2. Glycobiology. 30:981–988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Soh  WT, Liu  Y, Nakayama  EE, Ono  C, Torii  S, Nakagami  H, Matsuura  Y, Shioda  T, Arase  H. 2020. The N-terminal domain of spike glycoprotein mediates SARS-CoV-2 infection by associating with L-SIGN and DC-SIGN. bioRxiv. 10.1101/2020.11.05.369264. [DOI] [Google Scholar]
  36. Sztain  T, Ahn  SH, Bogetti  AT, Casalino  L, Goldsmith  JA, Seitz  E, McCool  RS, Kearns  FL, Acosta-Reyes  F, Maji  S, et al.  2021. A glycan gate controls opening of the SARS-CoV-2 spike protein. Nat Chem. 13:963–968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Varki  A, Cummings  RD, Aebi  M, Packer  NH, Seeberger  PH, Esko  JD, Stanley  P, Hart  G, Darvill  A, Kinoshita  T, et al.  2015. Symbol nomenclature for graphical representations of Glycans. Glycobiology. 25:1323–1324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Volz  E, Hill  V, McCrone  JT, Price  A, Jorgensen  D, O'Toole  A, Southgate  J, Johnson  R, Jackson  B, Nascimento  FF, et al.  2021. Evaluating the effects of SARS-CoV-2 spike mutation D614G on transmissibility and pathogenicity. Cell. 184:64, e11–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Wang  D, Baudys  J, Bundy  JL, Solano  M, Keppel  T, Barr  JR. 2020. Comprehensive analysis of the glycan complement of SARS-CoV-2 spike proteins using signature ions-triggered electron-transfer/higher-energy collisional dissociation (EThcD) mass spectrometry. Anal Chem. 92:14730–14739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Wang  P, Nair  MS, Liu  L, Iketani  S, Luo  Y, Guo  Y, Wang  M, Yu  J, Zhang  B, Kwong  PD, et al.  2021. Antibody resistance of SARS-CoV-2 variants B.1.351 and B.1.1.7. Nature. 593:130–135. [DOI] [PubMed] [Google Scholar]
  41. Watanabe  Y, Allen  JD, Wrapp  D, McLellan  JS, Crispin  M. 2020. Site-specific glycan analysis of the SARS-CoV-2 spike. Science. 369:330–333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Watanabe  Y, Mendonca  L, Allen  ER, Howe  A, Lee  M, Allen  JD, Chawla  H, Pulido  D, Donnellan  F, Davies  H, et al.  2021. Native-like SARS-CoV-2 spike glycoprotein expressed by ChAdOx1 nCoV-19/AZD1222 vaccine. ACS Cent Sci. 7:594–602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Yang  TJ, Yu  PY, Chang  YC, Hsu  SD. 2021a. COVID-19 dominant D614G mutation in the SARS-CoV-2 spike protein desensitizes its temperature-dependent denaturation. J Biol Chem. 10.1016/j.jbc.2021.101238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Yang  TJ, Chang  YC, Ko  TP, Draczkowski  P, Chien  YC, Chang  YC, Wu  KP, Khoo  KH, Chang  HW, Hsu  SD. 2020. Cryo-EM analysis of a feline coronavirus spike protein reveals a unique structure and camouflaging glycans. Proc Natl Acad Sci U S A. 117:1438–1446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Yang  TJ, Yu  PY, Chang  YC, Liang  KH, Tso  HC, Ho  MR, Chen  WY, Lin  HT, Wu  HC, Hsu  SD. 2021b. Effect of SARS-CoV-2 B.1.1.7 mutations on spike protein structure and function. Nat Struct Mol Biol. 28:731–739. [DOI] [PubMed] [Google Scholar]
  46. Yao  H, Song  Y, Chen  Y, Wu  N, Xu  J, Sun  C, Zhang  J, Weng  T, Zhang  Z, Wu  Z, et al.  2020. Molecular architecture of the SARS-CoV-2 virus. Cell. 183:730, e713–738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Yurkovetskiy  L, Wang  X, Pascal  KE, Tomkins-Tinch  C, Nyalile  TP, Wang  Y, Baum  A, Diehl  WE, Dauphin  A, Carbone  C, et al.  2020. Structural and functional analysis of the D614G SARS-CoV-2 spike protein variant. Cell. 183:739, e738–751. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Zhang  Y, Zhao  W, Mao  Y, Chen  Y, Wang  S, Zhong  Y, Su  T, Gong  M, Du  D, Lu  X, et al.  2020. Site-specific N-glycosylation characterization of recombinant SARS-CoV-2 spike proteins. Mol Cell Proteomics. 10.1074/mcp.RA120.002295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Zhao  P, Praissman  JL, Grant  OC, Cai  Y, Xiao  T, Rosenbalm  KE, Aoki  K, Kellman  BP, Bridger  R, Barouch  DH, et al.  2020. Virus-receptor interactions of glycosylated SARS-CoV-2 spike and human ACE2 receptor. Cell Host Microbe. 28:586, e586–601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Zhou  B, Thao  TTN, Hoffmann  D, Taddeo  A, Ebert  N, Labroussaa  F, Pohlmann  A, King  J, Steiner  S, Kelly  JN, et al.  2021a. SARS-CoV-2 spike D614G change enhances replication and transmission. Nature. 592:122–127. [DOI] [PubMed] [Google Scholar]
  51. Zhou  D, Tian  X, Qi  R, Peng  C, Zhang  W. 2021b. Identification of 22 N-glycosites on spike glycoprotein of SARS-CoV-2 and accessible surface glycopeptide motifs: Implications for vaccination and antibody therapeutics. Glycobiology. 31:69–80. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental_Data_0915_cwab102
Table_SI_NGlycopeptide_Library_for_Byonic_Search_cwab102
Table_SII_NGlycopeptide_Byonic_ID_Result_cwab102
Table_SIII_Quantified_NGlycopeptides_cwab102
Table_SIV_OGlycopeptide_Byonic_ID_Result_cwab102

Data Availability Statement

The mass spectrometry data underlying this article are available at the Center for Computational Mass Spectrometry, MassIVE and can be accessed via the dataset identifier MSV000087696.


Articles from Glycobiology are provided here courtesy of Oxford University Press

RESOURCES