Abstract
Designing prefusion-stabilized SARS-CoV-2 spike is critical for the effectiveness of COVID-19 vaccines. All COVID-19 vaccines in the US encode spike with K986P/V987P mutations to stabilize its prefusion conformation. However, contemporary methods on engineering prefusion-stabilized spike immunogens involve tedious experimental work and heavily rely on structural information. Here, we establish a systematic and unbiased method of identifying mutations that concomitantly improve expression and stabilize the prefusion conformation of the SARS-CoV-2 spike. Our method integrates a fluorescence-based fusion assay, mammalian cell display technology, and deep mutational scanning. As a proof-of-concept, we apply this method to a region in the S2 domain that includes the first heptad repeat and central helix. Our results reveal that besides K986P and V987P, several mutations simultaneously improve expression and significantly lower the fusogenicity of the spike. As prefusion stabilization is a common challenge for viral immunogen design, this work will help accelerate vaccine development against different viruses.
Subject terms: High-throughput screening, SARS-CoV-2
Designing vaccine immunogens is often a tedious process. Here the authors develop a deep mutational scanning-based method to rapidly and comprehensively identify prefusion stabilizing mutations of SARS-CoV-2 spike as a vaccine immunogen.
Introduction
SARS-CoV-2 spike (S) glycoprotein, a homotrimeric class I fusion protein, naturally exists in a metastable, prefusion conformation on the virion surface1. Once the receptor-binding domain (RBD) of S transitions to an ‘up’ state and binds to the human angiotensin-converting enzyme II (hACE2) receptor2–4, a cascade of conformational changes is triggered to promote virus-host membrane fusion, and hence virus entry1,5–8. This conformational change, which involves structural rearrangement of the first heptad repeat (HR1) and central helix (CH), as well as the shedding of the S1 subunit, converts S into the postfusion conformation5–10. To inhibit virus entry and fusion, neutralizing antibodies target a variety of mainly conformational epitopes on the prefusion conformation of S11–15. Many of these conformational epitopes disappear or rearrange in the postfusion conformation, which instead can expose non-neutralizing epitopes that are immunodominant1. Consistently, antibody titer to the prefusion conformation has a strong correlation with neutralization potency, whereas that to the postfusion conformation does not16. Therefore, effective COVID-19 vaccines require S to be locked in the prefusion conformation to preserve the neutralizing epitopes.
The rapid development of prefusion-stabilized SARS-CoV-2 S during the early phase of COVID-19 pandemic has tremendously benefited from prior studies on prefusion-stabilizing mutations in the S proteins of related betacoronaviruses, namely MERS-CoV17,18 and SARS-CoV18. These studies employed a structure-based approach to identify two prefusion-stabilizing mutations (K986P/V987P, SARS-CoV-2 numbering) at the HR1-CH junction17–19. Due to the structural similarities among the S proteins of MERS-CoV, SARS-CoV, and SARS-CoV-2, K986P/V987P were directly applied to engineer the prefusion-stabilized SARS-CoV-2 S during COVID-19 vaccine development. For example, K986P/V987P are included in many nucleic acid- and protein subunit-based COVID-19 vaccines, such as those from Moderna20, Pfizer-BioNTech21, Johnson & Johnson-Janssen22, and Novavax23. Subsequent studies, which also used a structure-based approach, identified additional mutations that further improve the expression and prefusion stability of SARS-CoV-2 S24–27. Nevertheless, identifying prefusion-stabilizing mutations using structure-based approach is time-consuming and likely not comprehensive, because it relies on low-throughput characterization of individual candidate mutants. Thus, viral vaccine immunogen design remains a challenge due to its non-trivial nature28.
To this end, we develop here a method to identify prefusion-stabilizing mutations of SARS-CoV-2 S in a high-throughput and systematic manner, by coupling a fluorescence-based fusion assay, mammalian cell display technology, and deep mutational scanning (DMS). As a proof-of-concept, we screen all possible amino-acid mutations across the entire region spanning HR1 and CH. In addition to the K986P and V987P mutations that are used in current COVID-19 vaccines, we identify several mutations that simultaneously improve expression and stabilize the prefusion conformation of both membrane-bound and soluble S. In this regard, our method circumvents the limitations of using structure-based approaches to engineer prefusion-stabilized S immunogens.
Results
Establishing a high-throughput fusion assay for SARS-CoV-2 S
High-throughput assays for measuring protein mutant expression level in human cells have been developed in previous studies by one of our authors using landing pad cells29–31, which enable one cell to express one mutant, thereby providing a genotype-phenotype linkage32,33. Such assays have also been adopted to study the impact of N-terminal domain (NTD) mutations on SARS-CoV-2 S expression34. However, there is no similar assay for measuring fusogenicity. Conventional approaches for quantifying fusogenicity often rely on split fluorescent protein systems35–40, such as the split GFP system that consists of GFP1-10 and GFP1141. For example, when cells that express hACE2 and GFP1-10 are mixed with cells expressing SARS-CoV-2 S and GFP11, fusion occurs, and the resultant syncytia fluoresce green. In this study, we pioneered an approach by combining this fluorescence-based fusion assay with the use of landing pad cells to establish a high-throughput fusion assay that is compatible with DMS42.
Specifically, we constructed a DMS library of membrane-bound S that was expressed by HEK293T landing pad cells, such that each cell would encode and express one S mutant. The DMS library contained all possible amino acid mutations from residues 883 to 1034, which covers HR1 (residues 912-984) and CH (residues 985-1034). All S-expressing cells also expressed mNeonGreen211 (mNG211), which belongs to the split monomeric NeonGreen2 system43. At the same time, a stable cell line that expressed hACE2 and mNG21-10 was generated (Fig. S1). For the rest of the study, unless otherwise stated, HEK293T landing pad cells that expressed S and mNG211 are abbreviated as “S-expressing cells” and those that expressed hACE2 and mNG21-10 are abbreviated as “hACE2-expressing cells”.
When S-expressing cells were mixed with hACE2-expressing cells, S-expressing cells that encoded fusion-competent mutants would fuse with hACE2-expressing cells to form green-fluorescent syncytia (Fig. 1a, c, see Methods). In contrast, no fusion would occur with S-expressing cells that encoded fusion-incompetent mutants. Subsequently, fluorescence-activated cell sorting (FACS) was performed to separate the unfused cells and green-fluorescent syncytia, both of which were then analyzed by next-generation sequencing. The fusogenicity of each mutant could be quantified by comparing its frequency between the green-fluorescent syncytia sample and the unfused cell sample. In parallel, the expression level of each mutant was measured in a high-throughput manner as described previously29,34 (see Methods).
Prior to performing the DMS experiments above, the expression of membrane-bound S in HEK293T landing pad cells was verified via flow cytometry analysis using the RBD antibody CC12.344 (Fig. 1b). Moreover, the formation of green-fluorescent syncytia due to the fusion of S-expressing cells and hACE2-expressing cells was also verified by microscopy and flow cytometry (Fig. 1c, d, Fig. S2a). We further showed that such fusion can be inhibited by CC40.8, a neutralizing antibody to the stem helix of the S fusion machinery45, at the highest concentration tested (Fig. S2b). This result confirmed that the fusion of S-expressing cells and hACE2-expressing cells was mediated by the S protein. We optimized the fusion assay to maximize the formation of green-fluorescent syncytia while minimizing the risk of clogging the cell sorter (Fig. S2c–e).
Identification of fusion-incompetent S mutations with high expression level
From the DMS results, we computed the fusion score and expression score for each of the 2736 missense mutations, 152 nonsense mutations, and 152 silent mutations (see Methods). A higher expression score indicates a higher S expression level. Similarly, a higher fusion score indicates higher fusogenicity. Both expression score and fusion score were normalized such that the average score of silent mutations was 1 and that of nonsense mutations was 0. Three and two biological replicates were performed for the high-throughput expression and fusion assays, respectively. The Pearson correlation coefficient of expression scores among replicates ranged from 0.72 to 0.79, whereas that of fusion scores between replicates was 0.61, confirming the reproducibility of our DMS experiments (Fig. S3a, b). In addition, the expression score distribution and fusion score distribution of silent mutations were significantly different from those of nonsense mutations (Fig. S3c, d), indicating that our DMS experiments could distinguish mutants with different expression and fusogenicity levels. The expression score and fusion score for individual mutations are shown in Fig. S4 and Supplementary Data 1.
Since our fusion assay measured the fusogenicity at the cell level rather than at the single molecule level, the fusion score would be influenced by the expression level even if the fusogenicity per S molecule remained constant. Consistently, the fusion score positively correlated with the expression score (Fig. 2a). To correct for the effect of S expression level on fusogenicity, we computed an adjusted fusion score, which represented the residual of a linear regression model of fusion score on expression score (Fig. 2b). Mutations that had a low adjusted fusion score and a high expression score included the well-known prefusion-stabilizing mutations K986P and V987P that were used in current COVID-19 vaccines46,47 (Fig. 2b), substantiating that our method could identify prefusion-stabilizing mutations.
Previous studies have shown that the expression of S with K986P/V987P can be improved by additional mutations24–27, as exemplified by an S construct known as HexaPro, which contains mutations F817P, A892P, A899P, A942P on top of K986P and V987P. Except for F817P, the other mutations in HexaPro were all present in our DMS library. Consistent with the original report of HexaPro24, our DMS data showed that A899P had minimal influence on S expression, whereas A892P and A942P noticeably increased S expression (Fig. 2a, b). These observations further validated our DMS data.
Validation and combinations of prefusion-stabilizing mutations
Besides K986P and V987P, we also identified other mutations in HR1 and CH that had a low adjusted fusion score and a high expression score, particularly T961F, D994E, D994Q and Q1005R (Fig. 2b, c). Of note, D994E and D994Q were at the same residue position and chemically similar. By expressing these four mutations individually using HEK293T landing pad cells, we validated that they indeed improved the surface expression of S (Figs. 3a, S5a) and prevented the formation of syncytia when incubated with hACE2-expressing cells (Figs. 3d, S6a, b). Consistent with the DMS data (Fig. 2), the effects of T961F, D994E, D994Q and Q1005R on S expression and fusogenicity were comparable to K986P and V987P in the validation experiments. As a control, we also selected two mutations that had a high adjusted fusion score and a high expression score, namely S943H and A944S (Fig. 2b), and validated their enhancement in S expression and fusogenicity (Figs. 3b, e, S5b, S6c, d).
Subsequently, we combined the validated fusion-incompetent mutations K986P, V987P, D994Q and Q1005R to generate double (K986P/V987P: ‘2P’), triple (K986P/V987P/D994Q: ‘2PQ’, K986P/V987P/Q1005R: ‘2PR’) and quadruple (K986P/ V987P/D994Q/Q1005R: ‘2PQR’) mutants of membrane-bound S. Surface expression of these mutation combinations was higher than that of WT, but comparable with each other (Figs. 3c, S5c). As expected, none of these S mutation combinations fused with hACE2-expressing cells (Figs. 3f, S6e, f). We further tested the expression of soluble S ectodomain with different mutants. Interestingly, addition of the D994Q to 2 P improved expression of soluble S ectodomain by approximately three-fold while the Q1005R drastically reduced expression of soluble S (Fig. S7). Q1005R seemed to increase the formation of higher order oligomers of soluble S ectodomain, as observed by a peak higher than the expected size of trimeric S ectodomain in size exclusion chromatography of all mutants that contained Q1005R (Fig. S7b). These observations indicate that certain mutations can improve the expression level of S in membrane-bound form but not soluble ectodomain form.
Structural and biophysical characterization of 2PQ spike
Due to the improvement of 2PQ over 2P in soluble S ectodomain expression, we proceeded with biophysical characterization of 2PQ to rationalize the prefusion-stabilization mechanism of D994Q. The prefusion conformation of 2PQ was confirmed by low-resolution cryogenic electron microscopy (Figs. 4a, b, S8a). While the structure could not be resolved at atomic resolution, the result allowed us to confirm that the 2PQ mutant is in the prefusion-stabilized conformation. In addition, the electron density of the protein backbone clearly showed that the helix containing residue 944 is shifted towards the helix containing residue Q755 (Fig. S8b). This observation is corroborated by in silico mutagenesis using Rosetta, which showed that the helices are brought together in proximity so that D994Q forms an intraprotomer hydrogen bond with Q758 to stabilize the prefusion conformation (Fig. 4d). Differential scanning fluorimetry revealed that both 2P and 2PQ had an apparent melting temperature at approximately 46.5 °C, similar to the previously reported value for 2P24. Nevertheless, 2PQ had another peak at approximately 62 °C, suggesting that the additional D994Q mutation prevents immediate, complete unfolding of S (Fig. 4c). The stabilizing effect of D994Q, however, was not as pronounced as a combination of F817P, A892P, A899P, A942P that were used in HexaPro, which not only showed two peaks in the differential scanning fluorimetry analysis, but also shifted the first apparent melting temperature by 5 °C24.
Finally, we tested whether D994Q altered the antigenicity of the S protein. We compared the binding of 2P and 2PQ to various S antibodies, including CC12.3 (RBD)44, S2M28 (NTD)48, CC40.8 (S2 stem helix)45, and COVA1-07 (S2 HR1)49, using flow cytometry. 2P and 2PQ showed similar binding affinity to CC12.3 and S2M28 (Figs. 4e, S9a, b). However, when assayed for binding with COVA1-07 or with CC40.8, 2PQ had weaker binding than 2P (Figs. 4e, S9c, d). Of note, COVA1-07 only binds efficiently when S is in an open conformation that has transitioned away from the prefusion conformation49. Similarly, the binding of CC40.8 to S requires partial disruption of the prefusion S trimer and is shown to be weakened by prefusion-stabilizing mutations45. Therefore, our result substantiates that D994Q can further enhance the prefusion stability of 2P, which is known to insufficiently stabilize the prefusion conformation24,25,50. Collectively, these data reveal a prefusion-stabilization mechanism of D994Q and demonstrate its minimal impact on the antigenicity of the head domain of S. Future studies should explore whether D994Q or other prefusion-stabilizing mutations identified in this study can further improve the stability of other prefusion-stabilized S constructs, such as S-Closed25, HexaPro24, and VFLIP27, while retaining its antigenicity.
Discussion
Structure-based design28 of prefusion-stabilized class I viral fusion proteins has been successfully applied to HIV51–54, RSV55, Nipah56, Lassa57, Ebola58, and more recently SARS-CoV-224–27. Although structure-based design is an effective approach for prefusion-stabilization, it requires structural determination and subsequent expression, purification, and characterization of each candidate mutation individually. This laborious experimental process limits the comprehensiveness of using a structure-based approach to identify prefusion-stabilizing mutations. In this study, we established a high-throughput approach to measure the fusogenicity of thousands of mutations in parallel. This approach enables systematic identification of prefusion-stabilizing mutations without relying on structural information. While we only provide a proof-of-concept using the SARS-CoV-2 S protein, our approach can be adopted to fusion proteins of other viruses with known cell surface receptors. Given that prefusion-stabilization is critical for viral immunogen design28,59, our work here should advance the process of viral vaccine development.
One interesting finding in this study is that the expression of membrane-bound (i.e. full-length) S protein does not necessarily correlate with the expression of soluble S ectodomain, as exemplified by Q1005R. In addition, our results show that S943G, A944G and A944P mutations, which have been shown to increase the expression of soluble S ectodomain25, do not increase the expression of membrane-bound S protein. These observations indicate that the ectodomain of the S protein has some long-range interactions with its native transmembrane domain. As a result, caution is needed when extrapolating the results obtained from full-length S protein to soluble S ectodomain, or vice versa. However, since most COVID-19 vaccines on the market are based on the full-length membrane-bound S protein60, the results from our high-throughput fusion and expression assays, which are also based on full-length membrane-bound S protein, are directly applicable to COVID-19 vaccine development.
Although most SARS-CoV-2 neutralizing antibodies target RBD61, recent studies have shown that antibodies to S2 can also neutralize, albeit often at a lower potency45,62–65. As a result, understanding the evolutionary constraints of S2 is relevant to SARS-CoV-2 antigenic drift and to design of more universal coronavirus vaccines. While many mutations in HR1 and CH, including those of major SARS-CoV-2 variants (Table S1), do not negatively impact the expression or fusogenicity of the S protein (Fig. 2b), HR1 and CH show high degrees of evolutionary conservation among betacoronaviruses (Fig. S10). This observation could be due to low levels of positive selection pressure on HR1 and CH, since most neutralizing antibodies are directed towards the RBD61. Alternatively, besides S protein expression and fusogenicity, other evolutionary constraints on HR1 and CH may be present in vivo. Future studies of the relationship among S protein expression, fusogenicity, and virus replication fitness will provide important biophysical insights into the evolution of SARS-CoV-2.
Since RBD is present in the prefusion conformation but not the postfusion conformation5–7 and is the major target of neutralizing antibodies61, this study used an RBD antibody to probe for surface expression. Nevertheless, we acknowledge that the folding of prefusion S can be more comprehensively probed by antibodies to conformational epitopes in the NTD and S2 subunit. Furthermore, due to the technical difficulties in sorting large syncytia, co-culturing of S-expressing cells and hACE2-expressing cells could only be performed for relatively short durations before sorting, leading to fewer syncytia and potentially lower reproducibility of results across replicates. Alternative strategies including microfluidics-based fusion experiments can be explored to obviate the kinetic limitations of the current fusion assay.
If the prefusion-stabilizing mutations of betacoronavirus S protein were not reported in late 2010s17,66, it would not have been possible to develop an effective COVID-19 vaccine at the speed that occurred, even with the availability and utilization of the mRNA vaccine technology. It is unclear whether the next pandemic will be caused by a virus that we have prior knowledge about. Consequently, while the speed of vaccine manufacturing has been revolutionized by the mRNA vaccine technology67, it is equally important to maximize the speed of immunogen design so that we are fully prepared for the next pandemic. We believe our work here provides an important step in that regard.
Methods
Cell culture
Human embryonic kidney 293T (HEK293T) landing pad cells obtained from Dr. Kenneth A. Matreyek (Case Western Reserve University) were grown and maintained in complete growth medium: Dulbecco’s modified Eagle medium (DMEM) with high glucose (Gibco), supplemented with 10% v/v fetal bovine serum (FBS; VWR), 1× non-essential amino acids (Gibco), 100 U/mL penicillin and 100 μg/mL streptomycin (Gibco), 1× GlutaMAX (Gibco) and 2 μg/mL doxycycline (Thermo Scientific) at 37 °C, 5% CO2 and 95% humidity. Expi293F cells (Gibco, catalog number A14527) were grown and maintained in Expi293 expression medium (Gibco) at 37 °C, 8% CO2, 95% humidity and 125 rpm according to the manufacturer’s instructions.
Landing pad plasmids
attB plasmids each encoding (hACE2, an internal ribosomal entry site [IRES], and hygromycin resistance: attB-hACE2), (hACE2, an IRES, general control nondepressible 4 [GCN4] leucine zipper fused to mNG21-10, a (GSG) P2A self-cleaving peptide, and hygromycin resistance: attB-hACE2-mNG2-1-10), and (S with the PRRA motif in the furin cleavage site deleted, an IRES, GCN4 leucine zipper fused to mNG211, a (GSG) P2A self-cleaving peptide, and puromycin resistance: attB-S-mNG2-11) were constructed and assembled via polymerase chain reaction (PCR). The sequence of S used in this study was the ancestral (Wuhan-Hu-1) strain (GenBank accession ID: MN908947.3)68. The PRRA motif in the furin cleavage site was deleted to prevent spontaneous fusion of S-expressing cells with each other69. For experimental validation, mutants of S were individually constructed using PCR-based site-directed mutagenesis. Pairs of primers used for PCR-based site directed mutagenesis are listed in Table S2.
Deep mutational scanning library construction
Cassette primers for DMS library construction are listed in Table S3. Cassette primers were resuspended in MilliQ H2O such that the final concentration of all primers is 10 μM. Forward cassette primers, named as CassetteX_N (X = 1, 2, …, 19; N = 1, 2, …, 8), that belong to the same cassette (i.e., the same value of X) were mixed in equimolar ratios. Each forward cassette primer also carried unique silent mutations (i.e. synonymous mutations) to help distinguish between sequencing errors and true mutations in downstream sequencing data analysis as described previously70. For the first round of PCR, two sets of reactions were set up. The first set had the mixed cassette primers and 5′-ACG ACG TCT CCT TCT CTA GGA AAG TGG GCT TTG C-3′ as forward and reverse primers, respectively. The second set had 5′-TGC TCG TCT CCA AAG TGA CAC TGG CCG ACG CCG G-3′ and CassetteX_Rprimers (X = 1, 2, …, 19) as forward and reverse primers, respectively. Since we had 19 cassettes, there were 19 PCRs for each of the two sets of reactions. For both sets, the template used was attB-S-mNG2-11. Thereafter, products corresponding to the correct size were excised and purified using Monarch DNA Gel Extraction kit (NEB). For the second round of PCR, 10 ng of PCR product from each of the first and second sets in the same cassette were mixed. 5′-ACG ACG TCT CCT TCT CTA GGA AAG TGG GCT TTG C-3′ and 5′-TGC TCG TCT CCA AAG TGA CAC TGG CCG ACG CCG G-3′ were used as the forward and reverse primers, respectively. PCR products corresponding to the correct size were excised and purified using DNA Gel Extraction kit (NEB). 100 ng of each gel-purified PCR products (total of 19) were mixed and digested with BsmBI restriction enzyme (NEB) for 2 h at 55 °C. Then, the product was purified using PureLink PCR Purification kit (Invitrogen) and served as the insert.
To amplify the vector, attB-S-mNG2-11, 5′-CAC TCG TCT CGA GAA GGC GTG TTC GTG TCC AAC G-3′, and 5′-GGC CCG TCT CAC TTT GTT GAA CAG CAG GTC CTC G-3′ were used as template, forward primer, and reverse primer, respectively. The PCR product was digested with DpnI (NEB) for 2 h at 37 °C, purified with PureLink PCR Purification kit (Invitrogen), digested with BsmBI restriction enzyme (NEB) for 2 h at 55 °C, and purified again using a PureLink PCR Purification kit (Invitrogen). All PCRs were performed using PrimeSTAR Max DNA Polymerase (Takara) according to the manufacturer’s instructions.
BsmBI-digested vector and insert were ligated in a molar ratio of 1:100 to a total of 1 μg using T4 DNA ligase (NEB) for 2 h at room temperature. A control ligation reaction was set up by only having the BsmBI-digested vector (no insert). 1 μL ligation reaction products were transformed into chemically competent DH5α Escherichia coli cells and plated onto agar plates with 100 μg/mL ampicillin. The ligation mixture that contained vector and insert had at least 10 times more colonies than the control reaction. Subsequently, the ligation mixture was column-purified using a PureLink PCR Purification kit and eluted in 10 μL of MilliQ H2O. 1 μL of the purified ligated product was mixed with 30 μL MegaX DH10β T1R electrocompetent E. coli cells (NEB) into an electroporation cuvette with a 1 mm gap (BTX). Electroporation was performed at 2.0 kV, 200 Ω and 25 μF using an ECM 830 square wave electroporation system (BTX). 1 mL of SOC recovery medium (NEB) was added immediately into cells after electroporation. Two electroporation reactions were performed. Cells were recovered for 1 h at 37 °C with shaking at 250 rpm. After recovery, cells were collected via centrifugation, resuspended in 400 μL lysogeny broth (LB), plated onto 150 mm agar plates supplemented with 100 μg/mL ampicillin, and incubated overnight at 37 °C. At least 1 × 106 colonies were scrape-harvested with LB broth and plasmids were extracted using a PureLink Plasmid Midiprep kit (Invitrogen).
Landing pad cell transfection
6.0 × 105 HEK293T landing pad cells in 1.35 mL of complete growth medium were seeded per well of a 6-well plate. 1.7 μg of attB-hACE2-mNG2-1-10 plasmid or attB-S-mNG2-11 plasmid were added into 5 μL FuGENE 6 transfection reagent (Promega) and OptiMEM (Gibco) to a total volume of 240 μL. The transfection mixture was subsequently added dropwise into cells. Transfection was carried out on the same day as seeding. One day post-transfection, 500 μL of complete growth medium was added to cells. Three days post-transfection, medium was discarded, cells were washed with 1× PBS, and incubated in negative selection medium (complete growth medium supplemented with 10 nM AP1903) for one day at 37 °C, 5% CO2 and 95% humidity. Then, the medium was discarded, cells were washed with 1× PBS, and recovered in complete growth medium for two days at 37 °C, 5% CO2 and 95% humidity. Cells were then trypsinized and grown in positive selection medium indefinitely: hACE2- and S-expressing cells were maintained in hygromycin medium (complete growth medium supplemented with 100 μg/mL hygromycin B [Invivogen]) and puromycin medium (complete growth medium supplemented with 1 μg/mL puromycin [Invivogen]), respectively.
To construct the S2 HR1/CH DMS cell line, the above protocol was used with modifications: 3.5 × 106 cells in 8 mL of complete growth medium in a T75 flask were transfected with 7.1 μg of the DMS plasmid library and 29 μL of FuGENE6 transfection reagent in 1.4 mL of OptiMEM. For positive selection and regular maintenance, puromycin medium was used.
Flow cytometry
To validate hACE2 surface expression after transfection, landing pad cells were harvested via centrifugation at 300 × g for 5 min at 4 °C, resuspended in ice-cold FACS buffer (2% v/v FBS, 50 mM EDTA in DMEM supplemented with high glucose, L-glutamine and HEPES, without phenol red [Gibco]), and incubated with 2 μg/mL of SARS-CoV-2 S RBD-IgG Fc for 1 h at 4 °C. Then, cells were washed once, and resuspended with ice-cold FACS buffer. Cells were incubated with 1 μg/mL of phycoerythrin (PE)-conjugated anti-human IgG Fc (BioLegend, clone M1310G05, catalog number 410708). Cells were washed once and resuspended in ice-cold FACS buffer. Cells were analyzed using an Accuri C6 flow cytometer (BD Biosciences). Data was collected using BD Accuri C6 software v264 (BD Biosciences).
The above protocol for verification and quantification of S surface expression was used except cells were incubated with 5 μg/mL of CC12.344, an RBD antibody, instead of SARS-CoV-2 S RBD-IgG Fc, for 1 h at 4 °C. To quantify fold change in surface expression of S relative to WT based on median fluorescence intensity (MFI), Eq. (1) was used in the plot of FSC-A against PE:
1 |
MFI values were obtained after plotting data in FCS Express Flow Cytometry software v6 (De Novo Software). Gating strategy is shown in Fig. S11a.
To assess fusogenicity of S (WT or mutants), an equal number of hACE2, mNG21-10- and S, mNG211-expressing cells were mixed such that the total cell number is 5.0 × 105 cells per mL of complete growth medium. Cells were co-cultured for 3 h at 37 °C, 5% CO2 and 95% humidity. Cells were then harvested and resuspended in ice-cold FACS buffer. Cells were analyzed using an Accuri C6 flow cytometer (BD Biosciences). Data was collected using BD Accuri C6 software v264 (BD Biosciences). Gating strategy is shown in Fig. S11b. The percentage of mNG2-positive events of mutants relative to that of WT S was calculated.
For titration of S-2P or S-2PQ, HEK293T landing pad cells stably expressing membrane-bound S-2P or S-2PQ were incubated with 0, 0.1, 0.3, 1.0, 3.0 or 10.0 μg/mL of S2M28, CC12.3, COVA1-07, or CC40.8 antibody for 1 h in ice-cold FACS buffer. Cells were washed and then incubated with 1 μg/mL PE-conjugated anti-human IgG Fc for 1 h at 4 °C. Cells were washed, resuspended in ice-cold FACS buffer, and analyzed for levels of PE using an Accuri C6 flow cytometer. Gating strategy is shown in Fig. S11a. MFI values were subtracted from those of negative control (0 μg/mL of antibody) and plotted against antibody concentration (Figs. 4e, S10).
Expression sorting
Cells expressing the S2 HR1/CH DMS library of S were harvested via centrifugation at 300 × g for 5 min at 4 °C. Supernatant was discarded, and cells were resuspended in ice-cold FACS buffer. Cells were incubated with 5 μg/mL of CC12.3 for 1 h at 4 °C. Then, cells were washed once, and resuspended with ice-cold FACS buffer. Cells were incubated with 2 μg/mL of PE anti-human IgG Fc. Cells were washed once, resuspended in ice-cold FACS buffer, and filtered through a 40 μm strainer. Cells were sorted via a four-way sort using a BD FACS Aria II cell sorter and BD FACS Diva software v8.0 (BD Biosciences), or a BigFoot spectral cell sorter and Sasquatch software firmware v888 (Invitrogen) according to PE fluorescence at 4 °C. Cells expressing the highest PE fluorescence were sorted into “bin 3”, then the next highest into “bin 2”, followed by “bin 1” and then “bin 0”. Each bin had ~25% of the singlet population. Gating strategy is shown in Fig. S11c. Number of cells collected per bin per replicate is shown in Table S4. Of note, since CC12.3 binds to the RBD44, an independently folded region of S that is present only in the prefusion but not postfusion conformation1,71, our sort was based on the expression of prefusion S.
Fusion sorting
Cells expressing the HR1/CH DMS library of S, and cells expressing hACE2 were resuspended in complete growth medium and filtered through a 40 μm cell strainer to obtain single cell suspensions. 2.5 × 106 cells of each were mixed in a T-75 flask and complete growth medium added to a total volume of 10 mL. Six co-cultures were set up, with one of the co-cultures acting as a negative, non-fluorescent control by mixing hACE2- and S-expressing cells that do not have split mNG2. Co-cultures were incubated for 3 h at 37 °C, 5% CO2 and 95% humidity. Subsequently, cells were harvested and pelleted via centrifugation at 300 × g for 5 min at 4 °C. Supernatant was discarded, and cells were resuspended in ice-cold FACS buffer. Cells were sorted via a two-way sort using a BigFoot spectral cell sorter (Invitrogen) according to presence or absence of mNG2 fluorescence at 4 °C. Gating strategy is shown in Fig. S11d. Number of cells collected per bin per replicate is shown in Table S5.
Post-sorting genomic DNA extraction
After FACS, cell pellets were obtained via centrifugation at 300 × g for 15 min at 4 °C, and the supernatant was discarded. Genomic DNA was extracted using a DNeasy Blood and Tissue Kit (Qiagen) following the manufacturer’s instructions with a modification: resuspended cells were incubated and lysed at 56 °C for 30 min instead of 10 min.
Deep sequencing
After genomic DNA extraction, the region of interest was amplified via PCR using 5′-CAC TCT TTC CCT ACA CGA CGC TCT TCC GAT CTA CAT CTG CCC TGC TGG CCG GCA CA-3′ and 5′-GAC TGG AGT TCA GAC GTG TGC TCT TCC GAT CTG CAA AAG TCC ACT CTC TTG CTC TG-3′ as forward and reverse primers, respectively. A maximum of 500 ng of genomic DNA per 50 μL PCR reaction was used as template; 4 μg of genomic DNA per expression or fusion bin, per replicate, was used as template. PCR was performed using KOD DNA polymerase (Takara) with the following settings: 95 °C for 2 min, 25 cycles of (95 °C for 20 s, 56 °C for 15 s, 68 °C for 20 s), 68 °C for 2 min, 12 °C indefinitely. All eight 50 μL reactions per bin per replicate were mixed after PCR. 100 μL of product per bin per replicate was used for purification using a PureLink PCR Purification kit. Subsequently, 10 ng of the purified PCR product per bin per replicate was appended with Illumina deep sequencing barcodes via PCR using KOD DNA polymerase with the following settings: 95 °C for 2 min, 9 cycles of (95 °C for 25 s, 56 °C for 15 s, 68 °C for 20 s), 68 °C for 2 min, 12 °C indefinitely. Barcoded products were mixed and sequenced with a MiSeq PE300 v3 flow cell (Illumina).
Analysis of deep sequencing data
Forward and reverse reads were merged via PEAR72. Using custom Python code, the merged reads were translated and matched to the corresponding mutant. Counts for expression and fusion bins for each replicate were tabulated. For each replicate, the frequency of each mutant was calculated as the count of that mutant divided by the total number of counts in that bin, as shown in Eq. (2):
2 |
For each replicate, the weighted expression score for each mutant (Wmut) was calculated using Eq. (3):
3 |
The weighted expression scores were normalized such that the average of nonsense mutations equals 0, and the average of silent mutations equals 1 using Eq. (4):
4 |
The final expression score for each mutant was calculated using Eq. (5):
5 |
Fusion scores (Umut) were calculated for each replicate by the formula shown in Eq. (6):
6 |
Fusion scores were normalized such that the of silent mutations equals 1, and the of nonsense mutations equals 0 using Eq. (7):
7 |
Then, the final average score for each mutant was calculated using Eq. (8):
8 |
Adjusted fusion score of each mutant is equal to the residual of that mutant in a linear regression model of fusion score against expression score. The linear regression model and residuals were calculated using the ‘lm’ and ‘resid’ functions in RStudio v2022.12.0+353.
Sequence conservation analysis
Sequences were obtained from GenBank or GISAID (Tables S1, S7). A BLAST database was created, and the reference sequence of the DMS region (residues 883-1034) was used to run tblastn to generate BlastXML files. The reference sequence used was the founder strain of SARS-CoV-2 (GenBank accession number: MN908947.3)68. Extracted information was obtained by running ‘XML_Extraction.py’73. Multiple alignment using MAFFT was then performed74. Sequence conservation was calculated based on the residue conservation at each position relative to the reference sequence. Mean expression score and mean fusion score were calculated by taking the average of the expression scores and fusion scores of all mutants, respectively, at that position.
Fluorescence microscopy
Images were captured with an ECHO Revolve epifluorescence microscope (ECHO) with a UPLANFL N 10×/0.30 NA objective (Olympus) using the FITC channel for mNG2 fluorescence. Brightfield images were also obtained using white light. Fluorescent and brightfield images were then overlaid. Identical exposure and intensity settings were used to capture images. Scale bars correspond to 100 μm for all micrographs.
Cryogenic electron microscopy
To prepare cryoEM grid, an aliquot of 3.5 µL purified protein at ~1 mg/mL concentration was applied to a 300-mesh Quantifoil R1.2/1.3 Cu grid pre-treated with glow-discharge, blotted in a Vitrobot Mark IV machine (force -5, time 3 s), and plunge-frozen in liquid ethane. The grid was loaded in a Titan Krios microscope equipped with Gatan BioQuantum K3 imaging filter and camera. A 10-eV slit was used for the filter. Data collection was done with serialEM v4.075. Images were recorded at 130,000× magnification, corresponding to a pixel size of 0.33 Å/pix at super-resolution mode of the camera. A defocus range of -0.8 μm to -1.5 μm was set. A total dose of 50 e−/Å2 of each exposure was fractionated into 50 frames. The first two frames of the movie stacks were not included in motion-correction. CryoEM data processing was performed on the fly with cryoSPARC Live v3.3.2 (Structura Biotechnology)76 following regular single-particle procedures. Statistics are provided in Table S8. Structures were visualized using UCSF ChimeraX v1.5 (UCSF).
Rosetta-based mutagenesis
The structure of S was obtained from the Protein Data Bank (PDB ID: 6ZGE). N-acetyl-D-glucosamine and water molecules were removed using PyMOL v2.4.0 (Schrödinger), and amino acids were renumbered using pdb-tools77. The ‘fixbb’ application in Rosetta v3.11 (RosettaCommons) was used to generate the D994Q mutation in all protomers. One-hundred poses were obtained, and the lowest scoring pose was used for further processing. A constraint file was generated using the lowest-scoring pose from fixed backbone mutagenesis as input, and the ‘minimize_with_cst’ application in Rosetta. Fast relax was subsequently performed using the ‘relax’ application78 with the constraint file. The lowest scoring pose out of thirty was used for structural analysis.
Antibody expression and purification
Codon-optimized oligonucleotides encoding the heavy chain and light chain of the indicated antibodies were cloned into phCMV3 plasmids in an IgG1 Fc format with a mouse immunoglobulin kappa signal peptide. Plasmids encoding the heavy chain and light chain of antibodies were transfected into Expi293F cells using an Expifectamine 293 transfection kit (Gibco) in a 2:1 mass ratio following the manufacturer’s protocol. Supernatant was harvested 6 days post-transfection and centrifuged at 4000 × g for 30 min at 4 °C to remove cells and debris. The supernatant was subsequently clarified using a polyethersulfone membrane filter with a 0.22 μm pore size (Millipore).
CaptureSelect CH1-XL beads (Thermo Scientific) were washed with MilliQ H2O thrice, and resuspended in 1× PBS. The clarified supernatant was incubated with washed beads overnight at 4 °C with gentle rocking. Then, flowthrough was collected, and beads washed once with 1× PBS. Beads were incubated in 60 mM sodium acetate, pH 3.7 for 10 min at 4 °C. The eluate containing antibody was buffer-exchanged into 1× PBS using a centrifugal filter unit with a 30 kDa molecular weight cut-off (Millipore) four times. Antibodies were stored at 4 °C.
Soluble S protein expression and purification
SARS-CoV-2 S ectodomain (residues 1-1213, which includes the native signal peptide) with the PRRA motif in the furin cleavage site deleted, C-terminal SGGGG linker, biotinylation site, thrombin cleavage site, Foldon trimerization sequence, and 6×His-tag were all cloned in-frame into a phCMV3 vector via PCR. Site-directed mutagenesis via PCR was performed to generate the indicated mutants of soluble S protein.
Expi293F cells were transfected with vectors encoding the indicated soluble spike protein mutant using an Expifectamine 293 transfection kit following the manufacturer’s protocol. Cells were harvested six days post-transfection. The supernatant was collected via centrifugation at 4000 × g for 30 min at 4 °C, and further clarified using a polyethersulfone membrane with a 0.22 μm pore size (Millipore). The clarified supernatant was incubated with washed Ni sepharose excel His-tagged protein purification resin (Cytiva) with gentle rocking overnight at 4 °C. Flow-through was collected. Beads were washed once with 20 mM imidazole in 1× PBS, then washed once with 40 mM imidazole in 1× PBS, and finally eluted with 300 mM imidazole in 1× PBS thrice. Wash and elution fractions were subjected to denaturing sodium dodecyl sulfate-polyacrylamide gel electrophoresis (Fig. S7a). All elution fractions were combined and concentrated using a centrifugal filter unit with a 30 kDa molecular weight cut-off (Millipore) via centrifugation at 4000 × g and 4 °C for 15 min. The concentrated protein mixture was passed through a Superdex 200 XK 16/100 column in 20 mM Tris-HCl pH 8.0 and 150 mM NaCl for size-exclusion chromatography (Fig. S7b, c). Data was collected using ChromLab software v6.1 (Bio-Rad). Fractions corresponding to ~540 kDa were pooled and concentrated using a centrifugal filter unit with a 30 kDa molecular weight cut-off (Millipore) via centrifugation at 4000 × g and 4 °C for 15 min.
Differential scanning fluorimetry
200 ng/μL of purified S protein and 5× SYPRO orange (Thermo Fisher Scientific) were added into 20 mM Tris-HCl pH 8.0, 150 mM NaCl in optically clear tubes. SYPRO orange fluorescence intensity in relative fluorescence units (RFU) was measured over temperatures ranging from 10 °C to 95 °C using a CFX Connect Real-Time PCR Detection System (Bio-Rad). Melting temperature (Tm) was calculated as the temperature at which the first derivative of fluorescence intensity with respect to temperature, , was minimum.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank the Roy J. Carver Biotechnology Center at the University of Illinois at Urbana-Champaign for assistance with fluorescence-activated cell sorting and deep sequencing. We thank the cryogenic-electron microscopy core facility at the Case Western Reserve University School of Medicine. This work was supported by National Institutes of Health (NIH) R01 AI167910 (N.C.W.), DP2 AT011966 (N.C.W.), R35 GM142886 (K.A.M.), the Michelson Prizes for Human Immunology and Vaccine Research (N.C.W.), the Searle Scholars Program (N.C.W.), and the Bill and Melinda Gates Foundation INV-004923 (I.A.W.).
Source data
Author contributions
T.J.C.T. and N.C.W. conceived and designed the study. T.J.C.T. established the fusion assay and performed the deep mutational scanning experiments. T.J.C.T. and N.C.W. analyzed the deep mutational scanning data. T.J.C.T., R.L. and W.O.O. expressed and purified recombinant proteins. Z.M. and X.D. performed cryo-EM analysis. K.A.M. provided the landing pad cells and helped establish the fusion assay. M.Y. and I.A.W. provided the CC12.3 antibody; G.S. and R.A. provided the CC40.8 antibody. T.J.C.T. and C.K. performed the microscopy analysis. T.J.C.T. and N.C.W. wrote the paper and all authors reviewed and/or edited the paper.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
Structures from the following identifiers from the Protein Data Bank (PDB) were used in this study: 6VXX and 6VYB. The cryoEM map of 2PQ spike can be accessed at the Electron Microscopy Data Bank (EMDB) using accession code EMD-29374. Raw deep sequencing data generated in this study have been submitted to the NIH Sequence Read Archive under accession number: PRJNA826665. Source data are provided with this paper.
Code availability
Custom codes to analyze deep mutational scanning, thermal stability, and flow cytometry data have been deposited to 10.5281/zenodo.774283079.
Competing interests
N.C.W., K.A.M. and T.J.C.T. have filed a provisional patent application (IL0042US.L) with the University of Illinois covering the deep mutational scanning-based method to identify prefusion-stabilizing mutations for vaccine design and SARS-CoV-2 spike with the K986P/V987P/D994Q mutations described in this article. N.C.W. serves as a consultant for HeliXon. The remaining authors declare no other competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-023-37786-1.
References
- 1.Cai Y, et al. Distinct conformational states of SARS-CoV-2 spike protein. Science. 2020;369:1586–1592. doi: 10.1126/science.abd4251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zhou P, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. doi: 10.1038/s41586-020-2012-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Letko M, Marzi A, Munster V. Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses. Nat. Microbiol. 2020;5:562–569. doi: 10.1038/s41564-020-0688-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lan J, et al. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 2020;581:215–220. doi: 10.1038/s41586-020-2180-5. [DOI] [PubMed] [Google Scholar]
- 5.Chiliveri SC, Louis JM, Ghirlando R, Bax A. Transient lipid-bound states of spike protein heptad repeats provide insights into SARS-CoV-2 membrane fusion. Sci. Adv. 2021;7:eabk2226. doi: 10.1126/sciadv.abk2226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Marcink TC, et al. Intermediates in SARS-CoV-2 spike–mediated cell entry. Sci. Adv. 2022;8:eabo3153. doi: 10.1126/sciadv.abo3153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Dodero-Rojas E, Onuchic JN, Whitford PC. Sterically confined rearrangements of SARS-CoV-2 Spike protein control cell invasion. eLife. 2021;10:e70362. doi: 10.7554/eLife.70362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Walls AC, et al. Tectonic conformational changes of a coronavirus spike glycoprotein promote membrane fusion. Proc. Natl. Acad. Sci. USA. 2017;114:11157–11162. doi: 10.1073/pnas.1708727114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Gorgun D, Lihan M, Kapoor K, Tajkhorshid E. Binding mode of SARS-CoV-2 fusion peptide to human cellular membrane. Biophys. J. 2021;120:2914–2926. doi: 10.1016/j.bpj.2021.02.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Koppisetti RK, Fulcher YG, Van Doren SR. Fusion peptide of SARS-CoV-2 spike rearranges into a wedge inserted in bilayered micelles. J. Am. Chem. Soc. 2021;143:13205–13211. doi: 10.1021/jacs.1c05435. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cerutti G, et al. Potent SARS-CoV-2 neutralizing antibodies directed against spike N-terminal domain target a single supersite. Cell Host Microbe. 2021;29:819–833.e7. doi: 10.1016/j.chom.2021.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Qing E, et al. Inter-domain communication in SARS-CoV-2 spike proteins controls protease-triggered cell entry. Cell Rep. 2022;39:110786. doi: 10.1016/j.celrep.2022.110786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Barnes CO, et al. SARS-CoV-2 neutralizing antibody structures inform therapeutic strategies. Nature. 2020;588:682–687. doi: 10.1038/s41586-020-2852-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Liu L, et al. Potent neutralizing antibodies against multiple epitopes on SARS-CoV-2 spike. Nature. 2020;584:450–456. doi: 10.1038/s41586-020-2571-7. [DOI] [PubMed] [Google Scholar]
- 15.Yuan M, Liu H, Wu NC, Wilson IA. Recognition of the SARS-CoV-2 receptor binding domain by neutralizing antibodies. Biochem. Biophys. Res. Commun. 2021;538:192–203. doi: 10.1016/j.bbrc.2020.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bowen, J. E. et al. SARS-CoV-2 spike conformation determines plasma neutralizing activity. bioRxiv, 10.1101/2021.12.19.473391 (2021).
- 17.Pallesen J, et al. Immunogenicity and structures of a rationally designed prefusion MERS-CoV spike antigen. Proc. Natl. Acad. Sci. USA. 2017;114:E7348–E7357. doi: 10.1073/pnas.1707304114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yuan Y, et al. Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains. Nat. Commun. 2017;8:15092. doi: 10.1038/ncomms15092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kirchdoerfer RN, et al. Pre-fusion structure of a human coronavirus spike protein. Nature. 2016;531:118–121. doi: 10.1038/nature17200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gilbert PB, et al. Immune correlates analysis of the mRNA-1273 COVID-19 vaccine efficacy clinical trial. Science. 2022;375:43–50. doi: 10.1126/science.abm3425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Skowronski DM, De Serres G. Safety and efficacy of the BNT162b2 mRNA Covid-19 vaccine. New Engl. J. Med. 2021;384:1576–1577. doi: 10.1056/NEJMc2036242. [DOI] [PubMed] [Google Scholar]
- 22.Sadoff J, et al. Safety and efficacy of single-dose Ad26.COV2.S vaccine against Covid-19. New Engl. J. Med. 2021;384:2187–2201. doi: 10.1056/NEJMoa2101544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Heath PT, et al. Safety and efficacy of NVX-CoV2373 Covid-19 vaccine. New Engl. J. Med. 2021;385:1172–1183. doi: 10.1056/NEJMoa2107659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hsieh C-L, et al. Structure-based design of prefusion-stabilized SARS-CoV-2 spikes. Science. 2020;369:1501–1505. doi: 10.1126/science.abd0826. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Juraszek J, et al. Stabilizing the closed SARS-CoV-2 spike trimer. Nat. Commun. 2021;12:244. doi: 10.1038/s41467-020-20321-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Riley TP, et al. Enhancing the prefusion conformational stability of SARS-CoV-2 spike protein through structure-guided design. Front. Immunol. 2021;12:660198. doi: 10.3389/fimmu.2021.660198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Olmedillas, E. et al. Structure-based design of a highly stable, covalently-linked SARS-CoV-2 spike trimer with improved structural properties and immunogenicity. bioRxiv, 10.1101/2021.05.06.441046 (2021).
- 28.Sanders RW, Moore JP. Virus vaccines: proteins prefer prolines. Cell Host Microbe. 2021;29:327–333. doi: 10.1016/j.chom.2021.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Matreyek KA, et al. Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 2018;50:874–882. doi: 10.1038/s41588-018-0122-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Suiter CC, et al. Massively parallel variant characterization identifies NUDT15 alleles associated with thiopurine toxicity. Proc. Natl. Acad. Sci. USA. 2020;117:5394–5401. doi: 10.1073/pnas.1915680117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Chiasson MA, et al. Multiplexed measurement of variant abundance and activity reveals VKOR topology, active site and human variant impact. eLife. 2020;9:e58026. doi: 10.7554/eLife.58026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Matreyek KA, Stephany JJ, Chiasson MA, Hasle N, Fowler DM. An improved platform for functional assessment of large protein libraries in mammalian cells. Nucleic Acids Res. 2020;48:e1. doi: 10.1093/nar/gkz910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Matreyek KA, Stephany JJ, Fowler DM. A platform for functional assessment of large variant libraries in mammalian cells. Nucleic Acids Res. 2017;45:e102. doi: 10.1093/nar/gkx183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Ouyang, W. O. et al. Probing the biophysical constraints of SARS-CoV-2 spike N-terminal domain using deep mutational scanning. bioRxiv, 10.1101/2022.06.20.496903 (2022). [DOI] [PMC free article] [PubMed]
- 35.Kondo N, Miyauchi K, Meng F, Iwamoto A, Matsuda Z. Conformational changes of the HIV-1 envelope protein during membrane fusion are inhibited by the replacement of its membrane-spanning domain. J. Biol. Chem. 2010;285:14681–14688. doi: 10.1074/jbc.M109.067090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Baviskar PS, Hotard AL, Moore ML, Oomens AG. The respiratory syncytial virus fusion protein targets to the perimeter of inclusion bodies and facilitates filament formation by a cytoplasmic tail-dependent mechanism. J. Virol. 2013;87:10730–10741. doi: 10.1128/JVI.03086-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Atanasiu D, et al. Dual split protein-based fusion assay reveals that mutations to herpes simplex virus (HSV) glycoprotein gB alter the kinetics of cell-cell fusion induced by HSV entry glycoproteins. J. Virol. 2013;87:11332–11345. doi: 10.1128/JVI.01700-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Meng B, et al. SARS-CoV-2 Spike N-Terminal Domain modulates TMPRSS2-dependent viral entry and fusogenicity. Cell Rep. 2022;40:111220. doi: 10.1016/j.celrep.2022.111220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Meng B, et al. Altered TMPRSS2 usage by SARS-CoV-2 Omicron impacts infectivity and fusogenicity. Nature. 2022;603:706–714. doi: 10.1038/s41586-022-04474-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Bradel-Tretheway BG, et al. Nipah and Hendra virus glycoproteins induce comparable homologous but distinct heterologous fusion phenotypes. J. Virol. 2019;93:e00577–19. doi: 10.1128/JVI.00577-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Cabantous S, Terwilliger TC, Waldo GS. Protein tagging and detection with engineered self-assembling fragments of green fluorescent protein. Nat. Biotechnol. 2005;23:102–107. doi: 10.1038/nbt1044. [DOI] [PubMed] [Google Scholar]
- 42.Fowler DM, Fields S. Deep mutational scanning: a new style of protein science. Nat. Methods. 2014;11:801–807. doi: 10.1038/nmeth.3027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Feng S, et al. Improved split fluorescent proteins for endogenous protein labeling. Nat. Commun. 2017;8:370. doi: 10.1038/s41467-017-00494-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Yuan M, et al. Structural basis of a shared antibody response to SARS-CoV-2. Science. 2020;369:1119–1123. doi: 10.1126/science.abd2321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Zhou P, et al. A human antibody reveals a conserved site on beta-coronavirus spike proteins and confers protection against SARS-CoV-2 infection. Sci. Transl. Med. 2022;14:eabi9215. doi: 10.1126/scitranslmed.abi9215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Krammer F. SARS-CoV-2 vaccines in development. Nature. 2020;586:516–527. doi: 10.1038/s41586-020-2798-3. [DOI] [PubMed] [Google Scholar]
- 47.Klasse PJ, Nixon DF, Moore JP. Immunogenicity of clinically relevant SARS-CoV-2 vaccines in nonhuman primates and humans. Sci. Adv. 2021;7:eabe8065. doi: 10.1126/sciadv.abe8065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.McCallum M, et al. N-terminal domain antigenic mapping reveals a site of vulnerability for SARS-CoV-2. Cell. 2021;184:2332–2347.e16. doi: 10.1016/j.cell.2021.03.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Claireaux M, et al. A public antibody class recognizes an S2 epitope exposed on open conformations of SARS-CoV-2 spike. Nat. Commun. 2022;13:4539. doi: 10.1038/s41467-022-32232-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Henderson R, et al. Controlling the SARS-CoV-2 spike glycoprotein conformation. Nat. Struct. Mol. Biol. 2020;27:925–933. doi: 10.1038/s41594-020-0479-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Binley JM, et al. A recombinant human immunodeficiency virus type 1 envelope glycoprotein complex stabilized by an intermolecular disulfide bond between the gp120 and gp41 subunits is an antigenic mimic of the trimeric virion-associated structure. J. Virol. 2000;74:627–643. doi: 10.1128/JVI.74.2.627-643.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Sanders RW, et al. Stabilization of the soluble, cleaved, trimeric form of the envelope glycoprotein complex of human immunodeficiency virus type 1. J. Virol. 2002;76:8875–8889. doi: 10.1128/JVI.76.17.8875-8889.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Kong L, et al. Uncleaved prefusion-optimized gp140 trimers derived from analysis of HIV-1 envelope metastability. Nat. Commun. 2016;7:12040. doi: 10.1038/ncomms12040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Sanders RW, et al. A next-generation cleaved, soluble HIV-1 Env Trimer, BG505 SOSIP.664 gp140, expresses multiple epitopes for broadly neutralizing but not non-neutralizing antibodies. PLoS Path. 2013;9:e1003618. doi: 10.1371/journal.ppat.1003618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.McLellan JS, et al. Structure of RSV fusion glycoprotein trimer bound to a prefusion-specific neutralizing antibody. Science. 2013;340:1113–1117. doi: 10.1126/science.1234914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Loomis RJ, et al. Structure-based design of Nipah virus vaccines: a generalizable approach to paramyxovirus immunogen development. Front. Immunol. 2020;11:842. doi: 10.3389/fimmu.2020.00842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Hastie KM, et al. Structural basis for antibody-mediated neutralization of Lassa virus. Science. 2017;356:923–928. doi: 10.1126/science.aam7260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Rutten L, et al. Structure-based design of prefusion-stabilized filovirus glycoprotein trimers. Cell Rep. 2020;30:4540–4550.e3. doi: 10.1016/j.celrep.2020.03.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Caradonna TM, Schmidt AG. Protein engineering strategies for rational immunogen design. npj Vaccines. 2021;6:154. doi: 10.1038/s41541-021-00417-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Heinz FX, Stiasny K. Distinguishing features of current COVID-19 vaccines: knowns and unknowns of antigen presentation and modes of action. npj Vaccines. 2021;6:104. doi: 10.1038/s41541-021-00369-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Premkumar L, et al. The receptor-binding domain of the viral spike protein is an immunodominant and highly specific target of antibodies in SARS-CoV-2 patients. Sci. Immunol. 2020;5:eabc8413. doi: 10.1126/sciimmunol.abc8413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Wang C, et al. A conserved immunogenic and vulnerable site on the coronavirus spike protein delineated by cross-reactive monoclonal antibodies. Nat. Commun. 2021;12:1715. doi: 10.1038/s41467-021-21968-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Pinto D, et al. Broad betacoronavirus neutralization by a stem helix-specific human antibody. Science. 2021;373:1109–1116. doi: 10.1126/science.abj3321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Hurlburt NK, et al. Structural definition of a pan-sarbecovirus neutralizing epitope on the spike S2 subunit. Commun. Biol. 2022;5:342. doi: 10.1038/s42003-022-03262-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Zhou, P. et al. Broadly neutralizing anti-S2 antibodies protect against all three human betacoronaviruses that cause severe disease. Immunity56, 669–686 (2023). [DOI] [PMC free article] [PubMed]
- 66.Kirchdoerfer RN, et al. Stabilized coronavirus spikes are resistant to conformational changes induced by receptor recognition or proteolysis. Sci. Rep. 2018;8:15701. doi: 10.1038/s41598-018-34171-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Pardi N, Hogan MJ, Porter FW, Weissman D. mRNA vaccines—a new era in vaccinology. Nat. Rev. Drug Discov. 2018;17:261–279. doi: 10.1038/nrd.2017.243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Wu F, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. doi: 10.1038/s41586-020-2008-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Hoffmann M, Kleine-Weber H, Pöhlmann S. A multibasic cleavage site in the spike protein of SARS-CoV-2 is essential for infection of human lung cells. Mol. Cell. 2020;78:779–784.e5. doi: 10.1016/j.molcel.2020.04.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Olson CA, Wu NC, Sun R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr. Biol. 2014;24:2643–2651. doi: 10.1016/j.cub.2014.09.072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Yan R, et al. Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2. Science. 2020;367:1444–1448. doi: 10.1126/science.abb2762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Zhang J, Kobert K, Flouri T, Stamatakis A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics. 2014;30:614–620. doi: 10.1093/bioinformatics/btt593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Ream, D. & Kiss, A. J. NCBI/GenBank BLAST Output XML Parser Tool. https://www.semanticscholar.org/paper/NCBI-GenBank-BLAST-Output-XML-Parser-Tool-Ream-Kiss/3ead0ae31b91d3096369de11f3488024f752bdc5 (2013).
- 74.Katoh K, Misawa K, Kuma KI, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059–3066. doi: 10.1093/nar/gkf436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Mastronarde DN. Automated electron microscope tomography using robust prediction of specimen movements. J. Struct. Biol. 2005;152:36–51. doi: 10.1016/j.jsb.2005.07.007. [DOI] [PubMed] [Google Scholar]
- 76.Punjani A, Rubinstein JL, Fleet DJ, Brubaker MA. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods. 2017;14:290–296. doi: 10.1038/nmeth.4169. [DOI] [PubMed] [Google Scholar]
- 77.Rodrigues JP, Teixeira JM, Trellet M, Bonvin AM. pdb-tools: a swiss army knife for molecular structures. F1000Res. 2018;7:1961. doi: 10.12688/f1000research.17456.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Conway P, Tyka MD, DiMaio F, Konerding DE, Baker D. Relaxation of backbone bond geometry improves protein energy landscape modeling. Protein Sci. 2014;23:47–55. doi: 10.1002/pro.2389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Tan, T. J. C. et al. High-throughput identification of prefusion-stabilizing mutations in SARS-CoV-2 spike (this paper). SARS2_S_fusogenicity_DMS-main. 10.5281/zenodo.7742830 (2023). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Structures from the following identifiers from the Protein Data Bank (PDB) were used in this study: 6VXX and 6VYB. The cryoEM map of 2PQ spike can be accessed at the Electron Microscopy Data Bank (EMDB) using accession code EMD-29374. Raw deep sequencing data generated in this study have been submitted to the NIH Sequence Read Archive under accession number: PRJNA826665. Source data are provided with this paper.
Custom codes to analyze deep mutational scanning, thermal stability, and flow cytometry data have been deposited to 10.5281/zenodo.774283079.