Skip to main content
Elsevier Sponsored Documents logoLink to Elsevier Sponsored Documents
. 2020 Sep 3;182(5):1295–1310.e20. doi: 10.1016/j.cell.2020.08.012

Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding

Tyler N Starr 1,8, Allison J Greaney 1,2,3,8, Sarah K Hilton 1,2, Daniel Ellis 4,5,6, Katharine HD Crawford 1,2,3, Adam S Dingens 1, Mary Jane Navarro 5, John E Bowen 5, M Alejandra Tortorici 5, Alexandra C Walls 5, Neil P King 4,5, David Veesler 5, Jesse D Bloom 1,2,7,9,
PMCID: PMC7418704  PMID: 32841599

Summary

The receptor binding domain (RBD) of the SARS-CoV-2 spike glycoprotein mediates viral attachment to ACE2 receptor and is a major determinant of host range and a dominant target of neutralizing antibodies. Here, we experimentally measure how all amino acid mutations to the RBD affect expression of folded protein and its affinity for ACE2. Most mutations are deleterious for RBD expression and ACE2 binding, and we identify constrained regions on the RBD’s surface that may be desirable targets for vaccines and antibody-based therapeutics. But a substantial number of mutations are well tolerated or even enhance ACE2 binding, including at ACE2 interface residues that vary across SARS-related coronaviruses. However, we find no evidence that these ACE2-affinity-enhancing mutations have been selected in current SARS-CoV-2 pandemic isolates. We present an interactive visualization and open analysis pipeline to facilitate use of our dataset for vaccine design and functional annotation of mutations observed during viral surveillance.

Keywords: deep mutational scanning, SARS-CoV-2, receptor-binding domain, ACE2

Graphical Abstract

graphic file with name fx1.jpg

Highlights

  • Measured effects on folding and ACE2 binding of all mutations to the SARS-CoV-2 RBD

  • Provide open data and interactive visualization for vaccine design and surveillance

  • Identify constrained surfaces as ideal targets for vaccines and antibody therapeutics

  • Mutations that enhance ACE2 affinity exist but are not selected in pandemic isolates


Starr et al. systematically change every amino acid in the receptor binding domain (RBD) of the SARS-CoV-2 spike protein and determine the effects of the substitutions on RBD expression, folding, and ACE2 binding. The work identifies structurally constrained regions of the spike RBD that would be ideal targets for COVID-19 countermeasures and demonstrates that mutations in the virus that enhance ACE2 affinity can be engineered but have not, to date, been naturally selected during the pandemic.

Introduction

The SARS-related (sarbecovirus) subgenus of betacoronaviruses comprises a diverse lineage of viruses that circulate in bat reservoirs and spill over into other mammalian species (Figure 1A; Bolles et al., 2011; Cui et al., 2019). Sarbecoviruses initiate infection by binding to receptors on host cells via the viral spike protein. The entry receptor for both SARS-CoV-2 and the original SARS-CoV (which we refer to here as SARS-CoV-1) is the human cell-surface protein angiotensin converting enzyme 2 (ACE2). The receptor binding domain (RBD) of spike from both these viruses binds ACE2 with high affinity (Hoffmann et al., 2020; Letko et al., 2020; Li et al., 2003; Walls et al., 2020; Wrapp et al., 2020a). Because of its role in viral entry, the RBD is a major determinant of cross-species transmission and evolution (Becker et al., 2008; Frieman et al., 2012; Letko et al., 2020; Li, 2008; Li et al., 2005b; Qu et al., 2005; Ren et al., 2008; Sheahan et al., 2008a, 2008b; Wu et al., 2012). In addition, the RBD is the target of the most potent anti-SARS-CoV-2-neutralizing antibodies identified to date (Cao et al., 2020; Ju et al., 2020; Pinto et al., 2020; Rogers et al., 2020; Seydoux et al., 2020; Shi et al., 2020; Wu et al., 2020b; Zost et al., 2020), and several promising vaccine candidates use the RBD as the sole antigen (Chen et al., 2020a, 2020b; Mulligan et al., 2020; Quinlan et al., 2020; Ravichandran et al., 2020; Yang et al., 2020; Zang et al., 2020).

Figure 1.

Figure 1

Yeast Display of RBDs from SARS-CoV-2 and Related Sarbecoviruses

(A) Maximum likelihood phylogeny of sarbecovirus RBDs. RBDs included in the present study are in bold colored text. Node labels indicate bootstrap support.

(B) RBD yeast-surface display enables fluorescent detection of RBD expression and ACE2 binding.

(C) Yeast displaying the indicated RBD were incubated with varying concentrations of human ACE2, and binding was measured via flow cytometry. Binding constants are reported as KD,app from the illustrated titration curve fits.

(D) Comparison of yeast-display binding with previous measurements of the capacity of viral particles to enter ACE2-expressing cells. Relative binding is Δlog10(KD,app) measured in the current study; relative cellular entry is infection of ACE2-expressing cells by vesicular stomatitis virus (VSV) pseudotyped with spike containing the indicated RBD, reported by Letko et al. (2020) in arbitrary luciferase units relative to SARS-CoV-1 RBD; n.d., not determined.

Despite its important function, the RBD is highly variable among sarbecoviruses (Hu et al., 2017), reflecting the complex selective pressures shaping its evolution (Demogines et al., 2012; Frank et al., 2020; MacLean et al., 2020). Furthermore, RBD mutations have already appeared among SARS-CoV-2 pandemic isolates, including some near the ACE2-binding interface—but their impacts on receptor recognition and other biochemical phenotypes remain largely uncharacterized. Therefore, comprehensive knowledge of how mutations impact the SARS-CoV-2 RBD would aid efforts to understand viral evolution and guide the design of vaccines and other countermeasures.

To address this need, we used a quantitative deep mutational scanning approach (Adams et al., 2016; Fowler and Fields, 2014; Weile and Roth, 2018) to experimentally measure how all possible SARS-CoV-2 RBD amino acid mutations affect ACE2-binding affinity and protein expression (a correlate of protein folding stability). The resulting sequence-phenotype maps illuminate the forces that shape RBD evolution, quantify constraint on antibody epitopes, and suggest that purifying selection is the main force acting on RBD mutations observed in human SARS-CoV-2 isolates to date. To facilitate use of our measurements in immunogen design and viral surveillance, we provide interactive visualizations, an open analysis pipeline, and complete raw and processed data.

Results

Yeast Display of RBDs from SARS-CoV-2 and Related Sarbecoviruses

To enable rapid functional characterization of thousands of RBD variants, we developed a yeast-surface-display platform for measuring expression of folded RBD protein and its binding to ACE2 (Adams et al., 2016; Boder and Wittrup, 1997). This platform enables RBD expression on the cell surface of yeast (Figure 1B), where it can be assayed for ligand-binding affinity or protein expression levels, a close correlate of protein folding efficiency and stability (Kowalski et al., 1998a, 1998b; Shusta et al., 1999). Because yeast have protein-folding quality control and glycosylation machinery similar to mammalian cells, they add N-linked glycans at the same RBD sites as human cells (Chen et al., 2014), although these glycans are more mannose rich than mammalian-derived glycans (Hamilton et al., 2003). The yeast-expressed RBD from SARS-CoV-1 has similar antigenic and structural properties to the RBD expressed in mammalian cells (Chen et al., 2014, 2017, 2020a) and binds to ACE2 as expected (Chen et al., 2014).

To validate the yeast-display platform, we selected RBDs from the Wuhan-Hu-1 SARS-CoV-2 isolate and six related sarbecoviruses (Figure 1A). These other sarbecoviruses include the closest known relatives of SARS-CoV-2 from bats and pangolins (RaTG13 and GD-Pangolin), SARS-CoV-1 (Urbani strain) and a close bat relative (LYRa11), and two more distantly related bat sarbecoviruses (BM48-31 and HKU3-1). Based on prior work, all these RBDs are expected to bind human ACE2 except those from BM48-31 and HKU3-1 (Lam et al., 2020; Letko et al., 2020; Shang et al., 2020). We cloned the RBDs into a vector for yeast display, induced RBD expression, and incubated with varying concentrations of fluorescently labeled human ACE2 (Figure 1B). We then used flow cytometry to measure ACE2 binding across 11 ACE2 concentrations, enabling the calculation of a dissociation constant for the binding of each RBD to ACE2 (Figure 1C). Because we used ACE2 in its native dimeric form (Yan et al., 2020), we refer to the measured constants as apparent dissociation constants (KD,app), which are affected by binding avidity. We report log binding constants Δlog10(KD,app) relative to the wild-type SARS-CoV-2 RBD, polarized such that a positive value reflects stronger binding (Figure 1D).

All RBDs exhibited ACE2-binding affinities consistent with prior knowledge. We measure KD,app = 3.9 × 10−11 M for the SARS-CoV-2 RBD (Figure 1C), which is tighter than affinities reported for monomeric ACE2 (Shang et al., 2020; Walls et al., 2020; Wrapp et al., 2020a) due to avidity effects caused by our use of native dimeric ACE2. Consistent with previous studies (Shang et al., 2020; Walls et al., 2020; Wrapp et al., 2020a), the SARS-CoV-1 RBD binds ACE2 with lower affinity than SARS-CoV-2 (Figures 1C and 1D). The SARS-CoV-1-related bat strain LYRa11 binds with even lower affinity, while the more distant bat RBDs (HKU3-1 and BM48-31) have no detectable binding. These measurements are consistent with the ability of these RBDs to enable viral particles to enter cells expressing human ACE2 (Letko et al., 2020; Figure 1D). Within the newly described SARS-CoV-2 clade, GD-Pangolin binds ACE2 with slightly higher affinity than SARS-CoV-2, while the bat isolate RaTG13 binds with two orders of magnitude lower affinity, consistent with prior reports (Shang et al., 2020; Wrobel et al., 2020). These results validate our yeast-surface-display platform for RBD affinity measurements and map variation in ACE2 affinity within the SARS-CoV-2 clade and the broader sarbecovirus subgenus.

Deep Mutational Scanning of All Amino Acid Mutations to the SARS-CoV-2 RBD

We next integrated the yeast-display platform with deep mutational scanning to determine how all amino acid mutations to the SARS-CoV-2 RBD impact expression and binding affinity for ACE2. We constructed two independent mutant libraries of the RBD using a PCR-based mutagenesis method that introduces all 19 mutant amino acids at each position (Bloom, 2014). To facilitate sequencing and obtain linkage among amino acid mutations within a single variant, we appended 16-nucleotide barcodes downstream of the coding sequence (Hiatt et al., 2010), bottlenecked each library to ∼100,000 barcoded variants, and linked each RBD variant to its barcode via long-read PacBio SMRT sequencing (Matreyek et al., 2018; Figure S1A). By examining the concordance of RBD variant sequences for barcodes sampled by multiple PacBio reads, we validated that this process correctly determined the sequence of >99.8% of the variants (Figure S1B). RBD variants contained an average of 2.7 amino acid mutations, with the number of mutations per variant roughly following a Poisson distribution (Figure S1C). Our libraries covered 3,804 of the 3,819 possible RBD amino acid mutations, of which 95.7% were present as the sole amino acid mutation in at least one barcoded variant (Figures S1D and S1E). To provide internal standards for our measurements, we spiked the mutant libraries with a barcoded panel of 11 unmutated sarbecovirus RBD homologs (strains in color in Figure 1A), including those tested in Figure 1C.

Figure S1.

Figure S1

SARS-CoV-2 RBD Mutant Libraries, Related to Figure 2

(A) Scheme of the library generation and sequencing approach. SARS-CoV-2 RBD mutant libraries were constructed in fully independent duplicates, and variants were linked to barcodes by long-read PacBio sequencing. (B) PacBio sequencing stats on duplicate SARS-CoV-2 mutant libraries. Comparison of RBD sequences among independent circular consensus sequences (CCSs) of the same barcode enables calculation of an empirical accuracy, which describes the minimal expected accuracy of the barcode:RBD linkage for barcodes with a single CCS (see STAR Methods for details). Most barcodes were represented by multiple CCSs, which further increases the accuracy of barcode:RBD linkage. (C) Statistics on mutation rates in mutant libraries. Top, average number of mutations of different types across variants in each library. Bottom, distribution of number of amino acid mutations per variant. (D, E) Mutation coverage in mutant libraries. Cumulative distribution plots (D) give the fraction of all possible amino acid mutations observed in the indicated number of variants, including all variants (left) or only variants with a single mutation (right). Minimum coverage statistics from these curves are tabulated in (E).

To determine how mutations affect RBD expression and ACE2 binding, we combined fluorescence-activated cell sorting (FACS) with deep sequencing of variant barcodes (Adams et al., 2016; Peterman and Levine, 2016). To measure expression, we fluorescently labeled the RBD protein on the yeast surface via a C-terminal epitope tag and used FACS to collect ∼15 million cells from each library, partitioned into four bins from low to high expression (Figures 2A and S2A). We sequenced the barcodes from each bin and reconstructed each variant’s mean fluorescence intensity (MFI) from its distribution of reads across bins (Figure S2C). We represent expression as Δlog(MFI) relative to the unmutated SARS-CoV-2 RBD, such that a positive Δlog(MFI) indicates increased expression. To measure ACE2-binding affinity, we incubated yeast libraries that had been pre-sorted for RBD expression with 16 concentrations of fluorescently labeled ACE2 (10−6 to 10−13 M, and 0M ACE2) and used FACS to collect >5 million RBD+ yeast cells at each concentration, partitioned into 4 bins from low to high ACE2 binding (Figures 2B and S2B). We again sequenced the barcodes from each bin, reconstructed the mean ACE2 binding of each variant at each concentration (Figure S2C), and used the resulting titration curves to infer dissociation constants KD,app (Figure S2D), which we represent as Δlog10(KD,app) relative to the unmutated SARS-CoV-2 RBD, with positive values indicating stronger binding.

Figure 2.

Figure 2

Deep Mutational Scanning of All Amino Acid Mutations to the SARS-CoV-2 RBD

(A and B) FACS approach for deep mutational scans for expression (A) and binding (B). Cells were sorted into four bins from low to high expression or binding, with separate sorts for each ACE2 concentration. The frequency of each library variant in each bin was determined by Illumina sequencing of the barcodes of cells collected in that bin, enabling reconstruction of per-variant expression and binding phenotypes. Bin boundaries were drawn based on distributions of expression or binding for unmutated SARS-CoV-2 controls (blue), and gray shows the distribution of library variants for library replicate 1 in these bins.

(C and D) Distribution of library variant phenotypes for expression (C) and binding (D), with variants classified by the types of mutations they contain. Internal control RBD homologs are indicated with vertical lines, colored by clade as in Figure 1A. Stop-codon-containing variants were purged by an RBD+ pre-sort prior to ACE2 binding measurements and so are not sampled in (D).

(E and F) Correlation in single-mutant effects on expression (E) and binding (F), as determined from independent mutant library replicates.

See also Figures S1 and S2 and Table S1.

Figure S2.

Figure S2

Deep Mutational Scanning of the SARS-CoV-2 RBD, Related to Figure 2

(A, B) Representative sorting gates used to select cells for for expression (A) and binding (B) FACS experiments. FSC and SSC gates select for single cells (P1-P3), and FITC labeling of an RBD C-terminal epitope tag defines RBD+ gates (P4), when necessary. Tables show the nested hierarchy of sort gates, with final bins 1-4 for expression and binding shown in Figures 2A and 2B, respectively. For (A), the P4 RBD+ gate was used to enrich the library for expressing variants, which were grown up and re-induced for binding experiments as in (B). (C) Empirical estimates of variance in FACS-seq measurements. Barcodes encoding wild-type SARS-CoV-2 RBD were grouped by total cell count across sort bins, and the variance in estimates of expression mean fluorescence (left) or binding mean bin (right, corresponding to a single point in the subsequent titration curve fit) were determined. Black dashed lines indicate the median cell count for which each phenotype was measured among library genotypes. (D) Example variant-specific titration curves inferred from the deep mutational scanning experiment. Randomly selected titration curves are illustrated across the range of fit KD,app binding constants, with variant genotype listed above each panel. Because curves that were fit with KD,app between 10−4 to 10−6 were virtually indistinguishable non-responsive curves, we truncated all KD,app measurements in this range to a censored > 10−6 M cutoff. (E-K) Global epistasis models were fit to decompose single-mutant effects from variant backgrounds containing variable numbers of mutations. These models invoke an underlying latent scale on which mutations combine additively, which is linked to the experimental scale by a flexible nonlinear curve fit, which accounts for limits in dynamic range and other nonlinearities. See the STAR Methods for more details. (E, H) Global epistasis fits. Plots illustrate, for each library variant, its experimentally determined phenotype for expression (E) or binding (H) versus its latent phenotype predicted by the global epistasis model. Red lines indicate the shape of the nonlinear curve fit. For the expression global epistasis models, mutations to stop codons are fit to a latent-scale effect of approximately −16.5. The separated clusters of points toward increasingly deleterious latent scale phenotypes reflect genotypes containing 1, 2, 3, etc. nonsense mutations. (F, I) Correlation in mutation effects on expression (F) and binding (I) between replicates, for mutations that were sampled directly as single mutants with no global epistasis decomposition. (J) Correlation in mutation effects on binding between replicates, for all global-epistasis-decomposed single-mutant effect terms on the observed phenotype scale. Equivalent plot for expression is Figure 2E. (G, K) Correlation in mutation effects on expression (G) and binding (K) averaged across replicates, for directly sampled single-mutant measurements versus global-epistasis-decomposed mutation effects. For expression, global epistasis averaging of single-mutant effects across all variants (Figure 2E) improved replicate correlations beyond the directly sampled measurements (F), so global-epistasis-decomposed values were used for all single-mutant terms. For binding, directly sampled single-mutant effects (I) were better correlated than the values decomposed from global epistasis models (J), so global epistasis models were used to interpolate single-mutant measurements only for mutations that were not observed on any directly sampled single-mutant variant backgrounds.

These high-throughput measurements of expression and ACE2 binding were consistent with expectations about the effects of mutations. RBD variants containing premature stop codons universally failed to express folded full-length protein (Figure 2C). Unmutated variants and those with synonymous mutations had a tight distribution of neutral expression and binding measurements (Figures 2C and 2D). Variants containing amino acid mutations had a wide range of expression and binding phenotypes, with variants containing just one mutation tending to have milder functional defects than those with multiple mutations (Figures 2C and 2D). These trends are consistent with the fact that most mutations are deleterious to protein folding or function (Soskine and Tawfik, 2010)—however, some mutated variants exhibit expression or binding that is comparable or even higher than the parental SARS-CoV-2 RBD. The panel of RBD homologs from other sarbecovirus strains all expressed well but exhibited a wide range of ACE2-binding affinities (Figures 2C and 2D; Table S1), as expected since only some are derived from viruses that can enter cells using human ACE2 (Letko et al., 2020).

These measurements show that the RBD possesses considerable mutational tolerance (Figures 2C and 2D). For instance, 46% of single amino acid mutations to SARS-CoV-2 RBD maintain an affinity to ACE2 at least as high as that of SARS-CoV-1, suggesting that there is a substantial mutational space consistent with sufficient affinity to maintain human infectivity. Many single amino acid mutants also maintain expression comparable to unmutated SARS-CoV-2, indicating that a large mutational space is compatible with properly folded RBD protein.

We next aggregated the measurements on all variants to quantify the effects of individual amino acid mutations. Because many variants contain multiple mutations, we used global epistasis models to determine the effects of individual mutations from all singly and multiply mutated variants (Otwinowski et al., 2018; Figures S2E–S2K). The resulting single-mutant Δlog(MFI) and Δlog10(KD,app) measurements correlated well between the independent library duplicates (R2 = 0.93 and 0.95, respectively; Figures 2E and 2F). Throughout the rest of this paper, we report single-mutant effects as the average of the duplicate measurements. Overall, we obtained expression measurements for 99.5% and binding measurements for 99.6% of all 3,819 single amino acid mutations.

Visualization of Sequence-to-Phenotype Maps

The complete measurements of how amino acid mutations affect expression and ACE2 binding represent rich sequence-to-phenotype maps for the RBD. We visualize the data in several ways. Figure 3 provides heatmaps that show how each mutation affects expression or ACE2 binding, with sites annotated by whether they contact ACE2, their relative solvent accessibility, and their amino acid identities in SARS-CoV-2 and SARS-CoV-1. Interactive versions of these heatmaps are in Data S1 and at https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS and enable zooming, subsetting by functional annotations, and mouse-selection-based readouts of numerical measurements. As an alternative representation, Figure S3 provides logo plots that enable side-by-side comparison of how mutations affect expression and ACE2 binding. Finally, interactive structure-based visualizations using dms-view (Hilton et al., 2020) are at https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS/structures/ and project the effects of mutations onto a crystal structure of the ACE2-bound RBD (Lan et al., 2020) and a cryoelectron microscopy (cryo-EM) structure of the full spike ectodomain (Walls et al., 2020). The underlying raw data are in Table S2.

Figure 3.

Figure 3

Sequence-to-Phenotype Maps of the SARS-CoV-2 RBD

(A and B) Heatmaps illustrating how all single mutations affect RBD expression (A) and ACE2-binding affinity (B). Interactive versions of these heatmaps are at https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS and in Data S1. Squares are colored by mutational effect according to scale bars on the left, with red indicating deleterious mutations. The SARS-CoV-2 amino acid is indicated with an “x” and the SARS-CoV-1 amino acid, if different, is indicated with an “o”. Black boxes in top overlay indicate residues that contact ACE2 in the SARS-CoV-2 or SARS-CoV-1 crystal structures. The purple overlay represents the relative solvent accessibility (RSA) of a residue in the ACE2-bound SARS-CoV-2 crystal structure. See also Figure S3, Table S2, and Data S1.

Figure S3.

Figure S3

Logo Plot Representation of Mutation Effects on Binding and Expression, Related to Figure 3

Letter height indicates preference of each site for individual amino acids with respect to ACE2 binding (height above the center line) or RBD expression (height below the center line). Blue letters indicate the unmutated SARS-CoV-2 amino acid, and, where applicable, green letters indicate differences found in SARS-CoV-1. Yellow highlights mark residues that contact ACE2 in the SARS-CoV-2 or SARS-CoV-1 crystal structures. See the STAR Methods for details of how the amino acid preferences are calculated from the experimental measurements.

The sequence-phenotype maps reveal tremendous heterogeneity in mutational constraint across the RBD. Many sites are highly tolerant of mutations with respect to one or both of expression and ACE2 binding, while other sites are constrained to the wild-type amino acid. A substantial number of sites (e.g., 382–395) are tolerant of mutations with respect to ACE2 binding but are constrained with respect to expression—consistent with folding and stability being global constraints common to many sites (Fane et al., 1991; Poteete et al., 1997). There are also a handful of sites where ACE2 binding imposes strong constraint but expression does not (e.g., 489, 502, and 505). Moreover, at some sites there are mutations that clearly enhance expression or ACE2-binding affinity (blue colors in Figure 3).

Validation of Deep Mutational Scanning Measurements

We performed a series of experiments to confirm the dynamic range of our assays and their relevance for RBD expressed in mammalian cells or full spike trimer on pseudotyped lentiviral particles (Figures 4 and S4).

Figure 4.

Figure 4

Validation of Deep Mutational Scanning Measurements

(A) Titration curves for select mutations that were re-cloned and validated in isogenic yeast cultures, as in Figure 1C.

(B and C) Correlation in binding (B) and expression (C) effects between deep mutational scanning and isogenic yeast validations, including mutants shown in (A) and Figure 7C.

(D) Comparisons of dissociation constants measured for mammalian-expressed purified RBD binding to monomeric human ACE2 (Figures S4A–S4F) and yeast displayed RBD binding to natively dimeric ACE2 from our deep mutational scan.

(E–G) Validation of expression-enhancing mutations.

(E and F) Expression-enhancing mutations increase soluble yield of mammalian-expressed RBD. Reducing SDS-PAGE gel of transfection supernatant and RBD protein at various stages of purification (E). Analytical size exclusion chromatography (SEC) trace of protein variants (F). Inset, relative quantitation of protein yield from SEC. Open bar reflects the relative quantity of the earlier eluting peak, which corresponds to oxidized dimer (Figure S4G).

(G) Thermal stability of RBD variants. See Figure S4H for raw melting curves.

(H) Effects of mutations on transduction of ACE2-expressing cells by lentiviral particles pseudotyped with a SARS-CoV-2 spike. Mutants are colored by their effects on ACE2 binding as measured in the deep mutational scan (Figure 3B). Titers that fell below the limit of detection (dashed horizontal line) are plotted on the x axis. Measurements were made in biological triplicate and reflect the integrated effects of mutations on pseudovirus production and cellular entry; transduction efficiency normalized by pseudovirus production is presented in Figure S4J and gives highly similar results.

See also Figure S4.

Figure S4.

Figure S4

Validation of Deep Mutational Scanning Measurements, Related to Figure 4

(A-F) Human ACE2 binds to various sarbecovirus RBDs with distinct affinities. Biolayer interferometry (BLI) binding of various concentrations of human ACE2 to the indicated RBDs immobilized at the surface of biosensors. Global fit curves are shown as black lines. The vertical dashed lines indicate the transition between association and dissociation phases. Analysis of binding to dimeric human ACE2, incorporating avidity effects, was also analyzed for the RBDs that did not bind monomeric ACE2 (D-F, right). (G) Reducing (top) and non-reducing (bottom) SDS-PAGE gels of expression-enhancing mutant RBDs illustrate that the early SEC peak (Figure 4F) is an oxidized dimer species. (H) Raw thermal melting traces for determination of non-equilibrium thermal stability, summarized in Figure 4G. Top plots show the barycentric mean (BCM) of intrinsic tryptophan fluorescence as a function of increasing temperature; bottom plots show the first derivative of BCM with respect to temperature, the maximum of which is the reported melting temperature (colored line). Black line illustrates the wild-type melting temperature, for reference. (I) BLI of immobilized mutant RBDs for binding to ACE2 (top) or CR3022 (bottom), indicating that all mutations maintain ACE2 and CR3022 binding, though kinetics of CR3022 binding may be slightly modified by some mutations. (J) Pseudovirus transduction efficiency normalized by pseudovirus yield in the transfection supernatant. p24 levels (pg/mL) in the transection supernatant were determined via ELISA. Titers of transducing units determined by flow cytometry were normalized by p24 levels in the same supernatant to calculate transducing particles per pg p24. Measurements were performed in biological triplicate, with p24 quantitation performed in technical duplicate.

To validate the dynamic range of our deep mutational scanning, we re-cloned and tested RBD mutants in isogenic yeast-display assays. These experiments recapitulated the deep mutational scanning (Figures 4A–4C), including confirmation that some mutations enhance expression (V367F and G502D) or ACE2 affinity (N501F, N501T, and Q498Y) in the context of yeast-expressed RBD.

We next compared our deep mutational scanning to measurements on mammalian-expressed RBDs. We purified mammalian-expressed RBDs from six sarbecoviruses (SARS-CoV-2, SARS-CoV-1, WIV1, RaTG13, ZXC21, and ZC45) and measured their 1:1 binding affinities for monomeric human ACE2 using biolayer interferometry, which agreed with the measurements from our deep mutational scan (Figures 4D and S4A–S4F). Moreover, we observed that using a natively dimeric ACE2 enables detection of binding by the RaTG13 RBD, which can support ACE2-mediated cell entry (Shang et al., 2020) even though the 1:1 affinity is too weak to detect (Figure S4D).

We also validated that mutations enhancing yeast surface expression improve soluble yield and stability of mammalian-expressed RBD protein. We tested five expression-enhancing mutations and found that each greatly increased soluble RBD yield (2.3- to 4.8-fold increase; Figures 4E, 4F, and S4G). Four of the mutations also increased RBD stability (Figures 4G and S4H), including one (V367F) that increased the melting temperature by 3.9°C. All five mutations also maintained ACE2 binding and antigenicity (Figure S4I), suggesting they could be useful for enhancing production of RBD-based vaccine immunogens.

Finally, we validated the deep mutational scanning measurements in the context of spike-pseudotyped lentiviral particles (Figures 4H and S4J; Crawford et al., 2020). The trends observed for entry by the spike-pseudotyped lentiviral particles generally confirmed the deep mutational scanning: three of four mutations that were detrimental for RBD expression or ACE2 binding reduced pseudovirus entry, while a mutation that had little phenotypic effect in the deep mutational scan did not affect viral entry. We also tested two ACE2 affinity-enhancing mutations and found that both increased pseudovirus entry. Note that this result with single-cycle pseudovirus does not necessarily imply that these mutations would increase growth of authentic SARS-CoV-2, since multi-cycle viral replication often involves tuning of receptor affinity to simultaneously optimize viral attachment and release (Callaway et al., 2018; Hensley et al., 2009; Lang et al., 2020). Taken together, these experiments help validate the accuracy and relevance of the deep mutational scanning.

Interpreting Mutation Effects in the Context of the RBD Structure

To relate our sequence-phenotype maps to the RBD structure, we mapped the effects of mutations onto the ACE2-bound SARS-CoV-2 RBD crystal structure (Lan et al., 2020), coloring each residue’s Cɑ by the mean effect of mutations at that site on expression (Figure 5A) or binding (Figure 5B). Interactive structure-based visualizations of specific residue sets discussed in the following sections can be found at https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS/structures/.

Figure 5.

Figure 5

Mutation Effects in the Context of RBD Structure and Implications for Sarbecovirus Evolution

(A and B) Mutational constraint mapped to the SARS-CoV-2 RBD structure. A sphere at each site Cɑ is colored according to the mean effect of mutations with respect to expression (A) or binding (B), with red indicating more constraint. RBD structural features and the ACE2 K31 and K353 interaction hotspot residues are labeled. Yellow sticks indicate disulfide bridges. Interactive structure-based visualizations of these data are at https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS/structures/.

(C) Relationship between mutational constraint on binding and expression. The structural view shows sites that are under strong constraint for ACE2 binding but are tolerant of mutations for expression (cyan spheres).

(D) Heatmap as in Figure 3B, subsetted on sites that directly contact ACE2 in the SARS-CoV-2 or SARS-CoV-1 RBD structures, plus interface site 494, which is a key site of adaptation in SARS-CoV-1.

(E) RBD sites 493, 498, and 501, which have many affinity-enhancing mutations, participate in polar contact networks involving the ACE2 interaction hotspot residues K31 and K353.

(F) Variation at ACE2 contact sites in sarbecovirus RBDs. Circles show the effects of individual mutations that differentiate a virus ACE2 interface from SARS-CoV-2, while x shows the mean effect of all mutations at that site. The sum of individual mutation effects at interface residues is shown, compared to the actual RBD binding relative to unmutated SARS-CoV-2.

See also Figure S5.

The two subdomains of the RBD differ in mutational constraint on expression and binding. The core-RBD subdomain consists of a central beta sheet flanked by alpha-helices and presents a stably folded scaffold for the receptor binding motif (RBM, residues 437–508; Li et al., 2005a), which encodes ACE2 binding and receptor specificity (Letko et al., 2020). The RBM consists of a concave surface anchored by a β-hairpin and a disulfide bond stabilizing one of the lateral loops, which cradles the ACE2 ɑ1 helix and a β-hairpin centered on K353ACE2. Consistent with the modularity of core-RBD-encoded stability and RBM-encoded binding, constraint on expression primarily focuses on buried residues within the core-RBD (Figure 5A), while constraint on binding focuses on the RBM-proximal core-RBD in addition to the RBM itself (Figure 5B), particularly on RBM residues that contact K31ACE2 and K353ACE2, which are “hotspots” of binding for SARS-CoV-1 and SARS-CoV-2 (Li, 2008; Shang et al., 2020; Wu et al., 2012).

Several ACE2-contact residues exhibit binding-stability tradeoffs, as has been seen in the active sites and binding interfaces of other proteins (Julian et al., 2017; Tokuriki et al., 2008; Wang et al., 2002). For example, several mutations to G502 enhance RBD expression (Figure 3A) but abolish binding (Figure 3B) due to steric clashes with ACE2 (Figure S5A). Similarly, mutations to polar amino acids enhance expression at interface residues Y449, L455, F486, and Y505 (Figure 3A), consistent with the destabilizing effect of surface-exposed hydrophobic patches (Schwehm et al., 1998)—but these hydrophobic residues form ACE2-packing contacts and are required for binding (Figures 3B and S5B).

Figure S5.

Figure S5

Additional Structural Analyses of Mutation Effects, Related to Figure 5

(A, B) Structural depictions of sites exhibiting stability-binding tradeoffs. (A) RBD residue G502 requires small amino acid side chains for ACE2 binding (Figure 3B), consistent with its close proximity to G354ACE2 in the bound structure. (B) Mutations to polar residues at positions Y449, L455, F486, and Y505 would enhance expression but reduce binding, consistent with specific geometric constraints imposed by the close packing of these residues at the ACE2 surface. (C) Relationship between barcode expression and titration response plateau parameters. The correlation between mutation effects on binding and expression in Figure 5C could emerge from trivial correlation between phenotypes (e.g., yeast with higher RBD surface expression can bind more ACE2). However, our multiple-concentration titration approach should in principle remove this trivial correlation (Adams et al., 2016), because each binding phenotype is determined from a self-referenced titration curve, for which the free plateau response parameter can vary to account for different levels of saturated binding due to RBD expression (see Figure S2D). Consistent with this premise, the response parameter from the titration fit for library variants with KD,app < 10−7 (as lower-affinity titration curves do not adequately sample the titration plateau) correlates with its expression phenotype. (D) Relationship between mutational constraint on binding and residue relative solvent accessibility (RSA). Black dots indicate RSA in the full ACE2-bound RBD structure, and when sites have changes in RSA in the unbound structure, then their RSA in that structure is also shown in orange. (E) Mutation effects on binding (left) and expression (right) at disulfide cysteine residues. Details as in Figure 3. RBD sites are grouped by disulfide pair and labeled according to location in the core-RBD or RBM sub-domains. (F) Mutation effects on expression at N-linked glycosylation sites (NLGS). RBD sites are grouped by NLGS motif (NxS/T, where x is any amino acid except proline). Boxed amino acids indicate those that encode a NLGS motif. NLGS motifs are labeled according to whether they are present in both the SARS-CoV-2 and SARS-CoV-1 RBD (N331 and N343 glycans), or in SARS-CoV-1 only (N370 glycan). Introduction of the N370 glycan in SARS-CoV-2 is mildly deleterious for stability. (G) Effects of putative N-linked glycosylation site (NLGS) knock-in mutations. Heatmap details as in Figure 3. There are 10 surface-exposed asparagines for which RBD expression is unaffected or enhanced (top) when an NLGS motif is introduced via mutations to S or T at the i+2 site; for eight of these putative NLGS knock-ins (blue labels), the putative glycan is also tolerated for ACE2 binding (bottom), but for two (red labels), introduction of the NLGS motif is not tolerated for ACE2 binding. (H) Mapping of these ten asparagines to the RBD structure illustrates that these two binding-constrained asparagines (red) map to the ACE2 interface. (I) For mutation effects on expression (left) and binding (right), comparison of phenotypic impacts of mutations that knock in new NLGS motifs (NxS/T) versus single mutations to N, S, or T at all positions. There is no trend for NLGS knockin mutations to be more deleterious than typical mutations to N, S, or T.

However, our data also indicate that global RBD stability contributes to ACE2-binding affinity. In general, mutation effects on RBD binding and expression are correlated (Figures 5C and S5C), with residues that deviate from this trend clustering at the ACE2 interface (Figure 5C, cyan points). This correlation between expression and binding is consistent with studies on antibodies, where mutations that improve stability and rigidity accompany increases in binding affinity (Davenport et al., 2016; Ovchinnikov et al., 2018; Schmidt et al., 2013). Because ACE2 binding is influenced by both global RBD stability and interface-specific constraints, a site’s tolerance to mutation is better explained by its extent of burial in the ACE2-bound RBD structure than its burial in the free RBD structure alone (Figure S5D). The contribution of RBD stability to ACE2 binding may be influenced by other factors in the full spike trimer, though our measurements on pseudotyped lentiviral particles (Figure 4H) indicate that a destabilizing RBD mutation (C432D) reduces ACE2-mediated cellular entry in the context of spike trimer.

Our data also reveal the importance of other sequence features. For example, the four disulfide bonds in the RBD have varying tolerance to mutation (Figures 5A, 5B, and S5E), with the RBM C480:C488 disulfide completely constrained for ACE2 binding. The two RBD N-linked glycans contribute to RBD stability, as mutations that ablate the NxS/T glycosylation motif decrease RBD expression (Figure S5F). The SARS-CoV-1 RBD contains a third glycan, but its introduction at the homologous N370 in SARS-CoV-2 is mildly deleterious for expression (Figure S5F). However, there are other surface positions where introduction of NxS/T glycosylation motifs is tolerated or even beneficial for RBD expression (Figures S5G–S5I); adding glycans at some of these sites could be useful in resurfacing RBDs as antibody probes (Wu et al., 2010; Zhou et al., 2020c) or epitope-focused immunogens (Duan et al., 2018; Eggink et al., 2014; Jardine et al., 2016; Kulp et al., 2017; Weidenbacher and Kim, 2019).

Mutation Effects at ACE2 Contact Sites and Implications for Sarbecovirus Evolution

An initially surprising feature of SARS-CoV-2 was that its RBD tightly binds ACE2 despite differing in sequence from SARS-CoV-1 at many residues that had been defined as important for ACE2 binding (Andersen et al., 2020; Wan et al., 2020). Our map of mutational effects explains this observation by revealing remarkable degeneracy at ACE2 contact positions, with many interface mutations being tolerated or even enhancing affinity (Figure 5D). Mutations that enhance affinity are notable at RBD sites Q493, Q498, and N501. Although these residues are involved in a dense network of polar contacts with ACE2 (Shang et al., 2020; Figure 5E), our measurements show there is substantial plasticity in this network, as mutations that reduce the polar character of these residues can enhance affinity.

Within the SARS-CoV-2 clade of sarbecoviruses, our maps of mutational effects on binding explain variation in ACE2 affinity among different viruses. For example, GD-Pangolin has higher affinity for ACE2 than SARS-CoV-2 (Figures 1C and 2D), and this can be explained by the affinity-enhancing Q498H mutation present in this virus’s RBD sequence (Figure 5F). In contrast, RaTG13 has substantially lower affinity for ACE2 than SARS-CoV-2 (Figures 1C and 2D), consistent with the presence of affinity-decreasing mutations including Y449F and N501D (Figure 5F). The fact that differences in binding affinity of GD-Pangolin and RaTG13 are well explained by summing the effects of individual mutations relative to SARS-CoV-2 suggests that our deep mutational scanning is useful for sequence-based predictions of the ACE2-binding potential of future viruses isolated from the SARS-CoV-2 clade.

In contrast, the ACE2-binding interface of SARS-CoV-1 has many more mutations relative to SARS-CoV-2, and this increased divergence causes shifts in the actual effects of mutations on ACE2 binding. In particular, our deep mutational scanning shows that most SARS-CoV-1 amino acid states are individually deleterious in SARS-CoV-2, despite being compatible with high-affinity binding by SARS-CoV-1 (Figure 5F). This shift in the effects of mutations between more distantly related RBDs is consistent with studies of protein evolution demonstrating that epistastic entrenchment causes amino acid preferences to change as proteins diverge (Hilton and Bloom, 2018; Lee et al., 2018; Pollock et al., 2012; Povolotskaya and Kondrashov, 2010; Shah et al., 2015; Starr and Thornton, 2016; Starr et al., 2018). Therefore, our current SARS-CoV-2 deep mutational scanning data are likely to be most useful for predicting the effects of mutations to RBDs closely related to that of SARS-CoV-2.

Mutational Constraint of Antibody Epitopes

The RBD is the dominant target of neutralizing antibodies to SARS-CoV-2 (Brouwer et al., 2020; Cao et al., 2020; Ju et al., 2020; Pinto et al., 2020; Premkumar et al., 2020; Rogers et al., 2020; Suthar et al., 2020; Yuan et al., 2020a; Zhang et al., 2020; Zost et al., 2020). It is unclear to what extent the RBD will evolve to escape such antibodies in a manner reminiscent of some other viruses (Smith et al., 2004; Trkola et al., 2005), although in vitro studies suggest that SARS-CoV-2 and SARS-CoV-1 RBDs are capable of fixing mutations that escape neutralizing antibodies (Baum et al., 2020; Rockx et al., 2010). To better define the RBD’s evolutionary capacity for antibody escape, we examined mutational constraint in the epitopes of antibodies that bind the SARS-CoV-1 or SARS-CoV-2 RBD (Figures 6A, S6A, and S6B; Hwang et al., 2006; Pak et al., 2009; Pinto et al., 2020; Prabakaran et al., 2006; Walls et al., 2019; Wrapp et al., 2020b; Wu et al., 2020b; Yuan et al., 2020b).

Figure 6.

Figure 6

Mutational Constraint of Antibody Epitopes

(A) For ACE2 and each of 8 RBD-directed antibodies, black outlines indicate the epitope structural footprint, with surfaces colored by mutational constraint (red indicates more constrained). Names of antibodies capable of neutralizing SARS-CoV-2 are boxed. Constraint is illustrated as mutational effects on binding for RBM-directed antibodies (blue, top) and expression for core-RBD-directed antibodies (orange, bottom). The N343 glycan, which is present in the S309 epitope and is constrained with respect to expression, is shown only on this surface for clarity.

(B) Average mutational constraint for binding and expression within each epitope. Points are colored according to the RBM versus core-RBD designation in (A).

(C) Identification of a patch of mutational constraint surrounding RBD residue E465, which has not yet been targeted by any described antibodies. Surface is colored according to mutational effects on expression, as in (A, bottom). Residues in this constrained E465 patch are listed. See also Figure S6.

Figure S6.

Figure S6

Mutational and Evolutionary Constraint of Antibody Epitopes, Related to Figure 6

(A, B) Surface representations of antibody epitopes colored by mutational effects on expression (A) and binding (B). Representations as described in Figure 6A. (C, D) Mutational constraint and observed antibody escape mutations. Baum et al. (Baum et al., 2020) selected SARS-CoV-2 escape mutations from RBD-directed antibodies. We compare the average mutational tolerance of the sites at which these escape mutations accrue (C), and the effects of the specific escape mutations themselves (D) to all RBM and ACE2-contact sites/mutations. The antibody escape involved mutations that were better tolerated than typical mutations in the RBM or ACE2-binding interface. (E) Evolutionary diversity in antibody epitopes and our newly described E465-centered surface patch among the sarbecoviruses in Figure 1A. Diversity is summarized as the effective number of amino acids (Neff), which scales from 1 for a site that is invariant, to 20 for a site in which all amino acids are at equal frequency.

Many antibodies have epitopes that overlap the RBD ACE2-contact interface and are therefore strongly constrained by mutation effects on binding. For instance, antibodies B38 and 80R engage the two constrained patches that comprise the ACE2-binding interface, while S230, F26G19, and m396 engage either one of these ACE2-binding patches. However, none of the currently characterized antibodies have epitopes as constrained as the ACE2-contact surface itself (Figure 6B), suggesting further epitope focusing could be achieved. The importance of such focusing is demonstrated by a recent study that identified RBD mutations enabling escape from RBM-directed neutralizing antibodies (Baum et al., 2020)—our data indicate that the escape occurs at sites that have high mutational tolerance (Figures S6C and S6D).

Epitopes of core-RBD-directed antibodies tend to be mutationally constrained with respect to expression rather than binding (Figures 6A and 6B). These core-RBD epitopes are conserved across the sarbecovirus alignment (Figure S6E), explaining the possible cross-reactivity of these antibodies between SARS-CoV-1 and SARS-CoV-2 (Huo et al., 2020; Pinto et al., 2020; Wrapp et al., 2020b). Although residues in these epitopes are constrained for stability even in our measurements on the isolated RBD, some of them likely exhibit additional constraint due to quaternary contacts in the full spike trimer (Walls et al., 2020; Wrapp et al., 2020a; Yuan et al., 2020b). We identified an additional core-RBD patch centered on residue E465 that is also mutationally constrained (Figure 6C) and evolutionarily conserved (Figure S6E) but is not targeted by any currently known antibody and might represent a promising target.

Taken together, our results identify multiple mutationally constrained patches on the RBD surface that can be targeted by antibodies. These findings provide a framework that could inform the formulation of antibody cocktails aiming to limit the emergence of viral escape mutants (Baum et al., 2020; Pinto et al., 2020; Wu et al., 2020b; Zost et al., 2020), particularly if deep mutational scanning approaches like our own are extended to define antibody epitopes in functional as well as structural terms (Dingens et al., 2019).

Using Sequence-Phenotype Maps to Interpret Genetic Variation in SARS-CoV-2

An important question is whether any mutations that have appeared in circulating SARS-CoV-2 isolates have functional consequences. Despite intense interest in this question, experimental work to characterize the effects of SARS-CoV-2 mutations has lagged far behind their identification in viral sequences. Our comprehensive maps of the phenotypic effects of mutations provide a direct way to interpret the impact of current and future genetic variation in the SARS-CoV-2 RBD.

To assess the phenotypic impacts of mutations that have appeared in the SARS-CoV-2 RBD to date, we downloaded all 31,570 spike sequences available from GISAID (Elbe and Buckland-Merrett, 2017) on May 27, 2020 and identified RBD amino acid mutations present in high-quality clinical isolates. All observed RBD mutations are at low frequency, with 56 of the 98 observed mutations present only in a single sequence. The observed mutations are significantly less deleterious for ACE2 binding and RBD expression than random single-nucleotide-accessible mutations (Figures 7A, S7A, and S7B, p < 10−6, permutation tests), consistent with the action of purifying selection. Purifying selection against deleterious mutations is especially apparent for mutations that are observed multiple times in circulating variants, with a substantial number of singletons being mildly or moderately deleterious, whereas mutations observed multiple times are largely neutral. This general pattern of increased purifying selection on more common mutations is consistent with theoretical expectation and empirical patterns observed for other viruses (Pybus et al., 2007; Xue and Bloom, 2020).

Figure 7.

Figure 7

Phenotypic Impacts of Genetic Variation in the SARS-CoV-2 RBD

(A) Distribution of effects on ACE2 binding of mutations observed among circulating SARS-CoV-2 isolates. The distribution of mutation effects is shown for all amino acid mutations accessible via single-nucleotide mutation from the SARS-CoV-2 Wuhan-Hu-1 gene sequence, compared to the distributions for subsets of mutations that are observed in sequenced SARS-CoV-2 isolates deposited in GISAID at increasing observation count thresholds. n, number of mutations in each subset.

(B) Summary of most frequent mutations among GISAID sequences, reporting our deep mutational scanning measured effect on binding and expression, the number of GISAID sequences containing the mutation, and the number of geographic regions from which a mutation has been reported.

(C and D) Validation of the mutational effects on binding (C) and expression (D) for 4 of the 5 most frequent circulating RBD variants. S477N rose to high frequency after we began our validation experiments, and so was not included. Error bars in (D) are standard error from 11 samples.

See also Figure S7.

Figure S7.

Figure S7

Genetic Variation and Selection in SARS-CoV-2, Related to Figure 7

(A) Distribution of expression effects of mutations observed among circulating SARS-CoV-2 isolates. Details as in Figure 7A. (B) Permutation tests indicating the action of purifying selection on binding (top) and expression (bottom) among circulating SARS-CoV-2 mutations. For each threshold of GISAID observation counts, 1 million random sub-samples of single-nucleotide-accessible amino acid changes were generated at the same sample size as the true mutation set (n = 98, 42, and 13 for the ≥ 1, ≥ 2, and ≥ 6 thresholds). A P-value was determined as the fraction of sub-samples with median mutational effect on binding or expression equal to or greater than that of the actual GISAID mutation set (dashed vertical line). The observation that the set of mutations observed in GISAID have a more favorable median mutational effect on binding and expression than randomly sampled mutations indicates the action of purifying selection for ACE2 binding and RBD stability. (C) Heatmaps depicting effects of mutations on ACE2 binding, indicating only those mutations that are accessible via single-nucleotide mutation from the SARS-CoV-2 Wuhan-Hu-1 isolate gene sequence. Amino acid mutations that require more than one nucleotide change are in gray. (D) Permutation tests for positive selection for enhanced ACE2 affinity. Random sub-samples were generated as in (B), and the maximum affinity-enhancing effect of mutations in each sub-sample was compared to that in the actual GISAID mutation set. A P-value was determined as the fraction of sub-samples with a maximum effect on binding equal to or greater than in the actual GISAID mutation set (vertical dashed line). We do not see evidence for selection for enhanced ACE2 binding, as randomly sampled mutations generally contain mutations with stronger affinity-enhancing effects than observed in the GISAID mutation set.

Our discovery of affinity-enhancing mutations to the SARS-CoV-2 RBD raises the question of whether positive selection favors such mutations, since the relationship between receptor affinity and fitness can be complex for viruses that are well adapted to their hosts (Callaway et al., 2018; Hensley et al., 2009; Lang et al., 2020). Affinity-enhancing mutations are accessible via single-nucleotide mutation from SARS-CoV-2 (Figure S7C), but none are observed among circulating viral sequences (Figure 7A), and observed mutations do not enhance ACE2 affinity more than randomly drawn samples of single-nucleotide mutations (Figure S7D). Taken together, we see no clear evidence of selection for stronger ACE2 binding, consistent with SARS-CoV-2 already possessing adequate ACE2 affinity at the beginning of the pandemic.

Last, we validated our deep mutational scanning for mutations that are especially prevalent among naturally occurring sequences in GISAID. The deep mutational scanning suggests small phenotypic effects for the most prevalent mutations, with the exception of V367F, which substantially enhances expression (Figure 7B). We re-cloned and tested most of these prevalent mutations for expression and ACE2 binding in isogenic yeast-display assays. Consistent with the deep mutational scanning, the only large phenotypic effect was increased expression of V367F (Figures 7C and 7D), which we also validated enhances thermal stability of mammalian-expressed RBD (Figures 4G and S4H). The relevance of V367F’s stability-enhancing effect for viral fitness is unclear, though this mutation has independently arisen multiple times (van Dorp et al., 2020). We also validated that N439K, the most prevalent RBD mutation, which may have a very slight affinity-enhancing effect (Figures 7B and 7C), has no measurable impact on entry of spike-pseudotyped lentiviral particles (Figure 4H). Taken together, our results suggest that there is little phenotypic diversity in ACE2 binding among circulating variants at this early stage of the pandemic—although it will be interesting to use our maps to continually assess the phenotypic effects of future mutations as the virus evolves.

Discussion

Vast numbers of viral genomes have been sequenced in almost real-time during the SARS-CoV-2 pandemic. These genomic sequences have been useful for understanding viral emergence and spread (Andersen et al., 2020; Bedford et al., 2020; Fauver et al., 2020), but the lack of corresponding high-throughput functional characterization means that speculation has outpaced experimental data when it comes to understanding the phenotypic consequences of mutations. Here, we take a step toward providing phenotypic maps commensurate with the scale of genomic data by experimentally characterizing how all amino acid mutations to the RBD affect the expression of folded protein and its affinity for ACE2, two key factors for viral fitness. These maps show that RBD mutations that have appeared in SARS-CoV-2 to date are nearly neutral with respect to these two biochemical phenotypes, with the exception of one mutation (V367F) that increases RBD stability. Notably, there has been no selection to date for any of the evolutionarily accessible mutations that enhance ACE2 binding affinity. The genetic diversity of SARS-CoV-2 is likely to increase as it continues to circulate in the human population, and so our phenotypic maps should become increasingly valuable for viral surveillance as mutations accumulate over time.

It is important to remember that our maps define biochemical phenotypes of the RBD, not how these phenotypes relate to viral fitness. There are many complexities in the relationship between biochemical phenotypes of yeast-displayed RBD and viral fitness. First, there are subtle differences in glycan structures between yeast versus human cells (Hamilton et al., 2003), though the overall role of glycans in RBD stability is preserved in yeast systems (Chen et al., 2014). Second, the RBD is just one domain of the viral spike, which engages in complex dynamic movements to mediate viral entry (Huo et al., 2020; Walls et al., 2019, 2020; Wrapp et al., 2020b). Finally, spike-mediated entry is just one component of fitness, which involves a myriad of incompletely understood factors that determine how well a virus spreads from one human to another (Kutter et al., 2018). To some degree, these caveats are universal of experimental studies, as even sophisticated animal models are imperfect proxies for true fitness (Louz et al., 2013)—but they are especially true for basic biochemical phenotypes like the ones we measure. However, on a hopeful note, our measurements correlate well with cellular entry by spike-pseudotyped viral particles expressing sarbecovirus RBD homologs (Figure 1D) and single mutants of the SARS-CoV-2 RBD (Figure 4H). Fitness ultimately arises from the concerted action of biochemical phenotypes, which are in turn determined by genotype (Dean and Thornton, 2007; Harms and Thornton, 2013; Russell et al., 2014). By making the first link from mutations to biochemical phenotypes, we have taken a step toward enabling better interpretation of viral genetic variation.

One important area where our maps do have clear relevance is assessing the potential for SARS-CoV-2 to undergo antigenic drift by fixing mutations at sites targeted by antibodies, as occurs for some other viruses such as influenza (Smith et al., 2004). The RBD is the dominant target of neutralizing antibodies (Cao et al., 2020; Ju et al., 2020; Pinto et al., 2020; Rogers et al., 2020; Seydoux et al., 2020; Shi et al., 2020; Wu et al., 2020b; Zost et al., 2020), and so any antigenic drift will be constrained by its mutational tolerance. Our results show that many mutations to the RBD are well tolerated with respect to both protein folding and ACE2 binding. However, the ACE2 binding interface is more constrained than most of the RBD’s surface, which could limit viral escape from antibodies that target this interface (Rockx et al., 2010). In this respect, our maps enable several important observations. First, no characterized antibodies have epitopes as constrained as the actual RBD surface that contacts ACE2, suggesting that there is room for epitope focusing to minimize viral escape. Second, there are a number of RBD mutations that enhance ACE2 affinity, which implies evolutionary potential for compensation of deleterious mutations in the ACE2 interface in a manner reminiscent of multi-step escape pathways that have been described for other viruses (Bloom et al., 2010; Friedrich et al., 2004; Gong et al., 2013; Lynch et al., 2015; Wu et al., 2017). It should be possible to shed further experimental light on the potential for antigenic drift by extending our deep mutational scanning methodology to directly map immune-escape mutations as has been done for other viruses (Dingens et al., 2019; Lee et al., 2019; Wu et al., 2020a).

RBD-based antigens represent a promising vaccine approach (Chen et al., 2020a, 2020b; Mulligan et al., 2020; Quinlan et al., 2020; Ravichandran et al., 2020; Zang et al., 2020). Our sequence-phenotype maps can directly inform efforts to engineer such vaccines in several ways. First, we identify many mutations that enhance RBD expression and thermal stability, a desirable property in vaccine immunogens. Second, our maps show which mutations can be introduced into the RBD without disrupting key biochemical phenotypes, thereby opening the door to resurfacing immunogens to focus antibodies on specific epitopes (Duan et al., 2018; Eggink et al., 2014; Jardine et al., 2016; Kulp et al., 2017; Weidenbacher and Kim, 2019; Wu et al., 2010). Finally, our maps show which surfaces of the RBD are under strong constraint and might thereby be targeted by structure-guided vaccines to stimulate immunity with breadth across the sarbecovirus clade: in addition to the ACE2 interface itself, these surfaces include several core-RBD patches targeted by currently described antibodies and a previously undescribed core-RBD patch surrounding residue E465.

Finally, our work should be useful for understanding the evolution of sarbecoviruses more broadly, including the potential for more spillovers into the human population. There is a dizzying diversity of RBD genotypes and phenotypes among sarbecoviruses within bat reservoirs (Boni et al., 2020; Demogines et al., 2012; Frank et al., 2020; Hu et al., 2017; Latinne et al., 2020; Letko et al., 2020; MacLean et al., 2020). A prerequisite for these viruses to jump to humans is the ability to efficiently bind human receptors (Becker et al., 2008; Letko et al., 2020; Menachery et al., 2015, 2016). Our maps are immediately useful in assessing the effects on ACE2 binding of mutations to viruses within the SARS-CoV-2 clade, and extensions to account for epistasis and genetic background could further inform understanding of the evolutionary trajectories that enable sarbecoviruses to efficiently infect human cells.

STAR★Methods

Key Resources Table

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies

FITC-conjugated chicken anti-cMyc antibody Immunology Consultants Laboratory, Inc. Cat# CMYC-45F
PE-conjugated streptavidin Thermo Fisher Cat# S866

Chemicals, Peptides, and Recombinant Proteins

Biotinylated human ACE2 ACROBiosytems Cat# AC2-H82E6

Critical Commercial Assays

HIV-1 p24 Antigen Capture Assay Advanced Bioscience Laboratories, Inc. Cat# 5421

Deposited Data

Raw sequencing data This paper NCBI SRA: BioProject PRJNA639956
GISAID EpiCoV SARS-CoV-2 sequence isolates GISAID Full list of contributing labs and accessions: https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/alignments/Spike_GISAID/gisaid_hcov-19_acknowledgement_table.xls
Sarbecovirus homolog RBD sequences NCBI NCBI GenBank: MN908947, MN996532, AY278741, KF367457, KF569996, DQ071615, DQ022305, DQ412042, MG772934, MG772933, NC014470
GD-Pangolin RBD sequence Lam et al., 2020 N/A
ACE2-bound RBD structures Li et al., 2005a; Lan et al., 2020 PDB 2AJF, 6M0J
Antibody-bound RBD structures Yuan et al., 2020b; Wrapp et al., 2020b; Prabakaran et al., 2006; Pak et al., 2009; Hwang et al., 2006; Wu et al., 2020b; Walls et al., 2019; Pinto et al., 2020 PDB 6W41, 6WAQ, 2DD8, 3BGF, 2GHW, 7BZ5, 6NB6, 6NB7, 6WPS

Experimental Models: Cell Lines

Saccharomyces cerevisiae strain AWY101 Wentz and Shusta 2007 AWY101
Human Embryonic Kidney cells (HEK293T) ATCC ATCC CRL-3216
Human Embryonic Kidney cells expressing human ACE2 (HEK293T-hACE2) BEI BEI NR-52511
Expi293F cells Thermo Fisher Cat# A14527
FreeStyle 293F cells Thermo Fisher Cat# R79007

Oligonucleotides

primers for RBD library mutagenesis Integrated DNA Technologies sequences given at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/primers/mutational_lib/SARS-CoV-2_RBD_NNSprimers.txt
primers for RBD library construction and Illumina sequencing Integrated DNA Technologies sequences given at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/primers/primers.csv

Recombinant DNA

plasmid 2649: pETcon_SARS-CoV-2_RBD This paper sequence at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/plasmid_maps/2649_pETcon-SARS-CoV-2-RBD-201aa.gb
plasmid 2736: HDM_IDTSpike_EcoKozak This paper sequence at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/plasmid_maps/2736_HDM_IDTSpike_EcoKozak.gb
pHAGE2-CMV-ZsGreen-W BEI BEI Resources NR-52520
HDM-Hgpm2 BEI BEI Resources NR-52517
pRC-CMV-Rev1b BEI BEI Resources NR-52519
HDM-tat1b BEI BEI Resources NR-52518

Software and Algorithms

ccs, version 4.2.0 Pacific Biosciences https://github.com/PacificBiosciences/ccs
alignparse, version 0.1.3 Crawford and Bloom, 2019 https://github.com/jbloomlab/alignparse
minimap, version 2.17 Li 2018 https://github.com/lh3/minimap2
dms_variants, version 0.6.0 GitHub https://jbloomlab.github.io/dms_variants/
custom code This paper all analyses provided on github: https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS

Other

SARS-CoV-2 RBD mutant libraries This paper N/A

Resource Availability

Lead Contact

Further information and requests for reagents and resources should be directed to and will be fulfilled by the Lead Contact, Jesse Bloom (jbloom@fredhutch.org).

Materials Availability

SARS-CoV-2 mutant libraries generated in this study will be made available on request by the Lead Contact with a completed Materials Transfer Agreement.

Data and Code Availability

We provide all data and code in the following ways:

Experimental Model and Subject Details

Saccharomyces cerevisiae strain AWY101 (Wentz and Shusta, 2007) was cultured at 30°C (except where indicated) in baffled flasks while shaking at 275rpm. Selective media contained 6.7 g/L Yeast Nitrogen Base, 5.0 g/L Casamino acids, 1.065 g/L MES, and 2% w/v carbon source (dextrose for routine maintenance, galactose supplemented with 0.1% dextrose for RBD induction). HEK293T cells (ATCC CRL-3216) were cultured in D10 growth media (DMEM with 10% heat-inactivated FBS, 2 mM l-glutamine, 100 U/mL penicillin, and 100 μg/mL streptomycin) at 37°C in a humidified 5% CO2 incubator. Expi293F (Thermo Fisher Cat No. A14527) and FreeStyle 293F (Thermo Fisher Cat No. R79007) suspension cells were grown at at 37°C in a humidified 8% CO2 incubator rotating at 130 rpm. Cell lines were not authenticated.

Method Details

RBD cloning

The Spike receptor binding domain (RBD) from SARS-CoV-2 (isolate Wuhan-Hu-1, GenBank : MN908947, residues N331-T531) and additional sarbecovirus homologs (RaTG13, GenBank: MN996532; GD-Pangolin consensus from Lam et al. (2020); SARS-CoV-1 Urbani, GenBank: AY278741; WIV1, GenBank: KF367457 (identical RBD sequence to WIV16); LYRa11, GenBank: KF569996; Rp3, GenBank: DQ071615; HKU3-1, GenBank: DQ022305; Rf1, GenBank: DQ412042; ZXC21, GenBank: MG772934; ZC45, GenBank: MG772933; and BM48-31, GenBank: NC014470) were ordered as yeast codon-optimized gBlocks (IDT) and cloned into the pETcon yeast surface-display expression vector. The destination vector was modified downstream of the yeast surface-display fusion construct to include a barcode landing pad for subsequent library generation, along with Illumina sequencing priming handles for downstream barcode sequencing and NotI digestion sites for downstream PacBio sequencing preparation. This plasmid sequence is provided on GitHub at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/plasmid_maps/2649_pETcon-SARS-CoV-2-RBD-201aa.gb.

Isogenic yeast display induction and titration

RBD variant plasmids were transformed into the AWY101 Saccharomyces cerevisiae strain (Wentz and Shusta, 2007), selecting for the plasmid Trp auxotrophic marker on SD-CAA selective plates (6.7g/L Yeast Nitrogen Base, 5.0g/L Casamino acids, 1.065 g/L MES acid, and 2% w/v dextrose). Single colonies were inoculated into 1.5mL liquid SD-CAA media, and grown overnight at 30°C. Then 1 OD unit of yeast were back-diluted into 1.5mL SG-CAA+0.1%D induction media (2% w/v galactose supplemented with 0.1% dextrose), and incubated for 16-18 hours at room temperature.

Induced cells were spun down at 250,000 cells per sample and washed in PBS-BSA (0.2 mg/mL). Samples were resuspended in primary labeling solutions across a range of concentrations of biotinylated human ACE2 ectodomain (ACROBiosystems AC2-H82E6), which contains its natural dimerization domain. Primary labeling reactions were conducted in sufficient reaction volumes for each concentration to avoid ligand depletion effects of greater than 10%. For instance, the lowest sample concentration of 10−13 M was scaled to 25mL, at which volume 2.9% of total ligand molecules are estimated to be titrated in RBD:ACE2 complexes given the wild-type KD,app and an estimated 50,000 surface RBDs per cell (Boder and Wittrup, 1997). Following overnight equilibration of ACE2 binding at room temperature, cells were washed in ice-cold PBS-BSA, and resuspended in PBS-BSA containing 1:200 diluted FITC-conjugated anti c-Myc antibody (Immunology Consultants Lab, CMYC-45F) to label for RBD surface expression via a C-terminal c-Myc epitope tag, and 1:200 diluted PE-conjugated streptavidin (Thermo Fisher S866) to detect bound biotinylated ACE2 ligand. Following 1 hour of secondary labeling at 4°C, cells were washed twice in ice-cold PBS-BSA, and resuspended in PBS.

RBD surface expression and ACE2-binding levels were determined via flow cytometry using a BD LSRFortessa X-50. For flow cytometry, 10,000 cells were analyzed at each ACE2 concentration across a titration series. Cells were gated to select for singleton events, FITC labeling was used to subset RBD+ cells, and PE labeling was measured within this FITC+ population. To mimic the subsequent library sorting experiments in which we are blinded to exact PE fluorescence within a given PE fluorescence bin (since we only sequence barcodes within a bin), we analyzed isogenic titration data by drawing equivalent bins of PE fluorescence that capture 95% of unbound unmutated SARS-CoV-2 cells (bin1), 95% of saturated SARS-CoV-2 cells (bin4), and a bin2/bin3 boundary evenly spaced on the log-scale between the boundaries of the bin1 and bin4 partitions (see Figure 2B). For each ACE2 concentration, we determine the mean bin of PE fluorescence as a simple weighted mean value across integer-weighted bins:

bin[ACE2]=(i=14ni,[ACE2]i)/(i=14ni,[ACE2])

where ni,[ACE2] is the number of cells that fall into bin i at a given ACE2 concentration, and i is the simple integer value of a bin from 1 to 4.

We determined the binding constant KD,app describing the affinity of each RBD variant for human ACE2 ligand along with free parameters a (titration response range) and b (titration curve baseline) via nonlinear least-squares regression using a standard non-cooperative Hill equation relating the mean bin response variable to the ACE2 labeling concentration:

bin[ACE2]=a[ACE2]/([ACE2]+KD,app)+b

We report apparent KD values (KD,app) that do not take into account the stoichiometry of the multivalent yeast-displayed RBD interaction with dimeric ACE2. Following this “apparent” nomenclature, we report ACE2 concentrations as molarity of the monomeric subunit. Computational notebooks detailing the fits of all isogenic RBD titrations is provided on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/isogenic_titrations/homolog_validations.md and https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/isogenic_titrations/point-mut-validations.md).

Library mutagenesis

Mutagenesis of the SARS-CoV-2 RBD was performed in two independent replicates via the method described in Bloom (2014) with the modification that primers lengths were adjusted to ensure equal melting temperatures as described in Dingens et al. (2017) and we used NNS rather than NNN primers. Our general library generation and sequencing workflow is outlined in Figure S1A. Briefly, we designed mutagenic primers containing degenerate NNS codons that tile across the SARS-CoV-2 RBD, which were ordered as oPools from Integrated DNA Technologies. The script used to design the mutagenic primers and the resulting primer sequences are available at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/tree/master/data/primers/mutational_lib. We conducted three rounds of mutagenesis, each consisting of 7 mutagenic PCR cycles and 20 joining PCR cycles. The final joined products were amplified for 10 cycles with primers that append a unique identifier N16 barcode sequence to the 3′ end of each mutagenized insert, downstream of the RBD stop codon and mRNA 3′ UTR. Barcodes were also PCR appended to the un-mutagenized RBD homologs via the same primer addition PCR. Primers used in library assembly are provided on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/tree/master/data/primers).

Mutagenized SARS-CoV-2 libraries and pooled wild-type homolog RBDs were cloned into EcoRI-HF/SacI-HF digested pETcon 2649 vector (sequence linked above) using NEBuilder HiFi DNA Assembly (NEB E2621). Assembled products were Ampure purified and electroporated into electrocompetent NEB10-beta cells. Electroporated cells were plated on 15cm LB+ampicillin plates at an estimated bottleneck of 100,000 (SARS-CoV-2 mutant libraries) or 1,000 (pooled RBD homologs) colony forming units to limit library size. After approximately 18 hours of outgrowth, colonies were scraped into liquid LB+ampicillin, and grown for 2.5 hours in liquid culture prior to plasmid purification.

Plasmid pools were transformed into the AWY101 strain of Saccharomyces cerevisiae via the protocol of Gietz and Schiestl (2007). SARS-CoV-2 mutant libraries were transformed at 50ug scale and the pooled RBD homolog controls were transformed at 10ug scale. Colony forming unit counts from plated serial dilutions indicate transformation yield of > 1 million cfus. Transformed yeast grew for 14 hours post-transformation in 100mL selective SD-CAA media, and were subsequently back-diluted into 100mL fresh SD-CAA at 1 OD600 for an additional 9 hours passage, to enable further resolution of multiple vector transformants (Scanlon et al., 2009). Transformed yeast libraries were flash frozen in 1e8 cfu aliquots and stored −80°C.

PacBio library sequencing and analysis

PacBio sequencing was used to acquire long sequence reads spanning the N16 barcode and the RBD gene sequence. PacBio sequencing inserts were prepared from bacterially-purified plasmid pools via NotI-HF restriction digest followed by gel purification and SMRTbell ligation. The use of restriction digest rather than PCR eliminates the possibility of PCR strand exchange scrambling barcodes. Each SARS-CoV-2 RBD mutant library was spiked to 1% frequency with the internal standard pool of RBD homologs. Each replicate library was sequenced in two SMRT Cells on a PacBio Sequel using 20-hour movie collection times. PacBio circular consensus sequences (CCSs) were generated from the raw subreads using the ccs program (https://github.com/PacificBiosciences/ccs, version 4.2.0), setting the parameters to require 99.9% accuracy and a minimum of 3 passes. The resulting CCSs are available on the NCBI Sequence Read Archive at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA639956.

We then processed the CCSs to identify the RBD sequence (SARS-CoV-2 or one of the 11 homologs), call any mutations in the RBD sequence, and determine the associated 16-nucleotide barcode. To do this, we used alignparse (Crawford and Bloom, 2019), version 0.1.3, which in turn makes use of minimap2 (Li, 2018), version 2.17. We only retained CCSs that matched the parental RBD sequence with no more than 45 nucleotide mutations (corresponding to up to 15 codon mutations), had a barcode of the expected 16 nucleotide length, and had no more than one mismatch in the flanking regions expected in the sequenced amplicon. A computational notebook providing full details is available on GitHub at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/process_ccs.md.

We next used these processed CCSs to generate a codon-variant lookup table that links each barcode to its associated codon mutations in the RBD sequence. To do this, we first filtered only for CCSs where the PacBio ccs-reported accuracy was at least 99.99% in both the RBD gene sequence and the barcode (the vast majority of CCSs passed this filter). We then determined the empirical accuracy of the CCSs by determining the concordance between the RBD gene sequence called by CCSs with the same barcode using the method implemented at https://jbloomlab.github.io/alignparse/alignparse.consensus.html#alignparse.consensus.empirical_accuracy. For both libraries, the empirical accuracy of the entire region of the CCS covering the RBD sequence was 99.8% if we ignored those with indels (Figure S1B). Most barcodes were covered by multiple CCSs (Figure S1B), and in that case we built a consensus of these CCSs after discarding any barcodes for which the CCSs differed often or at many sites using the method implemented at https://jbloomlab.github.io/alignparse/alignparse.consensus.html#alignparse.consensus.simple_mutconsensus. Finally, we discarded any variants with indels in the RBD. Therefore, more than 99.8% of the final barcode-linked variants should have the correctly determined RBD sequence, since 99.8% is the accuracy for those covered by just one CCS and most variants were called by the consensus of multiple CCSs. For further analysis of the barcoded variants, we then created a codon variant table using dms_variants (https://jbloomlab.github.io/dms_variants/, version 0.6.0). The final barcode-variant lookup table (which associates each barcode with its RBD sequence) is at https://media.githubusercontent.com/media/jbloomlab/SARS-CoV-2-RBD_DMS/master/results/variants/codon_variant_table.csv. Some summary statistics about the final composition of the libraries are in Figure S1, and the complete code used to generate the barcode-variant lookup table and many additional plots characterizing the composition of the libraries are on GitHub at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/build_variants.md.

Deep mutational scanning library yeast surface-display induction and labeling

Yeast libraries were thawed and grown overnight at 30°C in 180mL SD-CAA media at an initial OD600 of 0.1. We spiked our SARS-CoV-2 mutant libraries with the barcoded RBD homolog pool at a total fraction of 0.6% yeast density, such that each RBD homolog barcode should be present at a frequency on the same order of magnitude as the typical SARS-CoV-2 variant barcode. To induce RBD surface expression, yeast were back-diluted to 50mL (expression experiments) or 200mL (binding experiments) SG-CAA+0.1%D induction media at 0.67 OD600 and incubated at room temperature for 16-18 hours with mild agitation.

For library expression experiments, 45 OD units yeast were washed twice with PBS-BSA and labeled in 3mL 1:100 diluted anti-Myc-FITC antibody for 1hr at 4°C with gentle mixing. Labeled cells were washed twice in PBS-BSA and resuspended in 5mL PBS for FACS. For library binding experiments, 8 OD units yeast per titration concentration (10−13 M to 10−6 M ACE2 at half-log intervals, and a 0M ACE2 sample) were washed twice with PBS-BSA, and incubated with ACE2 ligand overnight at room temperature with gentle agitation. Labeling volumes were scaled at low ACE2 concentration to limit ligand depletion effects, as with isogenic titrations described above. Following equilibration of ACE2 labeling, cells were kept chilled while washing once with PBS-BSA, labeling for one hour in 1mL PBS-BSA with 1:100 diluted Myc-FITC and 1:200 Streptavidin-PE, washed two more times with PBS-BSA, and resuspended to 1mL in PBS.

Fluorescence-activated cell sorting (FACS) of yeast libraries

Yeast libraries were sorted into bins of FITC or PE fluorescence using a BD FACS Aria II. Cells were sorted into 5mL FACS tubes containing 1mL of 2xYPAD supplemented with 1% BSA. Tubes were pre-wet with collection media prior to sample collection, to reduce sticking and improve post-sort yield.

For expression sorts, cells were gated for singleton events (Figure S2A), followed by partitioning into four bins of FITC fluorescence (Figure 2A): bin 1 captures 99% of unstained cells, and bins 2-4 split the remaining library fraction into tertiles. We sorted > 50 million cells from each library into these bins. From these same inductions, we also sorted 15 million RBD+ cells from each library (P4 population, Figure S2A), to enrich RBD-expressing cells within our libraries for our titration sorting experiments.

For ACE2-binding titrations, we gated cells for singleton events and RBD+ expression (Figure S2B). For each ACE2 concentration sample, we sorted cells into four bins of PE fluorescence as described above: bin1 captures 95% of unmutated SARS-CoV-2 cells incubated with 0M ACE2, bin4 captures 95% of unmutated cells at saturating ACE2 ligand, and the bin2/bin3 boundary evenly splits the log-MFI scale between the bin1 and bin4 boundaries (Figure 2B). We sorted each ACE2 concentration sample into these four bins for approximately 15 minutes, capturing 5-6 million cells per ACE2 concentration.

Following each sort, cells from each collection tube were spun for 5 min at 3,000 g in a tabletop centrifuge, yielding a visible pellet for any sample with at least ∼500,000 collected cells. Collection supernatant was removed, and cells were resuspended in SD-CAA media supplemented with 1:100 penicillin-streptomycin. Cells were resuspended to an estimated 2e6 cells/mL in 15mL culture tubes or baffled flasks for expresion post-sort samples, 5e5 cells/mL in baffled flasks for RBD+ sort samples, and 1mL (< 1e6 cells) or 1.5mL (> 1e6 cells) in 96-deep-well plates for titration samples. For expression FACS experiments, total cell recovery from all samples was measured via serial dilution and plating on YPD and SD-CAA plates for each sample, which showed average cellular recovery of 85% (range 79%–94%), with 62% (range 52%–77%) of cells retaining plasmid, with the exception of the FITC-negative bin 1 populations, which showed 20% plasmid retention. These per-sample cell recovery counts were used to calibrate downstream sequencing numbers for the actual number of cells that grew out from each sort bin. For titration sorts, we did not titer all 64 post-sort samples, but instead spot checked 6 samples to ensure normal levels of cell recovery, which showed an average 66% cell recovery and 46% plasmid retention. As we did not titer all samples, we use the FACS log cell count as the estimate of number of cells collected in each bin, which makes the assumption that there are no systematic differences in post-sort cell yield across bins, which is more appropriate for these titration sorts where the ACE2-binding gates are nested within an overall RBD+ selection gate that selects for even plasmid retention (Figure S2B).

Post-sort samples were grown overnight in liquid media at 30°C. Plasmids were purified from post-sort yeast samples of < 4e7 cfu using Zymo Yeast Miniprep kits (single column or 96-well plate formats) according to kit instructions, but with the addition of > 2 hours Zymolyase treatment and a −80°C freeze/thaw cycle prior to cell lysis.

Illumina Sequencing

Post-sort plasmid samples were PCR amplified from 10uL plasmid template input using primers flanking the N16 barcode that append remaining Illumina sequencing handles that are not already plasmid encoded, and unique NextFlex sample indices (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/tree/master/data/primers). PCRs were conducted with KOD polymerase for 20 cycles, except for titration sort samples of less than 10,000 cells, where 28 cycles were necessary to obtain sufficient PCR product due to low sample input:

  • 1.

    95°C, 2min

  • 2.

    95°C, 20 s

  • 3.

    58°C, 10 s

  • 4.

    70°C, 10 s

  • 5.

    Return to 2, 19x (27x for low-input samples)

PCR products were Ampure purified, quantified via PicoGreen, and pooled to mirror desired sample frequencies given cell counts in each FACS sample. Pooled samples were gel purified, Ampure purified, and submitted for 2 lanes of 50bp single end Illumina HiSeq sequencing per library.

Demultiplexed reads were aligned to library barcodes determined from PacBio sequencing, yielding a count of the number of times each library barcode was sequenced within each FACS partition. Read counts for each FACS sample were downweighted by the ratio of total reads from a bin compared to the number of cells that were actually sorted into that bin. For one bin in which the number of HiSeq reads was less than the number of cells sorted into a bin, we re-amplified PCR product from a newly purified plasmid aliquot, and obtained reads via a single lane of MiSeq 50bp single end sequencing. Computational notebooks providing additional details on our Illumina sequencing processing and statistics are provided on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/count_variants.md and https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/analyze_counts.md).

Calculating variant phenotypes for expression

For each library variant, we estimated mean expression based on its distribution of cell counts across FITC sort bins and the known censored fluorescence boundaries of each sort bin using a maximum likelihood approach (Peterman and Levine, 2016), enacted in the fitdistrplus R package (Delignette-Muller and Dutang, 2015), assuming the uncensored log-transformed fluorescence values for a genotype follow a normal distribution. Expression measurements were retained for barcodes for which at least 20 cells were sampled across the four sort bins, resulting in measured expression phenotypes for 92.9 and 90.5% of variants in libraries 1 and 2, respectively.

Expression measurements were represented as the difference in log-mean fluorescence intensity (MFI) relative to wild-type (ΔlogMFI = logMFIvariant - logMFIwild-type), such that a positive value indicates higher RBD expression. A very small fraction of wild-type and synonymous barcodes were ascribed non-fluorescing phenotypes, likely reflecting expression-abolishing mutations that occurred outside of the PacBio sequencing window. These variants were selected out prior to titration measurements by our RBD+ pre-sort, but remain in the expression measurements. To avoid artificially depressing the wild-type SARS-CoV-2 expression measurement and therefore miscalibrating this Δlog(MFI) scale, potentially annotating slightly deleterious mutational effects as beneficial, we computed the mean wild-type expression excluding these outliers (logMFI < 10.2 or 10.1 in lib1 and lib2, respectively). We note that we are unable to do the same for any library mutants for which we observe non-fluorescence, because we are unable to a priori determine whether a lack of expression is due to the library mutation versus external, unobserved factors. This uncertainty makes our calling of expression-enhancing mutations conservative, as mutational effects, if biased by these outliers, will tend to be pulled slightly down in their measurement. The global epistasis approach we explain below can mitigate the influence of these outlier observations on our final estimates of mutational effects. A computational notebook presenting our calculation of expression phenotypes and results is included on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/compute_expression_meanF.md).

Calculating variant phenotypes for ACE2-binding affinity

For each library barcode at each ACE2 sample concentration, we determined its simple mean bin of ACE2-binding via the equation used above in isogenic titrations. We fit titration curves as above to determine barcode-specific KD,app from the series of FACS-seq derived mean bin measurements across ACE2 concentration (Figure S2D). Because a barcode’s mean bin might be measured with varying certainty across different ACE2 concentrations, we used weighted least-squares nonlinear regression, weighing each mean bin estimate by an empirical estimate of variance based on the per-sample cell count, derived from estimates of variance in repeated wild-type/synonymous barcode measurements grouped by sampling depth (Figure S2C, right panels). To avoid fits of errant titration curves, we constrained the baseline parameter b to be fit between 1 and 1.5, and the response parameter a to be fit between 1.5 and 3. Through initial curve fit constraints and subsequent QC filtering, our fit KD,app binding constants were constrained to be within the concentration range of our titration (10−13 – 10−6 M), and therefore many barcodes are censored at the upper limit with true KD,app ≥ 10−6 M. We filtered out titration curve fits for variants with an average cell count < 5 across sample concentrations, or with cell count < 2 in 7 or more of the 16 ACE2 concentration samples. Finally, we filtered out the 5% of curves with the highest normalized mean square residual, where residuals are normalized from 0 to 1 by the fit response parameter a, such that titration curves that plateau at lower levels of saturated binding don’t have systematically smaller mean square residuals. This process yielded KD,app estimates for 75.2 and 75.4% of variants in libraries 1 and 2, respectively. Binding measurements were represented as the difference in log10(KD,app) relative to wild-type (Δlog10(KD,app) = log10(KD,app)wild-type – log10(KD,app)variant), polarized such that a positive value indicates higher variant ACE2 affinity. A computational notebook presenting our calculation of expression phenotypes and results is included on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/compute_binding_Kd.md).

Decomposing single-mutant effects from multiple-mutant genotypes

Barcodes in our libraries contain a Poisson-distributed number of mutations (Figure S1C). Though most mutations are sampled in at least one barcode as a unique single mutant (Figure S1E), most library genotypes contain multiple amino acid mutations, and some amino acid mutations are only sampled on many of these multiple-mutant backgrounds. Therefore, we used global epistasis models (Otwinowski et al., 2018) to decompose single-mutation effects from the set of all single- and multi-mutant backgrounds (Figures S2E–S2K). Briefly, we fit regression models that represent the phenotype of each library variant as a sum of latent-scale effects of all component amino acid mutations, which are transformed by a flexible nonlinear curve to the observed experimental scale; the shape of the nonlinear curve and the single-mutant effect terms are fit simultaneously to all of the data. For variance estimates on each library variant, we used the standard error of the estimate on KD,app to estimate a variance for our per-variant binding measurements; for expression, we calculated empirical estimates of variance as a function of cell count, based on binning replicate wild-type/synonymous mutant barcodes present in the library across bins of sampling depth (Figure S2C, left panels). Our analysis, implemented in the dms_variants package (see https://jbloomlab.github.io/dms_variants/dms_variants.globalepistasis.html), is as described by Otwinowski et al., except we used a Cauchy likelihood model to relate observed measurements to the global epistasis modeled phenotype, which should be more tolerant of outliers than the Gaussian likelihood used by Otwinoski et al., and we transformed our single-mutant effect latent-scale coefficients back to the experimentally measured observed scale, to facilitate comparison with additional measurements made on this scale such as the RBD homologs spiked into each library. Computational notebooks detailing the global epistasis fits are provided on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/global_epistasis_expression.md and https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/global_epistasis_binding.md).

For our binding titration measurements, directly measured single-mutant phenotypes correlated extremely well between replicates (R2 = 0.97, Figure S2I), and this correlation was not further improved by the global epistasis decomposition (Figure S2J); therefore, we retained all directly measured single-mutant effects, and only used global epistasis decomposition to interpolate the 14% of single mutants in each library that were not directly measured on any single-mutant backgrounds (which together comprise the measurements correlated in Figure 2F). It is important to note that the shape of global epistasis nonlinearity that was fit to the data disallows mutations from increasing affinity relative to wild-type (Figures S2H and S2K)—this prevents us from ascribing affinity-enhancing effects to any of the mutations that we did not directly measure as single mutants (only 5.7% of mutants were not sampled as single mutants in either library), which we accept as an appropriately conservative approach.

In the case of our expression measurements, directly sampled single mutants correlated moderately well between replicates (R2 = 88, Figure S2F), but this correlation was improved between the global epistasis estimates derived from each library (R2 = 0.93, Figure 2E). This may be in part because the expression phenotype is a more widely distributed phenotype with smaller relative shifts in the mean caused by mutation, and because of the errant outliers that we could not account for as discussed above with regards to wild-type barcodes, such that measurements of mutational effects are improved when integrating across many different backgrounds instead of taking a single observed barcode at face value. Therefore, for expression phenotypes, we used the global epistasis estimates for all mutations. We filtered out four coefficients from library 1 and three from library 2 that had nonsensically high model estimates, likely to do partial collinearities among some low-coverage mutations. Our final binding and expression single-mutant phenotypes were determined from the average effect across the two independent library replicates. A computational notebook detailing the full derivation of our final single-mutant phenotypic scores for binding and expression is on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/single_mut_effects.md#assessing-global-epistasis-models-for-binding-data).

Data visualization

The interactive heatmap of mutational effects shown at https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS/ (Data S1) was made using the altair Python package (VanderPlas et al., 2018).

For the logo plot representation of the data in Figure S3, the experimental measurements of Δlog(MFI) and Δlog10(KD,app) were converted to letter heights as follows. For binding, we first computed a Boltzmann-like weighting factor for each amino acid a at site r as wr,a = exp(α xr,a) where xr,a is the experimental measurement for the effect of the mutation of site r to amino acid a, in other words the Δlog(MFI) or Δlog10(KD,app) value. The α parameter is a temperature-like scaling factor which was set to 1.4 for the binding values, and chosen for the expression values so that the range of exponents for expression is the same as for binding. The letter heights were then computed by re-scaling the weighting factors at each site to sum to one, so that the letter height is pr,a = wr,a / ∑a’ wr,a’. The logo plots themselves were rendered using Logomaker (Tareen and Kinney, 2020). The code that creates these logo plots is on GitHub at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/logoplots_of_muteffects.md.

The interactive structure-based visualizations at https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS/structures were built using dms-view (Hilton et al., 2020). In these visualizations, the logo plot letter heights were computed as for Figure S3 (see paragraph immediately above). Number of effective amino acids was calculated as the exponentiated preferences. Mean, minimum, and maximum mutational effects per site were calculated from the set of Δlog(MFI) or Δlog10(KD,app) measurements of all missense mutations at a site.

Structural analyses

Structural analyses of the ACE2-bound SARS-CoV-2 and SARS-CoV-1 RBDs used the crystal structures from PDB: 6M0J (Lan et al., 2020) and PDB: 2AJF (Li et al., 2005a), respectively. ACE2 contacts were annotated as residues with any non-hydrogen atom within 4 Angstrom from any ACE2 residue. Solvent-accessible surface area was calculated from the 6M0J structure using dssp (Kabsch and Sander, 1983), with and without the ACE2 ligand present. Relative solvent accessibilities were determined by normalizing to the maximum theoretical solvent accessibility of a residue (Tien et al., 2013). Structural images were rendered in PyMol. Full analyses of our mutational measurements in context of structural and evolutionary features are provided on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/structure_function.md and https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/sarbecovirus_diversity.md).

Antibody epitopes were mapped from crystal structures PDB: 6W41 (Yuan et al., 2020b), PDB: 6WAQ (Wrapp et al., 2020b), PDB: 2DD8 (Prabakaran et al., 2006), PDB: 3BGF (Pak et al., 2009), PDB: 2GHW (Hwang et al., 2006), PDB: 7BZ5 (Wu et al., 2020b), and cryo-EM structures PDB: 6NB6, 6NB7 (Walls et al., 2019), and PDB: 6WPS (Pinto et al., 2020). RBD residues were annotated as being in an antibody epitope if any non-hydrogen atom was within 4 Angstroms of an antibody residue, with the exception of the backbone-only models of 6NB6 and 6NB7, where epitopes were defined as RBD residues with Cɑ within 8 Angstroms of any antibody residue. Our full analysis of mutational constraint in antibody epitopes is provided on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/antibody_epitopes.md).

Analysis of circulating variants

All 31,570 spike sequences on GISAID as of 27 May 2020 were downloaded and aligned via mafft (Katoh and Standley, 2013). Sequences from non-human origins and sequences containing any gap characters were removed. All amino-acid mutations among GISAID sequences were enumerated. Some low-coverage spike sequences contain undetermined ‘X’ characters. We excluded any mutation from our curated set of GISAID mutations if it was solely observed on sequence backgrounds containing at least one undetermined X character in the RBD sequence; however, sequences with X characters were allowed to contribute to observations of mutation count for mutations that were observed on at least one other high-coverage RBD sequence. To characterize patterns of selection on amino-acid mutations observed among GISAID sequences, we conducted permutation tests as described in the Figure S7 legend. Our full analysis of mutational effects of circulating variants is provided on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/results/summary/circulating_variants.md). We acknowledge all GISAID contributors for their sharing of sequencing data (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/alignments/Spike_GISAID/gisaid_hcov-19_acknowledgement_table.xls).

Alignment and phylogeny

We used the curated RBD sequence set from Letko et al. (2020), adding newly described RBD sequences from sarbecovirus strains RaTG13 (Zhou et al., 2020b), RmYN02 (Zhou et al., 2020a), GD-Pangolin and GX-Pangolin (Lam et al., 2020), and the additional non-Asian bat sarbecovirus isolate BtKY72 (Tong et al., 2009). RBD nucleotide sequences were aligned via mafft with a gap opening penalty of 4.5, and the maximum likelihood phylogeny was inferred in RAxML (Stamatakis, 2014) under the GTR model with 4 gamma-distributed discrete categories of among-site rate variation.

Pseudotyped lentiviral particle infection assays

We selected seven single mutations from our deep mutational scanning measurements for validation of phenotypic effects in a spike-pseudotyped lentivirus assay (Crawford et al., 2020). Mutations were selected that exhibited deleterious effects on RBD expression (C432D) or ACE2 binding (L455Y, N501D and G502), no strong phenotypic effect on either binding or expression (N439K), and affinity-enhancing effects (Q498Y and N501F). These point mutations were introduced via site-directed mutagenesis (New England Biolabs E0554S) into the HDM vector containing codon-optimized SARS-CoV-2 Spike from Wuhan-Hu-1, with an upstream Kozak sequence. The full sequence of this plasmid is available at https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS/blob/master/data/plasmid_maps/2736_HDM_IDTSpike_EcoKozak.gb.

Pseudotyped lentiviral particles were generated as previously described (Crawford et al., 2020). Viruses were rescued in triplicate (i.e., independent transfections), which should average out variation in transfection efficiency such that viral entry phenotypes are reflective of both pseudovirus production and entry efficiency. Briefly, 2.5e5 293T cells per well were seeded in 12-well plates in 1 mL D10 growth media (DMEM with 10% heat-inactivated FBS, 2 mM l-glutamine, 100 U/mL penicillin, and 100 μg/mL streptomycin). 24h later, cells were transfected using BioT transfection reagent (Bioland Scientific, Paramount, CA, USA) with 0.5 μg of the ZsGreen lentiviral backbone pHAGE2-CMV-ZsGreen-W (BEI Resources NR-52520), 0.11 μg each of the lentiviral helper plasmids HDM-Hgpm2 (BEI Resources NR-52517), pRC-CMV-Rev1b (BEI Resources NR-52519), and HDM-tat1b (BEI Resources NR-52518), and 0.17 μg wild-type or mutant SARS-CoV-2 Spike plasmids. Media was changed to fresh D10 at 24 h post-transfection. At 60 hours post transfection, the viral supernatant was collected, filtered through a 0.45 μm SFCA low protein-binding filter, and stored at −80°C. To quantify efficiency of pseudovirus production, we quantified p24 levels (in pg/mL) in viral transfection supernatants via ELISA, in technical duplicate, per kit instructions (Advanced Bioscience Laboratories Cat. # 5421).

The resulting viruses were titered as previously described (Crawford et al., 2020). 293T cells stably expressing ACE2 (BEI NR-52511) were seeded at 1e4 cells per well in poly-L-lysine coated 96-well plates (Greiner 655930). 24 h later, 3 wells were counted and averaged to determine the number of cells per well at time of infection. Media was removed from the 293T-ACE2 cells and replaced with fresh D10 containing 50 μL of pseudovirus supernatant in a final volume of 150 μL. Polybrene (TR-1003-G, Sigma Aldrich, St. Louis, MO, USA) was added to a final concentration of 5 μg/mL. 60 h post-infection, cells were analyzed by flow cytometry. Titers were calculated using the Poisson formula. If P is the percentage of cells that are ZsGreen positive, as determined by drawing a ZsGreen+ gate from uninfected controls, then the titer per ml is: -ln(1 − P/100) × (number of cells/well)/(volume of virus per well in mL). Titers are only accurate when the percentage of ZsGreen+ cells is relatively low, i.e., ∼1%–10%. Titers are reported relative to the mean of the wild-type, which had similar titers as Crawford et al. of ∼104 infectious particles per mL (Crawford et al., 2020; Figure 4H), and normalized by p24 levels in transfection supernatants (Figure S4J). The dashed horizontal line in Figure 4H showing the limit of detection was calculated as the minimum titer that would be determined in the case of a single positive event.

RBD homolog purifications and binding assays

Receptor binding domains of SARS-CoV-2 (328-531) (Walls et al., 2020), WIV1 (316-518), RaTG13 (359-562), ZC45 (324-508), and ZXC21 (323-507) were synthesized by GenScript into vector pcDNA3.1- with a preceding mu-phosphatase signal peptide and a C-terminal octahistidine tag. SARS-CoV-1 (306-575) was subcloned from a GenArt synthesized SARS-CoV-1 Spike ectodomain (Walls et al., 2020). Human ACE2-Fc was synthesized and cloned by Twist with a C-terminal human Fc tag. The ACE2 construct begins with 19STIEE and ends with PYAD615.

The RBD constructs were transfected into 150mL suspension Expi293F (Thermo Fisher Cat No. A14527) cells at 37°C in a humidified 8% CO2 incubator rotating at 130rpm and harvested 3 days later. Clarified supernatants were purified in batch over Talon resin (Takara) prior to buffer exchanging into 20mM Tris pH 8, 150mM NaCl and flash freezing.

Expression and purification of human ACE2 was performed as previously described (Walls et al. 2020). Briefly, human ACE2-Fc was produced in FreeStyle 293F cells (ThermoFisher Cat# R79007) grown in suspension using FreeStyle 293 expression medium (Life Technologies) at 37°C in a humidified 8% CO2 incubator rotating at 130rpm. The cultures were transfected using PEI-MAX (Polyscience) with cells grown to a density of 2.5 million cells per mL and cultivated for 6 days. Human ACE2-Fc from clarified supernatants was affinity purified using a protein A column (GE Healthcare). The Fc tag was removed by thrombin cleavage in a reaction mixture containing 6mg of recombinant ACE2-Fc and 20μg thrombin in 20mM Tris-HCl pH 8.0, 150mM NaCl and 2.5mM CaCl2 for 4h at room temperature. The reaction mixture was re-purified using a protein A column to remove uncleaved protein and cleaved Fc tag. The cleaved ACE2 protein was further purified by gel filtration using a Superdex 200 column 10/300 GL (GE Life Sciences) equilibrated in a buffer containing 20mM Tris pH 8.0 and 150mM NaCl. Purified protein was quantified using absorption at 280nm, and concentrated to approximately 1mg/mL.

Binding measurements were performed on an Octet Red instrument (Forte Bio) at 30°C with shaking at 1,000 RPM. For monomeric ACE2 affinity measurements, Ni-NTA biosensors were hydrated in water for 10min and placed into 10X Kinetics Buffer (ForteBio). 10-50 μg/mL of RBD was loaded for a 1.5nm threshold prior to baseline stabilization in 10X Kinetics Buffer. The sensors were immersed in a 1:3 serial dilution of monomeric Fc-cleaved ACE2 ranging from 1,000 to 4.11nM in 10X Kinetics Buffer. For measurements of RaTG13, ZC45, and ZXC21 binding to dimeric ACE2-Fc, ARG2 biosensors were hydrated in water then activated for 300 s with an NHS-EDC solution (ForteBio) prior to amine coupling. 5-10 μg/mL of RBD in 10mM pH6 sodium acetate was loaded onto ARG2 tips (ForteBio) for 600 s and then quenched into 1M ethanolamine for 300 s. A baseline in 10X Kinetics Buffer was collected for 120 s prior to immersing the sensors in a 1:3 serial dilution of dimeric ACE2-his (SinoBiological # 10108-H08H, residues 1-740) ranging from 1,000 to 4.11nM in 10X Kinetics Buffer. Curve fitting was performed using a 1:1 binding model and the ForteBio data analysis software when applicable. Mean kon and koff values were determined with a global fit applied to all data.

Expression-enhancing mutant RBD purifications, binding, and stability assays

Codon-optimized RBDs of SARS-CoV-2 with its unmutated sequence or with single mutations (I358F, Y365F, Y365W, V367F or F392W) were synthesized by IDT as gBlocks with an N-terminal EGT secretion signal (MGILPSPGMPALLSLVSLLSVLLMGCVA) and C-terminal Avi- and octa-histidine tags (GLNDIFEAQKIEWHEHHHHHHHH) and cloned into the CMV/R (VRC 8400) mammalian expression vector. Plasmids were transfected into 200mL suspension Expi293F cells at 37°C in a humidified 8% CO2 incubator rotating at 130 rpm and harvested 3 days later. Clarified supernatants were purified in batch over Talon resin (Takara). After elution at 125mL in 20mM Tris (pH 8.0), 300mM NaCl, and 300mM imidazole, concentrated solutions of L-arginine (pH 8.0), CHAPS and glycerol were added to eluate to final concentrations of 100mM, 0.75%, and 5%, respectively, to prevent adhesion to concentrator membranes. To quantify yield, each sample was concentrated to a final volume of 1500uL, and 1000uL was applied to a Superdex 75 Increase 10/300 GL column (GE) pre-equilibrated with 50mM Tris (pH 8.0), 185mM NaCl, 100mM L-arginine, 0.75% CHAPS and 5% glycerol. Peak integration was quantified using UNICORN software (GE), and relative quantity from the SEC trace was corrected for unique extinction coefficients and molecular weights of each RBD mutant. Purified peaks from monomeric species were dialyzed three times into 25mM Tris (pH 8.0), 150mM NaCl and 5% glycerol at 4°C.

BLI binding assays were performed on an Octet Red instrument at 25°C with shaking at 1,000 RPM in the presence of 25mM Tris pH 8.0, 150mM NaCl and 5% glycerol. Anti-hIgG Capture (AHC) tips were loaded with human ACE2-Fc or CR3022 at 0.02mg/mL for 300 s prior to a baseline for 60 s, association with monomeric RBDs at 500nM for 600 s, and dissociation for 300 s.

Non-equilibrium measurements of melting temperatures were determined from thermal denaturation melt curves using an UNcle (UNchained Labs) based on the barycentric mean of intrinsic tryptophan fluorescence, with data collected from 20-95°C using a thermal ramp of 1°C per minute in a background of 25mM Tris pH 8.0, 150mM NaCl and 5% glycerol. Melting temperatures were defined as the maximum point of the first derivative of the melting curve, with first derivatives calculated using GraphPad Prism software after smoothing with four neighboring points using 2nd order polynomial settings.

Quantification and Statistical Analysis

Quantitative analyses were performed using custom code, available on GitHub (https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS).

For quantitative analysis of deep mutational scanning expression phenotypes (see Method Details section, “Calculating variant phenotypes for expression”), we determined per-variant expression via maximum likelihood inference using the fitdistrplus R package (Delignette-Muller and Dutang, 2015).

For quantitative analysis of deep mutational scanning binding phenotypes (see Method Details section, “Calculating variant phenotypes for ACE2-binding affinity”), we determined per-variant titration curves via weighted least-squares nonlinear regression in R.

To quantitatively decompose single-mutant effects on expression and binding (see Figure S2 legend and Method Details section, “Decomposing single-mutant effects from multiple-mutant genotypes”), we fit global epistasis regression models (Otwinowski et al., 2018) using the dms_variants Python package (https://jbloomlab.github.io/dms_variants/dms_variants.globalepistasis.html).

For quantification of binding via Biolayer Inferometry (see Figures 4D, S4A–S4F, and S4I), global curve fitting to determine kon and koff was performed using a 1:1 binding model in the ForteBio data analysis software.

To quantify thermal stability from melting curves (see Figures 4G and S4H), the GraphPad Prism software was used to identify the maximum point of the first derivative of the melting curve.

For the statistical analysis of mutations observed among circulating SARS-CoV-2 isolates described in Figure S7, we used permutation tests to assess significant trends in effects of observed mutations compared to the distribution of randomly sampled mutation subsets.

Acknowledgments

We thank Keara Malone for experimental assistance, Katherine Xue for helpful suggestions, and Frederick Matsen for intellectual support and hospitality. We thank the Flow Cytometry and Genomics core facilities at the Fred Hutchinson Cancer Research Center for experimental support as well as Mike Murphy, Deleah Pettie, and the Mammalian Production Core at the Institute for Protein Design for assistance with protein purification. This work was supported by the NIAID/NIH (R01AI141707 and R01AI12893 to J.D.B., HHSN272201700059C to D.V., F30AI149928 to K.H.D.C., and T32AI083203 to A.J.G.), NIGMS/NIH (R01GM120553 to D.V.), a Pew Biomedical Scholars Award to D.V., Burroughs Wellcome Investigators in the Pathogenesis of Infectious Diseases awards to D.V. and J.D.B., the Bill & Melinda Gates Foundation (OPP1156262 to D.V. and N.P.K.), Fast Grants to N.P.K. and D.V., and a generous gift from the Open Philanthropy Project to N.P.K. T.N.S. is a Washington Research Foundation Innovation Fellow at the University of Washington Institute for Protein Design and a Howard Hughes Medical Institute Fellow of the Damon Runyon Cancer Research Foundation (DRG-2381-19). J.D.B. is an Investigator of the Howard Hughes Medical Institute.

Author Contributions

Conceptualization, T.N.S., D.V., and J.D.B.; Methodology, T.N.S. and J.D.B.; Investigation, T.N.S. and A.J.G.; Code, T.N.S., S.K.H., K.H.D.C., and J.D.B.; Formal Analysis, T.N.S. and J.D.B.; Validation, A.J.G., D.E., K.H.D.C., A.S.D., M.J.N., J.E.B., M.A.T., and A.C.W.; Visualization, T.N.S., S.K.H., and J.D.B.; Writing – Original Draft, T.N.S. and J.D.B.; Writing – Review and Editing, all authors; Supervision, N.P.K., D.V., and J.D.B.

Declaration of Interests

N.P.K. is a co-founder, shareholder, and chair of the scientific advisory board of Icosavax, Inc.

Published: August 11, 2020, corrected online August 26, 2020

Footnotes

Supplemental Information can be found online at https://doi.org/10.1016/j.cell.2020.08.012.

Supplemental Information

Table S1. Binding and Expression Measurements for RBD Homologs in the Deep Mutational Scanning Libraries, Related to Figure 2
mmc1.csv (1.2KB, csv)
Table S2. Single-Mutant Effects on Expression and Binding, Related to Figure 3
mmc2.csv (263KB, csv)
Data S1. Interactive Heatmap of Mutational Effects on Expression and Binding, Related to Figure 3
mmc3.zip (75.5KB, zip)

References

  1. Adams R.M., Mora T., Walczak A.M., Kinney J.B. Measuring the sequence-affinity landscape of antibodies with massively parallel titration curves. eLife. 2016;5:e23156. doi: 10.7554/eLife.23156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Andersen K.G., Rambaut A., Lipkin W.I., Holmes E.C., Garry R.F. The proximal origin of SARS-CoV-2. Nat. Med. 2020;26:450–452. doi: 10.1038/s41591-020-0820-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Baum A., Fulton B.O., Wloga E., Copin R., Pascal K.E., Russo V., Giordano S., Lanza K., Negron N., Ni M. Antibody cocktail to SARS-CoV-2 spike protein prevents rapid mutational escape seen with individual antibodies. Science. 2020:eabd0831. doi: 10.1126/science.abd0831. Published online June 15, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Becker M.M., Graham R.L., Donaldson E.F., Rockx B., Sims A.C., Sheahan T., Pickles R.J., Corti D., Johnston R.E., Baric R.S., Denison M.R. Synthetic recombinant bat SARS-like coronavirus is infectious in cultured cells and in mice. Proc. Natl. Acad. Sci. USA. 2008;105:19944–19949. doi: 10.1073/pnas.0808116105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bedford T., Greninger A.L., Roychoudhury P., Starita L.M., Famulare M., Huang M.-L., Nalla A., Pepper G., Reinhardt A., Xie H. Cryptic transmission of SARS-CoV-2 in Washington State. medRxiv. 2020 doi: 10.1101/2020.04.02.20051417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bloom J.D. An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol. Biol. Evol. 2014;31:1956–1978. doi: 10.1093/molbev/msu173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bloom J.D., Gong L.I., Baltimore D. Permissive secondary mutations enable the evolution of influenza oseltamivir resistance. Science. 2010;328:1272–1275. doi: 10.1126/science.1187816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Boder E.T., Wittrup K.D. Yeast surface display for screening combinatorial polypeptide libraries. Nat. Biotechnol. 1997;15:553–557. doi: 10.1038/nbt0697-553. [DOI] [PubMed] [Google Scholar]
  9. Bolles M., Donaldson E., Baric R. SARS-CoV and emergent coronaviruses: viral determinants of interspecies transmission. Curr. Opin. Virol. 2011;1:624–634. doi: 10.1016/j.coviro.2011.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Boni M.F., Lemey P., Jiang X., Lam T.T.-Y., Perry B.W., Castoe T.A., Rambaut A., Robertson D.L. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nat. Microbiol. 2020 doi: 10.1038/s41564-020-0771-4. Published online July 28, 2020. [DOI] [PubMed] [Google Scholar]
  11. Brouwer P.J.M., Caniels T.G., van der Straten K., Snitselaar J.L., Aldon Y., Bangaru S., Torres J.L., Okba N.M.A., Claireaux M., Kerster G. Potent neutralizing antibodies from COVID-19 patients define multiple targets of vulnerability. Science. 2020;369:643–650. doi: 10.1126/science.abc5902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Callaway H.M., Welsch K., Weichert W., Allison A.B., Hafenstein S.L., Huang K., Iketani S., Parrish C.R. Complex and Dynamic Interactions between Parvovirus Capsids, Transferrin Receptors, and Antibodies Control Cell Infection and Host Range. J. Virol. 2018;92:e00460-e18. doi: 10.1128/JVI.00460-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cao Y., Su B., Guo X., Sun W., Deng Y., Bao L., Zhu Q., Zhang X., Zheng Y., Geng C. Potent neutralizing antibodies against SARS-CoV-2 identified by high-throughput single-cell sequencing of convalescent patients’ B cells. Cell. 2020;182:73–84. doi: 10.1016/j.cell.2020.05.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Chen W.-H., Du L., Chag S.M., Ma C., Tricoche N., Tao X., Seid C.A., Hudspeth E.M., Lustigman S., Tseng C.-T.K. Yeast-expressed recombinant protein of the receptor-binding domain in SARS-CoV spike protein with deglycosylated forms as a SARS vaccine candidate. Hum. Vaccin. Immunother. 2014;10:648–658. doi: 10.4161/hv.27464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chen W.-H., Chag S.M., Poongavanam M.V., Biter A.B., Ewere E.A., Rezende W., Seid C.A., Hudspeth E.M., Pollet J., McAtee C.P. Optimization of the Production Process and Characterization of the Yeast-Expressed SARS-CoV Recombinant Receptor-Binding Domain (RBD219-N1), a SARS Vaccine Candidate. J. Pharm. Sci. 2017;106:1961–1970. doi: 10.1016/j.xphs.2017.04.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Chen W.-H., Tao X., Peng B.-H., Pollet J., Strych U., Bottazzi M.E., Hotez P.J., Lustigman S., Du L., Jiang S. Yeast-Expressed SARS-CoV Recombinant Receptor-Binding Domain (RBD219-N1) Formulated with Alum Induces Protective Immunity and Reduces Immune Enhancement. bioRxiv. 2020 doi: 10.1101/2020.05.15.098079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Chen W.-H., Strych U., Hotez P.J., Bottazzi M.E. The SARS-CoV-2 Vaccine Pipeline: an Overview. Curr. Trop. Med. Rep. 2020;7:1–4. doi: 10.1007/s40475-020-00201-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Crawford K.H.D., Bloom J.D. alignparse: A Python package for parsing complex features from high-throughput long-read sequencing. J. Open Source Softw. 2019;4:1915. doi: 10.21105/joss.01915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Crawford K.H.D., Eguia R., Dingens A.S., Loes A.N., Malone K.D., Wolf C.R., Chu H.Y., Tortorici M.A., Veesler D., Murphy M. Protocol and Reagents for Pseudotyping Lentiviral Particles with SARS-CoV-2 Spike Protein for Neutralization Assays. Viruses. 2020;12:513. doi: 10.3390/v12050513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Cui J., Li F., Shi Z.-L. Origin and evolution of pathogenic coronaviruses. Nat. Rev. Microbiol. 2019;17:181–192. doi: 10.1038/s41579-018-0118-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Davenport T.M., Gorman J., Joyce M.G., Zhou T., Soto C., Guttman M., Moquin S., Yang Y., Zhang B., Doria-Rose N.A. Somatic Hypermutation-Induced Changes in the Structure and Dynamics of HIV-1 Broadly Neutralizing Antibodies. Structure. 2016;24:1346–1357. doi: 10.1016/j.str.2016.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Dean A.M., Thornton J.W. Mechanistic approaches to the study of evolution: the functional synthesis. Nat. Rev. Genet. 2007;8:675–688. doi: 10.1038/nrg2160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Delignette-Muller M., Dutang C. fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software. Articles. 2015;64:1–34. [Google Scholar]
  24. Demogines A., Farzan M., Sawyer S.L. Evidence for ACE2-utilizing coronaviruses (CoVs) related to severe acute respiratory syndrome CoV in bats. J. Virol. 2012;86:6350–6353. doi: 10.1128/JVI.00311-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Dingens A.S., Haddox H.K., Overbaugh J., Bloom J.D. Comprehensive Mapping of HIV-1 Escape from a Broadly Neutralizing Antibody. Cell Host Microbe. 2017;21:777–787. doi: 10.1016/j.chom.2017.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Dingens A.S., Arenz D., Weight H., Overbaugh J., Bloom J.D. An Antigenic Atlas of HIV-1 Escape from Broadly Neutralizing Antibodies Distinguishes Functional and Structural Epitopes. Immunity. 2019;50:520–532. doi: 10.1016/j.immuni.2018.12.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Duan H., Chen X., Boyington J.C., Cheng C., Zhang Y., Jafari A.J., Stephens T., Tsybovsky Y., Kalyuzhniy O., Zhao P. Glycan Masking Focuses Immune Responses to the HIV-1 CD4-Binding Site and Enhances Elicitation of VRC01-Class Precursor Antibodies. Immunity. 2018;49:301–311. doi: 10.1016/j.immuni.2018.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Eggink D., Goff P.H., Palese P. Guiding the immune response against influenza virus hemagglutinin toward the conserved stalk domain by hyperglycosylation of the globular head domain. J. Virol. 2014;88:699–704. doi: 10.1128/JVI.02608-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Elbe S., Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob. Chall. 2017;1:33–46. doi: 10.1002/gch2.1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Fane B., Villafane R., Mitraki A., King J. Identification of global suppressors for temperature-sensitive folding mutations of the P22 tailspike protein. J. Biol. Chem. 1991;266:11640–11648. [PubMed] [Google Scholar]
  31. Fauver J.R., Petrone M.E., Hodcroft E.B., Shioda K., Ehrlich H.Y., Watts A.G., Vogels C.B.F., Brito A.F., Alpert T., Muyombwe A. Coast-to-Coast Spread of SARS-CoV-2 during the Early Epidemic in the United States. Cell. 2020;181:990–996. doi: 10.1016/j.cell.2020.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Fowler D.M., Fields S. Deep mutational scanning: a new style of protein science. Nat. Methods. 2014;11:801–807. doi: 10.1038/nmeth.3027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Frank H.K., Enard D., Boyd S.D. Exceptional diversity and selection pressure on SARS-CoV and SARS-CoV-2 host receptor in bats compared to other mammals. bioRxiv. 2020 doi: 10.1101/2020.04.20.051656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Friedrich T.C., Frye C.A., Yant L.J., O’Connor D.H., Kriewaldt N.A., Benson M., Vojnov L., Dodds E.J., Cullen C., Rudersdorf R. Extraepitopic compensatory substitutions partially restore fitness to simian immunodeficiency virus variants that escape from an immunodominant cytotoxic-T-lymphocyte response. J. Virol. 2004;78:2581–2585. doi: 10.1128/JVI.78.5.2581-2585.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Frieman M., Yount B., Agnihothram S., Page C., Donaldson E., Roberts A., Vogel L., Woodruff B., Scorpio D., Subbarao K., Baric R.S. Molecular determinants of severe acute respiratory syndrome coronavirus pathogenesis and virulence in young and aged mouse models of human disease. J. Virol. 2012;86:884–897. doi: 10.1128/JVI.05957-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Gietz R.D., Schiestl R.H. High-efficiency yeast transformation using the LiAc/SS carrier DNA/PEG method. Nat. Protoc. 2007;2:31–34. doi: 10.1038/nprot.2007.13. [DOI] [PubMed] [Google Scholar]
  37. Gong L.I., Suchard M.A., Bloom J.D. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife. 2013;2:e00631. doi: 10.7554/eLife.00631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Hamilton S.R., Bobrowicz P., Bobrowicz B., Davidson R.C., Li H., Mitchell T., Nett J.H., Rausch S., Stadheim T.A., Wischnewski H. Production of complex human glycoproteins in yeast. Science. 2003;301:1244–1246. doi: 10.1126/science.1088166. [DOI] [PubMed] [Google Scholar]
  39. Harms M.J., Thornton J.W. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat. Rev. Genet. 2013;14:559–571. doi: 10.1038/nrg3540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Hensley S.E., Das S.R., Bailey A.L., Schmidt L.M., Hickman H.D., Jayaraman A., Viswanathan K., Raman R., Sasisekharan R., Bennink J.R., Yewdell J.W. Hemagglutinin receptor binding avidity drives influenza A virus antigenic drift. Science. 2009;326:734–736. doi: 10.1126/science.1178258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Hiatt J.B., Patwardhan R.P., Turner E.H., Lee C., Shendure J. Parallel, tag-directed assembly of locally derived short sequence reads. Nat. Methods. 2010;7:119–122. doi: 10.1038/nmeth.1416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Hilton S.K., Bloom J.D. Modeling site-specific amino-acid preferences deepens phylogenetic estimates of viral sequence divergence. Virus Evol. 2018;4:vey033. doi: 10.1093/ve/vey033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Hilton S.K., Huddleston J., Black A., North K., Dingens A.S., Bedford T., Bloom J.D. dms-view: Interactive visualization tool for deep mutational scanning data. Journal of Open Source Software. 2020;5:2353. doi: 10.21105/joss.02353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Hoffmann M., Kleine-Weber H., Schroeder S., Krüger N., Herrler T., Erichsen S., Schiergens T.S., Herrler G., Wu N.-H., Nitsche A. SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor. Cell. 2020;181:271–280. doi: 10.1016/j.cell.2020.02.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Hu B., Zeng L.-P., Yang X.-L., Ge X.-Y., Zhang W., Li B., Xie J.-Z., Shen X.-R., Zhang Y.-Z., Wang N. Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus. PLoS Pathog. 2017;13:e1006698. doi: 10.1371/journal.ppat.1006698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Huo J., Zhao Y., Ren J., Zhou D., Duyvesteyn H.M.E., Ginn H.M., Carrique L., Malinauskas T., Ruza R.R., Shah P.N.M. Neutralization of SARS-CoV-2 by Destruction of the Prefusion Spike. Cell Host Microbe. 2020 doi: 10.1016/j.chom.2020.06.010. Published online June 19, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Hwang W.C., Lin Y., Santelli E., Sui J., Jaroszewski L., Stec B., Farzan M., Marasco W.A., Liddington R.C. Structural basis of neutralization by a human anti-severe acute respiratory syndrome spike protein antibody, 80R. J. Biol. Chem. 2006;281:34610–34616. doi: 10.1074/jbc.M603275200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Jardine J.G., Kulp D.W., Havenar-Daughton C., Sarkar A., Briney B., Sok D., Sesterhenn F., Ereño-Orbea J., Kalyuzhniy O., Deresa I. HIV-1 broadly neutralizing antibody precursor B cells revealed by germline-targeting immunogen. Science. 2016;351:1458–1463. doi: 10.1126/science.aad9195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Ju B., Zhang Q., Ge J., Wang R., Sun J., Ge X., Yu J., Shan S., Zhou B., Song S. Human neutralizing antibodies elicited by SARS-CoV-2 infection. Nature. 2020;584:115–119. doi: 10.1038/s41586-020-2380-z. Published online May 26, 2020. [DOI] [PubMed] [Google Scholar]
  50. Julian M.C., Li L., Garde S., Wilen R., Tessier P.M. Efficient affinity maturation of antibody variable domains requires co-selection of compensatory mutations to maintain thermodynamic stability. Sci. Rep. 2017;7:45259. doi: 10.1038/srep45259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Kabsch W., Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22:2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
  52. Katoh K., Standley D.M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 2013;30:772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Kowalski J.M., Parekh R.N., Mao J., Wittrup K.D. Protein folding stability can determine the efficiency of escape from endoplasmic reticulum quality control. J. Biol. Chem. 1998;273:19453–19458. doi: 10.1074/jbc.273.31.19453. [DOI] [PubMed] [Google Scholar]
  54. Kowalski J.M., Parekh R.N., Wittrup K.D. Secretion efficiency in Saccharomyces cerevisiae of bovine pancreatic trypsin inhibitor mutants lacking disulfide bonds is correlated with thermodynamic stability. Biochemistry. 1998;37:1264–1273. doi: 10.1021/bi9722397. [DOI] [PubMed] [Google Scholar]
  55. Kulp D.W., Steichen J.M., Pauthner M., Hu X., Schiffner T., Liguori A., Cottrell C.A., Havenar-Daughton C., Ozorowski G., Georgeson E. Structure-based design of native-like HIV-1 envelope trimers to silence non-neutralizing epitopes and eliminate CD4 binding. Nat. Commun. 2017;8:1655. doi: 10.1038/s41467-017-01549-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Kutter J.S., Spronken M.I., Fraaij P.L., Fouchier R.A., Herfst S. Transmission routes of respiratory viruses among humans. Curr. Opin. Virol. 2018;28:142–151. doi: 10.1016/j.coviro.2018.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Lam T.T.-Y., Jia N., Zhang Y.-W., Shum M.H.-H., Jiang J.-F., Zhu H.-C., Tong Y.-G., Shi Y.-X., Ni X.-B., Liao Y.-S. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins. Nature. 2020;583:282–285. doi: 10.1038/s41586-020-2169-0. [DOI] [PubMed] [Google Scholar]
  58. Lan J., Ge J., Yu J., Shan S., Zhou H., Fan S., Zhang Q., Shi X., Wang Q., Zhang L., Wang X. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 2020;581:215–220. doi: 10.1038/s41586-020-2180-5. [DOI] [PubMed] [Google Scholar]
  59. Lang Y., Li W., Li Z., Koerhuis D., van den Burg A.C.S., Rozemuller E., Bosch B.-J., van Kuppeveld F.J.M., Boons G.-J.P.H., Huizinga E.G. Coronavirus hemagglutinin-esterase and spike proteins co-evolve for functional balance and optimal virion avidity. bioRxiv. 2020 doi: 10.1101/2020.04.03.003699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Latinne A., Hu B., Olival K.J., Zhu G., Zhang L., Li H., Chmura A.A., Field H.E., Zambrana-Torrelio C., Epstein J.H. Origin and cross-species transmission of bat coronaviruses in China. bioRxiv. 2020 doi: 10.1101/2020.05.31.116061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Lee J.M., Huddleston J., Doud M.B., Hooper K.A., Wu N.C., Bedford T., Bloom J.D. Deep mutational scanning of hemagglutinin helps predict evolutionary fates of human H3N2 influenza variants. Proc. Natl. Acad. Sci. USA. 2018;115:E8276–E8285. doi: 10.1073/pnas.1806133115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Lee J.M., Eguia R., Zost S.J., Choudhary S., Wilson P.C., Bedford T., Stevens-Ayers T., Boeckh M., Hurt A.C., Lakdawala S.S. Mapping person-to-person variation in viral mutations that escape polyclonal serum targeting influenza hemagglutinin. eLife. 2019;8:e49324. doi: 10.7554/eLife.49324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Letko M., Marzi A., Munster V. Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses. Nat. Microbiol. 2020;5:562–569. doi: 10.1038/s41564-020-0688-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Li F. Structural analysis of major species barriers between humans and palm civets for severe acute respiratory syndrome coronavirus infections. J. Virol. 2008;82:6984–6991. doi: 10.1128/JVI.00442-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Li W., Moore M.J., Vasilieva N., Sui J., Wong S.K., Berne M.A., Somasundaran M., Sullivan J.L., Luzuriaga K., Greenough T.C. Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus. Nature. 2003;426:450–454. doi: 10.1038/nature02145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Li F., Li W., Farzan M., Harrison S.C. Structure of SARS coronavirus spike receptor-binding domain complexed with receptor. Science. 2005;309:1864–1868. doi: 10.1126/science.1116480. [DOI] [PubMed] [Google Scholar]
  68. Li W., Zhang C., Sui J., Kuhn J.H., Moore M.J., Luo S., Wong S.-K., Huang I.-C., Xu K., Vasilieva N. Receptor and viral determinants of SARS-coronavirus adaptation to human ACE2. EMBO J. 2005;24:1634–1643. doi: 10.1038/sj.emboj.7600640. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Louz D., Bergmans H.E., Loos B.P., Hoeben R.C. Animal models in virus research: their utility and limitations. Crit. Rev. Microbiol. 2013;39:325–361. doi: 10.3109/1040841X.2012.711740. [DOI] [PubMed] [Google Scholar]
  70. Lynch R.M., Wong P., Tran L., O’Dell S., Nason M.C., Li Y., Wu X., Mascola J.R. HIV-1 fitness cost associated with escape from the VRC01 class of CD4 binding site neutralizing antibodies. J. Virol. 2015;89:4201–4213. doi: 10.1128/JVI.03608-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. MacLean O.A., Lytras S., Singer J.B., Weaver S., Kosakovsky Pond S.L., Robertson D.L. Evidence of significant natural selection in the evolution of SARS-CoV-2 in bats, not humans. bioRxiv. 2020 doi: 10.1101/2020.05.28.122366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Matreyek K.A., Starita L.M., Stephany J.J., Martin B., Chiasson M.A., Gray V.E., Kircher M., Khechaduri A., Dines J.N., Hause R.J. Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 2018;50:874–882. doi: 10.1038/s41588-018-0122-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Menachery V.D., Yount B.L., Jr., Debbink K., Agnihothram S., Gralinski L.E., Plante J.A., Graham R.L., Scobey T., Ge X.-Y., Donaldson E.F. A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence. Nat. Med. 2015;21:1508–1513. doi: 10.1038/nm.3985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Menachery V.D., Yount B.L., Jr., Sims A.C., Debbink K., Agnihothram S.S., Gralinski L.E., Graham R.L., Scobey T., Plante J.A., Royal S.R. SARS-like WIV1-CoV poised for human emergence. Proc. Natl. Acad. Sci. USA. 2016;113:3048–3053. doi: 10.1073/pnas.1517719113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Mulligan M.J., Lyke K.E., Kitchin N., Absalon J., Gurtman A., Lockhart S.P., Neuzil K., Raabe V., Bailey R., Swanson K.A. Phase 1/2 Study to Describe the Safety and Immunogenicity of a COVID-19 RNA Vaccine Candidate (BNT162b1) in Adults 18 to 55 Years of Age: Interim Report. medRxiv. 2020 doi: 10.1101/2020.06.30.20142570. [DOI] [Google Scholar]
  76. Otwinowski J., McCandlish D.M., Plotkin J.B. Inferring the shape of global epistasis. Proc. Natl. Acad. Sci. USA. 2018;115:E7550–E7558. doi: 10.1073/pnas.1804015115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Ovchinnikov V., Louveau J.E., Barton J.P., Karplus M., Chakraborty A.K. Role of framework mutations and antibody flexibility in the evolution of broadly neutralizing antibodies. eLife. 2018;7:e33038. doi: 10.7554/eLife.33038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Pak J.E., Sharon C., Satkunarajah M., Auperin T.C., Cameron C.M., Kelvin D.J., Seetharaman J., Cochrane A., Plummer F.A., Berry J.D., Rini J.M. Structural insights into immune recognition of the severe acute respiratory syndrome coronavirus S protein receptor binding domain. J. Mol. Biol. 2009;388:815–823. doi: 10.1016/j.jmb.2009.03.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Peterman N., Levine E. Sort-seq under the hood: implications of design choices on large-scale characterization of sequence-function relations. BMC Genomics. 2016;17:206. doi: 10.1186/s12864-016-2533-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Pinto D., Park Y.-J., Beltramello M., Walls A.C., Tortorici M.A., Bianchi S., Jaconi S., Culap K., Zatta F., De Marco A. Cross-neutralization of SARS-CoV-2 by a human monoclonal SARS-CoV antibody. Nature. 2020;583:290–295. doi: 10.1038/s41586-020-2349-y. [DOI] [PubMed] [Google Scholar]
  81. Pollock D.D., Thiltgen G., Goldstein R.A. Amino acid coevolution induces an evolutionary Stokes shift. Proc. Natl. Acad. Sci. USA. 2012;109:E1352–E1359. doi: 10.1073/pnas.1120084109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Poteete A.R., Rennell D., Bouvier S.E., Hardy L.W. Alteration of T4 lysozyme structure by second-site reversion of deleterious mutations. Protein Sci. 1997;6:2418–2425. doi: 10.1002/pro.5560061115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Povolotskaya I.S., Kondrashov F.A. Sequence space and the ongoing expansion of the protein universe. Nature. 2010;465:922–926. doi: 10.1038/nature09105. [DOI] [PubMed] [Google Scholar]
  84. Prabakaran P., Gan J., Feng Y., Zhu Z., Choudhry V., Xiao X., Ji X., Dimitrov D.S. Structure of severe acute respiratory syndrome coronavirus receptor-binding domain complexed with neutralizing antibody. J. Biol. Chem. 2006;281:15829–15836. doi: 10.1074/jbc.M600697200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Premkumar L., Segovia-Chumbez B., Jadi R., Martinez D.R., Raut R., Markmann A., Cornaby C., Bartelt L., Weiss S., Park Y. The receptor binding domain of the viral spike protein is an immunodominant and highly specific target of antibodies in SARS-CoV-2 patients. Sci. Immunol. 2020;5:eabc8413. doi: 10.1126/sciimmunol.abc8413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Pybus O.G., Rambaut A., Belshaw R., Freckleton R.P., Drummond A.J., Holmes E.C. Phylogenetic evidence for deleterious mutation load in RNA viruses and its contribution to viral evolution. Mol. Biol. Evol. 2007;24:845–852. doi: 10.1093/molbev/msm001. [DOI] [PubMed] [Google Scholar]
  87. Qu X.-X., Hao P., Song X.-J., Jiang S.-M., Liu Y.-X., Wang P.-G., Rao X., Song H.-D., Wang S.-Y., Zuo Y. Identification of two critical amino acid residues of the severe acute respiratory syndrome coronavirus spike protein for its variation in zoonotic tropism transition via a double substitution strategy. J. Biol. Chem. 2005;280:29588–29595. doi: 10.1074/jbc.M500662200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Quinlan B.D., Mou H., Zhang L., Guo Y., He W., Ojha A., Parcells M.S., Luo G., Li W., Zhong G. The SARS-CoV-2 receptor-binding domain elicits a potent neutralizing response without antibody-dependent enhancement. bioRxiv. 2020 doi: 10.1101/2020.04.10.036418. [DOI] [Google Scholar]
  89. Ravichandran S., Coyle E.M., Klenow L., Tang J., Grubbs G., Liu S., Wang T., Golding H., Khurana S. Antibody signature induced by SARS-CoV-2 spike protein immunogens in rabbits. Sci. Transl. Med. 2020;12:eabc3539. doi: 10.1126/scitranslmed.abc3539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Ren W., Qu X., Li W., Han Z., Yu M., Zhou P., Zhang S.-Y., Wang L.-F., Deng H., Shi Z. Difference in receptor usage between severe acute respiratory syndrome (SARS) coronavirus and SARS-like coronavirus of bat origin. J. Virol. 2008;82:1899–1907. doi: 10.1128/JVI.01085-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Rockx B., Donaldson E., Frieman M., Sheahan T., Corti D., Lanzavecchia A., Baric R.S. Escape from human monoclonal antibody neutralization affects in vitro and in vivo fitness of severe acute respiratory syndrome coronavirus. J. Infect. Dis. 2010;201:946–955. doi: 10.1086/651022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Rogers T.F., Zhao F., Huang D., Beutler N., Burns A., He W.-T., Limbo O., Smith C., Song G., Woehl J. Isolation of potent SARS-CoV-2 neutralizing antibodies and protection from disease in a small animal model. Science. 2020:eabc7520. doi: 10.1126/science.abc7520. Published online June 15, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Russell C.A., Kasson P.M., Donis R.O., Riley S., Dunbar J., Rambaut A., Asher J., Burke S., Davis C.T., Garten R.J. Improving pandemic influenza risk assessment. eLife. 2014;3:e03883. doi: 10.7554/eLife.03883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Scanlon T.C., Gray E.C., Griswold K.E. Quantifying and resolving multiple vector transformants in S. cerevisiae plasmid libraries. BMC Biotechnol. 2009;9:95. doi: 10.1186/1472-6750-9-95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Schmidt A.G., Xu H., Khan A.R., O’Donnell T., Khurana S., King L.R., Manischewitz J., Golding H., Suphaphiphat P., Carfi A. Preconfiguration of the antigen-binding site during affinity maturation of a broadly neutralizing influenza virus antibody. Proc. Natl. Acad. Sci. USA. 2013;110:264–269. doi: 10.1073/pnas.1218256109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Schwehm J.M., Kristyanne E.S., Biggers C.C., Stites W.E. Stability effects of increasing the hydrophobicity of solvent-exposed side chains in staphylococcal nuclease. Biochemistry. 1998;37:6939–6948. doi: 10.1021/bi9725069. [DOI] [PubMed] [Google Scholar]
  97. Seydoux E., Homad L.J., MacCamy A.J., Parks K.R., Hurlburt N.K., Jennewein M.F., Akins N.R., Stuart A.B., Wan Y.-H., Feng J. Analysis of a SARS-CoV-2-Infected Individual Reveals Development of Potent Neutralizing Antibodies with Limited Somatic Mutation. Immunity. 2020;53:98–105.e5. doi: 10.1016/j.immuni.2020.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Shah P., McCandlish D.M., Plotkin J.B. Contingency and entrenchment in protein evolution under purifying selection. Proc. Natl. Acad. Sci. USA. 2015;112:E3226–E3235. doi: 10.1073/pnas.1412933112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Shang J., Ye G., Shi K., Wan Y., Luo C., Aihara H., Geng Q., Auerbach A., Li F. Structural basis of receptor recognition by SARS-CoV-2. Nature. 2020;581:221–224. doi: 10.1038/s41586-020-2179-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Sheahan T., Rockx B., Donaldson E., Sims A., Pickles R., Corti D., Baric R. Mechanisms of zoonotic severe acute respiratory syndrome coronavirus host range expansion in human airway epithelium. J. Virol. 2008;82:2274–2285. doi: 10.1128/JVI.02041-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Sheahan T., Rockx B., Donaldson E., Corti D., Baric R. Pathways of cross-species transmission of synthetically reconstructed zoonotic severe acute respiratory syndrome coronavirus. J. Virol. 2008;82:8721–8732. doi: 10.1128/JVI.00818-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Shi R., Shan C., Duan X., Chen Z., Liu P., Song J., Song T., Bi X., Han C., Wu L. A human neutralizing antibody targets the receptor-binding site of SARS-CoV-2. Nature. 2020;584:120–124. doi: 10.1038/s41586-020-2381-y. [DOI] [PubMed] [Google Scholar]
  103. Shusta E.V., Kieke M.C., Parke E., Kranz D.M., Wittrup K.D. Yeast polypeptide fusion surface display levels predict thermal stability and soluble secretion efficiency. J. Mol. Biol. 1999;292:949–956. doi: 10.1006/jmbi.1999.3130. [DOI] [PubMed] [Google Scholar]
  104. Smith D.J., Lapedes A.S., de Jong J.C., Bestebroer T.M., Rimmelzwaan G.F., Osterhaus A.D.M.E., Fouchier R.A.M. Mapping the antigenic and genetic evolution of influenza virus. Science. 2004;305:371–376. doi: 10.1126/science.1097211. [DOI] [PubMed] [Google Scholar]
  105. Soskine M., Tawfik D.S. Mutational effects and the evolution of new protein functions. Nat. Rev. Genet. 2010;11:572–582. doi: 10.1038/nrg2808. [DOI] [PubMed] [Google Scholar]
  106. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Starr T.N., Thornton J.W. Epistasis in protein evolution. Protein Sci. 2016;25:1204–1218. doi: 10.1002/pro.2897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Starr T.N., Flynn J.M., Mishra P., Bolon D.N.A., Thornton J.W. Pervasive contingency and entrenchment in a billion years of Hsp90 evolution. Proc. Natl. Acad. Sci. USA. 2018;115:4453–4458. doi: 10.1073/pnas.1718133115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Suthar M.S., Zimmerman M., Kauffman R., Mantus G., Linderman S., Vanderheiden A., Nyhoff L., Davis C., Adekunle S., Affer M. Rapid generation of neutralizing antibody responses in COVID-19 patients. Cell Reports Medicine. 2020;1:100040. doi: 10.1016/j.xcrm.2020.100040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Tareen A., Kinney J.B. Logomaker: beautiful sequence logos in Python. Bioinformatics. 2020;36:2272–2274. doi: 10.1093/bioinformatics/btz921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Tien M.Z., Meyer A.G., Sydykova D.K., Spielman S.J., Wilke C.O. Maximum allowed solvent accessibilites of residues in proteins. PLoS ONE. 2013;8:e80635. doi: 10.1371/journal.pone.0080635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Tokuriki N., Stricher F., Serrano L., Tawfik D.S. How protein stability and new functions trade off. PLoS Comput. Biol. 2008;4:e1000002. doi: 10.1371/journal.pcbi.1000002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Tong S., Conrardy C., Ruone S., Kuzmin I.V., Guo X., Tao Y., Niezgoda M., Haynes L., Agwanda B., Breiman R.F. Detection of novel SARS-like and other coronaviruses in bats from Kenya. Emerg. Infect. Dis. 2009;15:482–485. doi: 10.3201/eid1503.081013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Trkola A., Kuster H., Rusert P., Joos B., Fischer M., Leemann C., Manrique A., Huber M., Rehr M., Oxenius A. Delay of HIV-1 rebound after cessation of antiretroviral therapy through passive transfer of human neutralizing antibodies. Nat. Med. 2005;11:615–622. doi: 10.1038/nm1244. [DOI] [PubMed] [Google Scholar]
  115. van Dorp L., Acman M., Richard D., Shaw L.P., Ford C.E., Ormond L., Owen C.J., Pang J., Tan C.C.S., Boshier F.A.T. Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infect. Genet. Evol. 2020;83:104351. doi: 10.1016/j.meegid.2020.104351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. VanderPlas J., Granger B., Heer J., Moritz D., Wongsuphasawat K., Satyanarayan A., Lees E., Timofeev I., Welsh B., Sievert S. Altair: Interactive Statistical Visualizations for Python. JOSS. 2018;3:1057. [Google Scholar]
  117. Walls A.C., Xiong X., Park Y.-J., Tortorici M.A., Snijder J., Quispe J., Cameroni E., Gopal R., Dai M., Lanzavecchia A. Unexpected Receptor Functional Mimicry Elucidates Activation of Coronavirus Fusion. Cell. 2019;176:1026–1039.e15. doi: 10.1016/j.cell.2018.12.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Walls A.C., Park Y.-J., Tortorici M.A., Wall A., McGuire A.T., Veesler D. Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein. Cell. 2020;181:281–292.e6. doi: 10.1016/j.cell.2020.02.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Wan Y., Shang J., Graham R., Baric R.S., Li F. Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus. J. Virol. 2020;94:e00127-20. doi: 10.1128/JVI.00127-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  120. Wang X., Minasov G., Shoichet B.K. Evolution of an antibiotic resistance enzyme constrained by stability and activity trade-offs. J. Mol. Biol. 2002;320:85–95. doi: 10.1016/S0022-2836(02)00400-X. [DOI] [PubMed] [Google Scholar]
  121. Weidenbacher P.A., Kim P.S. Protect, modify, deprotect (PMD): A strategy for creating vaccines to elicit antibodies targeting a specific epitope. Proc. Natl. Acad. Sci. USA. 2019;116:9947–9952. doi: 10.1073/pnas.1822062116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. Weile J., Roth F.P. Multiplexed assays of variant effects contribute to a growing genotype-phenotype atlas. Hum. Genet. 2018;137:665–678. doi: 10.1007/s00439-018-1916-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Wentz A.E., Shusta E.V. A novel high-throughput screen reveals yeast genes that increase secretion of heterologous proteins. Appl. Environ. Microbiol. 2007;73:1189–1198. doi: 10.1128/AEM.02427-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Wrapp D., Wang N., Corbett K.S., Goldsmith J.A., Hsieh C.-L., Abiona O., Graham B.S., McLellan J.S. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science. 2020;367:1260–1263. doi: 10.1126/science.abb2507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. Wrapp D., De Vlieger D., Corbett K.S., Torres G.M., Wang N., Van Breedam W., Roose K., van Schie L., Hoffmann M., Pöhlmann S., VIB-CMB COVID-19 Response Team Structural Basis for Potent Neutralization of Betacoronaviruses by Single-Domain Camelid Antibodies. Cell. 2020;181:1004–1015.e15. doi: 10.1016/j.cell.2020.04.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Wrobel A.G., Benton D.J., Xu P., Roustan C., Martin S.R., Rosenthal P.B., Skehel J.J., Gamblin S.J. SARS-CoV-2 and bat RaTG13 spike glycoprotein structures inform on virus evolution and furin-cleavage effects. Nat. Struct. Mol. Biol. 2020;27:763–767. doi: 10.1038/s41594-020-0468-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Wu X., Yang Z.-Y., Li Y., Hogerkorp C.-M., Schief W.R., Seaman M.S., Zhou T., Schmidt S.D., Wu L., Xu L. Rational design of envelope identifies broadly neutralizing human monoclonal antibodies to HIV-1. Science. 2010;329:856–861. doi: 10.1126/science.1187659. [DOI] [PMC free article] [PubMed] [Google Scholar]
  128. Wu K., Peng G., Wilken M., Geraghty R.J., Li F. Mechanisms of host receptor adaptation by severe acute respiratory syndrome coronavirus. J. Biol. Chem. 2012;287:8904–8911. doi: 10.1074/jbc.M111.325803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Wu N.C., Xie J., Zheng T., Nycholat C.M., Grande G., Paulson J.C., Lerner R.A., Wilson I.A. Diversity of Functionally Permissive Sequences in the Receptor-Binding Site of Influenza Hemagglutinin. Cell Host Microbe. 2017;21:742–753.e8. doi: 10.1016/j.chom.2017.05.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Wu N.C., Thompson A.J., Lee J.M., Su W., Arlian B.M., Xie J., Lerner R.A., Yen H.-L., Bloom J.D., Wilson I.A. Different genetic barriers for resistance to HA stem antibodies in influenza H3 and H1 viruses. Science. 2020;368:1335–1340. doi: 10.1126/science.aaz5143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Wu Y., Wang F., Shen C., Peng W., Li D., Zhao C., Li Z., Li S., Bi Y., Yang Y. A noncompeting pair of human neutralizing antibodies block COVID-19 virus binding to its receptor ACE2. Science. 2020;368:1274–1278. doi: 10.1126/science.abc2241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Xue K.S., Bloom J.D. Linking influenza virus evolution within and between human hosts. Virus Evol. 2020;6:veaa010. doi: 10.1093/ve/veaa010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  133. Yan R., Zhang Y., Li Y., Xia L., Guo Y., Zhou Q. Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2. Science. 2020;367:1444–1448. doi: 10.1126/science.abb2762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. Yang J., Wang W., Chen Z., Lu S., Yang F., Bi Z., Bao L., Mo F., Li X., Huang Y. A vaccine targeting the RBD of the S protein of SARS-CoV-2 induces protective immunity. Nature. 2020 doi: 10.1038/s41586-020-2599-8. Published online July 29, 2020. [DOI] [PubMed] [Google Scholar]
  135. Yuan M., Liu H., Wu N.C., Lee C.-C.D., Zhu X., Zhao F., Huang D., Yu W., Hua Y., Tien H. Structural basis of a shared antibody response to SARS-CoV-2. Science. 2020 doi: 10.1126/science.abd2321. Published online 13 July 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  136. Yuan M., Wu N.C., Zhu X., Lee C.D., So R.T.Y., Lv H., Mok C.K.P., Wilson I.A. A highly conserved cryptic epitope in the receptor binding domains of SARS-CoV-2 and SARS-CoV. Science. 2020;368:630–633. doi: 10.1126/science.abb7269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. Zang J., Gu C., Zhou B., Zhang C., Yang Y., Xu S., Zhang X., Zhou Y., Bai L., Wu Y. Immunization with the receptor-binding domain of SARS-CoV-2 elicits antibodies cross-neutralizing SARS-CoV-2 and SARS-CoV without antibody-dependent enhancement. bioRxiv. 2020 doi: 10.1101/2020.05.21.107565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  138. Zhang B.-Z., Hu Y.-F., Chen L.-L., Yau T., Tong Y.-G., Hu J.-C., Cai J.-P., Chan K.-H., Dou Y., Deng J. Mining of epitopes on spike protein of SARS-CoV-2 from COVID-19 patients. Cell Res. 2020;30:702–704. doi: 10.1038/s41422-020-0366-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  139. Zhou H., Chen X., Hu T., Li J., Song H., Liu Y., Wang P., Liu D., Yang J., Holmes E.C. A Novel Bat Coronavirus Closely Related to SARS-CoV-2 Contains Natural Insertions at the S1/S2 Cleavage Site of the Spike Protein. Curr. Biol. 2020;30:2196–2203. doi: 10.1016/j.cub.2020.05.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Zhou P., Yang X.-L., Wang X.-G., Hu B., Zhang L., Zhang W., Si H.-R., Zhu Y., Li B., Huang C.-L. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. doi: 10.1038/s41586-020-2012-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  141. Zhou T., Teng I.-T., Olia A.S., Cerutti G., Gorman J., Nazzari A., Shi W., Tsybovsky Y., Wang L., Wang S. Structure-Based Design with Tag-Based Purification and In-Process Biotinylation Enable Streamlined Development of SARS-CoV-2 Spike Molecular Probes. bioRxiv. 2020 doi: 10.1101/2020.06.22.166033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  142. Zost S.J., Gilchuk P., Case J.B., Binshtein E., Chen R.E., Nkolola J.P., Schäfer A., Reidy J.X., Trivette A., Nargi R.S. Potently neutralizing and protective human antibodies against SARS-CoV-2. Nature. 2020 doi: 10.1038/s41586-020-2548-6. Published online July 15, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1. Binding and Expression Measurements for RBD Homologs in the Deep Mutational Scanning Libraries, Related to Figure 2
mmc1.csv (1.2KB, csv)
Table S2. Single-Mutant Effects on Expression and Binding, Related to Figure 3
mmc2.csv (263KB, csv)
Data S1. Interactive Heatmap of Mutational Effects on Expression and Binding, Related to Figure 3
mmc3.zip (75.5KB, zip)

Data Availability Statement

We provide all data and code in the following ways:

RESOURCES