Skip to main content
Protein Engineering, Design and Selection logoLink to Protein Engineering, Design and Selection
. 2022 Mar 24;35:gzac002. doi: 10.1093/protein/gzac002

Stabilization of the SARS-CoV-2 receptor binding domain by protein core redesign and deep mutational scanning

Alison C Leonard 1, Jonathan J Weinstein 2, Paul J Steiner 3, Annette H Erbse 4, Sarel J Fleishman 5, Timothy A Whitehead 6,
PMCID: PMC9077414  PMID: 35325236

Abstract

Stabilizing antigenic proteins as vaccine immunogens or diagnostic reagents is a stringent case of protein engineering and design as the exterior surface must maintain recognition by receptor(s) and antigen—specific antibodies at multiple distinct epitopes. This is a challenge, as stability enhancing mutations must be focused on the protein core, whereas successful computational stabilization algorithms typically select mutations at solvent-facing positions. In this study, we report the stabilization of SARS-CoV-2 Wuhan Hu-1 Spike receptor binding domain using a combination of deep mutational scanning and computational design, including the FuncLib algorithm. Our most successful design encodes I358F, Y365W, T430I, and I513L receptor binding domain mutations, maintains recognition by the receptor ACE2 and a panel of different anti-receptor binding domain monoclonal antibodies, is between 1 and 2°C more thermally stable than the original receptor binding domain using a thermal shift assay, and is less proteolytically sensitive to chymotrypsin and thermolysin than the original receptor binding domain. Our approach could be applied to the computational stabilization of a wide range of proteins without requiring detailed knowledge of active sites or binding epitopes. We envision that this strategy may be particularly powerful for cases when there are multiple or unknown binding sites.

Keywords: deep mutational scanning, FuncLib, immunogen design, Rosetta, SARS-CoV-2

Graphical Abstract

Graphical Abstract.

Graphical Abstract

Introduction

Many natural proteins are marginally stable (Goldenzweig and Fleishman, 2018), and improving their stability is a common prerequisite for diverse industrial and medical applications, from engineering thermostable proteases for laundry detergents (Wintrode et al., 2000; Zhao and Arnold, 1999) to immunogen design like the pre-fusion stabilizing coronavirus Spike protein mutations used in several regulatory approved COVID vaccines (Baden et al., 2021; Mulligan et al., 2020; Pallesen et al., 2017). However, many stabilizing mutations can reduce activity or function in the engineered protein (Beadle and Shoichet, 2002). Thus, balancing the tradeoff between improved stability and maintaining function remains a major challenge for the protein designer. Previous work to predict non-disruptive stabilizing mutations have incorporated factors such as evolutionary conservation (Goldenzweig et al., 2016), distance to an active site or binding site (Tokuriki et al., 2007), and the local packing density (Klesmith et al., 2017; Wrenbeck et al., 2019). These and other prediction methods can successfully identify stabilizing mutations with neutral impact on activity without large-scale-directed evolution campaigns.

However, such prediction methods identify potential stabilizing mutations that are located predominantly at or peripheral to the solvent-contacting surface of the protein. In addition, the set of mutations identified typically are not in direct contact and thus are usually independent of other mutations (Goldenzweig et al., 2016). Yet for many applications, like immunogen or diagnostic reagent design, the surface of the protein must remain unchanged. Design strategies then would be akin to remodeling a historic building, in which the interior may be reinforced and updated as needed within the requirement that the exterior structure is preserved intact. In this stringent design case, the surface structure can be held in place during computational design with constraints on movement, whereas interior, non-solvent-exposed residues are allowed to vary as necessary barring major deformation of the protein backbone. Stabilizing protein vaccine immunogens is an example of a relevant application because effective stabilized antigens must retain the ability to bind many different antibodies at distinct epitopes across the protein surface and only a small fraction of the epitopes are known. Rather than risk damaging these epitopes, protein core redesign can be used on the protein from the inside out without making significant changes to the surface structure.

One such potential protein of immediate global importance is the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) receptor binding domain (RBD), a diagnostic antigen (Premkumar et al., 2020) and vaccine immunogen (Chen et al., 2017; Tai et al., 2020). RBD is critically important as the titer of anti-RBD antibodies are the major correlates of protection in SARS-CoV-2 (Feng et al., 2021; Rogers et al., 2020; Yu et al., 2020). The RBD is a 205 amino acid domain of the SARS-CoV-2 Spike protein that contains distinct epitopes for anti-Spike neutralizing antibodies (Barnes et al., 2020; Dejnirattisai et al., 2021; Francino-Urdaniz et al., 2021). The protein structure is composed primarily of beta sheets with connecting loops and helices and contains 9 cysteine residues, 8 of which form 4 disulfide bonds: C336–C361, C379–C432, C480–C488, and C391–C525 (Lan et al., 2020) (Fig. 1A). The receptor binding motif (RBM) spans residues 438–506 and contains the binding surface for the cell receptor ACE2 (Lan et al., 2020; Wrapp et al., 2020). Away from the RBM, the protein core contains several under packed regions, most notably a cryptic linoleic acid binding pocket surrounded by positions F338, V341, F342, I358, C361, A363, L368, F374, C379, L387, and F392 (Toelzer et al., 2020) and a buried unsatisfied polar group at Y365. The presence of under packed hollow cavities and side chains with unsatisfied hydrogen bonds in the core suggests that mutations could improve stability, as successfully demonstrated by the introduction of key RBD mutations into this fatty acid binding pocket by the King lab (Ellis et al., 2021). In a different approach, the King lab also reduced protein aggregation by mutating hydrophobic patches on the surface (Dalvie et al., 2021), though this could potentially remove critical epitopes recognized by the immune system.

Fig. 1.

Fig. 1

Evaluation of SARS-CoV-2 S RBD computational designs using yeast display. (A) Structure of SARS-CoV-2 S RBD (PDB ID 6M0J). The RBM for ACE2 spanning positions 438–506 is colored in cyan, residues adjacent to cryptic fatty acid binding pocket are colored in green, disulfide bonds shown as spheres and colored by element. Residues T430 and Y365 containing buried unsatisfied hydrogens are colored in blue. (B) RBD structure with sequence differences for the RBD1 design shown as orange sticks. (C) Sequence diversity of the 5 RBD designs. (D) Cartoon of yeast surface display assay and representative cytograms of yeast displayed WT RBD in the presence and absence of biotinylated ACE2 at the indicated concentrations. (E) Relative RBD surface expression of WT (n = 12) and RBD1 (n = 11) as measured by relative FITC mean fluorescence intensity using an anti-myc FITC antibody (P-value = 2e-7). (F) Apparent KD (KD, app) for the interactions between ACE2 (P-value = 0.13) and CR3022 (P-value = 1.2e-4) and surface-displayed WT and RBD1 (n = 4). (G) Cartoon and cytograms showing protease susceptibility assay for protein stability in yeast surface display. Displayed RBD is treated with protease (chymotrypsin) to denature the RBD before binding with ACE2. Treatment with higher concentrations of protease will result in more highly denatured RBD protein that is less capable of binding ACE, leading to lower PE signal. (H) Relative binding signal for WT and RBD1 after treatment with 1111, 3333, and 10 000 U/ml chymotrypsin, measured by relative PE mean fluorescence intensity (P-value = 0.0039 at 1111 U/ml, P-value = 0.096 at 3333 U/ml, P-value = 0.039 at 10 000 U/ml chymotrypsin, n = 4).

In this work, we present the stabilization of RBD by mutating key surface epitopes and using computational design and deep mutational scanning to identify stabilizing core mutations on RBD. Combining a subset of these mutations led to at least 1 improved RBD design with greater comparative proteolytic and thermal stability while maintaining recognition by a panel of antibodies binding multiple distinct epitopes. These designs may be useful as diagnostic and vaccine immunogens. More fundamentally, our work advances our understanding of critical factors influencing the successful computational redesign of cores of existing proteins.

Materials and Methods

Plasmid constructs

All plasmids and primers used for this work are listed in Supplementary Tables S1 and S2, respectively. All plasmids were verified by Sanger sequencing. The yeast display plasmid pETconV4 (production_PETCON_V4_B1A2), optimized for use in the protease susceptibility assay, was received as a gift from the University of Washington (Maguire et al., 2021). pJS699 (Wuhan-Hu-1 S-RBD(333-537)-N343Q for fusion to the C-terminus of AGA2) was previously described (Banach et al., 2021).

To construct plasmids pACL002–pACL006 and pACL009–pACL015 containing all RBD design sequences for yeast surface display, DNA sequences were ordered as gBlocks (Integrated DNA Technologies) and cloned into pETconV4 using restriction enzymes NdeI and XhoI New England Biolabs (NEB).

To construct plasmid pACL007, the wild-type (WT) RBD sequence was amplified from plasmid pJS699 using primers forward_pJS699_RBD_pETconV4 and reverse_pJS699_RBD_pETconV4 to add regions of pETconV4 homology surrounding the RBD sequence. Plasmid pETconV4 was linearized by digestion with NdeI and XhoI (NEB). Both fragments were run on a 1% agarose gel and purified using a Monarch DNA Gel Extraction kit (NEB). The fragments were then joined using the NEBuilder HiFi DNA Assembly protocol (NEB).

DNA sequences for plasmids pACL002–pACL006 and Design1_pETconV4–Design7_pETconV4 containing all RBD design sequences were ordered as gBlocks (Integrated DNA Technologies) and cloned into pETconV4 using restriction enzymes NdeI and XhoI (NEB).

Plasmid pACL008 was created by inserting the BbvCI restriction enzyme site into the pACL005 plasmid using site-directed mutagenesis. Polymerase chain reaction (PCR) primers add_BbvCI_forw_pETconV4 and add_BbvCI_rev_pETconV4 were used to insert the BbvCI restriction enzyme site using KAPA HiFi HotStart ReadyMix (Roche). The PCR product was run on a 1% agarose gel and extracted to purify. About 1 μL of the purified product was ligated into a circular plasmid using KLD reaction mix (NEB) incubated for 10 min at room temperature.

RBD designs to be tested further as soluble proteins were codon optimized for Pichia pastoris (Integrated DNA Technologies), ordered as gBlocks, and cloned into the pPICZInline graphicA secreted expression vector (Thermo Fisher V19520) using EcoRI and SacII restriction sites. RBD designs in P. pastoris do contain the N343 glycan that was removed from designs optimized for yeast surface display via an N343Q mutation. N343 was reintroduced to RBD sequences cloned into P. pastoris plasmids by Q5 site-directed mutagenesis (NEB) using primers Q343N_SDM_WT_D1_D3_F and Q343N_SDM_R.

Computational protein design

Initial designs were generated using 2 separate Rosetta-based methods (Leman et al., 2020). RBD1 and RBD2 were designed using FuncLib (Khersonsky et al., 2018) run on PDB entry 6VSB, using a position-specific scoring matrix (PSSM) threshold of 0 and Inline graphic of +3 Rosetta energy units computed using the Rosetta energy function 2015 (ref_2015). The RBD was manually split to 2 spatial sub-domains to reduce the combinatorial complexity. Positions in sub-domain 1: 341, 342, 358, 365, 368, 377, 387, 392, 395, 397, 431, 434, 511, 513, 515, 524. Positions in sub-domain 2: 350, 398, 400, 401, 402, 410, 418, 419, 423, 425, 430, 433, 438, 442, 495, 497, 507, 510, 512. FuncLib results for each sub-domain were sorted by Rosetta energy, and the best designs separated by at least 3 mutations from one another were chosen. All chosen mutants from both subdomains were combined, modeled, and ranked by energy. The resulting combined designs were clustered according to sequence difference as aforementioned. The 30 best scoring combined designs were manually inspected, and RBD1 and RBD2 were chosen for experimental testing.

Proteins RBD3, RBD4, and RBD5 were designed using FastDesign (Maguire et al., 2021) (Rosetta code access date 15 April 2020) on an RBD structure (PDB ID: 6MOJ) prepacked using FastRelax, with alternating cycles of repacking and minimization with design. Instead of fixing the backbone, coordinate constraints were applied to non-core residues, with constraints scaled to the B-factor of that atom. Core residues were identified by the layer selection command, with the exclusion of cysteines.

To introduce additional mutational variation into designs RBD7-12, additional rounds of Rosetta design were performed using different RBD structures (PDB ID: 6M0J, 7JMO, and 6LZG), as well as varying scaling of B-factor constraints to allow more or less flexibility to the protein surface. Allowed mutations were selected using a resfile in which the default was allowing packing but not design (NATAA). Positions with high yeast surface display expression identified in the deep mutational scanning experiments described here and/or in Starr et al. (2020) were allowed to mutate to either the WT identity or any of the possible beneficial mutations identified by deep mutational scanning (PIKAA).

Recombinant protein production, purification, and preparation

ACE2-Fc, produced and purified following Walls et al. (2020), and CR3022 (ter Meulen et al., 2006) were kind gifts from Neil King’s lab at the University of Washington. The anti-SARS-CoV-2 RBD antibody panel used (CC6.29, CC6.32, CC6.33, CC12.1, CC12.7) was a kind gift from Dennis Burton’s lab at Scripps and was produced and purified according to Rogers et al. (2020). Background binding level in enzyme-linked immunosorbent assay (ELISA) assays was measured using Human IgG Isotype Control (Thermo Fisher No. 02-7102).

RBD designs were produced recombinantly in P. pastoris as follows. pPICZInline graphic vectors (Thermo Fisher V19520) containing WT RBD or RBD designs were linearized by SacI and greater than 5 Inline graphicL were transformed into electrocompetent P. pastoris X-33 (Thermo Fisher C18000) at 2000 V using a 2 mm electroporation cuvette (Bulldog Bio) and Eppendorf electroporator and then plated on yeast extract peptone dextrose plus sorbitol (YPDS) plates (YPDS: 1% w/v yeast extract, 2% w/v peptone, 2% v/v glucose, plus 1.0 M sorbitol) supplemented with 100 Inline graphicg/ml zeocin (Thermo Fisher 25001). Single colonies were selected and streaked on minimal dextrose (1.34% w/v yeast nitrogen base, 4×10−5% w/v biotin, 2% v/v glucose), minimal methanol (1.34% w/v yeast nitrogen base, 4×10−5% w/v biotin, 0.5% v/v methanol), and YPDS plus 400 Inline graphicg/ml zeocin to identify the colonies with MutS phenotype that are most resistant to zeocin. Integration of the RBD sequence was confirmed using colony PCR using primer colonyPCR_pichiaRBD_rev.

For RBD production, single colonies were used to inoculate buffered complex glycerol (BMGY) medium (1% w/v yeast extract, 2% w/v peptone, 1.34% w/v YNB, 400 Inline graphicg/L biotin, 0.1 M potassium phosphate, pH 6.0, 1% v/v glycerol), scaling up to 1 L volume grown at 30°C with 250 rpm agitation until the culture reached OD600 = 2–6. Cells were harvested, resuspended in buffered methanol complex medium (BMMY: 1% w/v yeast extract, 2% w/v peptone, 1.34% v/v 10× yeast nitrogen base, 400 Inline graphicg/L biotin, 0.1 M potassium phosphate, pH 6.0, 0.5% v/v methanol), and incubated at 30°C with 250 rpm agitation for 4 days, adding 0.5% v/v methanol daily to maintain induction. Cells were removed by centrifugation at 3200 × g for 10 min.

RBD harvest and Ni2+-NTA column purification were done as described (Argentinian AntiCovid Consortium, 2020). After elution with imidazole, the purified protein was concentrated using a 10 kDa MW cutoff Amicon centrifugal filter (Sigma), buffer exchanged into phosphate buffered saline (PBS) (10 mM Na2HPO4, 1.8 mM KH2PO4, 2.7 mM KCl, 137 mM NaCl, pH 7.4) using PD-10 desalting columns (GE Healthcare), and stored at 4°C. Protein was quantified by absorbance at 280 nm using the theoretical extinction coefficient derived from the protein sequence when all 4 disulfide bonds are intact (WT: 33850 M−1 cm−1, RBD6: 37860 M−1 cm−1, RBD8: 37860 M−1 cm−1, RBD10: 36370 M−1 cm−1). To visualize protein bands, samples were denatured for 10 min at 99°C in SDS sample buffer (188 mM Tris-Cl, 3% w/v SDS, 30% v/v glycerol, 0.01% v/v bromophenol blue) plus 100 mM dithiothreitol (DTT) and separated by 4–20% gradient SDS-PAGE gel electrophoresis (BioRad 4568096). To visualize the N343 glycan, samples were incubated for 1 h at 37°C with 1 μL Endo H in GlycoBuffer 3 (NEB P0702S) after denaturing in SDS sample buffer before running SDS-PAGE.

Yeast display titrations and protease susceptibility screening

For initial screening of the initial designs compared with wild type, cell surface titrations of EBY100 Saccharomyces cerevisiae harboring the various RBD display plasmids were grown in 1 mL M19D (5 g/l casamino acids, 40 g/l dextrose, 80 mM MES free acid, 50 mM citric acid, 50 mM phosphoric acid, 6.7 g/l yeast nitrogen base, adjusted to pH 7 with 9 M NaOH, 1 M KOH) overnight at 30°C. Expression was induced by resuspending the M19D culture to OD600 = 1 in M19G (5 g/l casamino acids, 40 g/l galactose, 80 mM MES free acid, 50 mM citric acid, 50 mM phosphoric acid, 6.7 g/l yeast nitrogen base, adjusted to pH 7 with 9 M NaOH, 1 M KOH) and growing 22 h at 22°C with shaking at 300 rpm. Yeast surface display titrations for hACE2-Fc and CR3022 IgG were performed as described by Chao et al. (2006) with an incubation time of 3 h at room temperature and using secondary labels 0.6 μL anti-c-myc-FITC (Miltenyi Biotec) and 0.25 μL Goat anti-Human IgG Fc PE conjugate (Thermo Fisher No. 12-4998-82). Titrations were performed in biological replicates (cultures from independent colonies grown on separate days) and technical replicates (n = 4). The levels of display and binding were assessed by fluorescence measurements for fluorescein (FITC) and phycoerythrin (PE) using a Sony SH800 cell sorter equipped with a 70 μm sorting chip and 488 nm laser. Cells for protease susceptibility measurements were grown and prepared for titrations as described by Chao et al. (2006). Cells were first treated with 1000–4000 U/mL chymotrypsin (Sigma No. XC4129) for 5 min as described in Rocklin et al. (2017), using a 200 μL volume. Chymotrypsin activity was determined relative to trypsin using the Pierce fluorescent protease assay (Thermo Fisher), and trypsin concentration was quantified using an enzymatic assay with Inline graphic-Benzoyl-L-arginine ethyl ester (Sigma-Aldrich) exactly according to Rocklin et al. (2017) except with volumes scaled to be read on a 96-well plate reader. After chymotrypsin treatment, cells were incubated with 200 nM hACE2-Fc for 3 h at room temperature, washed with Tris-buffered saline + BSA (TBSF), and labeled with 0.6 μL anti-c-myc-FITC (Miltenyi Biotec) and 0.25 μL Goat anti-Human IgG Fc PE conjugate (Thermo Fisher No. 12-4998-82). Protease susceptibility measurements were performed in biological replicates (n = 2) and technical replicates (n = 2).

For screening of subsequent designs, S. cerevisiae EBY100 harboring the various RBD display plasmids were grown from −80°C cell stocks in 1 mL SDCAA (SDCAA yeast minimal media with dextrose: 2% w/v glucose, 0.67% yeast nitrogen base, 0.5% casamino acids, 0.54% di-sodium phosphate, 0.86% monosodium phosphate) for 4–6 h at 30°C. Expression was induced by resuspending the SDCAA culture to OD600 = 1 in SGCAA (SGCAA yeast minimal media with : 2% w/v galactose, 0.67% yeast nitrogen base, 0.5% casamino acids, 0.54% di-sodium phosphate, 0.86% monosodium phosphate) and growing at 22 h at 22°C with shaking at 300 rpm. Yeast surface display titrations and protease susceptibility measurements were performed as aforementioned, except with an incubation time with hACE2-Fc or CR3022 of 4 h at room temperature.

Preparation and screening of mutagenic library

We identified 90 positions where the RBD1 residue has a relative solvent accessibility of less than or equal to 20%. Relative solvent accessibility was calculated by normalizing the solvent-accessible surface area of the RBD1 design calculated using dssp (Kabsch and Sander 1983) relative to the maximum theoretical solvent accessibility of each residue (Tien et al., 2013). These 90 positions were mutated to every other amino acid plus stop codon by comprehensive nicking mutagenesis (Wrenbeck et al., 2016) using NNK primers (Supplementary Table S2) and template plasmid pACL008. For compatibility with 250 bp paired end Illumina sequencing, the mutagenic library was divided into 2 tiles. Tile 1 encompassed positions 336–430 and tile 2 encompassed positions 431–524. Biological replicates of each library were prepared by separate nicking mutagenesis reactions. Library plasmids were transformed into chemically competent EBY100 cells as described (Medina-Cucurella and Whitehead, 2018). Yeast stocks were stored in yeast storage buffer (20% w/v glycerol, 200 mM NaCl, 20 mM HEPES pH 7.5) at −80°C. Serial dilutions were plated on SDCAA and incubated 3 days to calculate the transformation efficiency.

To screen the core residue mutagenic library for protease-resistant mutations, library yeast stocks for each tile and biological replicate were thawed, centrifuged for 3 min at 2500 × g, resuspended in 1 mL of SDCAA, and grown for 4–6 h at 30°C. Expression was then induced by resuspending the SDCAA culture to OD600 = 1 in 1 mL SGCAA and growing at 18 h at 22°C with shaking at 300 rpm, after which cells were resuspended in 1 mL TBSF at OD600 = 2 (2×107 cells). Cells in TBSF were treated with either 0 (reference population), 1000, 2000, or 4000 U/ml chymotrypsin as aforementioned, incubating for 5 min at room temperature with an occasional vortex, then spun down at 2500 × g and washed with 2 mL TBSF 3 times. After chymotrypsin treatment, cells were incubated with 300 pM hACE2-Fc in a 10:1 hACE2-Fc/displayed protein ratio (Medina-Cucurella and Whitehead, 2018) for 4 h at room temperature, washed with TBSF, labeled with 50 μL Goat anti-Human IgG Fc PE conjugate (Thermo Fisher Scientific Invitrogen Catalog No. 12-4998-82) diluted in 1.95 mL TBSF for 10 min covered on ice, and washed again with TBSF. Labeled cells were sorted on a Sony SH800 cell sorter, with an SSC-A/FSC-A gate used to identify cells, a FSC-H/FSC-A gate to select only single cells, and a PE-A/FITC-A gate used to identify cells that bind hACE2-fc (Supplementary Fig. S1). For both chymotrypsin-treated populations and the no-protease reference population, cells binding hACE2-fc were collected using the PE/FITC gate with a PE fluorescence signal greater than 2000. A minimum of 170 000 cells for tile 1 and 130 000 cells for tile 2 were collected to compile at least 100-fold more cells than the theoretical sub-library size (Medina-Cucurella and Whitehead, 2018). Collected cells were recovered in SDCAA with 1× PenStrep for 30 h then frozen at −80°C in yeast storage buffer in 1 mL aliquots at OD600 = 4.

Deep sequencing preparation

Libraries were prepared for deep sequencing as described in Medina-Cucurella and Whitehead (2018), using a Zymo Yeast Plasmid Miniprep II kit (Zymo Research) and a Monarch PCR & DNA Cleanup kit (NEB) with the following changes. Inner primers RBD-F1_tile1_F_Illumina and RBD-F1_tile1_R_Illumnia were used to amplify tile 1 with an annealing temperature of 70°C; inner primers RBD-F1_tile2_F_Illumina and RBD-F1_tile2_R_Illumnia were used to amplify tile 2 with an annealing temperature of 63°C. About 5 μL of PCR product from the inner primer amplification was cleaned used 0.5 μL Exonuclease I (NEB) and 1 μL rSAP (NEB), incubating for 15 min at 37°C then 15 min at 80°C. Then 2 μL of purified DNA was carried forward to the second PCR reaction. Samples were purified using Agencourt Ampure XP beads (Beckman Coulter), quantified using PicoGreen (Thermo Fisher), pooled, and sequenced on an Illumina MiSeq using 2 × 250 bp paired-end reads at the BioFrontiers Sequencing Core (University of Colorado, Boulder, CO). Library statistics are listed in Supplementary Table S3.

Deep sequencing analysis

All deep sequencing data analysis was performed using the Protein Analysis and Classifier Toolkit (Klesmith and Hackel, 2019) available at GITHUB (https://github.com/JKlesmith/PACT).

Analysis was performed using the ‘fitness’ protocol with ‘mutationtype: single’, ‘mutthreshold: 1’, ‘min_coverage: 0.2’, ‘qaverage: 20’, ‘ref_count_threshold: 5’ and ‘sel_count_threshold: 5’.

ELISA binding affinity measurements

About 50 μL of each purified soluble RBD variant at a protein concentration of 2 μg/ml was immobilized to Microlon clear plates (Greiner Bio-One 655081) at 4°C overnight. Plates were sealed using Microseal B adhesive sealers (BioRad MSB-1001). The following day, the solutions were decanted by flicking and all wells were washed 3 times with 200 μL of PBS plus milk (PBSM) (PBS + 0.1% Tween-20 + 3% non-fat milk) followed by vigorous tamping on a pad of paper towels to remove residual liquid. Plates were blocked using 100 μL PBSM for 1 h at room temperature. Blocked plates were decanted by flicking. Binding reactions were assembled using serial dilutions of hACE2-Fc in PBSM from 100 pM to 12.8 nM, or appropriate concentrations of binding antibody. Plates were transferred to a plate shaker (Heidolph Titramax 1000) and incubated at room temperature for 4 h with shaking at 400 rpm. Plates were decanted by flicking and were washed 3 times with 200 μL PBSM followed by vigorous tamping on a pad of paper towels. Detection was enabled by the addition of 100 μL Goat anti-Human IgG Fc secondary antibody conjugated to horseradish peroxidase (Thermo Fisher No. A18817), diluted 1:50 000 into PBSM, and incubated in the above plate shaker for 1 h at room temperature. Binding was visualized through the addition of 50 μL 1-Step™ Ultra TMB-ELISA Substrate Solution (Thermo Fisher, No. 34028), incubated in the above plate shaker for Inline graphic5 min. TMB development was quenched by the addition of 50 μL of 2 M sulfuric acid and plates were read at 450 nm using a Biotek Synergy H1 Hybrid multimode plate reader. The KD,app for each reaction was calculated using non-linear least squares regression performed using custom Python scripts.

Thermal melt assay

WT and RBD design apparent melting temperatures, Tm, app, were measured using a SYPRO Orange thermal shift assay (Huynh and Partch, 2015). About 2 μL of 200× SYPRO Orange (Life Technologies) was added to 2 μg purified RBD protein in 18 μL PBS in a MicroAmp Fast 96-Well Reaction Plate (0.1 mL) (Applied Biosystems), for a total reaction volume of 20 μL. Plates were sealed with Micoseal ‘B’ seals (BioRad). Thermal melt analysis was performed in a QuantStudio 6 Flex Real-Time PCR device (Thermo Fisher) over a range of 25–85°C with 1°C change per minute and with a 2-min incubation at the first and last temperatures. Tm,app was determined by calculating the minimum of the first derivative of measured fluorescence. Replicate experiments A and B were performed 1 week apart on protein samples from the same purification batch.

In vitro proteolysis assay

The proteolytic stability of RBD designs was tested as described in Whitehead et al. (2009) with the following changes. About 1 μg RBD protein was incubated in 50 mM Tris–HCl pH 8.0, 0.5 mM CaCl2 at 37°C for 1 h in the presence and absence of 0.0067 mg/ml thermolysin, for a total reaction volume of 15 μL. Reactions were inactivated with 5 μL of 50 mM EDTA, then 10 μL of the reaction was mixed with 5 μL 3× SDS loading buffer with 100 mM DTT, denatured at 99°C for 10 min, and run on an SDS-PAGE gel. Gel bands were quantified using ImageJ software (Abràmoff et al., 2004) to determine the relative density of each band compared with the average of 3 control (no thermolysin) samples on the same gel.

Results

Initial computational design and characterization of RBD variants

We hypothesized that computational design of the protein core could improve the stability of the RBD from the original Wuhan-Hu-1 SARS-CoV-2 virus without impacting its ability to bind ACE2 or a panel of neutralizing antibodies. We used 2 different Rosetta design protocols (see Methods) to introduce mutations into the RBD protein core, leaving the 4 disulfide bonds intact. The first method used the FuncLib automated design method (Khersonsky et al., 2018) to target mutations to 2 distinct core regions away from the RBM, separated by an internal beta sheet. FuncLib starts with a phylogenetic analysis, restricting design at each of the selected amino acid positions only to mutations that are commonly observed among natural homologs. Then, the method models each mutation in Rosetta, allowing the structure to adapt to the mutation using whole-protein minimization and filtering mutations that are highly destabilizing. Finally, all combinations of mutations at the allowed positions are enumerated using Rosetta atomistic design, relaxed using whole-protein minimization and ranked by energy. Unlike other stability design methods that do not relax all possible mutants (Goldenzweig et al., 2016), FuncLib finds rare combinations of stabilizing constellations of amino acids that can stabilize the core of the protein (VanDrisse et al., 2021; Warszawski et al., 2019). The resulting designs RBD1 and RBD2 mainly increase hydrophobic packing and remove buried unsatisfied hydrogen bonds like those on Y365 and T430 (Fig. 1B), resulting in 8 and 10 mutations, respectively, from the WT RBD sequence (Fig. 1C). The second method used the Rosetta FastDesign packing method (Maguire et al., 2021) to target all core residues while restricting surface residue movement using constraints. The resulting designs, RBD3–RBD5, were more aggressive than RBD1-2 as they contained a combination of either 18 or 19 mutations from WT spread more evenly throughout the core (Fig. 1C). All designs concentrated on several small to large mutations in the underpacked cryptic linoleic acid binding pocket, including at V341, I358, L368, L387, and F392 (Toelzer et al., 2020) (Fig. 1B).

The 5 designs were ordered as synthetic genes, subcloned into a plasmid optimized for the protease susceptibility assay (Maguire et al., 2021), and expressed using a yeast surface display platform (Fig. 1D) previously demonstrated to display properly folded aglycosylated S RBD [Wuhan Hu-1 S RBD (333–537)-N343Q; WT] (Banach et al., 2021; Francino-Urdaniz et al., 2021). Expression of the aglycosylated constructs was induced at 22°C for 22 h in M19G media (see Methods). Protein surface display level assessed by the expression of a C-terminal c-myc epitope tag showed that, of the 5 initial designs tested, only RBD1 displayed on the yeast surface (Fig. 1E; data not shown), suggesting major structural changes of the other designs leading to improper folding. In addition, RBD1 displays at a significantly lower level than WT (P-value = 2e-7, n = 11) (Fig. 1E). This suggests that RBD1 is less stable than WT, since surface expression is known to correlate with stability (Klesmith et al., 2017).

Next, we assessed whether RBD1 maintains fidelity of critical surface epitopes by screening for maintenance of binding to both ACE2 and the antibody CR3022 (Yuan et al., 2020), which binds at a distinct epitope from ACE2. Titrations of ACE2 and CR3022 showed that the difference in the apparent dissociation constant, Kd,app, was not significantly different between the wild type sequence and RBD1 for ACE2 at the 95% confidence level (P-value = 0.12, n = 4), whereas the CR3022 Kd,app was slightly but significantly lower (P-value = 1.2 × 10−4, n = 4) for RBD1 than WT (Fig. 1F). Relative protein stability was assessed using a yeast surface protease susceptibility assay (Rocklin et al., 2017) in which incubation with chymotrypsin cleaves RBD, resulting in diminished ACE2 binding under saturating ACE2 concentrations (Fig. 1G). Treatment with increasing concentrations of chymotrypsin showed that yeast displayed RBD1 is more proteolytically sensitive than WT (P-value = 0.0039 at 1111 U/ml chymotrypsin and P-value = 0.039 at 10 000 U/ml chymotrypsin, n = 4) (Fig. 1H). The higher proteolytic sensitivity and the lower overall surface expression of RBD1 relative to WT suggest that although RBD1 maintains the surface topology of WT, it is a comparatively less stable protein.

Deep mutational scanning identifies additional stabilizing mutations

We hypothesized that deep mutational scanning could identify mutations that improve the stability of the RBD1 design. Although RBD1 is less stable than WT RBD, we hypothesized that most of the 8 mutations were stabilizing. Deep mutational scanning could then identify the few destabilizing mutations while also finding additional stabilizing mutations that complement the mutations introduced into the RBD1 sequence (Fig. 2A). We first identified all 90 positions where a RBD1 residue had a relative solvent accessibility of 20% or less. Next, a site-saturation mutagenesis library of these positions was created using nicking mutagenesis (Wrenbeck et al., 2016). The full library was split into 2 sub-libraries (‘tiles’), with tile 1 encompassing 52 positions between positions 336–430 and tile 2 encompassing 38 positions between 431–524. In addition, replicates (labeled ‘A’ and ‘B’) for each tile were prepared, resulting in a total of 4 separate libraries.

Fig. 2.

Fig. 2

Deep mutational scanning of RBD1 using a yeast-based proteolysis assay. (A) A flow diagram of the protease susceptibility deep mutational scanning protocol. (B) Correlation coefficient R of enrichment ratios between biological replicates as a function of average depth of coverage. (C) Scatterplot of enrichment ratios between replicates at an average depth of coverage of 60 or more reads. Mutations with an average enrichment ratio greater than 1.0 and positive enrichment ratios for both replicate A and B are shown as red closed circles, with all other mutations shown as closed gray circles. (D) Heatmap of hits from the deep mutational scan. Point mutations with a read depth of <60 colored gray. Hits are color coded by average enrichment ratio in shades of red. All other mutations are colored white. Blue boxes indicate positions where the WT residue is distinct from the RBD1 residue (colored orange).

Libraries were transformed into yeast [S. cerevisiae EBY100 (Boder and Wittrup, 1997)] and induced by galactose to display RBD1 mutants. Cells were then treated with 4 different concentrations of chymotrypsin (a reference at 0 and samples at 1000, 2000, 4000 U/ml), and then screened for maintenance of ACE2 binding. For each sample, cells maintaining ACE2 binding were collected by fluorescence-activated cell sorting. After outgrowth, plasmid DNA was purified, prepared, and deep sequenced. Coverage of all possible single nonsynonymous mutations ranged from 71.9% to 77.7% depending on the tile and replicate. In all, 1533/1800 (74%) possible single point mutants at the 90 core positions were present in the screened libraries, with complete library statistics reported in Supplementary Table S3.

Our deep mutational scanning protease resistant protocol differs from the originally described method (Rocklin et al., 2017), where Rocklin et al. assessed the maintenance of surface display of de novo designed mini proteins by labeling a C-terminal epitope tag. By contrast, our output from deep mutational scanning is the normalized enrichment ratio of each mutant i (εi) observed in the population relative to the RBD1 sequence (εRBD1), in which the reference population included all cells bound to ACE2 after no chymotrypsin treatment and the selected population included all cells bound to ACE2 after treatment with a specified chymotrypsin concentration, using the following equation:

graphic file with name DmEquation1.gif

We used this experimental design for 2 major reasons. First, the C-terminal c-myc epitope tag was proteolytically removed from the cell surface before the basal RBD construct began to lose ACE2 binding activity—note Fig. 1G where increasing protease concentration first removes the fluorescence channel corresponding to the loss of the c-myc epitope tag. Second, it is important for any potential end use that mutations conferring stability also confer maintenance of function. In our screen, libraries were labelled with an ACE2-Fc concentration (300 pM) close to the apparent dissociation constant on the yeast surface.

Because our reference population included a selection for maintenance of ACE2 binding, the 1533 mutations observed by deep sequencing are likely enriched in functional binders. Higher frequency mutants are thus likely correlated with ACE2 function, and mutants that are less proteolytically sensitive are likely to be concentrated in this population. We evaluated the correlation coefficient R between biological replicates ‘A’ and ‘B’ sorted by an average depth of coverage (Fig. 2B). As expected, R decreases as the average depth of coverage of included mutations decreases. We do note that the correlation observed here (0.61–0.76 depending on the depth of coverage) is worse than typically observed in library sequencing experiments (>0.9) and limits our ability to make inferences about mutants that are more proteolytically susceptible than WT. Although we did not further investigate the reasons for lower correlation between replicates, one interpretation is the stringent maintenance of ACE2 binding in the reference population. Nevertheless, we found that an average depth of coverage of 60 reads was sufficient to identify 40 mutants across 20 positions that had reproducible increases in their normalized enrichment ratios (Fig. 2B; a heatmap of these positions are shown in Fig. 2C). These hits are counted more frequently in the chymotrypsin-treated population than in the reference population, likely because they are resistant to chymotrypsin degradation.

The 40 hits are largely distal to the RBM with the exception of multiple hydrophobic mutations at N501 (N501F/W/M) (Figs. 2C and 3A). We neglected these hits as mutations at N501 are known to impact ACE2 affinity, including the N501Y mutation found in several Variants of Concern. The greatest surprise from the protease screen was that 5 of the 8 mutations included in the RBD1 design (at positions 368, 410, 418, 438, 513) were found to have the WT residue at least somewhat enriched, suggesting that a reversion of that RBD1 mutation back to the WT sequence is stabilizing at that position (Figs. 2C and 3A ). The overwhelming majority of other hits are concentrated in the underpacked core where the cryptic linoleic acid binding pocket occurs, most notably at positions 363 and 365 (Figs. 2C and 3A). There are a number of small to large aliphatic or aromatic substitutions occurring at this pocket, including A363WPMILV, A397ML, I434F, and F392W. RBD1 contains an Y365W; both Y and W are disfavored relative to smaller aliphatics without hydrogen bond acceptors or donors. By contrast, the partially surface exposed V362 prefers the isosteric substitution threonine with hydrogen bonding potential.

Fig. 3.

Fig. 3

Sequence and structural identification of screening hits informs the next round of computational design. (A) Selected stabilizing mutations identified by library protease screen were either reversions to WT of mutations incorporated into RBD1 (blue sticks; left inserts) or concentrated at or adjacent to the cryptic fatty acid binding pocket (pocket shown as green cartoon, and hits are shown as purple sticks; right insert). Orange sticks represent the original RBD1 residue. (B) Distribution of identified stabilizing mutations incorporated into designs RBD6-RBD12. Hits with an average enrichment ratio > 1.0 and > 60 average read depth from library protease screen are colored red. (C) Flowchart of process for distributing mutations among new designs.

Generation and characterization of the second set of RBD designs

We used the outputs from this deep mutational scan to inform the next set of computational designs (the full list of mutations used is shown as a Venn diagram in Fig. 3B). The parsimonious design RBD6 is encoded by the reversion of 4 mutations at positions 368, 410, 418, and 438. All other designs used this reversion as a base. We used Rosetta FastDesign on multiple RBD structures (see Methods) allowing either the RBD6 residue or a residue sourced from deep mutational scanning (Fig. 3C). Two of the designs used all 40 of the deep mutational scanning hits along with other potential hits just below the average depth of coverage. To add additional diversity of mutations, additional mutations were mined from a related deep mutational scanning dataset from Starr et al. (2020) that quantified the effect of mutation on protein expression rather than stability. Since expression in yeast surface display is correlated with protein stability (Klesmith et al., 2017), core mutations that increase expression are likely to be stabilizing. Two designs incorporated mutations sourced from the Starr dataset. Finally, 2 designs considered mutations from both the Starr et al. dataset along with the deep mutational scan from the current work.

All told, we created 7 new designs containing a combination of 19 different predicted stabilizing mutations (Fig. 4A). Synthetic gblocks for the 7 new designs were ordered and plasmids constructed as before. All designs displayed a c-myc epitope tag on the yeast surface, as determined by labeling with an anti-cmyc FITC secondary antibody. Two designs (RBD9, RBD12) bound neither ACE2 nor CR3022 at saturating concentrations, suggesting misfolding defects (Fig. 4A). Two other designs (RBD7, RBD11) bound ACE2 but not CR3022, suggesting localized misfolding around the CR3022 epitope (Fig. 4A). All 4 of these designs share the same sets of V382I and F515W mutations directly underneath the recognition surface of CR3022, suggesting that this combination of mutations likely results in local perturbation of the CR3022 epitope.

Fig. 4.

Fig. 4

Core redesign produces a protein design with greater proteolytic stability compared with WT while maintaining recognition by binding proteins. (A) The sequence diversity and binding assay results for the second round of computational designs. Binding was assessed using cell surface labeling by ACE2-Fc (1 nM) and CR3022 (1 nM) of yeast displayed RBD designs. Only RBD6, RBD8, and RBD10 bound both ACE2 and CR3022. (B) Relative apparent dissociation constants KD, app for ACE2 and CR3022 for RBD6, RBD8, and RBD10 (n = 4). KD, app was assessed relative to WT RBD. (C) ACE2 binding signal for WT, RBD6, RBD8, and RBD10 after treatment with increasing protease concentrations relative to no protease, measured by relative PE mean fluorescence intensity (RBD6 P-value = 7e-3 at 2000 U/ml, 5.3e-4 at 4000 U/ml chymotrypsin, RBD8 P-value = 0.011 at 4000 U/ml chymotrypsin, n = 4).

Three of the 7 new designs (RBD6, RBD8, RBD10) bound both ACE2 and CR3022. Cell surface titrations were performed as before, with all binding ACE2 comparable to WT and with slightly lower affinity for CR3022 (Fig. 4B). Testing by the protease susceptibility assay shows that RBD6 (the RBD1 reversion) is significantly more stable than WT at both chymotrypsin concentrations tested (P-value = 7e-3 and 5.3e-4 at 2000 and 4000 U/ml chymotrypsin, respectively, n = 4) (Fig. 4C). Design RBD8 was significantly more stable than WT only at the higher protease concentration of 4000 U/ml (P-value = 0.011, n = 4), whereas any differences in stability between design RBD10 and WT were not statistically significant at a significance threshold of 0.05 (Fig. 4C). Based on this analysis, we decided to express RBD6 and RBD8 recombinantly to determine whether in vitro properties correlated with the yeast surface display assays.

In vitro expression and characterization of successful RBD designs

His-tagged WT, RBD6, and RBD8 constructs [S-RBD(333–537)] were expressed in P. pastoris and purified by affinity chromatography (Argentinian Covid Consortium et al., 2020). All constructs purified as a single band at an apparent molecular weight of 31 kDa as visualized by SDS-PAGE, with Endo H treatment resulting in a single species migrating slightly faster at the predicted aglycosylated molecular weight of 28 kDa (Supplementary Fig. S1A). From this we conclude that all constructs are glycosylated at least at the lone N-linked glycan at position N343, consistent with previous reports of Pichia recombinant expression (Argentinian Covid Consortium et al., 2020). For all constructs, we noticed the appearance of a slightly lower MW species visible by SDS-PAGE after extended storage (~3–4 weeks) at 4°C (Supplementary Fig. S1B). We hypothesize that this additional species is a minor proteolysis product, as limited treatment with small amounts of thermolysin could reproduce the appearance of this band (Supplementary Fig. S1C and D).

We first assessed whether recombinant RBD variants could maintain binding to ACE2 and a panel of diverse antibodies that bind at distinct epitopes (CC12.7, CC12.1, CC6.29, CC6.32) (Rogers et al., 2020). Designs were screened for binding using an ELISA, in which recombinant RBD was immobilized on microtiter plates, incubated with varying concentrations of IgG or ACE-Fc, and then secondarily labeled with anti-human Fc-HRP conjugate. WT and designs assessed against ACE2-Fc results in a Kd,app between 1 and 2 nM (Fig. 5A and B), and labeling with an IgG isotype control results in a Kd,app that could not be determined (>50 nM). RBD6 and RBD8 bound all antibodies on the panel with similar apparent affinities to WT (Fig. 5A and B), suggesting strong structural conservation of the tertiary structure of the RBD protein domain.

Fig. 5.

Fig. 5

Recombinant RBD6 maintains binding to a wide range of neutralizing antibodies and has improved thermal and proteolytic stability compared with wild type RBD. (A) hACE2-Fc titration curves for WT, RBD6, and RBD8 designs by ELISA (n = 3). (B) Calculated KD,app for WT, RBD6, and RBD8 binding hACE2-Fc and antibodies CC12.7, CC12.1, CC6.29, and CC6.32 by ELISA (n = 3). (C) Fluorescence as a function of temperature using a SYPRO Orange thermal shift assay. (D) Apparent melting temperature, Tm,app, calculated from melting curves for WT, RBD6, and RBD8. Both designs show significantly higher Tm,app than WT for both replicates (RBD6: P-value = 1e-3, 5.7e-4; RBD8: P-value = 3.1e-3, 0.011) (n = 4 each of 2 independent replicates). (E) Representative SDS-PAGE gels showing WT, RBD6, and RBD8 protein after 1 h incubation with the protease thermolysin. Thermolysin activity was quenched by addition of EDTA, and samples were then denatured and reduced. Gels were stained with SimplyBlue stain to visualize proteins. (F) Quantification of protein band density after thermolysin relative to untreated control samples. RBD6 was significantly less sensitive to thermolysin treatment than WT (P-value = 0.00023), whereas RBD8 was more sensitive than WT to thermolysin treatment (P-value = 0.00090) (n = 6).

We next assessed the apparent thermal stability (Tm,app) of RBD6 and RBD8 compared with WT using a SYPRO-orange thermal shift assay (Fig. 5C). Both RBD6 and RBD8 reported higher Tm,app than WT, at 1–2°C (P-value = 1e-3, 5.7e-4) and 0.5–1°C (P-value = 3.1e-3, 0.011), respectively (n = 4 in each replicate) (Fig. 5D). Finally, we measured the relative proteolytic stability of the 3 constructs by incubating 1 μg samples for 1 h at 37°C with 193 μM thermolysin. Denatured samples were visualized on an SDS-PAGE gel (Fig. 5E). Quantification of the band intensity shows that more RBD6 protein remained after digestion with thermolysin than WT protein (P-value = 0.00023, n = 6), demonstrating that design RBD6 is more proteolytically stable than WT in addition to possessing increased thermal stability. Contrarily, design RBD8 was significantly less proteolytically stable than WT (P-value = 0.00090, n = 6), despite a modest increase in thermal stability (Fig. 5F). From these sets of in vitro experiments, we conclude that the design RBD6 maintains the tertiary surface topology of WT RBD, is more thermally stable, and is less proteolytically sensitive.

Discussion

In this work we describe a case study on the thermal stabilization and proteolytic resistance of a SARS-CoV-2 S RBD variant engineered by computational design and deep mutational scanning. The RBD6 variant was still able to bind ACE2 and a panel of antibodies binding at distinct epitopes, showing that the increase in melting temperature and protease resistance did not appreciably change the exterior surface. As such, this is a stringent case of protein design and engineering distinct from other stabilization design and engineering strategies typically targeting surface or surface-proximal positions. Our design strategy is also distinct from the ‘S2P’ and ‘hexapro’ mutations on Spike (Hsieh et al., 2020; Pallesen et al., 2017), which stabilize certain protein conformations over others rather than necessarily stabilizing the folded state of the protein. Our approach could be applied to the computational stabilization of a wide range of proteins without requiring detailed knowledge of active sites or binding epitopes, particularly powerful for cases when there are multiple or unknown binding sites.

The 4 mutations encoded in the successful design RBD6 were I358F, Y365W, T430I, and I513L. Although RBD6 resulted from a multi-point mutant reversion of 4 other nonoptimal residues from RBD1, our deep mutational scanning data indicate that many of these mutations may not be optimal. Although tryptophan seems preferable to tyrosine at position 365, other aromatic or aliphatic mutations like F or M without hydrogen bond donors or acceptors are superior to both. The mutation T430I removes a buried unsatisfied polar group. Our DMS data indicate that the isosteric aliphatic substitution of valine is slightly preferred over isoleucine at this position. Finally, both isoleucine and phenylalanine are slightly preferred over our originally designed I513L mutation in RBD1. These shortcomings could explain our only modest improvement in thermal stability compared with similar design efforts by other groups.

Similar to our work, the King group also found success mining the deep mutational scanning dataset of Starr et al (Ellis et al., 2021; Starr et al., 2020). Their choice to more narrowly focus on linoleic acid binding pocket mutations led to the creation of top designs containing only mutation F392W and mutations Y365F, F392W, and V395I, with melting temperatures of 1.9–2.4°C and 3.8–5.3°C above wild type, respectively (Ellis et al., 2021). All of these mutations except V395I were identified at a high read depth by the protease library screen and showed convergent strategies for RBD redesigns, even with distinct mutations. Thus, we view our work as complementary to, and confirmatory of, the RBD redesign strategy employed by Ellis and colleagues. The Schreiber group used yeast surface display of RBD and found stabilizing mutations, including I358F found in our RBD1 and RBD6-12 designs (Zahradník et al., 2021).

This case study highlighted several areas in need of improvement for both computational and high-throughput protein engineering. In some respects, these issues are intertwined—higher throughput measurements combined with facile assembly of gene-length designs would allow for testing a richer diversity of design concepts than the 2 methods implemented here. On the computational side, the implementation of FuncLib led to a workable design, whereas the 3 more aggressive FastDesign constructs did not display on the yeast surface. Rather than considering all core positions for computational design, FuncLib reduced the design space by focusing all mutations into 2 designated pockets. The number of mutations considered from the original sequence (8–18) is in the range of successful PROSS designs, as each mutation contributes perhaps 0.5–1°C increase in Tm. In this specific case of the S RBD with only 80 or so mutable core residues, the number of mutations considered in the first round of designs was possibly too aggressive. We would argue for the importance of parsimony in the future sets of initial designs, potentially including even fewer mutations than contained in successful PROSS designs due to the complexity of interactions between core residues. This could be accomplished easily by constraining the maximum number of mutations per design to perhaps 6. Previous FuncLib design applications also exhibited the largest improvements in designs exhibiting 3–6 mutations (Khersonsky et al., 2018; Netzer et al., 2018; VanDrisse et al., 2021; Warszawski et al., 2019). The 2-tiered screening strategy (design, experimentally characterize, then design again) resulted in a more stable RBD6 construct. However, RBD6 was a minimal reversion of the RBD1 construct, and more aggressive designs considering additional mutations ranged from minimally successful to failures. Two-tiered screening has a higher probability of success with a large diversity of the designs, such that potential failure modes are not shared between different designs.

On the experimental side, improving the hit identification from the protease assay used in the deep mutational scan would have improved the quality of the second set of designs. Our designs used both clear hits with average enrichment ratio greater than 1, but also included some mutations with much lower average read depth (below 60) to increase diversity of designs. These borderline mutations generally were unsuccessful. Second, our scan considered only single point mutations from the RBD6 design. The majority of the hits were located at or adjacent to the linoleic acid binding pocket, and the experiment did not allow us to determine whether coupling of these individually stabilizing mutations was beneficial. Of course, introducing combinations of mutations increases the size of the library exponentially, and more clever library design would be needed to keep the number of variants to a manageable level. For this example, clear ways to limit library diversity would be to remove inviolable positions like the 8 cysteines involved in the 4 disulfide bonds, removal of positions like N501 where many residues are solvent exposed, and considering a reduced hydrophobic alphabet (e.g. F,L,I,M,V,Y,A) rather than all 20 amino acids at completely buried positions. Such mutations could be efficiently programmed by existing combinatorial mutagenesis protocols (Kirby et al., 2021).

Conclusion

In this work we used computational design and high throughput protein engineering to improve the thermal stability and reduce the proteolytic sensitivity of Wuhan Hu-1 S RBD without impacting its ability to recognize ACE2 and diverse antibodies targeting distinct epitopes.

Data Availability

Raw sequencing reads for this work have been deposited in the SRA under accession number PRJNA797453. All processed deep sequencing runs and all raw data used for main text figures are available as a supplementary Microsoft Excel file (Leonard_complete_data.xlsx).

Author Contributions

Designed proteins computationally: J.W., P.J.S., A.C.L., S.J.F., T.A.W.

Designed bench research: A.C.L., T.A.W.

Performed bench research: A.C.L., A.H.E.

Wrote manuscript: A.C.L., T.A.W.

Funding

This work was supported by the National Science Foundation (CBET Award No. 2030221 to T.A.W.) and the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number R01AI141452 to T.A.W, the National Science Foundation Graduate Research Fellowship to A.C.L., and the NIH/CU Molecular Biophysics Program and NIH Biophysics Training Grant T32 GM-065103 to A.C.L. The Fleishman lab was supported by a Consolidator Grant from the European Research Council (815379), the Dr. Barry Sherman Institute of Medicinal Chemistry, and a charitable donation in memory of Sam Switzer. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Conflicts of Interest

None noted.

Supplementary Material

Leonard_complete_data_gzac002

Contributor Information

Alison C Leonard, Department of Chemical and Biological Engineering, University of Colorado, Boulder, CO 80303, USA.

Jonathan J Weinstein, Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel.

Paul J Steiner, Department of Chemical and Biological Engineering, University of Colorado, Boulder, CO 80303, USA.

Annette H Erbse, Department of Biochemistry, University of Colorado, Boulder, CO 80303, USA.

Sarel J Fleishman, Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel.

Timothy A Whitehead, Department of Chemical and Biological Engineering, University of Colorado, Boulder, CO 80303, USA.

References

  1. Abràmoff, M.D., Magalhães, P.J. and Ram, S.J. (2004) Biophotonics Int., 11, 36–42. [Google Scholar]
  2. Argentinian AntiCovid Consortium (2020) Sci. Rep., 10, 21779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Baden, L.R., el Sahly, H.M., Essink, B.  et al. (2021) N. Engl. J. Med., 384, 403–416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Banach, B.B., Cerutti, G., Fahad, A.S.  et al. (2021) Cell Rep., 37, 109771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Barnes, C.O., West, A.P.Jr., Huey-Tubman, K.E.  et al. (2020) Cell, 182, 828–842.e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Beadle, B.M. and Shoichet, B.K. (2002) J. Mol. Biol., 321, 285–296. [DOI] [PubMed] [Google Scholar]
  7. Boder, E.T. and Wittrup, K.D. (1997) Nature Biotech., 15, 553–557. [DOI] [PubMed] [Google Scholar]
  8. Chao, G., Lau, W.L., Hackel, B.J., Sazinsky, S.L., Lippow, S.M. and Wittrup, K.D. (2006) Nat. Protoc., 1, 755–768. [DOI] [PubMed] [Google Scholar]
  9. Chen, W., Chag, S.M., Poongavanam, M.V.  et al. (2017) Pharm. Biotechnol., 106, 1961–1970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Dalvie, N.C., Rodriguez-Aponte, S.A., Hartwell, B.L.  et al. (2021) Proc. Natl. Acad. Sci. U. S. A., 118, e2106845118.34493582 [Google Scholar]
  11. Dejnirattisai, W., Zhou, D., Ginn, H.M.  et al. (2021) Cell, 184, 2183–2200.e22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Ellis, D., Brunette, N., Crawford, K.H.D.  et al. (2021) Front. Immunol., 12, 2605. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Feng, S., Phillips, D.J., White, T.  et al. (2021) Nat. Med., 27, 2032–2040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Francino-Urdaniz, I., Steiner, P.J., Kirby, M.B.  et al. (2021) Cell Rep., 36, 109627. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Goldenzweig, A. and Fleishman, S.J. (2018) Annu. Rev. Biochem., 87, 105–129. [DOI] [PubMed] [Google Scholar]
  16. Goldenzweig, A., Goldsmith, M., Hill, S.E.  et al. (2016) Mol. Cell, 63, 337–346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hsieh, C., Goldsmith, J.A., Schaub, J.M.  et al. (2020) Science, 369, eabd0826. [Google Scholar]
  18. Huynh, K. and Partch, C.L. (2015) Curr. Protoc. Protein Sci., 79, 28.9.1–28.9.14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Kabsch, W. and Sander, C. (1983) Biopolymers, 22, 2577–2637. [DOI] [PubMed] [Google Scholar]
  20. Khersonsky, O., Lipsh, R., Avizemer, Z.  et al. (2018) Mol. Cell, 72, 178–186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kirby, M.B., Medina-Cucurella, A.V., Baumer, Z.T. and Whitehead, T.A. (2021) Protein Eng. Des. Select., 34, gzab017. Edited by: Christopher Snow, Associate Editor PEDS. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Klesmith, J.R., Bacik, J.P., Wrenbeck, E.E., Michalczyk, R. and Whitehead, T.A. (2017) Proc. Natl. Acad. Sci. U. S. A., 114, 2265–2270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Klesmith, J.R. and Hackel, B.J. (2019) Bioinformatics, 35, 2707–2712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lan, J., Ge, J., Yu, J.  et al. (2020) Nature, 581, 215–220. [DOI] [PubMed] [Google Scholar]
  25. Leman, J.K., Weitzner, B.D., Lewis, S.M.  et al. (2020) Nat. Methods, 17, 665–680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Maguire, J.B., Haddox, H.K., Strickland, D.  et al. (2021) Proteins, 89, 436–449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Medina-Cucurella, A.V. and Whitehead, T.A. (2018) Methods Mol. Biol., 1764, 101–121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. ter  Meulen, J., van den Brink, E., Poon, L.L.  et al. (2006) PLoS Med., 3, e237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Mulligan, M.J., Lyke, K.E., Kitchin, N.  et al. (2020) Nature, 586, 589–593. [DOI] [PubMed] [Google Scholar]
  30. Netzer, R., Listov, D., Lipsh, R., Dym, O., Albeck, S., Knop, O., Kleanthous, C. and Fleishman, S.J. (2018) Nat. Commun., 9, 5286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Pallesen, J., Wang, N., Corbett, K.S.  et al. (2017) Proc. Natl. Acad. Sci. U. S. A., 114, E7348–E7357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Premkumar, L., Segovia-Chumbez, B., Jadi, R.  et al. (2020) Sci. Immunol., 5, eabc8413.32527802 [Google Scholar]
  33. Rocklin, G.J., Chidyausiku, T.M., Goreshnik, I.  et al. (2017) Science, 357, 168–175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Rogers, T.F., Zhao, F., Huang, D.  et al. (2020) Science, 369, 956–963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Starr, T.N., Greaney, A.J., Hilton, S.K.  et al. (2020) Cell, 182, 1295–1310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Tai, W., He, L., Zhang, X., Pu, J., Voronin, D., Jiang, S., Zhou, Y. and du, L. (2020) Cell. Mol. Immunol., 17, 613–620. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Tien, M.Z., et al. (2013) PLoS ONE, 8, e80635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Toelzer, C., Gupta, K., Yadav, S.K.N.  et al. (2020) Science, 370, 725–730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Tokuriki, N., Stricher, F., Schymkowitz, J., Serrano, L. and Tawfik, D.S. (2007) J. Mol. Biol., 369, 1318–1332. [DOI] [PubMed] [Google Scholar]
  40. VanDrisse, Lipsh-Sokolik, R., Khersonsky, O., Fleishman, S.J. and Newman, D.K. (2021) Proc. Natl. Acad. Sci. U. S. A., 118, e2022012118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Walls, A.C., Park, Y.J., Tortorici, M.A., Wall, A., McGuire, A. and Veesler, D. (2020) Cell, 180, 281–292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Warszawski, S., Borenstein Katz, A., Lipsh, R.  et al. (2019) PLoS Comput. Biol., 15, e1007207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Whitehead, T.A., Bergeron, L.M. and Clark, D.S. (2009) Protein Eng. Des. Select., 22, 607–613. [DOI] [PubMed] [Google Scholar]
  44. Wintrode, P.L., Miyazaki, K. and Arnold, F.H. (2000) J. Biol. Chem., 275, 31635–31640. [DOI] [PubMed] [Google Scholar]
  45. Wrapp, D., Wang, N., Corbett, K.S., Goldsmith, J.A., Hsieh, C.L., Abiona, O., Graham, B.S. and McLellan, J.S. (2020) Science, 367, 1260–1263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Wrenbeck, E.E., Bedewitz, M.A., Klesmith, J.R., Noshin, S., Barry, C.S. and Whitehead, T.A. (2019) ACS Syn. Bio., 8, 474–481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Wrenbeck, E.E., Klesmith, J.R., Stapleton, J.A., Adeniran, A., Tyo, K.E.J. and Whitehead, T.A. (2016) Nat. Methods, 13, 928–930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Yu, J., Tostanoski, L.H., Peter, L.  et al. (2020) Science, 369, 806–811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Yuan, M., Wu, N.C., Zhu, X., Lee, C.C.D., So, R.T.Y., Lv, H., Mok, C.K.P. and Wilson, I.A. (2020) Science, 368, 630–633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Zahradník, J., Marciano, S., Shemesh, M.  et al. (2021) Nat. Microbiol., 6, 1188–1198. [DOI] [PubMed] [Google Scholar]
  51. Zhao, H. and Arnold, F.H. (1999) Protein Eng. Des. Select., 12, 47–53. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Leonard_complete_data_gzac002

Data Availability Statement

Raw sequencing reads for this work have been deposited in the SRA under accession number PRJNA797453. All processed deep sequencing runs and all raw data used for main text figures are available as a supplementary Microsoft Excel file (Leonard_complete_data.xlsx).


Articles from Protein Engineering, Design and Selection are provided here courtesy of Oxford University Press

RESOURCES