Skip to main content
iScience logoLink to iScience
. 2025 Sep 17;28(10):113584. doi: 10.1016/j.isci.2025.113584

Integrated antibody language model accelerates IgG screening and design for broad-spectrum antiviral therapy

Hannah F Almubarak 1,2,11, Wuwei Tan 3,11, Andrew D Hoffmann 1,11, Yuanfei Sun 3,11, Juncheng Wei 4,11, Lamiaa El-Shennawy 1, Joshua R Squires 1,4, Nurmaa K Dashzeveg 1, Brooke Simonton 1, Yuzhi Jia 1, Radhika Iyer 4, Yanan Xu 4, Vlad Nicolaescu 5, Derek Elli 5, Glenn C Randall 5, Matthew J Schipma 6, Suchitra Swaminathan 7,8, Michael G Ison 9, Huiping Liu 1,7,10,, Deyu Fang 4,7,∗∗, Yang Shen 3,12,∗∗∗
PMCID: PMC12539245  PMID: 41126882

Summary

Identifying highly efficacious, broad-spectrum antibodies against fast-mutating viral variants remains a major challenge in therapeutic development. Here, we developed AbGen, a machine learning-assisted antibody generation pipeline powered by an antibody language model (AbLM), to accelerate antibody screening and re-design. AbLM, pretrained on protein domain sequences and fine-tuned on paired VH-VL sequences, enables the analysis and prediction of neutralization activity against viruses (specifically SARS-CoV-2 in this study), targeting both wild-type (through antigen interaction prediction [docking]) and emerging variants (through Gaussian process regression [Kriging]). Screening over 1300 RBD-binding IgG sequences from convalescent patients, AbGen efficiently prioritized candidates for experimental validation and/or redesign against wild-type, Delta, and Omicron variants, preventing viral infections in vitro and in vivo. AbLM outperformed other language models in predicting IgGs with low variant susceptibility. Our work advances artificial intelligence-based antibody discovery by synergizing data-driven language models and Kriging with physics-driven docking and design.

Subject areas: Immunology, Immunological methods, Bioinformatics

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Machine learning-assisted antibody generation pipeline AbGen accelerates IgG screen

  • The Ab language model with paired VH-VL sequences outperforms others in prediction

  • AbGen-predicted and re-designed anti-viral antibodies are validated experimentally

  • In silico prioritized IgGs neutralize broad variants of SARS-CoV-2 in mice


Immunology; Immunological methods; Bioinformatics

Introduction

One of the most demanding challenges in medical care and clinical investigation is to fight against constantly evolving threats, such as pathogens responsible for infectious diseases and abnormal cells found in cancers, that adapt under selective therapeutic pressure. Monoclonal antibodies (mAbs) have emerged as an important class of therapeutics, accounting for most of the top-selling treatments for immune diseases and cancers over the past years.1,2,3 However, the conventional development of therapeutic antibodies remains time-consuming and labor-intensive, often requiring extensive experimental screening and validation. Many antibody screening strategies for neutralization efficacy, such as phage display, ribosome display, and mammalian cell surface display, still face limitations in efficiency.4 In this study, we aimed to accelerate the discovery of broad-spectrum antiviral therapeutics through an integrated antibody generation (AbGen) pipeline that combines computational modeling with experimental validation. We developed computational models that incorporate a tailored antibody language model (AbLM) alongside predictions of antibody structure, antibody-antigen interaction, and viral neutralization activity. We applied AbGen to the rapidly mutating severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) as a model system, demonstrating its potential to identify potent neutralizing antibodies against diverse viral variants.

SARS-CoV-2 enters host cells via the binding of its viral spike (S) protein to the host cell receptor, angiotensin-converting enzyme 2 (ACE2),5,6 which is highly expressed on the cell membranes of various human organs, including the lung, heart, and kidney.7 During the outbreak of the coronavirus disease 2019 (COVID-19) pandemic, neutralizing antibodies were developed as one of the first approaches to effectively treat early infections before vaccines were developed and before SARS-CoV-2 evolved to evade neutralization.8,9,10,11,12 Most SARS-CoV-2 neutralizing antibodies disrupt spike interactions with human ACE2, thereby preventing viral entry for prophylactic and therapeutic applications.8,9,10,11,12,13,14,15

The constantly evolving viral sequences, especially the variants of concern (Delta and Omicron), began to escape neutralizing antibodies and vaccination immunity,16,17 partially contributing to the death of nearly seven million people worldwide.18 For instance, the spike L452R and T478K mutations in the Delta variant, also known as B.1.617 lineage, are located at the periphery or the epitope region of the receptor binding domain (RBD) and are found to reduce antibody neutralizing activity.19,20 The Omicron variant, which is characterized by the presence of around 15 mutations in the RBD, escapes most therapeutic neutralizing antibodies and largely vaccine-elicited antibodies.17

To better prepare for inevitable pandemics or epidemic diseases, the rapid development and design of broad-spectrum antibodies would facilitate the early response against infectious pathogens.21 The emergence of artificial intelligence (AI) has shown great potential in expediting and transforming antibody production pipelines.21,22 However, it is unclear whether and how therapeutic development could be accelerated in early-response, low-data regimes, where limited data on epitope identification, structure characterization, and variant responses constrain the effectiveness of data-intensive computational approaches such as machine learning. Despite these critical barriers, we integrated experimental data, physics simulations, and machine learning into an antibody-screening and optimization platform. Exploiting limited experimental data, we built and validated such methods in predicting and ranking patient-derived antibodies against broad SARS-CoV-2 variants.

To improve the efficiency of experimental screening, we built our computational pipeline AbGen on a tailored antibody language model, AbLM, through pretraining over twelve million generic protein domain sequences and finetuning over four thousand paired VH-VL sequences, with antibody-specific CDR masking and VH-VL cross-attention. No activity data was required for AbLM. In the latent space of IgG sequence embeddings suggested by AbLM, we first used physics-driven IgG antibody structure prediction and IgG-RBD docking to predict the IgG neutralization landscape against the wild-type (WT) SARS-CoV-2 strain. First, no activity data was required for predicting neutralization against the WT virus. Second, leveraging activity data from just 14 early-response antibodies, we constructed Gaussian process regressors23 in the latent space to effectively predict the IgG neutralization landscape across the viral variants of concern. Furthermore, we applied computational protein design, with the input of protein-docking structural models, to redesign IgGs with enhanced neutralization efficacy against the Delta variant. The screened or redesigned IgGs were subsequently validated in virus infection studies, both in vitro and in vivo.

Results

Overview of the machine learning-accelerated antibody production platform

Using SARS-CoV-2 as a targeting model pathogen and the spike receptor binding domain (RBD) as a bait antigen, we developed a low data regime-derived pipeline facilitating the screening and redesign of neutralizing antibodies (Figure 1A). To identify and produce anti-spike antibodies, we first collected blood specimens from 42 patients recovered from early COVID-19 as described,24 flow-sorted the RBD-bound IgM-negative memory B cells for subsequent single-cell VDJ sequencing via the 5′ 10× genomics platform, and retrieved 1376 heavy-light chain pairs of IgG antibody sequences.

Figure 1.

Figure 1

Schematic illustration of our study flow and the AbGen pipeline

(A) Peripheral blood samples were collected from 42 patients with convalescent COVID-19, and RBD+ memory B-cells were sorted and underwent single-cell VDJ sequencing using the 10× Genomics platform.

(B) A total of 1366 anti-RBD antibody sequences were retrieved and included in our machine-learning model to predict antibodies with high SARS-CoV-2 neutralization capacities.

(C) Prioritized antibodies were then tested for their ability to neutralize broad-spectrum SARS-CoV-2 variants in both in-vitro and in-vivo settings.

Facing the low data regimes where activity data are available for few or no antibodies, we harnessed two complementary approaches: physics-driven protein docking and data-driven machine learning, to predict antibody effectiveness in viral neutralization (Figure 1B with details in Figure S1). We first predicted the antibody structures from sequences derived from convalescent patient samples. Using both the predicted antibody structures and the crystal structure of the spike RBD, we generated multiple docking configurations of antibody-RBD structures, each accompanied by uncertainty estimates. Based on the docking configurations, we refined a confidence score for each IgG, which is the predicted probability to cover or block the ACE2-binding RBD residues, serving as a physics-driven predictor for neutralization against the WT virus. Higher coverage of the ACE2 binding residues would predict better neutralization. This structure-based protein docking approach has been previously validated for its accuracy in modeling antigen-antibody complex structures.25

To predict neutralization against variants of concern such as the Delta and Omicron variants, we calculated the variograms (covariance-based dissimilarities) among sequenced antibodies and clinically validated antibodies, within the latent space learned in an antibody language model. Using these variograms and the variant responses of 14 clinical antibodies, we applied Kriging (Gaussian process regression)23 to predict the neutralizing robustness of sequenced antibodies to neutralize variants. This sequence-based antibody language model, AbLM, is customized from pretrained protein language models for paired VH-VL chains with cross-chain attention (evaluated against other language models in subsequent sections).

Based on the confidence scores against the WT virus and the predictions of the robustness against viral variants, the IgG ranking facilitated their prioritization and re-design for further experimental validation. After all sequenced IgG antibodies were prioritized into categories of high, medium, and low confidence, we randomly selected 19 IgG pairs of heavy and light chains spanning these categories, along with a few computationally redesigned antibodies, for cloning and experimental evaluation. We tested these antibodies for functional neutralization against SARS-CoV-2 infections both in vitro and in vivo (Figure 1C).

Diversity of sequenced antiviral antibodies from human B cells

After obtaining the sequences of 1376 heavy-light chain pairs of IgG antibodies via the CellRanger platform (10X genomics), we identified the individual V, D, and J genes (sequences) for each antibody as well as the type of light chain (Kappa or Lambda). Frequency analyses of individual V and J genes revealed an uneven distribution, as certain genes appeared more frequently with 60–600 counts compared to others with few counts (Figures S2A and S2B). D genes were not detected in the light chains and were rarely seen in the heavy chain (Figure S2C). The relative diversity of V genes was identified with the J genes skewed toward a few sequences, such as IGHJ4 of over 600 counts. This pattern suggests that while the IgG antibody candidates in our sample group share general IgG characteristics, they exhibit distinct specificities in how individual IgGs recognize and bind to the spike RBD. We detected a few C genes in the heavy chains and Lambda light chains but only one within the Kappa chains (Figure S2D). Every CDR3 sequence within the H chains was identified once, similar to most of the CDR3 sequences in K and L chains, with only a small portion repeating 2–10 times (Figures S2E and S2F).

Since it is labor-intensive and inefficient to experimentally test over 1000 IgG candidates for functional efficacy tests, we next leveraged computational predictions (integrating physics-driven protein docking and data-driven machine learning) to accelerate the experimental screen and redesign of efficacious SARS-CoV-2 neutralizing antibodies.

Antibody language model provides a latent space for IgG analysis and design

To analyze the sequence distribution of IgG candidates at a large scale without time-consuming sequence alignment, and to further improve the prediction of antibody activities, we designed and trained the customized antibody language model, AbLM26 (Figure 2A). AbLM utilizes readily available, unlabeled protein sequence data for pretraining, paired antibody heavy and light chain sequences with cross-chain attention for finetuning, and CDR-informed masking (see details in STAR Methods). For the convalescent IgG antibodies we sequenced, as well as 14 clinically used COVID-19 antibodies, we embedded their variable region sequences into 1536-dimensional vectors (768 dimensions for either heavy or light chain) using the paired sequence encoders of AbLM, and we visualized their 2D distributions using Uniform Manifold Approximation and Projection (UMAP)27 (Figure 2B). We noticed that the 14 clinically proven antibodies were clustered with (or “covered” by) some convalescent IgG clusters, which suggests the clinical potential to prioritize and improve selected IgG antibodies. As demonstrated in the following subsections, AbLM proved effective in harnessing this potential.

Figure 2.

Figure 2

Our tailored Antibody Language Model (AbLM) embeds IgG sequences in a latent space for sequence analysis and susceptibility prediction

(A) Architecture, training, and application of our tailored antibody language model, AbLM.

(B) Clusters of 1366 patient-derived convalescent antibodies shown in dots along with 14 clinical antibodies shown in crosses, based on the sequence embeddings from AbLM and dimensionality reduction using UMAP. 19 patient antibodies highlighted in circles were functionally tested.

(C) Predicted 2D landscape of patient antibodies’ neutralization of wild-type (a confidence score between 0 and 1), using antibody-antigen structure predictions, along with a 1D histogram. Labeled circles are patient antibodies whose live virus neutralization was detected in functional tests (Figure 3B).

(D) Predicted 2D landscape of patient antibodies’ robustness to variants (log fold improvement in neutralization compared to that of the wild-type/Washington, ranging from −3 to +1), using Gaussian Process Regression (Kriging) based on 14 clinical antibodies’ known variant responses. Labeled crosses represent clinical antibodies with EC50-fold improvements greater than 1 (i.e., no worse than wild-type) for Delta and Omicron BA.1, or greater than 0.1 (i.e., no worse than 10-fold of wild-type) for Omicron BA.5. Labeled circles indicate tested patient antibodies with EC50-fold improvements exceeding 0.1 for Delta and Omicron BA.1, or 0.01 for Omicron BA.5.

In the 2D distribution, the 19 randomly selected IgG candidates for experimental validation (red circles in Figure 2B) showed diversity among themselves and coverage of the cluster sets of over 1000 antibodies. Based on the AbLM analysis, IgG 106/107, named after H/L indices, resembled the sequence of clinical antibodies C1448 and IMD28 (Imdevimab or REGN10987), whose neutralization mechanism involves direct competition with ACE2; whereas IgG 88/89 resembled CAS28 (Casivirimab or REGN10933). Sequence alignment proved such resemblance that 106/107 and C144 share 11 of 12 CDR3 amino acids in the light chain (CSSYTSSSTGVF versus CSSYTSSSTRVF), although C144 has a much longer CDR3 sequence than 106/107 (14 amino acids for 106/107 versus 23 amino acids for C144) (Figure S3A).

Physics-driven prediction of IgG structures, receptor binding domain interactions, and wild-type viral neutralization landscape

To mechanistically and virtually assess the neutralization capabilities of these uncharacterized IgG candidates, we established a physics-driven prediction pipeline (Figure S1B). It simulates IgG apo-structures (H/L pairs) via AbodyBuilder-ML29 and predicts IgG halo-structures bound to spike RBD, using HADDOCK25 for initial docking and employing Bayesian Active Learning (BAL)30 for refinement and uncertainty quantification. We hypothesized that IgG antibodies with higher neutralization potency more effectively compete with ACE2 for binding to the viral RBD, thereby blocking viral entry as a key mechanism of neutralization. Therefore, we used the IgG-competing portion of the ACE2-binding residues of RBD (0.0–1.0) as a confidence indicator to predict the IgG’s neutralization against the WT virus (Data S1 [Tab 1 “All IgGs (WT and variant pred.)”]). The predicted neutralization landscape is visualized in the AbLM-learned latent space (2D heatmap in Figure 2C and 3D landscape in Figure S3C). Out of the original 1376 IgG candidates, 10 IgGs were excluded due to failed apo-structure predictions, leaving 1366 for analysis. Among these, the majority - 849 antibodies (>62%) - had predicted confidence scores below 0.4, implying a low neutralization potential against the WT and therefore representing a low priority. 107 IgGs (7.8%) were predicted with the scores above 0.6 (Figure 2C), and these high-priority antibodies were dispersed in the latent space (Figure S3A), including regions not covered by the 14 clinical antibodies, potentially suggesting novel modes of clinical intervention.

Data-driven prediction of neutralization profiles against the viral variants in low-data regimes

To predict the neutralization profiles against variants of concern for the uncharacterized IgG candidates, we followed an early-response scenario of “few-shot learning” by selecting a few IgGs for experimental characterization and providing a few labeled data points for machine learning. The IgG selection can be based on the diversity (clusters covering the latent space) and the predicted WT neutralization. Before any of our IgG candidates were tested, we emulated the selection process using the 14 clinical antibodies that satisfied the criteria and had been experimentally characterized for certain SARS-CoV-2 variants. As machine learning models, particularly deep learning models, often demand large datasets that are unavailable in the early response scenarios or other disease settings, we chose a Gaussian process regressor (Kriging), the best linear unbiased estimator, in combination with a latent space learned from a protein language model, to predict variant-response robustness. This robustness was defined as the log-fold improvement in EC50 values for each IgG candidate when neutralizing viral variants compared to the wild-type, and predicted using only the neutralization profiles of 14 early-response clinical antibodies.

For all 1366 IgG candidates, we visualized the predicted robustness landscape against Delta and Omicron BA.1/BA.5 variants (evolving variants compared to the WT viral infection that the convalescent patients experienced) in the latent space (2D heatmap in Figure 2D, 2D scatterplots in Figure S3B, and 3D landscape in Figure S3D, with data provided in Data S1 [Tab 1 “All IgGs (WT and variant pred.)”]). We observed a few regions enriched in better robustness for the Delta variant (warmer colors). For example, those regions around IgG 106/107 were predicted to exhibit neutralization profiles similar to nearby IMD (Imdevimab/REGN10987), neutralizing the Delta variant well but losing robustness against the Omicron BA.1 or BA.5 variants drastically (Figure S8). The prediction revealed an overall cooler colored landscape against the Omicron variants versus the Delta variants, echoing the observations from other studies17,31 that the Omicron variants are more prone than the Delta variant to elicit antibody escape.

Functional tests of IgG candidates in neutralizing receptor binding domain and viral infections

To validate our predictions of IgG candidates in neutralizing the WT and variant strains of SARS-CoV-2, we randomly selected 19 IgG antibodies for experimental tests, including three [88/89, 106/107, and 94/95] in the high priority class, four in the medium priority class, and 12 in the low priority class against WT SARS-CoV-2 (Figures 3A and Data S1 [Tab 2 “Validated IgGs (WT pred.)”]). Following heavy and light chain cloning, plasmid DNA was transfected into HEK293T cells for IgG overexpression, and the IgGs were purified to functionally assess their robustness in both blocking RBD binding to host cells and neutralizing live viral infections (Data S5).

Figure 3.

Figure 3

Experimental tests of antibody neutralization against SARS-CoV-2 WT and variants

(A) Schematic depicting experimental workflow. Variable regions of the heavy and light chains of the selected antibodies were cloned into AbVec IgG vectors, followed by transfection into HEK293T cells. Purified antibodies were then tested for their ability to neutralize RBD in an ACE-2 cell-binding assay as well as live SARS-CoV-2 neutralization.

(B) Live SARS-CoV-2 virus neutralization by selected anti-RBD antibodies against the wild-type and SARS-CoV-2 variants. Viability of A549 cells overexpressing human ACE2 was evaluated 72–96 h post viral infection (n = 3). GraphPad Prism 9.3.1 was used to calculate EC50s. Data are presented as mean ± SEM.

(C) The true positive rate of our antibody language model (AbLM) compared to other state-of-the-art pretrained language models.

(D) Performance of our tailored Antibody Language Model (AbLM) in comparison to state-of-the-art protein language models in predicting antibody robustness to spike variants (log fold improvement in viral neutralization EC50 compared to the wild-type). Boxes span the interquartile range (IQR; Q1–Q3); orange horizontal lines mark the median. Whiskers extend to the most extreme values within 1.5×IQR of the quartiles; points beyond these limits are plotted as outliers.

With flow cytometry-based measurement of RBD binding to human ACE2 expressing cells,32 we evaluated the IgG abilities in blocking RBD binding to the cellular ACE2 receptor (binding inhibition IC50) (Figure S4A). Seven out of 19 randomly selected IgGs (001/002, 019/020, 025/026, 031/032, 088/089, 106/107, and 108/109) inhibited RBD binding to ACE2+ HEK293 cells in a dose dependent manner (Figure S4A and Data S2). Among the seven neutralizing IgGs, two were extremely potent (106/107 and 31/32) with IC50s < 0.5 nmol/L, among which 106/107 was predicted with a high confidence score in the high priority IgG category (Figure 2). The remaining 12 antibodies did not show any effective RBD blocking at the concentration tested (Data S2), all of which are largely consistent with their predicted low confidence scores (Figure 2C).

Consistently, nearly half of the antibodies showed neutralizing effects on live SARS-CoV-2 infection of ACE2 overexpressing A549 cells (Figure 3B). Specifically, the IgG clones 106/107, 011/012, and 031/032 obtained the lowest EC50 values against the WT Washington strain, demonstrating the strongest neutralizing functions (Figure 3B). Interestingly, although the IgGs were derived from patients with COVID-19 infected with the WT Washington strain (samples collected before the variants of concern emerged and became widespread), many of the IgGs showed neutralizing capacities against the Delta and Omicron variants as well. In fact, three of the clones (001/002, 031/032, and 108/109) were about ten times more effective in neutralizing the Delta variant as compared to the wild-type (Figures 3B, S4B and Data S3).

Additionally, as antibody-antigen binding affinity is commonly used to accelerate the screening and characterization of therapeutic antibodies,33,34,35 we evaluated the binding affinity of our tested antibodies to the WT SARS-CoV-2 RBD using biolayer interferometry (BLI) (Figure S5 and Data S4). As anticipated, for the antibodies that showed live-virus neutralization, there was a positive correlation between the dissociation constant (KD) and the neutralization EC50 values, where highly neutralizing antibodies with the lowest EC50s also showed low KD values (high binding affinity) (Figure S6A). A positive correlation between the off-rate constant (Koff) and EC50 was observed, while a negative correlation between the on-rate constant (Kon) and EC50 was observed (Figure S6A). Importantly, many of the antibodies that failed to neutralize wild-type SARS-CoV-2 showed high binding affinities to RBD, indicating that the binding affinity of the antibody to RBD on its own is a poor predictor of SARS-CoV-2 neutralization (Figures S5 and S6B). These findings highlight the strength of our physics-driven antibody structure prediction and IgG-RBD docking approach, which emphasizes epitope recognition to identify highly neutralizing anti-SARS-CoV-2 antibodies.

Computational predictions align with experimental neutralization profiles

The experimental data above validated our computational pipelines’ predictive power in screening and prioritizing potent and broad-spectrum antibodies, as detailed as follows.

First, independent of IgG activity data, our physics-driven confidence prediction for IgGs in WT neutralization, based on the predicted blocking portion of ACE2-binding RBD residues, showed robust prioritization of WT neutralizing antibodies (Figures S7A and 2C). The first two of the top three IgGs with high-confidence score priority (88/89 and 106/107) showed high potencies and efficacies in neutralizing RBD-binding and viral infection, representing a 67% success rate compared to 37% (7/19) from a random subset of antibodies. In particular, the WT neutralization profile of 106/107 reflects its sequence and epitope similarity to clinical antibodies C144 and IMD, as previously identified in our AbLM analysis (Figures 2B and S3A). Conversely, among the twelve predicted low-priority antibodies, 75% and 67% were proven to have no activity in neutralizing RBD-binding and viral infection, respectively (Figure S7A). These results suggest that the neutralization efficacy of our successful IgGs was primarily driven by targeting the ACE2-binding epitope.

Meanwhile, additional factors beyond epitope specificity, such as binding affinity, may also play a role. Notably, our high-priority antibodies exhibited slightly lower dissociation constants (KD), indicating tighter binding compared to the low-priority group. When converted to Gibbs free energy of binding (ΔG = RTIn(KD), where ΔG is in kcal/mol for KD in mol/L, R is the gas constant, T is the absolute temperature, and RT ≈ 0.592 kcal/mol at 298 K), the high-priority antibodies exhibited an average binding energy of −13.1 kcal/mol, compared to −11.8 kcal/mol for the low-priority group (Figure S7A). Interestingly, the only false positive [94/95] among the top three high-priority IgGs showed relatively weak binding affinity (KD = 13.1 nmol/L and ΔG = −10.75 kcal/mol), suggesting that it might have been excluded had binding affinity been considered as an additional selection criterion alongside epitope targeting.

Second, using data from as few as 14 antibodies (Figure S8), our data-driven prediction profiles of antibodies on variant neutralization also showed agreement with the experimental inhibition of variant infection in differentiating narrow- and broad-spectrum antibodies (Figures 2D and 3C). Among 22 IgGs including the 19 previously described ones, 3 out of 4 predicted to have improved neutralization efficacy (better robustness compared to the WT virus) were validated, representing a success rate of 75% (3/4), whereas 14 of the rest 18 predicted to lose efficacy were also validated with a success rate of 78% (14/18) (Figures 3D and S7B). For example, as predicted, IgG 106/107 exhibited markedly reduced neutralization against Omicron BA.1 and BA.5, similar to the profiles previously observed for IMD and C144.

Our antibody language model, AbLM, demonstrated significantly higher success rates, particularly for robust responses (75%), compared to state-of-the-art pretrained language models. These include all five versions of ESM-136 (44%–57% for robustness), three largest versions of ESM-237 (45%–67%), ProtT538 (44%), and an antibody language model AbLang39 (50%), as shown in Figures 3C, 3D, and S7B. Compared to the next best performer, ESM-2 with 15 billion parameters, our AbLM with 92 million parameters (163-times smaller) is accurate yet lightweight, thanks to antibody-inspired design choices, including pairing VH-VL encoders with cross attention and training with CDR-masking.

Redesign of a prioritized antibody improves neutralization of the Delta variant

Given the demonstrated ability of our platform to prioritize potent and broad-spectrum antibodies, we next tested its ability to (re)design prioritized IgGs for better variant neutralization profiles. We selected IgG 106/107, which was predicted to be a high-priority candidate, and then verified its neutralizing efficacy against WT and the Delta variant. To enhance binding to the Delta variant RBD (positive design) while preserving affinity to the WT RBD (negative design), we employed the multistate, physics-driven protein design tool iCFN40 and redesigned IgG 106/107 variants (termed 106/107m) computationally based on its predicted IgG-RBD structures. Ten single amino acid substitutions near suggested binding sites of T478K were proposed by iCFN, including three at the heavy chain (all on F66) and seven at the light chain (four on Y38, two on Y55, and one on D56) (Figures 4A and S9). All four designs at Y38 were calculated to exhibit distinguishingly improved binding energy (reflected by highly negative ΔΔG values), driven by enhanced electrostatic interactions. Among the four Y38 variants, two involved large-to-small apolar substitutions (Y38A and Y38G), while the other two introduced hydrophobic-to-polar changes (Y38S and Y38Q).

Figure 4.

Figure 4

Antibody 106/107 redesign with improved efficacy against the Delta variant

(A) Visualization of the binding interface between the spike RBD (in wheat cartoon), including two positions mutated in the Delta variant (L452 and T478), and the patient antibody 106/107 (in blue cartoons). Four positions on the antibody were selected for redesign and shown in sticks, including F66 of the heavy chain (in darker blue) and Y38, Y55, and D56 of the light chain (in lighter blue).

(B) Live SARS-CoV-2 virus neutralization by our redesigned antibodies. Data showing fold-improvement of EC50 for the redesigned antibodies compared to the WT 106/107 antibody. Viability of A549 cells overexpressing human ACE2 was evaluated 72–96 h post viral infection. GraphPad Prism 9.3.1 was used to calculate EC50s.

(C) Computationally designed single amino acid substitutions of antibody 106/107: computationally predicted binding-energy improvements (-ddG) showed high correlations with experimentally measured EC50-fold improvements, whether tested against WT (Washington) or the Delta variant.

(D) Computationally predicted 3D structures of RBD-Ig complexes suggested the WT-neutralizing mechanism of the antibody 106/107 (where Y38(L) and D56(L) interacted with T478 on the RBD wild-type), the antibody-escaping mechanism of the Delta variant (where Y38(L) and D56(L) lost interactions with K478 on the RBD Delta variant and Y38(L) paid more desolvation penalty), and the variant-neutralization enhancing mechanism (where A38(L) paid almost no desolvation penalty).

We cloned the ten re-designed IgG 106/107m antibodies and measured their neutralization abilities against live viruses of the wild-type and Delta variant. Strikingly, three of the four top-ranked Y38 designs, including Y38A, Y38S, and Y38G, showed the greatest improvement in neutralization against the Delta variant, with Y38A exhibiting over 2-fold increase in neutralization efficacy (Figures 4B and S10). Y38Q did not show a similar enhancement. Importantly, their neutralization activities for the WT virus were not significantly impaired, reflecting the success of the multistate computational design strategy, which explicitly incorporated both positive design for Delta binding and negative design to preserve WT binding.

Our multistate, principle-driven protein design approach yielded predicted binding energy changes (ΔΔG) that proved powerful in ranking designs. Specifically, Spearman’s rank correlation coefficients between predicted negative ΔΔG values and measured EC50-fold improvements across the ten IgG designs were 0.515 for the wild-type SARS-CoV-2 and 0.806 for the Delta variant (Figure 4C). This approach also provided mechanistic explanations for the enhanced binding of IgG 106/107 Y38 designs to the Delta RBD: T478K in the Delta variant may cause Y38 in IgG 106/107 to become buried from the solvent, incurring a desolvation penalty to binding (Figure 4D, middle panel). Substituting Y38 in IgG 106/107 with a smaller hydrophobic residue, such as alanine or glycine, can reduce this penalty, thereby improving binding affinity to the Delta RBD (Figure 4D, right panel). With the proof of concept from single substitution designs, we anticipate that higher-order redesigns or even de novo designs would further improve the broad-spectrum neutralization profile.

Prioritized IgG candidates inhibit SARS-CoV-2 infection of transgenic mice

We next evaluated the therapeutic efficacy of the two best neutralizing IgG candidates (106/107 and 011/012) — prioritized by our integrated computational-experimental platform AbGen—for their ability to prevent multi-strain SARS-CoV-2 infection in the well-established hACE2 transgenic mouse model in vivo (Figure 5A). Antibodies were administered to mice by nasal inhalation to determine their effects on preventing the viral infection of the three viral variants (WT Washington, Delta, and Omicron). In concordance with the in-vitro cell survival after viral infections, both antibodies prevented weight loss and health deterioration in mice infected with the Washington strain, whereas only IgG 106/107, but not IgG 011/012, prevented the mice from Delta variant infection-caused weight loss and clinical symptoms (Figures 5B and 5C). Furthermore, the viral titers in the lungs of infected mice were significantly lower after the 106/107 antibody treatment which also neutralized the viral variants, consistent with the live virus neutralization results in vitro (Figure 5D).

Figure 5.

Figure 5

Antibody neutralization and inhibition of broad-spectrum SARS-CoV-2 infection in hACE2 transgenic mice

(A) Schematic of viral neutralization and infection design. SARS-CoV-2 viral variants and select antibodies (1.2 μg IgG for Washington and Delta or 8.5 μg IgG for Omicron BA.1) were administered intranasally into K18-hACE2 mice. Animals were monitored for health twice daily and weighed once per day. Seven days post infection (3 days for Omicron), mice were euthanized, and lungs were analyzed for viral load and pathohistological characteristics.

(B) Percent weight loss of K18-hACE2 mice post viral infection (n = 9 per group).

(C) Clinical scores post SARS-CoV-2 infection. Larger clinical scores indicate increased disease severity and reduced physical fitness (n = 9 per group).

(D) Lung genomic viral load on day 7 (Washington and Delta) or day 3 (Omicron) post antibody neutralization and viral infection measured by qRT-PCR (n = 9 per group).

(E) Representative H&E images of mice lungs at experimental endpoint. Scale bar = 1 mm.

(F) Lung Acute and chronic inflammation scores based on histopathological analysis of H&E-stained slides (n = 9 per group).

(G) Alveolar hemorrhage and necrosis scores based on histopathological analysis of H&E-stained slides (n = 9 per group). Statistical significance was tested with unpaired t-tests (B, C, F and G) or Mann Whitney test (D). Data are presented as mean ± SEM. ∗∗∗∗ indicates p < 0.0001. ns means non-significant difference.

Notably, while the low pathogenic Omicron variant did not cause severe disease in mice, treatment with either IgG 106/107 or IgG 011/012 resulted in lower levels of viral genome in the lungs of infected animals when compared to the isotype IgG control group (Figures 5C and 5D). As expected, Omicron infection did not cause significant changes in weight, clinical scores or lung histology even in the control mice without candidate treatment (Figures 5C, 5D, S11, and S12). Histology analyses of the lungs of infected mice reveal that the 106/107 and 011/012 neutralizing IgGs were able to reduce acute lung inflammation following infection with the Washington and Delta strains, albeit with no significant effect on chronic inflammation (Figures 5E, 5F, S11, and S12). Additionally, the IgG 011/012 antibody treatment significantly inhibited necrosis in the lungs of mice infected with the Washington strain as compared to control isotype IgG (Figure 5G). These data demonstrated the robustness of our IgG prioritization strategies in pre-clinical studies.

Discussion

Therapeutic antibody development faces the dual challenges of expensive experimental screening and constantly evolving targets under selective pressure. While many clinical antibodies are used to neutralize other pathogens, treat autoimmune diseases, and combat cancer, there is an urgent call to prepare efficacy-improving strategies in advance or in the early response phase before therapy resistance develops. For instance, over the last few years, while the convalescent plasma treatment has been proven to be effective in treating patients with COVID-19,41 almost all licensed monoclonal antibodies have eventually failed to neutralize the Omicron variant.42

Using SARS-CoV-2 as a testing model, our extensive RBD-specific IgG sequencing on a sizable scale with patients with COVID-19 in the early stages of the pandemic, reveals that a significant number of effective clones against the WT virus lose effectiveness against the variants of concern, such as Delta and Omicron. Significantly, our computational analysis pinpoints certain IgG antibody clones with potential enhancements in blocking variant infections in ACE2+ cells, as confirmed through both in vitro and in vivo virus infection studies. These findings offer a potential explanation for why certain patients with COVID-19 exhibit greater resistance to reinfection by SARS-CoV-2 variants compared to others, as frequently observed.43 Furthermore, we have demonstrated that the replacement of a single residue spin in the IgG core for RBD binding with a smaller and hydrophobic alanine (Y38A), as suggested by computational protein design, dramatically improves its neutralization activity against the Delta variant. These discoveries underscore the potential of leveraging computational power through platforms such as AbGen to accelerate virtual IgG screening and re-design, addressing urgent clinical demand for therapies targeting emerging pathogen variants and evolving oncogenic mutations.

Similar to our machine learning strategies, a few methods have been proposed: (1) to efficiently select broadly neutralizing antibodies based on the structural information of specific epitope,44 and (2) to identify antibody variants of high potency using the evolutionary information of antibody sequences alone, which are encoded in generic protein language models.45 While these approaches are powerful, we emphasize the urgency of advanced preparation and early response in therapeutic antibody optimization, especially in scenarios where experimental data such as epitope characterization, structure determination,44 or activities of antibody variants45 are limited or unavailable. Also, unlike methods that focus on selecting or combining single substitutions for a given antibody,45 our approach starts with predicting the neutralization activity landscapes for a large and diverse set of IgG antibodies. This capacity is driven by our tailored antibody language model AbLM, which, despite being 10–100 times smaller in training sizes, outperforms competing protein or antibody language models in variant response prediction. Its success stems from a tailored computing architecture that integrates protein pretraining and antibody fine-tuning with VH-VL cross-attention and CDR masking.

Future therapeutic development strategies shall further integrate computational and experimental approaches. This synergy will enhance both accuracy and efficiency in discovering and developing clinically effective antibodies and other drugs against quickly evolving targets, such as viral pathogens and oncogenic antigens.

Limitations of the study

First, our structure-based prediction for wild-type viral neutralization requires the known structural characterization of the viral antigen target (i.e., SARS-CoV-2 spike protein in this study), its host receptor target (ACE2 receptor here), and their binding interface. The prioritization scoring platform in ongoing studies will include epitopes outside the interface and/or binding affinity. Second, our machine learning-based variant neutralization prediction requires known sequences of variants and a small number of mAbs’ responses to such variants. Lastly, the current validation study was limited in scale; rather, a larger set of mAbs can be tested in iterations where new experimental data can be utilized to train and update computational models. And feeding forward, updated computational models make improved predictions.

Resource availability

Lead contact

Requests for further information and resources should be directed to and will be fulfilled by the lead contact, Yang Shen (yshen@tamu.edu).

Materials availability

  • Plasmids generated in this study have not been deposited in public repositories. We are glad to share plasmids with reasonable compensation by the requester for their processing and shipping.

  • Antibody heavy chain and light chain sequences analyzed in this study are available in the supporting data tables.

Data and code availability

Acknowledgments

We are most grateful to Drs. Patrick Wilson and Jenna Guthmiller at the University of Chicago for sharing their antibody generation protocols and IgG constructs. We thank Dr. Dominique Missiakas at the University of Chicago for assistance with A/BSL3 research at the Howard T. Ricketts Laboratory (BSL3 management and practice of infectious disease research core, NIAID grant UC7AI180312). We are also thankful to the team of Northwestern COVID-19 Antibody and Cancer Collaborative Group and advisory members, especially Drs. Alfred L. George Jr., Leonidas C. Platanias, Rex L. Chishom, and William A. Muller for their scientific input and resourceful support for the project. The work was partially funded by the National Institute of General Medical Sciences (NIH/NIGMS) R35GM124952 (Y.S.), the National Institute of Allergy and Infectious Diseases (NIH/NIAID) R01AI167272 and Chicago Biomedical Consortium Accelerator Award A-017 (H L. and D.F.), Northwestern University Feinberg School of Medicine Emerging and Re-emerging Pathogens Program (EREPP) (H.L. and D.F.), and the R.H. Lurie Comprehensive Cancer Center Blood Biobank fund and Northwestern University NUCATs grant UL1TR001422 (M.I.). We gratefully acknowledge the support from the Northwestern University NUseq Core Facility and the RHLCCC Flow Cytometry Facility (Cancer Center Support Grant NCI CA060553). Flow Cytometry Cell Sorting was performed on a BD FACSAria SORP system and BD FACSymphony S6 SORP system, purchased through the support of NIH 1S10OD011996-01 and 1S10OD026814-01. Portions of this research were conducted with the advanced computing resources provided by Texas A&M High Performance Research Computing.

Author contributions

H.F.A., W.T., A.D.H., Y.Sun, and J.W. co-led the bench experiments and computational analyses, prepared figures, and contributed to the article writing. L.E., J.R.S. participated in the studies and contributed to figure preparation and writing. N.K.D., B.S., Y.J., R.I., Y. X., V.N., D.E., G.C.R., M.J.S., and S.S. provided technical support or conducted bench experiments and analyzed data. M.G.I. provided critical clinical resources and supervised the research project. H.L., D.F., and Y.S. designed experiments, analyzed data, supervised the work, and wrote the article.

Declaration of interests

Northwestern University, Texas A&M University, and H. L., D. F., Y. S., H. F. A., W. T., L. E., A. D. H., and N. K. D. hold issued and/or provisional patents in IgG and exosome therapeutics. Texas A&M University, Y. S., Y. Sun, and W. T. hold provisional patents in AbLM and its use. H. L., D. F., and A. D. H. are scientific co-founders of ExoMira Medicine Inc.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies

Anti-CD19 Mouse Monoclonal Antibody (Pacific Blue) [clone: HIB19] BioLegend Cat# 302223; RRID: AB_493652
Brilliant Violet 605™ anti-human CD27 [O323] BioLegend Cat# 302829; RRID: AB_11204431
PE/Cy7 anti-human CD38 [HIT2] BioLegend Cat# 303515; RRID: AB_1279235
Mouse Anti-Human IgM-PE; Clone UHB SouthernBiotech Cat# 9022-09; RRID: AB_2796586
TotalSeqTM-C 0251 anti-human Hashtag 1 BioLegend Cat# 394661; RRID: AB_2801031
TotalSeqTM-C 0252 anti-human Hashtag 2 BioLegend Cat# 394663; RRID: AB_2801032
TotalSeqTM-C 0253 anti-human Hashtag 3 BioLegend Cat# 394665; RRID: AB_2801033
TotalSeqTM-C 0254 anti-human Hashtag 4 BioLegend Cat# 394667; RRID: AB_2801034
TotalSeqTM-C 0255 anti-human Hashtag 5 BioLegend Cat# 394669; RRID: AB_2801035
TotalSeqTM-C 0256 anti-human Hashtag 6 BioLegend Cat# 394671; RRID: AB_2820042
TotalSeqTM-C 0257 anti-human Hashtag 7 BioLegend Cat# 394673; RRID: AB_2820043
TotalSeqTM-C 0258 anti-human Hashtag 8 BioLegend Cat# 394675; RRID: AB_2820044
TotalSeqTM-C 0259 anti-human Hashtag 9 BioLegend Cat# 394677; RRID: AB_2820045
TotalSeqTM-C 0260 anti-human Hashtag 10 BioLegend Cat# 394679; RRID: AB_2820046

Bacterial and virus strains

NEB® 5-alpha Competent E. coli (High Efficiency) New England Biolabs Cat# C2987H
WT Washington nCoV/Washington/1/2020 BEI Resources Cat# NR-52281
Delta SARS-Related Coronavirus 2, Isolate hCoV-19/USA/MD-HP05647/2021 (Lineage B.1.617.2; Delta variant) BEI Resources Cat# NR-55672
Omicron BA.1 SARS-Related Coronavirus 2, Isolate hCoV-19/USA/GA-EHC-2811C/2021 (Lineage B.1.1.529; Omicron Variant) BEI Resources Cat# NR-56481
Omicron BA.5 Isolate hCoV-19/USA/COR-22-063113/2022 (Lineage BA.5) BEI Resources Cat# NR-58496

Biological samples

Human blood specimens from convalescent COVID-19 patients Northwestern Medicine N/A

Chemicals, peptides, and recombinant proteins

Recombinant SARS-CoV-2, S1 Subunit Protein (RBD) RayBiotech Catalog # 230-30162
Streptavidin, Alexa Fluor™ 647 Conjugate Invitrogen Cat# S21374
2M Calcium Chloride Quality Biological Cat# 351-130-721
HEPES Buffer Corning Cat# 25-060-Cl
Sodium Chloride Fisher Chemical Cat# S2713
Sodium Phosphate Dibasic Heptahydrate Fisher Chemical Cat# S373-500
DMEM - high glucose Sigma-Aldrich Cat# D6429
Dulbecco’s Phosphate Buffered Saline Sigma-Aldrich Cat# D8537
0.25% Trypsin-EDTA Gibco Cat# 25200-056
Serum Plus Sigma-Aldrich Cat# 14009C
UltraPure™ Distilled Water (DNase, RNase, Free) Invitrogen Cat# 10977-015
Glycine hydrochloride Sigma-Aldrich Cat# G2879-100G
Tris Base Fisher Chemicals Cat# BP152
LB Broth, Miller Fisher Chemicals Cat# BP1426
LB Agar, Miller Fisher Chemicals Cat# BP1425
10% Formalin Fisher HealthCare™ Cat# 23-305510
Crystal Violet Sigma-Aldrich Cat# C0775
IsoThesia Isoflurane, USP Covetrus Cat# 029405

Critical commercial assays

EasySep™ Human B Cell Isolation Kit Stemcell Technologies Cat#17954
EZ-Link™ NHS-PEG4-Biotin Thermo Scientific Cat# 21330
Zeba™ Spin Desalting Columns Thermo Scientific Cat# 89882
Chromium Single Cell V(D)J Enrichment Kit, Human B Cell, 96 rxns 10X Genomics PN-1000016
Pierce™ Protein A Agarose Thermo Scientific Cat# 20333
NEBuilder® HiFi DNA Assembly Master Mix New England Biolabs Cat# E2621
MilliporeSigma™ Amicon™ Ultra-4 Centrifugal Filter Units 30 kDa MWT Millipore Cat# UFC803024
Human IgG ELISA Kit Abcam Cat# ab195215
NucleoSpin 96 RNA, 96-well kit for RNA purification Macherey-Nagel Cat# 740709.4
SuperScript™ III Platinum™ One-Step qRT-PCR Kit w/ROX Invitrogen Cat# 11745500
Octet anti-human IgG FC biosensors (AHC2) Sartorius Cat# 18-5142
MycoAlert mycoplasma detection kit Lonza LT07-318

Deposited data

Pfam-PR15-v32 Mistry et al.46 http://pfam.xfam.org/
SabDab Dunbar et al.47 https://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/sabdab
Coronavirus Resistance Database Tzou et al.48 https://covdb.stanford.edu/susceptibility-data/table-mab-susc/

Experimental models: Cell lines

HEK 293T cells ATCC CRL-3216
ACE2+ HEK-293 cells Gift from Dr. Daniel Batlle and Dr. Jan Wysocki, Northwestern University N/A
A549-hACE2 cells Benjamin TenOever, New York University N/A

Experimental models: Organisms/strains

B6.Cg-Tg(K18-ACE2)2Prlmn/J The Jackson Laboratory Strain #:034860

Oligonucleotides

Primer: SP6 (GAT TTA GGT GAC ACT ATA G) ACGT inc. N/A
nCOV_N2 Forward Primer Aliquot, 50 nmol IDT Cat# 10006824
nCOV_N2 Reverse Primer Aliquot, 50 nmol IDT Cat# 10006825
nCOV_N2 (FAM) Probe Aliquot, 25 nmol IDT Cat# 10006826

Recombinant DNA

AbVec2.0-IGHG1 Tiller et al.49 Addgene #80795
AbVec1.1-IGLC2-Xho1 Tiller et al.49 Addgene #99575
AbVc2.0-1.1-IGKC Tiller et al.49 Addgene #80796

Software and algorithms

AbodyBuilder Leem et al.29 https://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/sabpred/abodybuilder/
HADDOCK Ambrosetti et al.25 https://rascar.science.uu.nl/haddock2.4/
Bayesian Active Learning (BAL) Cao et al.30 https://github.com/Shen-Lab/BAL
PyMol Schrodinger, LLC https://www.pymol.org/#page-top
Python version 3.8 Python Software Foundation https://www.python.org

Experimental model and study participant details

Human subject study and biosafety approvals for blood draws

Human blood specimens from convalescent COVID-19 patients and related research activities were implemented under NIH guidelines and the protocols approved by the Northwestern University Institutional Review Board (STU00205299) as well as the Institutional Biosafety Committee for COVID-19 research. For collecting human blood specimens, patients and donors were recruited at Northwestern Memorial Hospital based on their availability and written informed consent was obtained. The blood was collected prior to COVID-19 vaccination. All convalescent COVID-19 patients enrolled in our study met the following criteria: testing positive for SARS-CoV-2, at least three weeks following recovery; some patients were also enrolled if they had a positive RBD-specific IgG in a community screening study. 20 mL of blood was drawn into EDTA tubes and transported on ice blocks to the Robert H. Lurie Comprehensive Cancer Center Flow Cytometry Core Facility at Northwestern University.

Experimental animals

Mice used in our animal studies were housed in specific pathogen-free facilities at The University of Chicago Howard T. Ricketts Regional Biocontainment Laboratory. Experiments with SARS-CoV-2 were performed in biosafety level 3 (BSL3) and animal BSL3 (ABSL3) containment in accordance with the institutional guidelines following experimental protocol review and approval by the Institutional Biosafety Committee (IBC) and the Institutional Animal Care and Use Committee (IACUC) at the University of Chicago (IACUC protocol ACUP #72642).

Three experimental groups of 15 week-old hACE2 transgenic B6.Cg-Tg (K18-ACE2) 2Prlmn/J (K18-hACE2) mice were set up with nine mice per group (4 or 5 females and males each group to reach about 50%/50%). Three groups of IgG (1.2 μg for Washington and Delta or 8.5 μg for Omicron) were each mixed with SARS-CoV-2 viral strains; Washington at 10,000 pfu, Delta at 10,000 pfu, or Omicron at 20,000 pfu and incubated for 1 h at 37°C. After incubation, the mice were anesthetized with isoflurane and 25 μL of the mixture were administered intranasally. Following viral infection challenges, animals were monitored for health twice daily and weighed once per day. Clinical scoring system included: Score 0 (pre-inoculation)- mice are bright, alert, active, normal fur coat and posture. Score 1 (post-inoculation, pi)- mice are bright, alert, active, normal fur coat and posture, no weight loss. Score 1.5 - mice present with slightly ruffled fur but are active OR weight loss might occur but does not reach 2.5%; recovery can be expected. Score 2 (pi)- ruffled fur OR less active OR < 5% weight loss; recovery might occur. Score 2.5 (pi)- ruffled fur OR not active but movies when touched OR hunched posture OR difficulty breathing OR weight loss 5–10%; recovery is unlikely but still might occur. Score 3 (pi)- ruffled fur OR inactive but moves when touched OR difficulty breathing OR weight loss at 11–20%; recovery is not expected. Score 4 (pi)- ruffled fur OR positioned on its side or back OOR dehydrated OR difficulty breathing OR weight loss >20% OR labored breathing; recovery is not expected. Score 5 (pi) - death. Three days post infection, mice challenged with the Omicron variant were euthanized while the mice challenged with Washington and Delta variant were euthanized on day seven post infection. One lung was used to determine SARS CoV-2 viral genome levels, and the other lung was fixed with formalin.

Cell lines

HEK 293T cells were purchased from the American Type Culture Collection (ATCC, Cat #CRL-3216) and ACE2+ HEK-293 cells were provided by Dr. Daniel Batlle and Dr. Jan Wysocki at Northwestern University. The A549-hACE2 cells were provided by Benjamin TenOever at New York University. Cells were maintained in complete Dulbecco’s Modified Eagle’s Medium (DMEM) supplemented with 4.5 g/L glucose, L-glutamine, sodium pyruvate, 10% Fetal Bovine Serum (FBS) and 1% Penicillin-Streptomycin. Cells were cultured under sterile conditions at 37°C and 5% CO2. Cell lines were regularly tested for mycoplasma contamination using MycoAlert mycoplasma detection kit (Lonza, LT07-318).

Sex as a biological variable

Human blood specimens were collected from both male and female patients. Additionally, our study examined male and female mice, and similar findings are reported for both sexes.

Method details

B-cell sequencing (VDJ), bioinformatic analysis

B cells were isolated from blood of convalescent COVID-19 donors using the EasySep B kit (Stemcell Technologies, cat no 17954). SARS-nCoV-2 Spike RBD (Raybiotech, catalog no. 230–30162) was biotinylated via a commercially available kit (Thermo Sceintific cat no. 21330) and the resulting biotin-labeled protein was purified through a Zeba quick spin column (Thermo Scientific cat no. 89882). The RBD-biotin was prebound with streptavidin-AlexaFluor-647 to make RBD-647 (Invitrogen, cat no. S21374). B cells were stained with CD19, CD27, CD38, anti-IgM and RBD-647 and a TotalSeq-C hashtag oligo (Biolegend). RBD-647 + B cells were sorted on a FACS Aria (BD Biosciences) in the Robert H. Lurie Cancer Center Flow Cytometry core facility. Due to the small number of RBD-647 + B cells isolated from each patient, we combined B cells with monocytes isolated from the same patients, labeled with a different hashtag oligo.24

B Cells were partitioned using a 10× genomics Chromium Controller for GEM generation followed by single-cell library construction using 10× Chromium Next GEM Single Cell V(D)J Library reagent kit. Libraries were sequenced at the Northwestern University Sequencing Core Facility on an Illumina HiSeq 4000 using the sequencing parameters indicated by the manufacturer. Sequences were aligned using CellRanger (10× genomics). Aligned antibody sequences from CellRanger were extracted for downstream analysis. VDJ sequences were ordered from IDT and cloned into the AbVec vectors (Addgene)

WT-neutralization prediction based on antibody structure prediction and docking

From each antibody sequence, we first predicted its 3D structure using the web application ABodyBuilder-ML.50 Then we determined the “active” residues for each antibody’s predicted structure using the web server proABC251 and the “passive” residues for an RBD crystal structure (PDB ID: 6W41, Chain C) using the criteria of DSSP-calculated relative solvent accessibility above 0.4. For each pair of antibody heavy (H) and light (L) chains and RBD antigen structures, we performed initial protein docking using the webserver HADDOCK25 (“Scenario 2” when a loose definition of the epitope is known) as well as the aforementioned information about active and passive residues. And we refined the resulting top-10 or less HADDOCK structural models (cluster representatives) and estimated their confidence weights using the computer program Bayesian Active Learning (BAL).30 For each antibody, we calculated the confidence indicator of WT virus neutralization probability (0.0–1.0) using these protein-docking models to calculate antibody-blocking RBD residues from ACE2 binding. A total of 26 ACE2-binding residues on RBD were determined using the ACE2-RBD co-crystal structure (PDB ID: 6M17, 6M0J, and 6LZG), based on the proximity within 5 Å of the ACE2; their residue indices are 417, 446, 447, 449, 453, 455, 456, 473, 475, 476, 477, 484, 486, 487, 489, 490, 493, 494, 495, 496, 498, 500, 501, 502, 503, and 505. Specifically, each given antibody is predicted to interact and interfere with a portion (0.0–1.0) of the 26 ACE2-binding residues on RBD (within 5 Å of any RBD heavy atom) according to each structural model; and the WT-neutralizing confidence predictor is calculated by weight-averaging the portions across all structural models for each antibody.

Antibody language model (AbLM)

The architecture, training, and application of AbLM26 is illustrated in Figure 2A. We adopted a bidirectional self attention-based transformer encoder (12 layers and 12 heads per layer; please refer to the model ‘RP15_B1’52 for more details). We pretrained the encoder using over 12 million non-redundant protein-domain sequences from Pfam-RP15-v3246 and fine-tuned two weight-tied sequence encoders, one for the heavy chain (VH) and the other for the light chain (VL), using 4,196 paired heavy and light-chain antibody sequences (variable region) from SabDab.47 During fine-tuning, a VH-VL cross-attention module was added after the weight-tied VH and VL encoders. While random masking following BERT was applied in pre-training, antibody-specific masking of one random CDR region per training example was adopted in fine-tuning. For the input of given paired heavy and light-chain sequences, the output of the resulting antibody language model is the “embedding” of the antibody sequence, that is, a 1536-dimensional vector consisting of 768 dimensions for either heavy or light chain. The embeddings lie in a 1536-dimensional space dubbed the “latent” space.

Variant response prediction based on machine learning

To emulate the low-data regimes, we used variant response profiles of 14 clinical antibodies as of late 2022 (including 9 single antibodies under emergency use authorization and 5 single antibodies under clinical trials), as in “susceptibility summaries” of the Coronavirus Resistance Database (CoV-RDB)48) and Figure S6. Specifically, we used their log fold improvements against Delta, Omicron/BA.1, and Omicron/BA.4/5, ranging between −3 (≥1000-fold reduced neutralizing activity) and +1 (≥10-fold increased neutralizing activity). In the 1536-dimensional latent space, we constructed covariance kernels using Euclidean distances among the embeddings of 1366 uncharacterized IgG candidates and 14 experimentally profiled clinical antibodies and accordingly three Kriging regressors for log fold improvements against Delta, Omicron/BA.1, and Omicron/BA.4/5, respectively. We made variant response predictions for all 1366 IgG candidates and visualized the landscape by interpolation (inverse distance weighting). We then validated such prediction based on 19 IgG candidates.

Antibody re-design against the delta variant

We chose IgG 106/107 as the “seed” for improved neutralization against a SARS-CoV-2 variant (Delta). The highest-weight structural model of the antibody-RBD (WT) complex as the input, a computational protein design program iCFN40 is used to first predict the antibody-RBD (Delta variant) complex structure and then design amino acid substitutions for the antibody to gain neutralization against the Delta variant. Specifically, both steps involve multistate design with single substrate per state (see more details in Karimi and Shen40) and find the optimal structures (and sequences, when applicable) to minimize the energy difference between a positive and a negative state.

When predicting the antibody-RBD (Delta variant) complex structure, we maximally disrupt binding by minimizing the folding-energy difference between the separate (positive-state) and bound (negative-state) antibody-RBD, as detailed in.53 All residues within 5 Å of RBD residues L452 and T478 were treated flexible during design, while amino acid substitutions L452K and T478R were enforced.

When designing the optimal single amino acid substitutions for IgG 106/107 and simultaneously predicting the structures, we maximally enhance RBD-binding by minimizing the folding-energy difference between the bound (positive-state) and the separate (negative-state) antibody-RBD while constraining the folding stability, as detailed in40 (XRCC1 design). The antibody redesign positions are all residues within 5 Å of RBD residues 452 and 478, including 1 on the heavy chain near residue 452 of RBD and 7 on the light chain near residue 478 of RBD.

Cloning of antibody expression plasmids

The VDJ sequences of heavy and light chains of prioritized antibody pairs were obtained from the single cell VDJ sequencing. DNA fragments harboring the specific variable sequences were then synthesized by gBlock Gene Fragment technology by Integrated DNA Technologies (IDT). Synthesized double stranded DNA fragments were then cloned into AbVec2.0-IGHG1 (Addgene #80795), AbVc2.0-1.1-IGKC (Addgene #80796), and AbVec1.1-IGLC2-Xho1 (Addgene #99575)49 using the NEBuilder HiFi DNA Assembly Master Mix (Cat# E2621) to generate the heavy chain, kappa light chain, and lambda light chains, respectively. The IgG expression vectors and protocols were generously shared by Drs. Jenna Guthmiller and Patrick Wilson at the University of Chicago. Proper insertions of cloned DNA fragments were confirmed by Sanger sequencing.

Antibody production and purification

Plasmids expressing the heavy and light chain of each tested antibody were transfected into HEK293T cells (ATCC CRL-3216) by a calcium chloride transfection method. Briefly, 60–80% confluent HEK293T cells were transfected with about 19 μg of each heavy and light chain expression plasmids (up to 1044 μL nuclease free H2O) mixed with 188 μL of 2M CaCl2. Followed by dropwise addition of 1.25 mL of 2× HBS buffer pH 7.12 (50 mM HEPES Acid, 280 mM NaCl, 1.5 mM NA2HPO4) while introducing bubbles to the mix. Following incubation for 20 min at room temperature, 15.5 mL cDMEM was added to the mix and subsequently transferred to cells and incubated at 37°C 5% CO2 for 16 h, after which cells were washed with 1× PBS and fresh cDMEM was added to cells and incubated for three more days. Supernatants were collected and spun at 2,000 × g for 10 min at 4°C to pellet cell debris. Antibodies were purified from culture supernatants by protein A agarose beads (Thermo Fisher PierceTM Protein A Agarose, 20333). Briefly, cell culture supernatant was added to pre-washed beads (500 μL) and incubated on a tabletop rocker for 2 h at room temperature followed by overnight incubation at 4°C. Bead-bound antibodies were collected by centrifugation at 1800 xg for 10 min at 4°C (Break off) and washed in 1M NaCl followed by two 1× PBS washes. Antibodies were eluted by adding 3 mL 0.1M glycine-HCL (pH 7.12) and rocking at room temperature for 10 min. Following centrifugation at 1800 × g for 10 min at 4°C, supernatants containing the antibodies were neutralized by adding 200ul 1M Tris-HCl (pH 8.8). Antibody solutions were then concentrated using an Amicon protein concentrator (4 mL capacity, 30 kDa MWT cutoff) and buffer exchanged with 1× PBS and antibodies were stored at 4C. The production yield for each antibody can be found in Data S5.

Quantification of antibody concentrations using ELISA

The Human IgG ELISA Kit (ab195215) was used to quantify the concentration of purified IgGs following the manufacturer’s instructions and optical density was measured using the SpectraMax iD5 plate reader.

Cell-based neutralization assay

To create neutralized Spike Receptor Binding Domain (RBD), purified antibodies were incubated with the RBD-biotin-AF647 bait (3.3 nmol/L) for 45 min on ice, then incubated with ACE2 expressing HEK-293 (ACE2+ HEK-293) cells (200,000 cells in 100 μL 2%FBS/PBS) for 45 min on ice. RBD bait that was incubated with PBS, or with non-fluorescent RBD bait (mock control) were used as controls. Cells were then washed twice with 2%FBS/PBS (300xg for 5min). Dapi was added to stain dead cells, and analysis was performed using BD FACSymphony A5-laser analyser. Viable singlets were gated for percentage of the RBD-AF647+ population. Flow cytometry data were analyzed using Flow Jo v10.6.2. IC50 values were estimated by fitting a log(inhibitor) vs. normalized response model using GraphPad Prism 9.3.1.

Live SARS-CoV-2 virus infection of A549 cells

The live virus neutralization experiments were conducted at the NIAID-supported BSL-3 facility at the University of Chicago Howard T. Ricketts Regional Biocontainment Laboratory.

One day prior to viral infection, A549 cells overexpressing human ACE2 (A549-hACE2) (obtained from tenOever and colleagues)54 were seeded onto 96-well plates at a density of 10,000 cells per well. Antibody dilutions were made in Infection Media with 2% FBS. Antibody dilutions were mixed with 500 pfu of a SARS-CoV-2 strain, WT Washington (nCoV/Washington/1/2020), variant Delta (NR-55672 SARS-Related Coronavirus 2, Isolate hCoV-19/USA/MD-HP05647/2021 (Lineage B.1.617.2), variant Omicron (NR-56481 SARS-Related Coronavirus 2 Isolate hCoV-19/USA/GA-EHC-2811C/2021 (Lineage B.1.1.529) or variant Omicron Isolate hCoV-19/USA/COR-22-063113/2022 (Lineage BA.5) and incubated at 37°C in the dark for 1 h. Culturing media was then replaced with infected antibody dilutions and allowed to incubate for 72–96 h or until positive control wells show at least 50% CPE (cytopathic effect). At which point the infectious media was removed and cells were fixed with 100 μL of Formalin solution (10% Formalin Fisher 23305510). Cells were incubated at room temperature with Formalin for at least 15 min to ensure viral inactivation. Then formalin media was removed, and cells were stained in 0.25% Crystal Violet solution (0.25% w/v Crystal Violet Sigma C0775 in 20% EtOH) for 15–30 min, after which the Crystal violet is washed off under gently flowing tap water. Plates were then allowed to dry uncovered on the benchtop for at least 24 h prior to analysis using the Infinite 200 Pro TECAN plate reader. Control wells such as no virus mock control in addition to a non-antibody treated control were included. A value of maximal death caused by the virus was evaluated from the virally infected but non-treated control wells while the maximal normal cell growth was determined from the mock-infected control wells. The average absorbance value of the non-treated control wells was subtracted from the remaining absorbance values to establish “0” values for non-treated wells. Next, all absorbance values were divided by the average absorbance value of the mock-infected control thereby setting the value of the mock-infected control to 100. To estimate EC50 values, the neutralization curves were fit using the log(agonist) vs. response – Variable slope (four parameter) model by GraphPad Prism 9.3.1.

Biolayer interferometry assay (BLI)

BLI assays were carried out using the Sartorius Octet R8 system. The binding affinity of the functionally validated antibodies to recombinant wild-type RBD was carried out as follows. A solution of antibody at 12.5 nmol/L concentration was loaded onto the Sartorius anti human IgG Fc capture biosensor (Octet AHC2, 18–5142). Post baseline measurements, the loaded biosensors were introduced into increasing concentrations of WT recombinant RBD (RayBiotech, 230–30162) in PBS containing 0.02% BSA and 0.002% Tween 20. Biosensors were then introduced back into baseline buffer (PBS containing 0.02% BSA and 0.002% Tween 20) to measure antibody dissociation rates. Association (Kon) and dissociation rates (Koff) as well as the dissociation constants (KD) were evaluated using Octet analysis studio software using the 1:1 binding model. Double reference normalization was performed to eliminate signal resulting from possible non-specific binding of the RBD analyte to the biosensors.

Lung fixation and histopathological analysis

Lung tissue was submerged in 1 mL formalin for 48 h. Formalin was removed and 1 mL formalin was added and incubated for an additional 12 days. Tissue was tested for viral inactivation and released. Fixed lungs were then embedded in paraffin and sectioned by routine procedures followed by H&E staining. Stained slides were scanned then analyzed using the NDP view2 software. Double-blinded evaluation of the percent of total lung surface area involvement was performed by a pathologist following a graded scheme adopted from a previous report.55

Lung viral genome level measurement using qRT-PCR

RNA was extracted from mouse lungs using the Nucleospin 96 RNA extraction kit as per written instructions (Macherey-Nagel 740709.4). Prior to being run through the binding columns, the lung tissue was collected in the R1A buffer provided with the kit and homogenized using a FastPrep fp120 homogenizer with 1.4 mm ceramic beads (Omni international SKU 19–645). RNA was eluted in RNAse free water and used for qRT-PCR. Which was performed using an Applied BioSystems Step One Plus Real-time PCR system, using the SupperScriptIII Platinum One-Step qRT-PCR Kit with ROX (Invitrogen 11745–500). Sample RNA was measured using a standard curve made by extracting viral RNA from lab viral stocks. The quantity of RNA in the viral stock was measured with a Nanodrop 2000. CDC recommended N2 primers and probe used in the qRT-PCR were purchased from IDT (10006824, 10006825, and 10006826).

Quantification and statistical analysis

GraphPad Prism 9.3.1 software was used to perform statistical analyses and calculate the EC50 and IC50. Parametric unpaired t-tests or non-parametric Mann–Whitney tests were used as appropriate to assess differences between groups (as reported in the figure legends). Results were reported significant if p < 0.05. Data are presented as mean ± standard error of the mean (SEM) (as reported in the figure legends). Measurements were taken from distinct samples in all experiments with biological and/or technical replicates.

Published: September 17, 2025

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2025.113584.

Contributor Information

Huiping Liu, Email: huiping.liu@northwestern.edu.

Deyu Fang, Email: fangd@northwestern.edu.

Yang Shen, Email: yshen@tamu.edu.

Supplemental information

Document S1. Figures S1–S12
mmc1.pdf (2.6MB, pdf)
Data S1. Machine learning predictions
mmc2.xlsx (163KB, xlsx)
Data S2. Cell-based neutralization results
mmc3.xlsx (18KB, xlsx)
Data S3. Live virus neutralization data including EC50 values
mmc4.xlsx (46.3KB, xlsx)
Data S4. BLI kinetics values including Koff, Kon, and KD
mmc5.xlsx (10.8KB, xlsx)
Data S5. Recombinant IgG production yield for all functionally validated antibodies
mmc6.xlsx (9.7KB, xlsx)

References

  • 1.Urquhart L. Top companies and drugs by sales in 2020. Nat. Rev. Drug Discov. 2021;20:253. doi: 10.1038/d41573-021-00050-6. [DOI] [PubMed] [Google Scholar]
  • 2.Urquhart L. Top companies and drugs by sales in 2019. Nat. Rev. Drug Discov. 2020;19:228. doi: 10.1038/d41573-020-00047-7. [DOI] [PubMed] [Google Scholar]
  • 3.Urquhart L. Top drugs and companies by sales in 2018. Nat. Rev. Drug Discov. 2019;18:245. doi: 10.1038/d41573-019-00049-0. [DOI] [PubMed] [Google Scholar]
  • 4.Jaroszewicz W., Morcinek-Orlowska J., Pierzynowska K., Gaffke L., Wegrzyn G. Phage display and other peptide display technologies. FEMS Microbiol. Rev. 2022;46 doi: 10.1093/femsre/fuab052. [DOI] [PubMed] [Google Scholar]
  • 5.Zhou P., Yang X.L., Wang X.G., Hu B., Zhang L., Zhang W., Si H.R., Zhu Y., Li B., Huang C.L., et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. doi: 10.1038/s41586-020-2012-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lan J., Ge J., Yu J., Shan S., Zhou H., Fan S., Zhang Q., Shi X., Wang Q., Zhang L., Wang X. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 2020;581:215–220. doi: 10.1038/s41586-020-2180-5. [DOI] [PubMed] [Google Scholar]
  • 7.Hamming I., Timens W., Bulthuis M.L.C., Lely A.T., Navis G.J., van Goor H. Tissue distribution of ACE2 protein, the functional receptor for SARS coronavirus. A first step in understanding SARS pathogenesis. J. Pathol. 2004;203:631–637. doi: 10.1002/path.1570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Barnes C.O., Jette C.A., Abernathy M.E., Dam K.M.A., Esswein S.R., Gristick H.B., Malyutin A.G., Sharaf N.G., Huey-Tubman K.E., Lee Y.E., et al. SARS-CoV-2 neutralizing antibody structures inform therapeutic strategies. Nature. 2020;588:682–687. doi: 10.1038/s41586-020-2852-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chen P., Nirula A., Heller B., Gottlieb R.L., Boscia J., Morris J., Huhn G., Cardona J., Mocherla B., Stosor V., et al. SARS-CoV-2 Neutralizing Antibody LY-CoV555 in Outpatients with Covid-19. N. Engl. J. Med. 2021;384:229–237. doi: 10.1056/NEJMoa2029849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jiang S., Hillyer C., Du L. Neutralizing Antibodies against SARS-CoV-2 and Other Human Coronaviruses. Trends Immunol. 2020;41:355–359. doi: 10.1016/j.it.2020.03.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ju B., Zhang Q., Ge J., Wang R., Sun J., Ge X., Yu J., Shan S., Zhou B., Song S., et al. Human neutralizing antibodies elicited by SARS-CoV-2 infection. Nature. 2020;584:115–119. doi: 10.1038/s41586-020-2380-z. [DOI] [PubMed] [Google Scholar]
  • 12.Wajnberg A., Amanat F., Firpo A., Altman D.R., Bailey M.J., Mansour M., McMahon M., Meade P., Mendu D.R., Muellers K., et al. Robust neutralizing antibodies to SARS-CoV-2 infection persist for months. Science. 2020;370:1227–1230. doi: 10.1126/science.abd7728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ge J., Wang R., Ju B., Zhang Q., Sun J., Chen P., Zhang S., Tian Y., Shan S., Cheng L., et al. Antibody neutralization of SARS-CoV-2 through ACE2 receptor mimicry. Nat. Commun. 2021;12:250. doi: 10.1038/s41467-020-20501-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Han P., Li L., Liu S., Wang Q., Zhang D., Xu Z., Han P., Li X., Peng Q., Su C., et al. Receptor binding and complex structures of human ACE2 to spike RBD from omicron and delta SARS-CoV-2. Cell. 2022;185:630–640.e10. doi: 10.1016/j.cell.2022.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Regeneron Pharmaceuticals Inc Fact sheet for health care providers: emergency use authorization (EUA) of casirivimab and imdevimab. https://www.fda.gov/media/155054/download
  • 16.Weisblum Y., Schmidt F., Zhang F., DaSilva J., Poston D., Lorenzi J.C., Muecksch F., Rutkowska M., Hoffmann H.H., Michailidis E., et al. Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants. eLife. 2020;9 doi: 10.7554/eLife.61312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Cao Y., Wang J., Jian F., Xiao T., Song W., Yisimayi A., Huang W., Li Q., Wang P., An R., et al. Omicron escapes the majority of existing SARS-CoV-2 neutralizing antibodies. Nature. 2022;602:657–663. doi: 10.1038/s41586-021-04385-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.World Health Organization Coronavirus (COVID-19) Dashbord. https://covid19.who.int
  • 19.Liu C., Ginn H.M., Dejnirattisai W., Supasa P., Wang B., Tuekprakhon A., Nutalai R., Zhou D., Mentzer A.J., Zhao Y., et al. Reduced neutralization of SARS-CoV-2 B.1.617 by vaccine and convalescent serum. Cell. 2021;184:4220–4236.e13. doi: 10.1016/j.cell.2021.06.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Planas D., Veyer D., Baidaliuk A., Staropoli I., Guivel-Benhassine F., Rajah M.M., Planchais C., Porrot F., Robillard N., Puech J., et al. Reduced sensitivity of SARS-CoV-2 variant Delta to antibody neutralization. Nature. 2021;596:276–280. doi: 10.1038/s41586-021-03777-9. [DOI] [PubMed] [Google Scholar]
  • 21.Kim J., McFee M., Fang Q., Abdin O., Kim P.M. Computational and artificial intelligence-based methods for antibody development. Trends Pharmacol. Sci. 2023;44:175–189. doi: 10.1016/j.tips.2022.12.005. [DOI] [PubMed] [Google Scholar]
  • 22.Shan S., Luo S., Yang Z., Hong J., Su Y., Ding F., Fu L., Li C., Chen P., Ma J., et al. Deep learning guided optimization of human antibody against SARS-CoV-2 variants with broad neutralization. Proc. Natl. Acad. Sci. USA. 2022;119 doi: 10.1073/pnas.2122954119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Park C., Apley D. Patchwork kriging for large-scale gaussian process regression. J. Mach. Learn. Res. 2018;19:269–311. [Google Scholar]
  • 24.Hoffmann A.D., Weinberg S.E., Swaminathan S., Chaudhuri S., Almubarak H.F., Schipma M.J., Mao C., Wang X., El-Shennawy L., Dashzeveg N.K., et al. Unique molecular signatures sustained in circulating monocytes and regulatory T cells in convalescent COVID-19 patients. Clin. Immunol. 2023;252 doi: 10.1016/j.clim.2023.109634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ambrosetti F., Jandova Z., Bonvin A.M.J.J. Information-Driven Antibody-Antigen Modelling with HADDOCK. Methods Mol. Biol. 2023;2552:267–282. doi: 10.1007/978-1-0716-2609-2_14. [DOI] [PubMed] [Google Scholar]
  • 26.Sun Y. Protein Multimodal Learning for Variant Effect Prediction and Protein Engineering (Doctoral dissertation) Texsas A&M University, College Station, TX, USA; 2023. https://hdl.handle.net/1969.1/203042 [Google Scholar]
  • 27.McInnes L.A., Healy J., Saul N., Großberger L. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 2018;3:861. doi: 10.21105/joss.00861. [DOI] [Google Scholar]
  • 28.Hansen J., Baum A., Pascal K.E., Russo V., Giordano S., Wloga E., Fulton B.O., Yan Y., Koon K., Patel K., et al. Studies in humanized mice and convalescent humans yield a SARS-CoV-2 antibody cocktail. Science. 2020;369:1010–1014. doi: 10.1126/science.abd0827. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Leem J., Dunbar J., Georges G., Shi J., Deane C.M. ABodyBuilder: Automated antibody structure prediction with data-driven accuracy estimation. mAbs. 2016;8:1259–1268. doi: 10.1080/19420862.2016.1205773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Cao Y., Shen Y. Bayesian Active Learning for Optimization and Uncertainty Quantification in Protein Docking. J. Chem. Theory Comput. 2020;16:5334–5347. doi: 10.1021/acs.jctc.0c00476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Planas D., Saunders N., Maes P., Guivel-Benhassine F., Planchais C., Buchrieser J., Bolland W.H., Porrot F., Staropoli I., Lemoine F., et al. Considerable escape of SARS-CoV-2 Omicron to antibody neutralization. Nature. 2022;602:671–675. doi: 10.1038/s41586-021-04389-z. [DOI] [PubMed] [Google Scholar]
  • 32.El-Shennawy L., Hoffmann A.D., Dashzeveg N.K., McAndrews K.M., Mehl P.J., Cornish D., Yu Z., Tokars V.L., Nicolaescu V., Tomatsidou A., et al. Circulating ACE2-expressing extracellular vesicles block broad strains of SARS-CoV-2. Nat. Commun. 2022;13:405. doi: 10.1038/s41467-021-27893-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Choi J.R., Kim M.J., Tae N., Wi T.M., Kim S.H., Lee E.S., Kim D.H. BLI-Based Functional Assay in Phage Display Benefits the Development of a PD-L1-Targeting Therapeutic Antibody. Viruses. 2020;12 doi: 10.3390/v12060684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Noy-Porat T., Mechaly A., Levy Y., Makdasi E., Alcalay R., Gur D., Aftalion M., Falach R., Leviatan Ben-Arye S., Lazar S., et al. Therapeutic antibodies, targeting the SARS-CoV-2 spike N-terminal domain, protect lethally infected K18-hACE2 mice. iScience. 2021;24 doi: 10.1016/j.isci.2021.102479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Jug A., Bratkovič T., Ilaš J. Biolayer interferometry and its applications in dug discovery and development. TrAC, Trends Anal. Chem. 2024;176 doi: 10.1016/j.trac.2024.117741. [DOI] [Google Scholar]
  • 36.Rives A., Meier J., Sercu T., Goyal S., Lin Z., Liu J., Guo D., Ott M., Zitnick C.L., Ma J., Fergus R. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl. Acad. Sci. USA. 2021;118 doi: 10.1073/pnas.2016239118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Lin Z., Akin H., Rao R., Hie B., Zhu Z., Lu W., Smetanin N., Verkuil R., Kabeli O., Shmueli Y., et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. 2023;379:1123–1130. doi: 10.1126/science.ade2574. [DOI] [PubMed] [Google Scholar]
  • 38.Elnaggar A., Heinzinger M., Dallago C., Rehawi G., Wang Y., Jones L., Gibbs T., Feher T., Angerer C., Steinegger M., et al. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2022;44:7112–7127. doi: 10.1109/TPAMI.2021.3095381. [DOI] [PubMed] [Google Scholar]
  • 39.Olsen T.H., Moal I.H., Deane C.M. AbLang: an antibody language model for completing antibody sequences. Bioinform. Adv. 2022;2 doi: 10.1093/bioadv/vbac046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Karimi M., Shen Y. iCFN: an efficient exact algorithm for multistate protein design. Bioinformatics. 2018;34:i811–i820. doi: 10.1093/bioinformatics/bty564. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Lang-Meli J., Fuchs J., Mathe P., Ho H.E., Kern L., Jaki L., Rusignuolo G., Mertins S., Somogyi V., Neumann-Haefelin C., et al. Case Series: Convalescent Plasma Therapy for Patients with COVID-19 and Primary Antibody Deficiency. J. Clin. Immunol. 2022;42:253–265. doi: 10.1007/s10875-021-01193-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Rijnders B.J.A., Huygens S., Mitja O. Evidence-based dosing of convalescent plasma for COVID-19 in future trials. Clin. Microbiol. Infect. 2022;28:667–671. doi: 10.1016/j.cmi.2022.01.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ren X., Zhou J., Guo J., Hao C., Zheng M., Zhang R., Huang Q., Yao X., Li R., Jin Y. Reinfection in patients with COVID-19: a systematic review. Policy. 2022;7:12. doi: 10.1186/s41256-022-00245-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Rouet R., Henry J.Y., Johansen M.D., Sobti M., Balachandran H., Langley D.B., Walker G.J., Lenthall H., Jackson J., Ubiparipovic S., et al. Broadly neutralizing SARS-CoV-2 antibodies through epitope-based selection from convalescent patients. Nat. Commun. 2023;14:687. doi: 10.1038/s41467-023-36295-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Hie B.L., Shanker V.R., Xu D., Bruun T.U.J., Weidenbacher P.A., Tang S., Wu W., Pak J.E., Kim P.S. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol. 2024;42:275–283. doi: 10.1038/s41587-023-01763-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Mistry J., Chuguransky S., Williams L., Qureshi M., Salazar G.A., Sonnhammer E.L.L., Tosatto S.C.E., Paladin L., Raj S., Richardson L.J., et al. Pfam: The protein families database in 2021. Nucleic Acids Res. 2021;49:D412–D419. doi: 10.1093/nar/gkaa913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Dunbar J., Krawczyk K., Leem J., Baker T., Fuchs A., Georges G., Shi J., Deane C.M. SAbDab: the structural antibody database. Nucleic Acids Res. 2014;42:D1140–D1146. doi: 10.1093/nar/gkt1043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Tzou P.L., Tao K., Pond S.L.K., Shafer R.W. Coronavirus Resistance Database (CoV-RDB): SARS-CoV-2 susceptibility to monoclonal antibodies, convalescent plasma, and plasma from vaccinated persons. PLoS One. 2022;17 doi: 10.1371/journal.pone.0261045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Tiller T., Meffre E., Yurasov S., Tsuiji M., Nussenzweig M.C., Wardemann H. Efficient generation of monoclonal antibodies from single human B cells by single cell RT-PCR and expression vector cloning. Methods. 2008;329:112–124. doi: 10.1016/j.jim.2007.09.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Abanades B., Georges G., Bujotzek A., Deane C.M. ABlooper: fast accurate antibody CDR loop structure prediction with accuracy estimation. Bioinformatics. 2022;38:1877–1880. doi: 10.1093/bioinformatics/btac016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ambrosetti F., Olsen T.H., Olimpieri P.P., Jiménez-García B., Milanetti E., Marcatilli P., Bonvin A.M.J.J. proABC-2: PRediction of AntiBody contacts v2 and its application to information-driven docking. Bioinformatics. 2020;36:5107–5108. doi: 10.1093/bioinformatics/btaa644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sun Y., Shen Y. Structure-informed protein language models are robust predictors for variant effects. Hum. Genet. 2025;144:209–225. doi: 10.1007/s00439-024-02695-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Irani S., Tan W., Li Q., Toy W., Jones C., Gadiya M., Marra A., Katzenellenbogen J.A., Carlson K.E., Katzenellenbogen B.S., et al. Somatic estrogen receptor alpha mutations that induce dimerization promote receptor activity and breast cancer proliferation. J. Clin. Invest. 2024;134 doi: 10.1172/JCI163242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Blanco-Melo D., Nilsson-Payant B.E., Liu W.C., Uhl S., Hoagland D., Møller R., Jordan T.X., Oishi K., Panis M., Sachs D., et al. Imbalanced Host Response to SARS-CoV-2 Drives Development of COVID-19. Cell. 2020;181:1036–1045.e9. doi: 10.1016/j.cell.2020.04.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Zheng J., Wong L.Y.R., Li K., Verma A.K., Ortiz M.E., Wohlford-Lenane C., Leidinger M.R., Knudson C.M., Meyerholz D.K., McCray P.B., Jr., Perlman S. COVID-19 treatments and pathogenesis including anosmia in K18-hACE2 mice. Nature. 2021;589:603–607. doi: 10.1038/s41586-020-2943-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S12
mmc1.pdf (2.6MB, pdf)
Data S1. Machine learning predictions
mmc2.xlsx (163KB, xlsx)
Data S2. Cell-based neutralization results
mmc3.xlsx (18KB, xlsx)
Data S3. Live virus neutralization data including EC50 values
mmc4.xlsx (46.3KB, xlsx)
Data S4. BLI kinetics values including Koff, Kon, and KD
mmc5.xlsx (10.8KB, xlsx)
Data S5. Recombinant IgG production yield for all functionally validated antibodies
mmc6.xlsx (9.7KB, xlsx)

Data Availability Statement


Articles from iScience are provided here courtesy of Elsevier

RESOURCES