Computational design of therapeutic antibodies with improved developability: efficient traversal of binder landscapes and rescue of escape mutations

Frédéric A Dreyer; Constantin Schneider; Aleksandr Kovaltsuk; Daniel Cutting; Matthew J Byrne; Daniel A Nissley; Henry Kenlay; Claire Marks; David Errington; Richard J Gildea; David Damerell; Pedro Tizei; Wilawan Bunjobpol; John F Darby; Ieva Drulyte; Daniel L Hurdiss; Sachin Surade; Newton Wahome; Douglas EV Pires; Charlotte M Deane

doi:10.1080/19420862.2025.2511220

. 2025 Jun 3;17(1):2511220. doi: 10.1080/19420862.2025.2511220

Computational design of therapeutic antibodies with improved developability: efficient traversal of binder landscapes and rescue of escape mutations

Frédéric A Dreyer ^a,^✉,^*, Constantin Schneider ^a,^*, Aleksandr Kovaltsuk ^a, Daniel Cutting ^a, Matthew J Byrne ^a, Daniel A Nissley ^a, Henry Kenlay ^a, Claire Marks ^a, David Errington ^a, Richard J Gildea ^a, David Damerell ^a, Pedro Tizei ^a, Wilawan Bunjobpol ^a, John F Darby ^a, Ieva Drulyte ^b, Daniel L Hurdiss ^c, Sachin Surade ^a, Newton Wahome ^a, Douglas EV Pires ^a, Charlotte M Deane ^a,^d

PMCID: PMC12164381 PMID: 40458889

ABSTRACT

Developing therapeutic antibodies is a challenging endeavor, often requiring large-scale screening to produce initial binders, that still often require optimization for developability. We present a computational pipeline for the discovery and design of therapeutic antibody candidates, which incorporates physics- and AI-based methods for the generation, assessment, and validation of candidate antibodies with improved developability against diverse epitopes, via efficient few-shot experimental screens. We demonstrate that these orthogonal methods can lead to promising designs. We evaluated our approach by experimentally testing a small number of candidates against multiple SARS-CoV-2 variants in three different tasks: (i) traversing sequence landscapes of binders, we identify highly sequence dissimilar antibodies that retain binding to the Wuhan strain, (ii) rescuing binding from escape mutations, we show up to 54% of designs gain binding affinity to a new subvariant and (iii) improving developability characteristics of antibodies while retaining binding properties. These results together demonstrate an end-to-end antibody design pipeline with applicability across a wide range of antibody design tasks. We experimentally characterized binding against different antigen targets, developability profiles, and cryo-EM structures of designed antibodies. Our work demonstrates how combined AI and physics computational methods improve productivity and viability of antibody designs.

KEYWORDS: Antibody design, artificial intelligence, immunology, mab, structural biology

1. Introduction

Antibodies are an important and growing class of therapeutics,¹ representing the largest and most successful class of biotherapeutics, with a market size estimated to reach $450B by 2028.² To date, more than 100 therapies have been approved by the FDA against a wide range of disease indications,³ with particular success in oncology and immuno-oncology, and with revenues projected to surpass that of small molecule therapies in the next few years.⁴

Modern antibody therapeutic discovery and design processes are heavily supplemented and accelerated by computational and machine learning (ML) driven methods.⁵ These approaches include the in silico prediction of epitope and paratope regions,^6,7 various developability and affinity characteristics^8–10 as well as generative approaches to the development of novel candidate antibodies.^11–13

Despite recent advances in computational approaches,^14–17 the majority of successful antibody drug discovery campaigns rely heavily on extensive experimental screening, initially to identify binders and in subsequent rounds to further optimize specificity, affinity, humanness, stability, and other characteristics needed for downstream development.¹ Alternatively, it is also common to start from highly developable frameworks and adapt these antibodies to new targets through high-throughput assays.^18,19 This procedure of optimizing binding affinity and developability characteristics separately, is typical of most discovery pipelines, which tends to drive up overall costs and prolong discovery time.²⁰

We propose a design pipeline that combines (i) in silico biophysical property assessment, (ii) machine learning-based antibody design approaches and (iii) sample-efficient experimental validation to enable the design of high affinity and developable therapeutic antibodies from starting point antibody candidates. We use our pipeline to design antibodies against multiple variants of the SARS-CoV-2 receptor binding domain (RBD), we show that our methodology is able to discover binders from naturally occurring repertoires, improve developability metrics of existing binders and rescue antibodies from escape mutations on the target antigen.

In this pilot study, we experimentally validated a small number of designs from three different computational design methods in a single-shot approach. Starting with known SARS-CoV-2 wild-type RBD-binding antibodies, we designed and characterized 285 novel antibodies against different strains of the virus. An overview of our computational design methods and characterization pipeline is shown in Figure 1.

A circular representation of the computational pipeline described in the article, with an antibody-antigen complex starting point, a computational design stage, a characterization stage and an experimental validation stage. — (a) an antibody-antigen complex that will be used as starting point for optimization. (b) Overview of the computational antibody design platform, showing the three candidate generation methods. (c) The characterization pipeline used to filter and rank candidates to select a small subset. (d) Experimental validation of selected candidates to measure properties such as binding affinity to the target, aggregation propensity and thermostability.

We demonstrate our computational pipeline is capable of (i) efficiently traversing sequence landscapes of binders, achieving high expression and hit rates through the use of effective computational screening; (ii) rescuing developability issues of existing binders, such as aggregation, through the use of language model-guided modifications; and (iii) re-epitoping²¹ antibodies to mutated antigens through the use of an antibody-specific inverse folding model.

Our computational design pipeline represents a significant step toward ML-driven design of potent biotherapeutics.

2. Results

2.1. Selection of viable antibody starting points for design

We first sought a pool of template antibodies known to bind the SARS-CoV-2 RBD to serve as the basis for design. It was previously reported that antibodies binding the Class 3 epitope on RBD tend to be tolerant to mutations and thus maintain neutralization across RBD strains.^22,23 We characterized a set of 192 Class 3 antibodies, of which 182 have no known structure and were selected from a larger pool of Class 3 antibodies on the basis of their broad neutralization in a previous study.²⁴

To assess this pool of antibodies, we performed a measurement of binding affinity against the Wuhan SARS-CoV-2 strain, the XBB.1 strain and the XBB.1.5 strain with a single-concentration surface plasmon resonance (SPR) experiment, as described in Section 4.1.4. A summary of this study is given in Appendix A. With these results, we confirmed that eight of the 10 antibodies with known structure were able to bind the original Wuhan SARS-CoV-2 strain but failed to bind, or only bound weakly, to the XBB.1.5 RBD. We selected five of these antibodies, BD55–5840,²⁵ DXP-604,²⁶ LY-CoV1404,²⁷ REGN10987,²⁸ and S2K146,²⁹ which have CDR H3 loop lengths up to 15 residues, as starting points for our hit expansion and re-epitoping pipelines described in Section 2.3 and 2.4 respectively. For our developability pipeline, in Section 2.3, we also use as an additional starting point the antibody S309,³⁰ which exhibits binding to the Wuhan, XBB.1 and XBB.1.5 strains, but has poor developability characteristics, notably a propensity to aggregate and a low melting temperature. The binding affinity and developability properties of all antibodies used as starting points are given in Table 1. We resolved the structure of two of these starting point antibodies, REGN10987 and S309, using cryogenic electron microscopy as described in Section 4.3. These are shown in Figure 5a,b. Further details of the starting points used in each computational design pipeline are given in Appendix B.

Table 1.

Sequence of the CDR loops, as well as kinetic characterization with multiple antigen concentrations against three SARS-CoV-2 variants, except for S2K146 where only a single antigen concentration measurement was performed. As described in Section 4.1.4, and developability characterization, as defined in Section 4.1.6, for each starting point antibody used in this study.

Antibody	CDR	CDR	CDR	CDR	CDR	CDR	$p K_{D}$	$p K_{D}$	$p K_{D}$	Developable
	H1	H2	H3	L1	L2	L3	Wuhan	XBB.1	XBB.1.5
BD55–5840	GHSFTSNA	INTDTGTP	ARERDYSDYFFDY	ASLGISTD	GAS	QQYSNWPLT	10.12	NB	6.14	✓
DXP-604	GIIVSSNY	IYSGGST	ARDLGPYGMDV	QGISSD	AAS	QQLNSDLYT	9.67	NB	5.84	✓
LY-CoV1404	GFSLSISGVG	IYWDDDK	AHHSISTIFDH	SSDVGDYNY	EVS	SSYTTSSAV	9.01	NB	NB	✓
REGN10987	GFTFSNYA	ISYDGSNK	ASGSDYGDYLLVY	SSDVGGYNY	DVS	NSLTSISTWV	7.84	NB	NB	✓
S2K146	GFTFSNYA	ISYDGSNK	ASGSDYGDYLLVY	SSDVGGYNY	DVS	NSLTSISTWV	10.34	NB	NB	✓
S309	GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDN	QTVSSTS	GAS	QQHDTSLT	8.89	8.93	9.22	55

Open in a new tab

Four panels showing each the central slice colored by resolution and the locally refined interfaces are shown, each for a different antibody-antigen complex. — Cryo-electron microscopy epitope mapping of four selected antibodies in complex with the SARS-CoV-2 spike protein. (a) and (b) show the REGN10987 and S309 starting points, while (c) and (d) are selected designs from the inverse folding method. In each panel, the left-hand side shows central slices of 3D electron microscopy volumes colored by local resolution in angstroms, and the right-hand side displays locally refined interfaces of the fragment antigen-binding region and receptor-binding domain (in blue).

2.2. Efficiently navigating the antibody space enables the identification of diverse binders

We took five starting point antibodies (REGN10987, BD55–5840, DXP-604, S2K146 and LY-CoV1404) with strong binding affinity against the Wuhan strain, as defined in Section 4.1.5, but no or weak binding against the XBB.1.5 strain and moderate CDR H3 loop length. These starting points were used as seeds to generate libraries of candidate antibodies from paired and unpaired sequences in the Observed Antibody Space (OAS) dataset^31,32 as detailed in Section 4.2.2. Using our characterization pipeline, described in Section 4.2.1, to select antibodies for experimental validation, we computationally screened 11,389 candidate antibodies, of which 148 were submitted for experimental validation, 67 directly drawn from OAS and 81 designs elaborated using an inverse folding model. None of the 67 candidates drawn from OAS showed binding against the XBB.1 and XBB.1.5 strains, while 18 showed binding against the Wuhan strain in single concentration antigen SPR measurements as outlined in Section 4.1.4, with 14 strong binders and 4 medium binders, as defined in Section 4.1.5. A summary of binding and developability profile is given in Table 2, showing also the high dissimilarity of the designs to the starting point, with median edit distances from 11 to 75 depending on the starting point. About 95% of designs pass our developability criterion described in Section 4.1.6. We achieved a hit rate of 21% across the four starting points where our characterization pipeline was able to identify binding antibodies. Of these binders, five derived from the DXP-604 and REGN10987 starting points were submitted for kinetic SPR characterization with multiple antigen concentration, with three of these being confirmed as strong binders, as displayed in Table 3 and in Figure 2(a). Structures of the antibody-antigen complex for two designs for each starting point are displayed in Figure 2(c). We also show the full chromatograms of size exclusion chromatography in Figure 2(b). These antibodies are highly sequence-dissimilar to their respective starting point antibody, with up to 74 edits across the whole sequence. These results demonstrate that, given a starting point structure, our pipeline is able to identify sequence-distant binders with sample-efficient experimental screening.

Table 2.

Sequence of the CDR loops, as well as kinetic characterization with multiple antigen concentrations and developability characterization of candidate antibodies designed with the OAS scanning strategy. The first row for each starting point contains the sequence and experimental validation of the starting point antibody itself, with mutations introduced underlined in subsequent rows.

Starting	CDR	CDR	CDR	CDR	CDR	CDR	$p K_{D}$	$p K_{D}$	$p K_{D}$	Developable
point	H1	H2	H3	L1	L2	L3	Wuhan	XBB.1	XBB.1.5
REGN10987	GFTFSNYA	ISYDGSNK	ASGSDYGDYLLVY	SSDVGGYNY	DVS	NSLTSISTWV	7.84	NB	NB	✓
	GFTFNYA	ISYDGSNK	ARVHGSGSYHFDS	QSVFYSSTNKNY	LAS	HQYFNTPDS	NB	NB	NB	✓
	GFTFSSYG	ISYDGSNK	AKGHLAVAGVFDY	QSVLYSSNNKNY	WAS	QQYYSTPPFYT	8.53	NB	NB	✓
	GFTFSSYG	ISYDGSNK	AKGDSSGYYLLDY	QSVLYSSNNKNY	WAS	QQYYSTPPT	NB	NB	NB	✓
DXP-604	GIIVSSNY	IYSGGST	ARDLGPYGMDV	QGISSD	AAS	QQLNSDLYT	9.67	NB	5.84	✓
	GIIVSRNY	IYSGGST	ARDLIADGMDV	QGIGSW	AAS	QQLNSDLYT	9.07	5.35	5.87	✓
	GLIVSSNY	IYPGGST	VRDLYYYGMDV	QGISSY	AAS	QQLNSDLYT	9.38	NB	NB	✓

Open in a new tab

Table 3.

Summary of experimental characterization of candidate antibodies generated with our design strategies. Developable denotes designs that pass our HPLC quality threshold as detailed in section 4.1.6. Hits are designs that show measurable binding in a single concentration antigen screen, regardless of strength of the binding interaction.

Method	Starting	Designs	Median CDR	Median FW	Hits	Hits	Hits	Develop.
	point		edit dist.	edit dist.	Wuhan	XBB.1	XBB.1.5
	BD55–5840	43	38	22	1	0	0	43
OAS	DXP-604	26	7	4	22	1	1	23
scanning	LY-CoV1404	23	12	5	0	0	0	20
	REGN10987	33	39	35	1	0	0	32
	S2K146	23	41	34	2	0	0	22
	BD55–5840	13	11	-	1	0	0	3
Inverse	DXP-604	13	4	-	13	7	7	13
folding	LY-CoV1404	13	3	-	12	0	2	13
	REGN10987	13	10	-	2	0	0	12
	S2K146	13	5	-	3	0	0	11
	BD55–5840	12	2	5	12	0	0	12
	DXP-604	12	3	5	12	0	0	12
Efficient	LY-CoV1404	12	5	6	12	0	0	12
evolution	REGN10987	12	5	5	2	0	0	10
	S2K146	12	3	4	10	0	0	12
	S309	12	3	6	12	12	12	6

Open in a new tab

At the top, three panels showing the affinity measurements, as well as the HPLC response curve separately for both starting points. At the bottom, the structure of four selected examples are shown. — Main results of the OAS scanning design strategy. (a) Full titration SPR measurement of binding affinity to the Wuhan strain for selected designs and starting points. (b) Size-exclusion chromatography experiments for all designs and starting points. (c) Selected examples of design structures, highlighting mutations to the starting point in orange.

2.3. Protein language model approaches enable improvement of developability properties for existing candidate antibodies

Beyond the generation of novel binders against a target of interest, improving the developability profile of existing binders while maintaining affinity to the target is also an important task in computational antibody design. It has been shown previously that protein language models can provide a useful avenue for developability optimization.^11,33,34 Using an ensemble of six ESM language models³⁵ to identify sequence mutations with high likelihood, we generated 12,000 designs (2,000 for each of the six starting points) as described in Section 4.2.4 and characterized and ranked the designs using our characterization pipeline.

This combination of energetics assessment with language-guided evolution allows us to improve developability properties of starting point antibodies in a single-shot in silico study while maintaining binding affinity with the antigen. This is in contrast with the original study,³³ which required two rounds of experiments with an initial assessment of single-point mutations, and for which many antibodies lost their binding affinity to the antigen.

We submitted the 12 top ranked designs for each starting point for experimental validation, testing 72 designs in total for which a summary is given in Table 2. Of these, 57 designs maintained binding against the Wuhan strain (51 strong binders, 6 medium binders), achieving a hit rate of 79% with a median edit distance of 8. Of particular interest, the starting point antibody S309 has strong binding to the Wuhan strain as well as the XBB.1 and the XBB.1.5 strains, but has unfavorable developability characteristics, namely a propensity for aggregation,³⁶ as measured by the size-exclusion chromatography approach described in Section 4.1.6, as well as poor thermostability. All 12 designs derived from the S309 starting point that were experimentally validated maintained binding against all three strains of the virus. We ran full-titration SPR on all 12 of the S309-derived designs and confirmed them as either strong (9) or medium (3) binders against the XBB.1.5 strain and strong binders against the Wuhan strain, as is shown in Table 4 and Figure 3(a). The antibody-antigen complex structure for a few selected designs is shown in Figure 3(c). We further ran size exclusion chromatography columns on these 12 designs. All designs showed significant improvements in aggregation and 10 of the 12 designs showed improvement in thermostability compared to the S309 starting point, as shown in the chromatograms and melting point comparisons in Figure 3(b). This demonstrates the ability of our pipeline to generate and identify candidate antibodies with favorable developability properties while maintaining existing binding affinities.

Table 4.

Sequence of the CDR loops, as well as kinetic characterization with multiple antigen concentrations, thermostability and developability characterization of candidate antibodies designed with the efficient evolution strategy. The first row contains the sequence and experimental validation of the S309 starting point antibody itself, with mutations introduced underlined in subsequent rows.

CDR	CDR	CDR	CDR	CDR	CDR	FW edit	$p K_{D}$	$p K_{D}$	$p K_{D}$	Developable	Melting
H1	H2	H3	L1	L2	L3	distance	Wuhan	XBB.1	XBB.1.5		point
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDN	QTVSSTS	GAS	QQHDTSLT	–	8.89	8.93	9.22	✗	64.61
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QTVSSNS	GAS	QQHDTSLT	5	8.7	8.7	8.89	✓	70.41
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QSVSSNS	GAS	QQHDTSLT	5	9.3	8.5	8.93	✓	70.81
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QTVSSTS	GAS	QQHDTSLT	6	8.7	8.4	8.46	✗	69.88
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QSVSSNS	GAS	QQHDTSLT	6	8.2	8.5	8.6	✗	71.08
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QTVSSNS	GAS	QQHDTSLT	6	8.8	8.6	9.12	✗	70.31
GYTFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QTVSSNS	GAS	QQHDTSLT	5	8.7	7.9	7.99	✓	70.40
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QTVSSTS	GAS	QQHDTSLT	5	8.8	8.6	8.79	✓	55.25
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QSVSSTS	GAS	QQHDTSLT	6	8.6	8.5	8.83	✗	57.26
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QSVSSNS	GAS	QQHDTSLT	4	8.4	8.4	8.8	✓	80.21
GYTFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QSVSSNS	GAS	QQHDTSLT	6	8.5	7.8	7.92	✗	70.34
GYTFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QTVSSNS	GAS	QQHDTSLT	6	8.8	7.8	7.89	✗	70.15
GYPFTSYG	ISTYNGNT	ARDYTRGAWFGESLIGGFDV	QTVSSNS	GAS	QQHDTSLT	4	8.5	8.3	8.4	✓	71.86

Open in a new tab

At the top, three panels showing the affinity measurements for each strain of the virus separately. In the center, both a HPLC response curve and a distribution of melting temperatures are shown. At the bottom, the structure of four selected examples are shown. — Main results of the efficient evolution design strategy. (a) Full titration SPR measurement of binding affinity to the three virus strains for selected designs and the S309 starting point. (b) Size-exclusion chromatography experiments for all designs and the starting point, as well as thermostability measurements. (c) Selected examples of design structures, highlighting mutations to the starting point in orange.

2.4. Inverse folding enables rescue of candidate antibodies from escape mutations

Recovering binding activity of therapeutic antibodies after escape mutations on the target antigen is a particularly important task for computational antibody design.^37,38 We demonstrate that using an antibody-specific, antigen-aware inverse folding model and our characterization pipeline, we can recover binding to the antigen after viral escape mutations.

To this end, we generated 25,000 designs using AbMPNN,³⁹ as described in Section 4.2.3, using five starting point antibodies (5,000 each for REGN10987, BD55–5840, DXP-604, S2K146 and LY-CoV1404). We ranked and filtered them using our characterization pipeline. We experimentally validated the top 13 designs for each starting point through a single concentration antigen SPR measurement, of which a summary is given in Table 2. Of these 65 designs, 31 maintained binding against the Wuhan strain, with 25 strong binders and 6 medium binders, achieving a hit rate against the Wuhan strain of 48%. Amongst the designs binding to the Wuhan strain, the median edit distance is 4 with a maximum edit distance of 13.

For the starting points LY-CoV1404 and DXP-604, we demonstrate using a full-titration SPR measurement that 9 designs maintained strong binding against the Wuhan strain and gained weak (2) or medium (7) binding against XBB.1.5, while displaying favorable developability characteristics, as shown in Table 5 and in Figure 4. For the DXP-604 starting point, we achieve a hit rate of 54% against the XBB.1 and XBB.1.5 strains, while the LY-CoV1404 starting point designs have a hit rate of 15% against the XBB.1.5 strain. Furthermore, we selected one design per starting point for structural characterization with the cryo electron microscopy methodology described in Section 4.3 to verify the binding pose of our designed antibodies against the Wuhan SARS-CoV-2 RBD. These experimentally resolved structures are shown in Figure 5c,d.

Table 5.

Sequence of the CDR loops, as well as kinetic characterization with multiple antigen concentrations and developability characterization of candidate antibodies designed with the inverse folding strategy. The first row for each starting point contains the sequence and experimental validation of the starting point antibody itself, with mutations introduced underlined in subsequent rows.

Starting	CDR	CDR	CDR	CDR	CDR	CDR	$p K_{D}$	$p K_{D}$	$p K_{D}$	Developable
point	H1	H2	H3	L1	L2	L3	Wuhan	XBB.1	XBB.1.5
DXP-604	GIIVSSNY	IYSGGST	ARDLGPYGMDV	QGISSD	AAS	QQLNSDLYT	9.67	NB	5.84	✓
	GIIVSRNY	IYSGGST	ARDLGPYGMDV	QGISSD	AAS	QQLMDTLYT	10.01	5.78	6.33	✓
	GIIVSRNY	IYSGGTT	ARDLAVYGMDV	QGIGSD	AAS	QQLNSDLYT	10.17	5.99	6.15	✓
	GIIVSRNY	IYSGGST	ARDLGPYGMDV	QGISSD	AAS	QQLNDLLYT	9.38	5.12	6.01	✓
	GIIVSRNY	IYSGGTT	ARDLGPYGMDV	QGIGSD	AAS	QQLNSDLYT	9.53	5.84	6.35	✓
	GIIVSRNY	IYPGGST	ARDLGVYGMDV	QGISSD	AAS	QQLNDYLYT	10.76	5.91	6.33	✓
	GIIVSRNY	IYSGGST	ARDLGPYGMDV	QGIGSD	AAS	QQLNSDLYT	10.02	6.37	6.28	✓
	GIIVSRNY	IYSGGST	ARDLGPYGMDV	QGISSD	AAS	QQLNSDLYT	10.64	6.15	6.07	✓
LY-CoV1404	GFSLSISGVG	IYWDDDK	AHHSISTIFDH	SSDVGDYNY	EVS	SSYTTSSAV	9.01	NB	NB	✓
	GFSLSTSGVG	IYWDDDK	AHHSISTIFDH	SSDVGNYNY	EVS	SSYTSSSVV	9.57	NB	4.87	✓
	GFSLSISGVG	IYWDDDK	AHHSISTIFDH	SSDVGIYNY	EVS	SSYTTSAAV	9.48	NB	4.88	✓

Open in a new tab

At the top, three panels showing the affinity measurements for each strain of the virus separately. In the center, HPLC response curves are shown for both starting points. At the bottom, the structure of four selected examples are shown. — Main results of the inverse folding design strategy. (a) Full titration SPR measurement of binding affinity to the three virus strains for selected designs and starting points. (b) Size-exclusion chromatography experiments for all designs and starting points. (c) Selected examples of design structures, highlighting mutations to the starting point in orange.

With these results, we demonstrate that our candidate antibody generation and characterization pipeline is not only able to improve binding and developability for existing antibody binders but is also able to yield antibodies binding to previously non-bound epitopes.

3. Discussion

We introduce a novel pipeline for the discovery and design of potent and developable therapeutic antibody candidates. Our approach, which integrates physics-based and ML-driven antibody design and characterization, successfully identifies effective and sequence-diverse binders with favorable developability profiles from the natural antibody repertoire, using existing antibodies as starting points. Our pipeline presents a valuable opportunity for hit elaboration during antibody design campaigns. We demonstrate that combining our characterization pipeline with a language model for sequence-based elaboration enhances the developability profiles of candidate antibodies while maintaining binding potency in a single round of in silico screening. This suggests a viable strategy for the lead optimization of antibody therapeutics. Lastly, we show that using an inverse folding model in combination with our characterization pipeline can allow for the restoration of binding activity in candidate antibodies after antigen escape mutations, using low-sample experimental screening.

Our computational design pipeline could be combined with a Bayesian optimization approach^40–44 to improve properties of candidate antibodies over multiple design cycles with experimental validation. Future directions for research also include incorporating recent advancements in antibody-antigen structure prediction using modern ML techniques^45,46 to facilitate faster and more accurate generation of antibody-antigen complex structures. Structure-based design with antibody-specific diffusion models^47–49 may potentially deliver rational de novo biologics design against novel targets, although these models have seen limited experimental success to date, see e.g. Appendix E.

The findings of our study have practical implications for the development of therapeutic antibodies and pave the way for the computational design of developable therapeutic antibodies. By improving the efficiency and accuracy of antibody design, our pipeline has the potential to accelerate the development of new therapeutics and improve their success rates in clinical applications.

4. Materials and methods

4.1. Experimental methods

4.1.1. Antibody protein production and purification

Protein production and gene synthesis were conducted by GenScript. After synthesis, The selected VH/VL sequences underwent cloning onto an IgG1 backbone, facilitating the expression of the full antibody structure. The recombinant antibodies were then transfected into Genscript’s proprietary TurboCHO-HT expression system, a high-performance mammalian cell line optimized for antibody production. This system offers high yield and consistent expression levels, critical for subsequent purification steps.

Post-expression, the antibodies were subjected to purification processes involving affinity chromatography, typically using Protein A or Protein G columns, to isolate the antibodies from the cell culture supernatant. The purified, eluted protein fractions were pooled to ensure batch consistency and underwent buffer exchange into phosphate-buffered saline (PBS) at pH 7.4. The final protein concentration was measured using the A280 method, which relies on absorbance at 280 nm to quantify protein concentration based on the presence of aromatic residues, ensuring accurate and reproducible concentration determination.

This approach ensures high purity and functionality of the antibodies, ready for subsequent experimental use.

4.1.2. Antigen protein production

GenScript provided pcDNA3.4 plasmids encoding the original Spike RBD variant as well as the XBB.1 and XBB.1.5 mutant strains with additional C-terminal hexahistidine (6×His) tags for affinity purification. Plasmids were transiently transfected into Expi293F cells (ThermoFisher). The cellular density was adjusted to $3 \times 10^{6}$ cell/mL, with a final volume of 400 mL of Expi293 expression media in 2-L non-baffled Erlenmeyer flasks (Corning, cat. No. 431255). Next, 400 mg of plasmid DNA was diluted with 24 mL of Opti-MEM medium (ThermoFisher, cat. No. 31985062). In a separate tube, 1.3 mL of Expifectamine 293 reagent (GIBCO, cat. No. 13385544) was then diluted in 23.7 mL of the same medium. Sample DNA and Expifectamine were then mixed and incubated at room temperature for 20 min. Cultured cells with this mixture added were then incubated for 3 days at 37C and 8% CO2. The supernatant from the culture was collected by centrifugation at 4000×g for 30 min. Affinity purification was then performed using AmMag Ni Magnetic beads (GenScript Biotech, cat. No. U3108HC180). Beads were washed 4 times with 50 mm HEPES at pH 7.5, 500-mM NaCl, and 5% glycerol buffer to remove nonspecifically bound protein. Protein was then eluted with 50-mM HEPES at pH 7.5, 500 mm NaCl, 5% glycerol, and 500 mm imidazole. Eluted Spike RBD was then buffer exchanged into 50-mM HEPES and 150-mM NaCl buffer with a PD-10 Sephadex G-25 desalting column (Cytiva, cat. No. 17085101). The purity and concentration of the obtained protein were determined by SDS-PAGE and the A280 method, respectively.

4.1.3. Differential scanning fluorimetry experiments

Monoclonal antibody thermal stability was assessed with a Prometheus Panta instrument (NanoTemper Technologies) using a differential scanning fluorimetry (DSF) analysis with temperature increased from 25 to 95 C with a rate of increase of 1C/min. High sensitivity capillaries supplied by NanoTemper Technologies (Cat. No. PR-C006) were used in all experiments. Output data were processed using a Prometheus Panta Control Software (NanoTemper Technologies). We report the lower melting temperature, Tm1, for all starting point and designed antibodies.

4.1.4. Surface plasmon resonance experiments

Surface plasmon resonance (SPR) analysis of purified mAbs was performed with a Bruker Sierra Sensors MASS-2 instrument (Bruker Corp), following the methodology described in.⁵⁰ In these experiments, goat anti-human Fc antibody (Jackson ImmunoResearch, Product #109-005-088) was covalently coupled on an HCA censorship (Bruker) by first activating the spots with a mixture of 400 mm EDC and 100 mm NHS for 10 minutes at 10 µL/min. This step was followed by coupling of the anti-human Fc antibody at 25 µg/ml in sodium acetate buffer (10 mm, pH 5.0) for 6 minutes at 15 µL/min. The remaining uncoupled sites were blocked with 1 M ethanolamine at pH 8.0 injected for 6 minutes at 10 µL/min. In general 13,000 RU of antibody was captured on each spot. After surface preparation, human mAb capture was performed at 20 µg/ml for 30 seconds at 10 µL/min to give on average 250 RU. Binding to three variants of SARS-CoV-2 RBD, the original strain, XBB.1 and XBB.1.5 mutants, was measured in independent experiments. For initial SPR experiments a single-shot injection of the three antigens was performed at 100 nM for 90 sec with a flow-rate of 50 µl/min and the dissociation phase was monitored for 600 sec. Confirmation of binders via titration experiments was performed with the same injection periods but with a 10-point three-fold serial dilution from 1 µM for each antigen. To regenerate the surface, 200 mm phosphoric acid was injected for 30 sec three times. The mAbs and antigens were diluted in the running buffer, which consisted of HBS-EP+ (10 mm HEPES pH 7.4, 150 mm NaCl, 3 mm EDTA, 0.05% v/v Tween-20). Data was double referenced by subtracting reference data obtained simultaneously with the active injection at a blank spot and subtracting data from a preceding blank buffer injection obtained at the active spot. Data processing was carried out with Sierra Analyser (Bruker, ver3.4.1) and fitting was carried out using a 1:1 model with Scrubber (BioLogic Software, ver2.0c).

4.1.5. Surface plasmon resonance results analysis

The equilibrium dissociation constant $K_{D}$ is calculated from the ratio of the dissociation rate constant $k_{d}$ and the association rate constant $k_{a}$ , which are fitted from the corresponding slopes in the sensorgram obtained through the procedure described in section 4.1.4. In practice, we report the negative logarithm of $K_{D}$ ,

p K_{D} = - {log}_{10} K_{D} .

(1)

To enable easy comparison of binding results, we refer to binding detected in SPR experiments as weak (4.5 $<$ $p K_{D}$ $<$ 6), medium (6 $\leq$ $p K_{D}$ $<$ 8) or strong (8 $\leq$ $p K_{D}$ ). Further, for single concentration antigen SPR measurements, we introduce a quality assurance criterion, discounting any data points for which a binding signal was observed but which had a response of less than 30% of the theoretical maximum response as non-binding.

4.1.6. Size-exclusion chromatography high-performance liquid chromatography experiments

Measurements were performed with an Agilent 1260 Infinity II system using AdvanceBioSEC 200A 1.9 µm 4.x6×150 mm columns (Cat. No. PL1580–3201) and a PBS pH 7.4 mobile phase at a flow rate of 0.4 mL/min for 7.2 minutes per injection. A diode array monitoring absorbance at 205 nm was used as the detector and peaks were integrated in the window of 1–4.1 minutes retention time. For each mAb, 100 ng of total protein were injected in a maximum injection volume of 10 $μ$ l, with 10 $μ$ l being injected for proteins that did not reach a purified concentration higher than 10ng/ $μ$ l. The fraction of the total integrated peak area contained within the highest peak was reported as “ratio in main peak”. Using a set of benchmark antibodies as reference, we considered antibody candidates as developable, if the highest peak was observed between two to three minutes retention time and has a ratio in the main peak above 0.5. This set of parameters was chosen to ensure that the highest peak was the monomeric peak and that the majority of the antibody was monomeric in solution. Note that here monomeric refers to the intact antibody protein in its native state. We use this metric to assess the aggregation state and hydrophobicity of antibodies.⁵¹

4.2. Computational methods

4.2.1. Antibody characterisation pipeline

In order to enable assessment, ranking and filtering of computationally suggested candidate antibodies prior to experimental validation, we implemented a computational pipeline to predict a range of metrics, both with and without the presence of the corresponding antigen, as shown in Figure 1. This pipeline is in the following referred to as the characterisation pipeline. The pipeline takes as input a set of antibody sequences to be characterized as well as a reference complex structure of a starting point antibody against the target of interest. Antibody structures and antibody-antigen complexes are predicted and potential liabilities (sequence- and structure-based) filtered out, prior to candidate ranking. To this end, antibody sequences are first annotated, characterized and filtered. For the remaining antibody sequences, structural models are predicted and assessed for developability as well as stability. Using target epitope and antibody structures, complex structures are predicted and the resulting complexes are ranked and filtered based on predicted interactions to determine candidate antibodies to take forward to experimental assessment.

4.2.1.1. Sequence-based characterisation

Antibody sequences are first numbered using ANARCI.⁵² Candidate sequences are then scanned for any motifs that are known to negatively affect developability,⁵³ and any that contain these liabilities are discarded at this stage in order to reduce downstream computational cost.

4.2.1.2. Structure-based antibody characterisation

Antibody structures are predicted using ABodyBuilder2.^54,55 Predicted structures are then characterized using with the Therapeutic Antibody Profiler (TAP)⁵⁶ and structures with red TAP flags are discarded from further analysis. In addition to this, interface stability of the heavy and light chains is characterized using Rosetta as described in Section 4.2.6 and structures with predicted interface energy above 5 kcal/mol, used as a proxy for antibody instability, are discarded from downstream analysis.

4.2.1.3. Structure-based complex characterisation

The remaining candidate antibodies are characterized for their predicted interactions with the target antigen. ChimeraX⁵⁷ is used to generate a complex structure between each candidate antibody and the target antigen, using the antibody in the reference complex to guide the complex generation by calculating a low resolution map (6 Å) of the reference antibody and fitting the candidate antibody into that map. The resulting complex is relaxed using OpenMM.⁵⁸

Following complex generation, the predicted complex is characterized using Rosetta. Complexes with shape complementarity score below 0.45 and complexes with positive interaction energy at mutated residues (see Section 4.2.6) are discarded at this point. The remaining structures are ranked as described below.

4.2.1.4. Ranking of antibody candidates

In order to balance overall structure stability and interface energetics, we employ a bi-variate Gaussian fit for the ranking of candidate antibodies after structure characterization. To this end, we fit the bi-variate Gaussian to the overall Rosetta score and the Rosetta predicted free energy change. We omit from the fit any sequences which have either total score or free energy change further than 1.5 interquartile ranges from the upper or lower quartile boundaries. For designs where both the overall score and the free energy change are lower than the starting point antibody, we then compute the Mahalanobis distance, while all others are discarded. Designs are then ranked by decreasing distance.

4.2.1.5. Confirmation of high-ranking candidates

Rosetta calculations are stochastic, with multiple runs with the same input structure leading to an ensemble of results. Prior to experimental validation, we therefore repeat the Rosetta score generation for the 50 highest-ranking candidate antibodies with 10 replicates. We then consider the two replicates with the lowest total score, as well as the two replicates with the lowest free energy change. Final Rosetta scores are calculated by averaging over this set of two to four replicates, and we perform an identical ranking procedure to select the designs with highest Mahalanobis distance in the left lower quadrant for experimental verification. Further details are given in Appendix D.

4.2.2. Candidate library generation from OAS

We scanned both paired and unpaired OAS³² for potential candidates for assessment with the antibody characterization pipeline outlined above. For paired OAS, where both light and heavy chain of each deposited antibody are known, we determined the V and J gene of the starting point antibody and considered all non-redundant, paired antibodies in OAS as candidates which had matching V and J gene in the heavy chain and a CDR H3 length within one residue of the starting antibody length. For unpaired OAS, we searched for heavy chains in OAS with the same V and J gene as the starting point antibody heavy chain, same length CDR H3 and at least 50% sequence identity to the starting point CDR H3. The heavy chains identified through this scan were paired with the starting point antibody light chain. Candidates are then assessed according to the characterization pipeline described in Section 4.2.1. We further expanded the libraries through two rounds of elaboration using the inverse folding model described in section 4.2.3. In each round, we generated 25 variants for each of the 100 top ranked designs from each library, using the inverse folding model and assessed according to our characterization pipeline.

4.2.3. Candidate generation with an inverse folding model

To generate candidates with similar structural features but with novel sequences, we used the inverse folding model AbMPNN.³⁹ AbMPNN is a fine-tuned version of the ProteinMPNN⁵⁹ inverse folding model that specializes in antibody sequence prediction. The model takes an antibody-antigen complex structure as input and produces a sequence for the antibody that is predicted to fold into the complex structure. It can take partial sequence information for conditioning on both antibody and antigen sequence, and has been shown to achieve competitive results in several recent benchmarks.^60–62

To select which residues to design, we identify all paratope residues in the CDR loops for the starting antibody structures. Here we define paratope residues as those with a heavy atom within 4 Å of an antigen heavy atom. We then selected the four CDR loops with the most paratope residues and redesigned the paratope residues in these loops. This was intended to reduce the edit distance to the starting antibody in comparison to modifying the entire paratope. We then generated 5,000 sequences per starting antibody at a temperature of T = 0.2 for 25,000 sequences in total. Candidates are then assessed according to the characterization pipeline described in Section 4.2.1.

4.2.4. Candidate generation with a language model

We consider suggested mutations from the ESM-1b language model and the ESM-1 v ensemble of five protein language models,³⁵ following the language-model-guided affinity maturation procedure described in.³³ Amino acid substitutions along the heavy and light chain sequences are recommended by a consensus of masked language models, and we consider as potential mutations the set of substitutions with higher likelihood than the wild type by at least one model.

We provide a list of all these potential substitutions along with the wild type residue as input to Rosetta, allowing it to pick either a suggested language model mutation or the wildtype amino acid. Rosetta then samples rotamers to repack the side-chains of the modified antibody in complex with the antigen. We then use the same selection criteria as described in section 4.2.1 to select the best scoring sequences from a sample of 1,000 sequences generated from combinations of all possible mutations, though only a single round of Rosetta calculations are performed in this pipeline due to the comparatively lower number of designs.

This leads to relatively high edit distances to the wild type sequence, as Rosetta preferentially picks out some of the mutations consistently, and leads to diversity across samples due to the stochastic nature of the assessment.

4.2.5. Generation of initial structures for design

Though a chimeric structure of XBB.1.5 RBD fused to the SARS-CoV-1 core structure has recently become available,⁶³ no structure was available at the beginning of our study. To enable structure-based assessment of designs against XBB.1.5 RBD, we therefore predicted the structure of XBB.1.5 RBD using ColabFold.⁶⁴ The resulting model structure is predicted with high confidence except in the region proximal to the RBD glycan. This ColabFold structure was then aligned to the RBD models present in the template PDBs noted in Appendix B to generate antibody-XBB.1.5 RBD complexes. The resulting complexes were then energy relaxed with Rosetta as described in Section 4.2.6.

4.2.6. Physics-based minimization, design, and scoring with Rosetta

Rosetta3⁶⁵ was used for relaxation and energetic scoring of antibody Fvs and antibody-antigen complexes. In the efficient evolution design pipelines, Rosetta was also used to introduce mutations into the CDRs of antibody-antigen complexes.

In all Rosetta runs the beta_nov16 score function was used to assess energies and the InterfaceAnalyzeMover used to compute dG_separated, the Rosetta prediction of binding free energy change, at the interface of interest, given in kcal/mol. When the input to Rosetta was an antibody Fv, the interface between the heavy and light chains was analyzed with InterfaceAnalyzeMover. In cases when an antibody-antigen complex was input to Rosetta, interface analysis was applied to the antibody-antigen interface. In all cases, side chains were repacked before computing interface statistics. The RunSimpleMetrics⁶⁶ function was used to compute the structural aggregation propensity (SAP),⁶⁷ RMSD from the initial structure, and dihedral distance from the initial structure. We briefly summarize the key Rosetta protocols used in our study below.

In relaxation simulations, all backbone atom positions were restrained with $weights = 1$ and the structure then minimized to an energy tolerance of 0.001 kcal/mol using MinMover. This minimized structure was then subjected to one iteration of the FastRelax repacking and minimization protocol, and the resulting structure then assessed with RunSimpleMetrics and InterfaceAnalyzeMover.

For design, Rosetta format resfiles were prepared specifying which amino acid positions were mutable and to which new amino acids they could change. First, restraints were applied to all $C_{α}$ atoms in the input structure to maintain the overall backbone configuration. Mutations are introduced and the structure relaxed via three alternating rounds of rotamer repacking and minimization with soft constraints ( $weights = 0.4$ ) using PackRotamersMover and MinMover, respectively, followed by two rounds of minimization with RotamersTrialsMinMover. Each design cycle was concluded by a final minimization with MinMover. This process was repeated three times and the final structure analyzed with InterfaceAnalyzeMover and RunSimpleMetrics as noted above.

Due to the stochastic nature of the Rosetta relaxation and design protocols, repeat runs using the same starting structure give different results. We therefore carried out 10 replicate runs of Rosetta and considered the set of up to four replicate structures with the two lowest total score values and the two lowest free energy change to compute final scoring metrics.

We also used the interface_energy module within Rosetta to predict the degree to which antibodies engage SARS-CoV-2 XBB.1.5 RBD. First, we used PyMol to list the paratope residues in the antibody as those within 4 Å of the antigen and the epitope residues in the antigen as those within 4 Å of the antibody. We then computed all pairwise contact energies between these two sets of residues to generate a contact energy matrix. Finally, we computed the sum of all pairwise contact energies involving any of the 23 XBB.1.5 RBD mutations, generating a single energy value representing the strength of interactions of the antibody with mutated residues in XBB.1.5 RBD. We exclude any candidates for which the difference between the sum of mutated energies is positive.

4.3. Cryogenic electron microscopy methods

4.3.1. Sample preparation

Purified Fabs were incubated with 6P stabilized SARS-CoV-2 spike protein⁶⁸ at a molar ratio of 1.2:1 for 5 minutes on ice. Prior to application of this solution to the grid, fluorinated octyl maltoside (FOM) was added to a final concentration of 0.01% (w/v) to aid in overcoming preferred orientation. In all cases 3 l of spike:Fab complex solution was applied to Quantifoil R1.2/1.3 grids (200 mesh) pre-treated by glow discharge in air at 20 mAmp for 30 seconds, resulting in a negatively charged grid surface. Samples were blotted with 0 blot force for 4 seconds, and plunge frozen in liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific). The sample chamber was held at 4 C and 95% relative humidity throughout.

4.3.2. Data collection

Cryo-EM data were collected at the Eindhoven NanoPort facility on a 200 kV Glacios 2 cryo-TEM (Thermo Fisher Scientific) equipped with a Falcon 4i direct electron detector and Selectris energy filter. Further details on optical settings are listed in Appendix C.

4.3.3. Image processing

Cryo-EM image processing was carried out in cryoSPARC 4.0.⁶⁹ Raw EER movies were preprocessed using patch motion correction and patch CTF estimation without upsampling. Particles were picked with cryoSPARC blob picker and extracted into 520 pixel boxes, binned to 130 for early processing steps. Iterative 2D classification and manual curation were used to clean the initial particle stack. Ab initio model generation into 4 classes was used to distinguish spike:fab complexes from apo spike. Particles containing spike:fab complex were re-extracted into 2× bin (260 px) boxes for further refinements. Non-uniform refinement was used to generate global 3D reconstructions with C1 symmetry. Local refinement masks comprising RBD and Fab were generated with UCSF Chimera⁵⁷ and used for local refinement of spike-fab interfaces in cryoSPARC. Final maps were locally sharpened with deepEMhancer.⁷⁰ FSCs were calculated using the gold-standard method at a cutoff of 0.143.

Acknowledgments

We thank John Overington, Andrew Hopkins, Jody Barbeau, Eileen Brandenburger, Anthony Bradley, Sean Robinson, Tjelvar Olsson, Luis Yanes and Iva Hopkins Navratilova for useful discussions and comments. We also thank Berend Jan Bosch and Chunyan Wang from Utrecht University for providing the SARS-CoV-2 spike protein used in this study.

Appendix A.

Class 3 antibody SPR screen

We performed a broad single concentration antigen screen with the SPR method described in Section 4.1.4 to estimate binding affinity of 192 class 3 antibodies against the Wuhan strain, the XBB.1 strain and the XBB.1.5 strain. A summary of these measurements is given in Table A1, with further details available in the full dataset at https://doi.org/10.5281/zenodo.13862717.

Table A1.

Summary of experimental characterization of 192 class 3 antibodies from a single concentration antigen screen against three strains of the SARS-CoV-2 RBD. Binding strength is determined from the $p K_{D}$ value as detailed in section 4.1.5..

	Wuhan	XBB.1	XBB.1.5
Non-binders	29	191	94
Weak binders	0	0	0
Medium binders	0	0	7
Strong binders	163	1	98

Open in a new tab

Using the cryogenic electron microscopy methodology described in Section 4.3, we assess the binding pose and antibody structure of two selected sequences, OC220225-SN0108 and OC220302-SN0554. These are from the set of 182 for which no previously known structure exists and exhibit strong binding to the Wuhan strain. The structural characterization for both antibodies is shown in Figure A1.

Two panels showing each the central slice colored by resolution and the locally refined interfaces are shown, each for a different antibody-antigen complex. — Cryo-electron microscopy epitope mapping of two selected antibodies in complex with the SARS-CoV-2 spike protein. In both panels, the left-hand side shows central slices of 3D electron microscopy volumes colored by local resolution in angstrom, and the right-hand side displays locally refined interfaces of the fragment antigen-binding region and receptor-binding domain (in blue).

Appendix B.

Starting point antibodies

The starting point antibodies used in this article are shown in Table B1. Of these six starting points, all have potent binding to the Wuhan strain of the virus, but only S309 is an effective binder to all three antigens tested, namely wild-type, XBB.1 and XBB.1.5. S2K146, BD55–5840 and DXP-604 are weak binders to XBB.1.5, but do not bind to XBB.1. REGN10987 and LY-CoV1404 only bind to the Wuhan strain and have no efficacy against XBB.1 and XBB.1.5.

Table B1.

Starting point antibodies used in this study, along with the template PDB structure used, and the corresponding RBD strain. Each starting point was used with the efficient evolution approach, while only starting point antibodies with CDR H3 of length up to 15 residues were used in the inverse folding and OAS scanning method.

Antibody Name	Template PDB used	RBD Strain in template PDB	Used in method
Antibody Name			Inverse folding	OAS scanning	Efficient Evol.
BD55–5840	7WRZ²⁵	Omicron	✓	✓	✓
DXP-604	7CH4²⁶	Wuhan	✓	✓	✓
LY-CoV1404	7MMO²⁷	Wuhan	✓	✓	✓
REGN10987	6XDG²⁸	Wuhan	✓	✓	✓
S2K146	7TAS²⁹	Wuhan	✓	✓	✓
S309	7X1M³⁰	Omicron	–	–	✓

Open in a new tab

Appendix C.

Cryo-EM settings

Details on optical settings for the cryo-EM data collection are given in Table C1.

Table C1.

Data collection parameters.

Samples	REGN10987 antibody- antigen complex S309 antibody-antigen complex	Design derived from the DXP-604 starting point
		Design derived from the LY-CoV1404 starting point
		OC220225-SN0108 antibody-antigen complex
		OC220302-SN0554 antibody-antigen complex
Energy source	XFEG	E-CFEG
Voltage (kV)	200	200
Magnification	165,000	165,000
Apertures	C2: 50 $\mum$ , no objective aperture	C2: 30 $\mum$ , no objective aperture
Spot size	2	2
Beam diameter ( $\mum$ )	1.1	1.1
Pixel size (Å …)	0.694	0.694
Dose rate (e-/pix/sec)	10.8	12.1
Electron exposure (e–/Å …2)	50	50
EER frames	702	657
Exposure time (sec)	2.29	2.15
Selectris energy filter slit (eV)	10	10
Images per hole	2 (Fringe-Free Imaging)	2 (Fringe-Free Imaging)
Hole centering	Aberration-free image shift	Aberration-free image shift
Delay after stage shift (sec)	10	5
Delay after image shift (sec)	0.5	0.5
Autofocus	Once per cluster	Once per cluster
Defocus range ( $\mum$ )	−0.7 to −1.4	−0.7 to −1.4

Open in a new tab

Appendix D.

Ranking of designs

In Figure D1, we show the distribution of total score and free energy change for all candidates across each design strategy. For the language model pipeline, which had a smaller number of total designs, 10 replicates were used in a single round to select final top-ranking candidates. An inset is shown for each plot, displaying the left lower quadrant of the distribution, with contour plots of the bi-variate Gaussian fit used for final ranking according to Mahalanobis distance. The fit excludes outliers, defined as further than 1.5 interquartile range above the third quartile or below the first quartile.

Three rows of scatter plots are shown. In the first row, two rounds are shown for the OAS scanning pipeline, with an inset plot for a zoomed in view of the distribution of selected designs. The second row shows the same selection procedure over two rounds for the inverse folding pipeline. The final row shows the single round selection for the language model pipeline, which has fewer outliers. — Multi-variate gaussian fit of the designed candidates, with selected designs shown in orange. (a) Two rounds of Rosetta selection for the OAS scanning pipeline, with one and 10 replicates respectively. (b) Two rounds of Rosetta selection for the inverse folding pipeline, with one and 10 replicates respectively. (c) Selection of language model designs from the efficient evolution pipeline.

Appendix E.

CDR design with RFdiffusion

We attempted to use RFdiffusion, [71] a state-of-the-art protein backbone diffusion model, to re-design CDR loops of starting point antibodies. We considered two starting point antibodies, BD55–5840 and LY-Cov1404, and used RFdiffusion to design part of the CDR H3 and CDR L3 loops respectively, using the remainder of the antibody and the XBB.1.5 antigen as motif.

We identify the six contiguous residues closest to the antigen as those to re-design in each selected CDR loop, and sample backbone conformations with RFdiffusion of six to nine residues to replace the original CDR loop. We use four different configurations for both starting point: (1) the default RFdiffusion model, (2) the Complex_beta_ckpt model weights, intended to generate a greater diversity of backbone topologies, (3) partially noising and de-noise the original backbone with 20 steps with the default model weights, and (4) using the default model in conjunction with an auxiliary potential, which consists of an SE(3) transformer trained on classifying experimental antibody structures from modified ones where all CDR loops have been re-designed with RFdiffusion. For each configuration and starting point, we generate 200 samples, for a total of 800 structures. For each structure, we then generate 5 sequences by predicting all modified residues with AbMPNN.

Across each pool of 1000 sequences and backbone structures for every configuration and starting point, we select 10 using the characterization pipeline described in Section 4.2.1 for experimental validation. These 80 designs, 40 for each starting point, are selected to bind against the XBB.1.5 variant. As shown in Table E1, we find that none of these designs bind to XBB.1 or XBB.1.5, however 32 of them show efficacy against the Wuhan strain, to which the starting points also had measurable binding, resulting in a hit rate of 40%. There is no statistically significant difference in measured binding affinity or hit rate across the different configurations. Of the 32 binders, 11 have modified loop lengths where additional residues where inserted during the CDR backbone redesign.

Table E1.

Summary of experimental characterization of candidate antibodies generated with RFdiffusion.

Starting point	Designs	Hits	Hits	Hits	Developable
		(Wuhan)	(XBB.1)	(XBB.1.5)
BD55–5840	40	15	0	0	29
LY-CoV1404	40	17	0	0	29

Open in a new tab

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Funding Statement

The author(s) reported there is no funding associated with the work featured in this article.

Disclosure statement

F.D., C.S., A.K., D.C., M.B., D.N., N.W., H.K., C.M., D.E., R.G., D.D., P.T., W.B., J.D., D.P., S.S. and C.D. are or have previously been employees of Exscientia. I.D. is an employee of Thermo Fisher Scientific.

Data availability statement

We provide the code and data generated in this study under MIT license at https://github.com/Exscientia/ab-characterisation and https://doi.org/10.5281/zenodo.13862717.

References

1.Crescioli S, Kaplon H, Chenoweth A, Wang L, Visweswaraiah J, Reichert JM.. Antibodies to watch in 2024. Mabs-austin. 2024;16(1):2297450. doi: 10.1080/19420862.2023.2297450. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Walsh G, Walsh E.. Biopharmaceutical benchmarks 2022. Nat Biotechnol. 2022;40(12):1722–18. doi: 10.1038/s41587-022-01582-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Mullard A. Fda approves 100th monoclonal antibody product. Nat Rev Drug Discov. 2021;20(7):491–495. doi: 10.1038/d41573-021-00079-7. [DOI] [PubMed] [Google Scholar]
4.https://www.pharmaceutical-technology.com/analyst-comment/biologic-sales-small-molecule-sales/.
5.Kim J, McFee M, Fang Q, Abdin O, Kim PM. Computational and artificial intelligence-based methods for antibody development. Trends Pharmacological Sci. 2023;44(3):175–189. doi: 10.1016/j.tips.2022.12.005. [DOI] [PubMed] [Google Scholar]
6.Silva BM, Myung Y, Ascher DB, Pires DEV. epitope3D: a machine learning method for conformational B-cell epitope prediction. Briefings Bioinf. 2022;23(1):423. doi: 10.1093/bib/bbab423. [DOI] [PubMed] [Google Scholar]
7.Chinery L, Wahome N, Moal I, Deane CM, Borgwardt K. Paragraph—antibody paratope prediction using graph neural networks with minimal feature vectors. Bioinformatics. 2022;39(1):732. doi: 10.1093/bioinformatics/btac732. [DOI] [PubMed] [Google Scholar]
8.Kurumida Y, Saito Y, Kameda T. Predicting antibody affinity changes upon mutations by combining multiple predictors. Sci Rep. 2020;10(1):19533. doi: 10.1038/s41598-020-76369-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Adolf-Bryfogle J, Kalyuzhniy O, Kubitz M, Weitzner BD, Hu X, Adachi Y, Schief WR, Dunbrack RL Jr, Ben-Tal N. Rosettaantibodydesign (rabd): a general framework for computational antibody design. PLOS Comput Biol. 2018;14(4):1006112. doi: 10.1371/journal.pcbi.1006112. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Zhang W, Wang H, Feng N, Li Y, Gu J, Wang Z. Developability assessment at early-stage discovery to enable development of antibody-derived therapeutics. Antibody Ther. 2023;6(1):13–29. doi: 10.1093/abt/tbac029. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Bachas S, Rakocevic G, Spencer D, Sastry AV, Haile R, Sutton JM, Kasun G, Stachyra A, Gutierrez JM, Yassine E, et al. Antibody optimization enabled by artificial intelligence predictions of binding affinity and naturalness. BioRxiv. 2022: 2022–2028.
12.Gruver N, Stanton S, Frey N, Rudner TG, Hotzel I, Lafrance-Vanasse J, Rajpal A, Cho K, Wilson AG. Protein design with guided discrete diffusion. Adv Neural Inf Process Syst. 2023;36:12489–12517. https://dl.acm.org/doi/10.5555/3666122.3666669 [Google Scholar]
13.Shanker VR, Bruun TU, Hie BL, Kim PS. Unsupervised evolution of protein and antibody complexes with a structure-informed language model. Science. 2024;385(6704):46–53. doi: 10.1126/science.adk8946. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Hummer AM, Abanades B, Deane CM. Advances in computational structure-based antibody design. Curr Opin Struct Biol. 2022;74:102379. doi: 10.1016/j.sbi.2022.102379. [DOI] [PubMed] [Google Scholar]
15.Shanehsazzadeh A, McPartlon M, Kasun G, Steiger AK, Sutton JM, Yassine E, McCloskey C, Haile R, Shuai R, Alverio J, et al. Unlocking de novo antibody design with generative artificial intelligence. bioRxiv. 2024. doi: 10.1101/2023.01.08.523187. [DOI]
16.Cheng J, Liang T, Xie X-Q, Feng Z, Meng L. A new era of antibody discovery: an in-depth review of ai-driven approaches. Drug Discov Today Today. 2024;29(6):103984. doi: 10.1016/j.drudis.2024.103984. [DOI] [PubMed] [Google Scholar]
17.Kim J, McFee M, Fang Q, Abdin O, Kim P. Computational and artificial intelligence-based methods for antibody development. Trends Pharmacological Sci. 2023;44(3):175–189. doi: 10.1016/j.tips.2022.12.005. [DOI] [PubMed] [Google Scholar]
18.Bauer J, Kube S, Gupta P, Kumar S. Biopharmaceutical informatics: a strategic vision for discovering developable biotherapeutic drug candidates. In: Gadamasetti K, Kolodziej SA. editors. Bioprocessing, bioengineering and process chemistry in the biopharmaceutical industry: using chemistry and bioengineering to improve the performance of Biologics. Cham: Springer; 2024. p. 405–436. doi: 10.1007/978-3-031-62007-214. [DOI] [Google Scholar]
19.Kumar S, Nixon A. Biopharmaceutical informatics: learning to discover developable biotherapeutics. 2025; doi: 10.1201/9781003300311. [DOI] [Google Scholar]
20.Svilenov HL, Arosio P, Menzen T, Tessier P, Sormanni P. Approaches to expand the conventional toolbox for discovery and selection of antibodies with drug-like physicochemical properties. Mabs-austin. 2023;15(1):2164459. doi: 10.1080/19420862.2022.2164459. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Nimrod G, Fischman S, Austin M, Herman A, Keyes F, Leiderman O, Hargreaves D, Strajbl M, Breed J, Klompus S, et al. Computational design of epitope-specific functional antibodies. Cell Rep. 2018;25(8):2121–21315. doi: 10.1016/j.celrep.2018.10.081. [DOI] [PubMed] [Google Scholar]
22.Jette CA, Cohen AA, Gnanapragasam PNP, Muecksch F, Lee YE, Huey-Tubman KE, Schmidt F, Hatziioannou T, Bieniasz PD, Nussenzweig MC, et al. Broad cross-reactivity across sarbecoviruses exhibited by a subset of COVID-19 donor-derived neutralizing antibodies. Cell Rep. 2021;36(13):109760. doi: 10.1016/j.celrep.2021.109760. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Piccoli L, Park Y-J, Tortorici MA, Czudnochowski N, Walls AC, Beltramello M, SilacciFregni C, Pinto D, Rosen LE, Bowen JE, et al. Mapping neutralizing and immunodominant sites on the sars-cov-2 spike receptor-binding domain by structure-guided high-resolution serologyCell. Cell. 2020;183(4):1024–1042.e21. doi: 10.1016/j.cell.2020.09.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Cao Y, Wang J, Jian F, Xiao T, Song W, Yisimayi A, Huang W, Li Q, Wang P, An R, et al. Omicron escapes the majority of existing sars-cov-2 neutralizing antibodies. Nature. 2022;602(7898):657–663. doi: 10.1038/s41586-021-04385-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Cao Y, Yisimayi A, Jian F, Song W, Xiao T, Wang L, Du S, Wang J, Li Q, Chen X, et al. Ba.2.12.1, ba.4 and ba.5 escape antibodies elicited by omicron infection. Nature. 2022;608(7923):593–602. doi: 10.1038/s41586-022-04980-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Du S, Cao Y, Zhu Q, Yu P, Qi F, Wang G, Du X, Bao L, Deng W, Zhu H, et al. Structurally resolved sars-cov-2 antibody shows high efficacy in severely infected hamsters and provides a potent cocktail pairing strategy. Cell. 2020;183(4):1013–102313. doi: 10.1016/j.cell.2020.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Westendorf K, Žentelis S, Wang L, Foster D, Vaillancourt P, Wiggin M, Lovett E, Lee R, Hendle J, Pustilnik A, et al. Ly-cov1404 (bebtelovimab) potently neutralizes sars-cov-2 variants. bioRxiv. 2022. doi: 10.1101/2021.04.30.442182. [DOI] [PMC free article] [PubMed]
28.Hansen J, Baum A, Pascal KE, Russo V, Giordano S, Wloga E, Fulton BO, Yan Y, Koon K, Patel K, et al. Studies in humanized mice and convalescent humans yield a sars-cov-2 antibody cocktail. Science. 2020;369(6506):1010–1014. doi: 10.1126/science.abd0827. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Park Y-J, Marco AD, Starr TN, Liu Z, Pinto D, Walls AC, Zatta F, Zepeda SK, Bowen JE, Sprouse KR, et al. Antibody-mediated broad sarbecovirus neutralization through ace2 molecular mimicry. Science. 2022;375(6579):449–454. doi: 10.1126/science.abm8143. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Huang M, Wu L, Zheng A, Xie Y, He Q, Rong X, Han P, Du P, Han P, Zhang Z, et al. Atlas of currently available human neutralizing antibodies against sars-cov-2 and escape by omicron sub-variants ba.1/ba.1.1/ba.2/ba.3. Immunity. 2022;55(8):1501–15143. doi: 10.1016/j.immuni.2022.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Kovaltsuk A, Leem J, Kelm S, Snowden J, Deane CM, Krawczyk K. Observed antibody space: a resource for data mining next-generation sequencing of antibody repertoires. J Immunol. 2018;201(8):2502–2509. doi: 10.4049/jimmunol.1800708. [DOI] [PubMed] [Google Scholar]
32.Olsen TH, Boyles F, Deane CM. Observed antibody space: a diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences. Protein Sci. 2022;31(1):141–146. doi: 10.1002/pro.4205. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Hie BL, Shanker VR, Xu D, Bruun TU, Weidenbacher PA, Tang S, Wu W, Pak JE, Kim PS. Efficient evolution of human antibodies from general protein language models. Nat Biotechnol. 2024;42(2):275–283. doi: 10.1038/s41587-023-01763-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Bashour H, Smorodina E, Pariset M, Zhong J, Akbar R, Chernigovskaya M, LêQuý K, Snapkow I, Rawat P, Krawczyk K, et al. Biophysical cartography of the native and human-engineered antibody landscapes quantifies the plasticity of antibody developability. Commun Biol. 2024;7(1):922. doi: 10.1038/s42003-024-06561-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, Guo D, Ott M, Zitnick CL, Ma J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci USA. 2021;118(15):2016239118. doi: 10.1073/pnas.2016239118. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Pham NB, Meng WS. Protein aggregation and immunogenicity of biotherapeutics. Int J Multiling Pharmaceutics. 2020;585:119523. doi: 10.1016/j.ijpharm.2020.119523. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Greaney AJ, Starr TN, Barnes CO, Weisblum Y, Schmidt F, Caskey M, Gaebler C, Cho A, Agudelo M, Finkin S, et al. Mapping mutations to the sarscov-2 rbd that escape binding by different classes of antibodies. Nat Commun. 2021;12(1):4196. doi: 10.1038/s41467-021-24435-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Desautels TA, Arrildt KT, Zemla AT, Lau EY, Zhu F, Ricci D, Cronin S, Zost SJ, Binshtein E, Scheaffer SM, et al. Computationally restoring the potency of a clinical antibody against omicron. Nature. 2024;629(8013):878–885. doi: 10.1038/s41586-024-07385-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Dreyer FA, Cutting D, Schneider C, Kenlay H, Deane CM. Inverse folding for antibody sequence design using deep learning. arXiv preprint arXiv:2310.19513. (2023) https://arxiv.org/abs/2310.19513.
40.Yang Z, Milas KA, White AD. Now what sequence? pre-trained ensembles for bayesian optimization of protein sequences. bioRxiv. (2022) 10.1101/2022.08.05.502972. [DOI]
41.Stanton S, Maddox W, Gruver N, Maffettone P, Delaney E, Greenside P, Wilson AG. Accelerating bayesian optimization for biological sequence design with denoising autoencoders. In: Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, editors. International conference on machine learning;17–23 July. PMLR; 2022. p. 20459–20478 https://proceedings.mlr.press/v162/stanton22a/stanton22a.pdf. [Google Scholar]
42.Khan A, Cowen-Rivers AI, Grosnit A, Deik D-G-X, Robert PA, Greiff V, Smorodina E, Rawat P, Dreczkowski K, Akbar R, et al. AntBO: towards real-world automated antibody design with combinatorial bayesian optimisation. arXiv preprint arXiv:2201.12570. 2022. [DOI] [PMC free article] [PubMed]
43.Park JW, Stanton S, Saremi S, Watkins A, Dwyer H, Gligorijevic V, Bonneau R, Ra S, Cho K. PropertyDAG: multi-objective bayesian optimization of partially ordered, mixed-variable properties for biological sequence design. arXiv preprint arXiv:2210.04096. 2022.
44.Park JW, Tagasovska N, Maser M, Ra S, Cho K. Botied: multi-objective Bayesian optimization with tied multivariate ranks. arXiv preprint arXiv:2306.00344. 2024.
45.Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, Ronneberger O, Willmore L, Ballard AJ, Bambrick J, et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature. 2024;630(8016):493–500. doi: 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Ketata MA, Laue C, Mammadov R, Stärk H, Wu M, Corso G, Marquet C, Barzilay R, Jaakkola TS. Diffdock-pp: rigid protein-protein docking with diffusion models. arXiv preprint arXiv:2304.03889. 2023.
47.Martinkus K, Ludwiczak J, Cho K, Liang W-C, Lafrance-Vanasse J, Hotzel I, Rajpal A, Wu Y, Bonneau R, Gligorijevic V, et al. AbDiffuser: full-atom generation of in vitro functioning antibodies. Adv Neural Inf Process Syst. 2024;36:40729–40759. [Google Scholar]
48.Bennett NR, Watson JL, Ragotte RJ, Borst AJ, See DL, Weidle C, Biswas R, Shrock EL, Leung PJY, Huang B, et al. Atomically accurate de novo design of single-domain antibodies. bioRxiv. 2024. doi: 10.1101/2024.03.14.585103. [DOI]
49.Cutting D, Dreyer FA, Errington D, Schneider C, Deane CM. De Novo antibody design with SE(3) diffusion. J Comput Biol. 2024;32(4):351–361. https://arxiv.org/abs/2405.07622. [DOI] [PubMed] [Google Scholar]
50.Kamat V, Rafique A, Huang T, Olsen O, Olson W. The impact of different human igg capture molecules on the kinetics analysis of antibody-antigen interaction. Analytical Biochem. 2020;593:113580. doi: 10.1016/j.ab.2020.113580. [DOI] [PubMed] [Google Scholar]
51.Bailly M, Mieczkowski C, Juan V, Metwally E, Tomazela D, Baker J, Uchida M, Kofman E, Raoufi F, Motlagh S, et al. Predicting antibody developability profiles through early stage discovery screening. Mabs-austin. 2020;12(1):1743053. 10.1080/19420862.2020.174305. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Dunbar J, Deane CM. Anarci: antigen receptor numbering and receptor classification. Bioinformatics. 2016;32(2):298–300. doi: 10.1093/bioinformatics/btv552. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Leem J, Dunbar J, Georges G, Shi J, Deane CM. ABodyBuilder: automated antibody structure prediction with data–driven accuracy estimation. Mabs-austin. 2016;8(7):1259–1268. doi: 10.1080/19420862.2016.1205773. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Abanades B, Wong WK, Boyles F, Georges G, Bujotzek A, Deane CM. Immunebuilder: deep-learning models for predicting the structures of immune proteins. Commun Biol. 2023;6(1):575. doi: 10.1038/s42003-023-04927-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Kenlay H, Dreyer FA, Cutting D, Nissley D, Deane CM, Cowen L. ABodyBuilder3: improved and scalable antibody structure predictions. Bioinformatics. 2024;40(10). doi: 10.1093/bioinformatics/btae576. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Raybould MI, Deane CM. The therapeutic antibody profiler for computational developability assessment. Ther Antibodies: Methods Protocol. 2022;2313:115–125. https://link.springer.com/protocol/10.1007/978-1-0716-1450-1_5 [DOI] [PubMed] [Google Scholar]
57.Goddard TD, Huang CC, Meng EC, Pettersen EF, Couch GS, Morris JH, Ferrin TE. UCSF chimerax: meeting modern challenges in visualization and analysis. Protein Sci. 2018;27(1):14–25. doi: 10.1002/pro.3235. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y, Beauchamp KA, Wang LP, Simmonett AC, Harrigan MP, Stern CD, et al. OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLOS Comput Biol. 2017;13(7):e1005659. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Dauparas J, Anishchenko I, Bennett N, Bai H, Ragotte RJ, Milles LF, Wicky BIM, Courbet A, Haas RJ, Bethel N, et al. Robust deep learning–based protein sequence design using proteinmpnn. Science. 2022;378(6615):49–56. doi: 10.1126/science.add2187. [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Høie MH, Hummer AM, Olsen TH, Aguilar-Sanjuan B, Nielsen M, Deane CM, Gromiha M. Antifold: improved structure-based antibody design using inverse folding. Bioinf Adv. 2025;5(1):202. doi: 10.1093/bioadv/vbae202. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Wang Z, Ji Y, Tian J, Zheng S. Retrieval augmented diffusion Model for structure-informed antibody design and optimization. 2024;https://arxiv.org/abs/2410.15040. [Google Scholar]
62.Greenig M, Zhao H, Radenkovic V, Ramon A, Sormanni P. IgCraft: a versatile sequence generation framework for antibody discovery and engineering. 2025;https://arxiv.org/abs/2503.19821. [Google Scholar]
63.Zhang W, Shi K, Geng Q, Herbst M, Wang M, Huang L, Bu F, Liu B, Aihara H, Li F, et al. Structural evolution of sars-cov-2 omicron in human receptor recognition. J Virol. 2023;97(8):00822–23. doi: 10.1128/jvi.00822-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. Colabfold: making protein folding accessible to all. Nat Methods. 2022;19(6):679–682. doi: 10.1038/s41592-022-01488-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Leaver-Fay A, Tyka M, Lewis SM, Lange OF, Thompson J, Jacak R, Kaufman KW, Renfrew PD, Smith CA, Sheffler W, et al. Chapter nineteen rosetta3: an object-oriented sofware suite for the simulation and design of macromolecules. In: Johnson ML, Brand L, editors. Computer methods, part C. Methods in Enzymology. Vol. 487. Academic Press; 2011. p. 545–574. doi: 10.1016/B978-0-12-381270-4.00019-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
66.Adolf-Bryfogle J, Labonte JW, Kraft JC, Shapovalov M, Raemisch S, Lütteke T, DiMaio F, Bahl CD, Pallesen J, King NP, et al. Growing glycans in Rosetta: accurate de novo glycan modeling, density fitting, and rational sequon design. bioRxiv. 2021. doi: 10.1101/2021.09.27.462000. [DOI] [PMC free article] [PubMed]
67.Lauer TM, Agrawal NJ, Chennamsetty N, Egodage K, Helk B, Trout BL. Developability index: a rapid in silico tool for the screening of antibody aggregation propensity. J Pharm Sci. 2012;101(1):102–115. doi: 10.1002/jps.22758. [DOI] [PubMed] [Google Scholar]
68.Hsieh C-L, Goldsmith JA, Schaub JM, DiVenere AM, Kuo H-C, Javanmardi K, Le KC, Wrapp D, Lee AG, Liu Y, et al. Structure-based design of prefusion-stabilized sars-cov-2 spikes. Science. 2020;369(6510):1501–1505. doi: 10.1126/science.abd0826. [DOI] [PMC free article] [PubMed] [Google Scholar]
69.Punjani A, Rubinstein JL, Fleet DJ, Brubaker MA. CryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat Methods. 2017;14(3):290–296. doi: 10.1038/nmeth.4169. [DOI] [PubMed] [Google Scholar]
70.Sanchez-Garcia R, Gomez-Blanco J, Cuervo A, Carazo JM, Sorzano COS, Vargas J. Deepemhancer: a deep learning solution for cryo-em volume post-processing. Commun Biol. 2021;4(1):874. doi: 10.1038/s42003-021-02399-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Watson JL, Juergens D, Bennett NR, Trippe BL, Yim J, Eisenach HE, Ahern W, Borst AJ, Ragotte RJ, Milles LF, et al. De Novo design of protein structure and function with rfdiffusion. Nature. 2023;620(7976):1089–1100. doi: 10.1038/s41586-023-06415-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

We provide the code and data generated in this study under MIT license at https://github.com/Exscientia/ab-characterisation and https://doi.org/10.5281/zenodo.13862717.

[cit0001] 1.Crescioli S, Kaplon H, Chenoweth A, Wang L, Visweswaraiah J, Reichert JM.. Antibodies to watch in 2024. Mabs-austin. 2024;16(1):2297450. doi: 10.1080/19420862.2023.2297450. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0002] 2.Walsh G, Walsh E.. Biopharmaceutical benchmarks 2022. Nat Biotechnol. 2022;40(12):1722–18. doi: 10.1038/s41587-022-01582-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0003] 3.Mullard A. Fda approves 100th monoclonal antibody product. Nat Rev Drug Discov. 2021;20(7):491–495. doi: 10.1038/d41573-021-00079-7. [DOI] [PubMed] [Google Scholar]

[cit0004] 4.https://www.pharmaceutical-technology.com/analyst-comment/biologic-sales-small-molecule-sales/.

[cit0005] 5.Kim J, McFee M, Fang Q, Abdin O, Kim PM. Computational and artificial intelligence-based methods for antibody development. Trends Pharmacological Sci. 2023;44(3):175–189. doi: 10.1016/j.tips.2022.12.005. [DOI] [PubMed] [Google Scholar]

[cit0006] 6.Silva BM, Myung Y, Ascher DB, Pires DEV. epitope3D: a machine learning method for conformational B-cell epitope prediction. Briefings Bioinf. 2022;23(1):423. doi: 10.1093/bib/bbab423. [DOI] [PubMed] [Google Scholar]

[cit0007] 7.Chinery L, Wahome N, Moal I, Deane CM, Borgwardt K. Paragraph—antibody paratope prediction using graph neural networks with minimal feature vectors. Bioinformatics. 2022;39(1):732. doi: 10.1093/bioinformatics/btac732. [DOI] [PubMed] [Google Scholar]

[cit0008] 8.Kurumida Y, Saito Y, Kameda T. Predicting antibody affinity changes upon mutations by combining multiple predictors. Sci Rep. 2020;10(1):19533. doi: 10.1038/s41598-020-76369-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0009] 9.Adolf-Bryfogle J, Kalyuzhniy O, Kubitz M, Weitzner BD, Hu X, Adachi Y, Schief WR, Dunbrack RL Jr, Ben-Tal N. Rosettaantibodydesign (rabd): a general framework for computational antibody design. PLOS Comput Biol. 2018;14(4):1006112. doi: 10.1371/journal.pcbi.1006112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0010] 10.Zhang W, Wang H, Feng N, Li Y, Gu J, Wang Z. Developability assessment at early-stage discovery to enable development of antibody-derived therapeutics. Antibody Ther. 2023;6(1):13–29. doi: 10.1093/abt/tbac029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0011] 11.Bachas S, Rakocevic G, Spencer D, Sastry AV, Haile R, Sutton JM, Kasun G, Stachyra A, Gutierrez JM, Yassine E, et al. Antibody optimization enabled by artificial intelligence predictions of binding affinity and naturalness. BioRxiv. 2022: 2022–2028.

[cit0012] 12.Gruver N, Stanton S, Frey N, Rudner TG, Hotzel I, Lafrance-Vanasse J, Rajpal A, Cho K, Wilson AG. Protein design with guided discrete diffusion. Adv Neural Inf Process Syst. 2023;36:12489–12517. https://dl.acm.org/doi/10.5555/3666122.3666669 [Google Scholar]

[cit0013] 13.Shanker VR, Bruun TU, Hie BL, Kim PS. Unsupervised evolution of protein and antibody complexes with a structure-informed language model. Science. 2024;385(6704):46–53. doi: 10.1126/science.adk8946. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0014] 14.Hummer AM, Abanades B, Deane CM. Advances in computational structure-based antibody design. Curr Opin Struct Biol. 2022;74:102379. doi: 10.1016/j.sbi.2022.102379. [DOI] [PubMed] [Google Scholar]

[cit0015] 15.Shanehsazzadeh A, McPartlon M, Kasun G, Steiger AK, Sutton JM, Yassine E, McCloskey C, Haile R, Shuai R, Alverio J, et al. Unlocking de novo antibody design with generative artificial intelligence. bioRxiv. 2024. doi: 10.1101/2023.01.08.523187. [DOI]

[cit0016] 16.Cheng J, Liang T, Xie X-Q, Feng Z, Meng L. A new era of antibody discovery: an in-depth review of ai-driven approaches. Drug Discov Today Today. 2024;29(6):103984. doi: 10.1016/j.drudis.2024.103984. [DOI] [PubMed] [Google Scholar]

[cit0017] 17.Kim J, McFee M, Fang Q, Abdin O, Kim P. Computational and artificial intelligence-based methods for antibody development. Trends Pharmacological Sci. 2023;44(3):175–189. doi: 10.1016/j.tips.2022.12.005. [DOI] [PubMed] [Google Scholar]

[cit0018] 18.Bauer J, Kube S, Gupta P, Kumar S. Biopharmaceutical informatics: a strategic vision for discovering developable biotherapeutic drug candidates. In: Gadamasetti K, Kolodziej SA. editors. Bioprocessing, bioengineering and process chemistry in the biopharmaceutical industry: using chemistry and bioengineering to improve the performance of Biologics. Cham: Springer; 2024. p. 405–436. doi: 10.1007/978-3-031-62007-214. [DOI] [Google Scholar]

[cit0019] 19.Kumar S, Nixon A. Biopharmaceutical informatics: learning to discover developable biotherapeutics. 2025; doi: 10.1201/9781003300311. [DOI] [Google Scholar]

[cit0020] 20.Svilenov HL, Arosio P, Menzen T, Tessier P, Sormanni P. Approaches to expand the conventional toolbox for discovery and selection of antibodies with drug-like physicochemical properties. Mabs-austin. 2023;15(1):2164459. doi: 10.1080/19420862.2022.2164459. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0021] 21.Nimrod G, Fischman S, Austin M, Herman A, Keyes F, Leiderman O, Hargreaves D, Strajbl M, Breed J, Klompus S, et al. Computational design of epitope-specific functional antibodies. Cell Rep. 2018;25(8):2121–21315. doi: 10.1016/j.celrep.2018.10.081. [DOI] [PubMed] [Google Scholar]

[cit0022] 22.Jette CA, Cohen AA, Gnanapragasam PNP, Muecksch F, Lee YE, Huey-Tubman KE, Schmidt F, Hatziioannou T, Bieniasz PD, Nussenzweig MC, et al. Broad cross-reactivity across sarbecoviruses exhibited by a subset of COVID-19 donor-derived neutralizing antibodies. Cell Rep. 2021;36(13):109760. doi: 10.1016/j.celrep.2021.109760. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0023] 23.Piccoli L, Park Y-J, Tortorici MA, Czudnochowski N, Walls AC, Beltramello M, SilacciFregni C, Pinto D, Rosen LE, Bowen JE, et al. Mapping neutralizing and immunodominant sites on the sars-cov-2 spike receptor-binding domain by structure-guided high-resolution serologyCell. Cell. 2020;183(4):1024–1042.e21. doi: 10.1016/j.cell.2020.09.037. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0024] 24.Cao Y, Wang J, Jian F, Xiao T, Song W, Yisimayi A, Huang W, Li Q, Wang P, An R, et al. Omicron escapes the majority of existing sars-cov-2 neutralizing antibodies. Nature. 2022;602(7898):657–663. doi: 10.1038/s41586-021-04385-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0025] 25.Cao Y, Yisimayi A, Jian F, Song W, Xiao T, Wang L, Du S, Wang J, Li Q, Chen X, et al. Ba.2.12.1, ba.4 and ba.5 escape antibodies elicited by omicron infection. Nature. 2022;608(7923):593–602. doi: 10.1038/s41586-022-04980-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0026] 26.Du S, Cao Y, Zhu Q, Yu P, Qi F, Wang G, Du X, Bao L, Deng W, Zhu H, et al. Structurally resolved sars-cov-2 antibody shows high efficacy in severely infected hamsters and provides a potent cocktail pairing strategy. Cell. 2020;183(4):1013–102313. doi: 10.1016/j.cell.2020.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0027] 27.Westendorf K, Žentelis S, Wang L, Foster D, Vaillancourt P, Wiggin M, Lovett E, Lee R, Hendle J, Pustilnik A, et al. Ly-cov1404 (bebtelovimab) potently neutralizes sars-cov-2 variants. bioRxiv. 2022. doi: 10.1101/2021.04.30.442182. [DOI] [PMC free article] [PubMed]

[cit0028] 28.Hansen J, Baum A, Pascal KE, Russo V, Giordano S, Wloga E, Fulton BO, Yan Y, Koon K, Patel K, et al. Studies in humanized mice and convalescent humans yield a sars-cov-2 antibody cocktail. Science. 2020;369(6506):1010–1014. doi: 10.1126/science.abd0827. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0029] 29.Park Y-J, Marco AD, Starr TN, Liu Z, Pinto D, Walls AC, Zatta F, Zepeda SK, Bowen JE, Sprouse KR, et al. Antibody-mediated broad sarbecovirus neutralization through ace2 molecular mimicry. Science. 2022;375(6579):449–454. doi: 10.1126/science.abm8143. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0030] 30.Huang M, Wu L, Zheng A, Xie Y, He Q, Rong X, Han P, Du P, Han P, Zhang Z, et al. Atlas of currently available human neutralizing antibodies against sars-cov-2 and escape by omicron sub-variants ba.1/ba.1.1/ba.2/ba.3. Immunity. 2022;55(8):1501–15143. doi: 10.1016/j.immuni.2022.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0031] 31.Kovaltsuk A, Leem J, Kelm S, Snowden J, Deane CM, Krawczyk K. Observed antibody space: a resource for data mining next-generation sequencing of antibody repertoires. J Immunol. 2018;201(8):2502–2509. doi: 10.4049/jimmunol.1800708. [DOI] [PubMed] [Google Scholar]

[cit0032] 32.Olsen TH, Boyles F, Deane CM. Observed antibody space: a diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences. Protein Sci. 2022;31(1):141–146. doi: 10.1002/pro.4205. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0033] 33.Hie BL, Shanker VR, Xu D, Bruun TU, Weidenbacher PA, Tang S, Wu W, Pak JE, Kim PS. Efficient evolution of human antibodies from general protein language models. Nat Biotechnol. 2024;42(2):275–283. doi: 10.1038/s41587-023-01763-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0034] 34.Bashour H, Smorodina E, Pariset M, Zhong J, Akbar R, Chernigovskaya M, LêQuý K, Snapkow I, Rawat P, Krawczyk K, et al. Biophysical cartography of the native and human-engineered antibody landscapes quantifies the plasticity of antibody developability. Commun Biol. 2024;7(1):922. doi: 10.1038/s42003-024-06561-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0035] 35.Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, Guo D, Ott M, Zitnick CL, Ma J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci USA. 2021;118(15):2016239118. doi: 10.1073/pnas.2016239118. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0036] 36.Pham NB, Meng WS. Protein aggregation and immunogenicity of biotherapeutics. Int J Multiling Pharmaceutics. 2020;585:119523. doi: 10.1016/j.ijpharm.2020.119523. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0037] 37.Greaney AJ, Starr TN, Barnes CO, Weisblum Y, Schmidt F, Caskey M, Gaebler C, Cho A, Agudelo M, Finkin S, et al. Mapping mutations to the sarscov-2 rbd that escape binding by different classes of antibodies. Nat Commun. 2021;12(1):4196. doi: 10.1038/s41467-021-24435-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0038] 38.Desautels TA, Arrildt KT, Zemla AT, Lau EY, Zhu F, Ricci D, Cronin S, Zost SJ, Binshtein E, Scheaffer SM, et al. Computationally restoring the potency of a clinical antibody against omicron. Nature. 2024;629(8013):878–885. doi: 10.1038/s41586-024-07385-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0039] 39.Dreyer FA, Cutting D, Schneider C, Kenlay H, Deane CM. Inverse folding for antibody sequence design using deep learning. arXiv preprint arXiv:2310.19513. (2023) https://arxiv.org/abs/2310.19513.

[cit0040] 40.Yang Z, Milas KA, White AD. Now what sequence? pre-trained ensembles for bayesian optimization of protein sequences. bioRxiv. (2022) 10.1101/2022.08.05.502972. [DOI]

[cit0041] 41.Stanton S, Maddox W, Gruver N, Maffettone P, Delaney E, Greenside P, Wilson AG. Accelerating bayesian optimization for biological sequence design with denoising autoencoders. In: Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, editors. International conference on machine learning;17–23 July. PMLR; 2022. p. 20459–20478 https://proceedings.mlr.press/v162/stanton22a/stanton22a.pdf. [Google Scholar]

[cit0042] 42.Khan A, Cowen-Rivers AI, Grosnit A, Deik D-G-X, Robert PA, Greiff V, Smorodina E, Rawat P, Dreczkowski K, Akbar R, et al. AntBO: towards real-world automated antibody design with combinatorial bayesian optimisation. arXiv preprint arXiv:2201.12570. 2022. [DOI] [PMC free article] [PubMed]

[cit0043] 43.Park JW, Stanton S, Saremi S, Watkins A, Dwyer H, Gligorijevic V, Bonneau R, Ra S, Cho K. PropertyDAG: multi-objective bayesian optimization of partially ordered, mixed-variable properties for biological sequence design. arXiv preprint arXiv:2210.04096. 2022.

[cit0044] 44.Park JW, Tagasovska N, Maser M, Ra S, Cho K. Botied: multi-objective Bayesian optimization with tied multivariate ranks. arXiv preprint arXiv:2306.00344. 2024.

[cit0045] 45.Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, Ronneberger O, Willmore L, Ballard AJ, Bambrick J, et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature. 2024;630(8016):493–500. doi: 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0046] 46.Ketata MA, Laue C, Mammadov R, Stärk H, Wu M, Corso G, Marquet C, Barzilay R, Jaakkola TS. Diffdock-pp: rigid protein-protein docking with diffusion models. arXiv preprint arXiv:2304.03889. 2023.

[cit0047] 47.Martinkus K, Ludwiczak J, Cho K, Liang W-C, Lafrance-Vanasse J, Hotzel I, Rajpal A, Wu Y, Bonneau R, Gligorijevic V, et al. AbDiffuser: full-atom generation of in vitro functioning antibodies. Adv Neural Inf Process Syst. 2024;36:40729–40759. [Google Scholar]

[cit0048] 48.Bennett NR, Watson JL, Ragotte RJ, Borst AJ, See DL, Weidle C, Biswas R, Shrock EL, Leung PJY, Huang B, et al. Atomically accurate de novo design of single-domain antibodies. bioRxiv. 2024. doi: 10.1101/2024.03.14.585103. [DOI]

[cit0049] 49.Cutting D, Dreyer FA, Errington D, Schneider C, Deane CM. De Novo antibody design with SE(3) diffusion. J Comput Biol. 2024;32(4):351–361. https://arxiv.org/abs/2405.07622. [DOI] [PubMed] [Google Scholar]

[cit0050] 50.Kamat V, Rafique A, Huang T, Olsen O, Olson W. The impact of different human igg capture molecules on the kinetics analysis of antibody-antigen interaction. Analytical Biochem. 2020;593:113580. doi: 10.1016/j.ab.2020.113580. [DOI] [PubMed] [Google Scholar]

[cit0051] 51.Bailly M, Mieczkowski C, Juan V, Metwally E, Tomazela D, Baker J, Uchida M, Kofman E, Raoufi F, Motlagh S, et al. Predicting antibody developability profiles through early stage discovery screening. Mabs-austin. 2020;12(1):1743053. 10.1080/19420862.2020.174305. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0052] 52.Dunbar J, Deane CM. Anarci: antigen receptor numbering and receptor classification. Bioinformatics. 2016;32(2):298–300. doi: 10.1093/bioinformatics/btv552. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0053] 53.Leem J, Dunbar J, Georges G, Shi J, Deane CM. ABodyBuilder: automated antibody structure prediction with data–driven accuracy estimation. Mabs-austin. 2016;8(7):1259–1268. doi: 10.1080/19420862.2016.1205773. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0054] 54.Abanades B, Wong WK, Boyles F, Georges G, Bujotzek A, Deane CM. Immunebuilder: deep-learning models for predicting the structures of immune proteins. Commun Biol. 2023;6(1):575. doi: 10.1038/s42003-023-04927-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0055] 55.Kenlay H, Dreyer FA, Cutting D, Nissley D, Deane CM, Cowen L. ABodyBuilder3: improved and scalable antibody structure predictions. Bioinformatics. 2024;40(10). doi: 10.1093/bioinformatics/btae576. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0056] 56.Raybould MI, Deane CM. The therapeutic antibody profiler for computational developability assessment. Ther Antibodies: Methods Protocol. 2022;2313:115–125. https://link.springer.com/protocol/10.1007/978-1-0716-1450-1_5 [DOI] [PubMed] [Google Scholar]

[cit0057] 57.Goddard TD, Huang CC, Meng EC, Pettersen EF, Couch GS, Morris JH, Ferrin TE. UCSF chimerax: meeting modern challenges in visualization and analysis. Protein Sci. 2018;27(1):14–25. doi: 10.1002/pro.3235. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0058] 58.Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y, Beauchamp KA, Wang LP, Simmonett AC, Harrigan MP, Stern CD, et al. OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLOS Comput Biol. 2017;13(7):e1005659. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0059] 59.Dauparas J, Anishchenko I, Bennett N, Bai H, Ragotte RJ, Milles LF, Wicky BIM, Courbet A, Haas RJ, Bethel N, et al. Robust deep learning–based protein sequence design using proteinmpnn. Science. 2022;378(6615):49–56. doi: 10.1126/science.add2187. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0060] 60.Høie MH, Hummer AM, Olsen TH, Aguilar-Sanjuan B, Nielsen M, Deane CM, Gromiha M. Antifold: improved structure-based antibody design using inverse folding. Bioinf Adv. 2025;5(1):202. doi: 10.1093/bioadv/vbae202. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0061] 61.Wang Z, Ji Y, Tian J, Zheng S. Retrieval augmented diffusion Model for structure-informed antibody design and optimization. 2024;https://arxiv.org/abs/2410.15040. [Google Scholar]

[cit0062] 62.Greenig M, Zhao H, Radenkovic V, Ramon A, Sormanni P. IgCraft: a versatile sequence generation framework for antibody discovery and engineering. 2025;https://arxiv.org/abs/2503.19821. [Google Scholar]

[cit0063] 63.Zhang W, Shi K, Geng Q, Herbst M, Wang M, Huang L, Bu F, Liu B, Aihara H, Li F, et al. Structural evolution of sars-cov-2 omicron in human receptor recognition. J Virol. 2023;97(8):00822–23. doi: 10.1128/jvi.00822-23. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0064] 64.Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. Colabfold: making protein folding accessible to all. Nat Methods. 2022;19(6):679–682. doi: 10.1038/s41592-022-01488-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0065] 65.Leaver-Fay A, Tyka M, Lewis SM, Lange OF, Thompson J, Jacak R, Kaufman KW, Renfrew PD, Smith CA, Sheffler W, et al. Chapter nineteen rosetta3: an object-oriented sofware suite for the simulation and design of macromolecules. In: Johnson ML, Brand L, editors. Computer methods, part C. Methods in Enzymology. Vol. 487. Academic Press; 2011. p. 545–574. doi: 10.1016/B978-0-12-381270-4.00019-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0066] 66.Adolf-Bryfogle J, Labonte JW, Kraft JC, Shapovalov M, Raemisch S, Lütteke T, DiMaio F, Bahl CD, Pallesen J, King NP, et al. Growing glycans in Rosetta: accurate de novo glycan modeling, density fitting, and rational sequon design. bioRxiv. 2021. doi: 10.1101/2021.09.27.462000. [DOI] [PMC free article] [PubMed]

[cit0067] 67.Lauer TM, Agrawal NJ, Chennamsetty N, Egodage K, Helk B, Trout BL. Developability index: a rapid in silico tool for the screening of antibody aggregation propensity. J Pharm Sci. 2012;101(1):102–115. doi: 10.1002/jps.22758. [DOI] [PubMed] [Google Scholar]

[cit0068] 68.Hsieh C-L, Goldsmith JA, Schaub JM, DiVenere AM, Kuo H-C, Javanmardi K, Le KC, Wrapp D, Lee AG, Liu Y, et al. Structure-based design of prefusion-stabilized sars-cov-2 spikes. Science. 2020;369(6510):1501–1505. doi: 10.1126/science.abd0826. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0069] 69.Punjani A, Rubinstein JL, Fleet DJ, Brubaker MA. CryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat Methods. 2017;14(3):290–296. doi: 10.1038/nmeth.4169. [DOI] [PubMed] [Google Scholar]

[cit0070] 70.Sanchez-Garcia R, Gomez-Blanco J, Cuervo A, Carazo JM, Sorzano COS, Vargas J. Deepemhancer: a deep learning solution for cryo-em volume post-processing. Commun Biol. 2021;4(1):874. doi: 10.1038/s42003-021-02399-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit0071] 71.Watson JL, Juergens D, Bennett NR, Trippe BL, Yim J, Eisenach HE, Ahern W, Borst AJ, Ragotte RJ, Milles LF, et al. De Novo design of protein structure and function with rfdiffusion. Nature. 2023;620(7976):1089–1100. doi: 10.1038/s41586-023-06415-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Computational design of therapeutic antibodies with improved developability: efficient traversal of binder landscapes and rescue of escape mutations

Frédéric A Dreyer

Constantin Schneider

Aleksandr Kovaltsuk

Daniel Cutting

Matthew J Byrne

Daniel A Nissley

Henry Kenlay

Claire Marks

David Errington

Richard J Gildea

David Damerell

Pedro Tizei

Wilawan Bunjobpol

John F Darby

Ieva Drulyte

Daniel L Hurdiss

Sachin Surade

Newton Wahome

Douglas EV Pires

Charlotte M Deane

ABSTRACT

1. Introduction

Figure 1.

2. Results

2.1. Selection of viable antibody starting points for design

Table 1.

Figure 5.

2.2. Efficiently navigating the antibody space enables the identification of diverse binders

Table 2.

Table 3.

Figure 2.

2.3. Protein language model approaches enable improvement of developability properties for existing candidate antibodies

Table 4.

Figure 3.

2.4. Inverse folding enables rescue of candidate antibodies from escape mutations

Table 5.

Figure 4.

3. Discussion

4. Materials and methods

4.1. Experimental methods

4.1.1. Antibody protein production and purification

4.1.2. Antigen protein production

4.1.3. Differential scanning fluorimetry experiments

4.1.4. Surface plasmon resonance experiments

4.1.5. Surface plasmon resonance results analysis

4.1.6. Size-exclusion chromatography high-performance liquid chromatography experiments

4.2. Computational methods

4.2.1. Antibody characterisation pipeline

4.2.1.1. Sequence-based characterisation

4.2.1.2. Structure-based antibody characterisation

4.2.1.3. Structure-based complex characterisation

4.2.1.4. Ranking of antibody candidates

4.2.1.5. Confirmation of high-ranking candidates

4.2.2. Candidate library generation from OAS

4.2.3. Candidate generation with an inverse folding model

4.2.4. Candidate generation with a language model

4.2.5. Generation of initial structures for design

4.2.6. Physics-based minimization, design, and scoring with Rosetta

4.3. Cryogenic electron microscopy methods

4.3.1. Sample preparation

4.3.2. Data collection

4.3.3. Image processing

Acknowledgments

Appendix A.

Class 3 antibody SPR screen

Table A1.

Figure A1.

Appendix B.

Starting point antibodies

Table B1.

Appendix C.

Cryo-EM settings

Table C1.

Appendix D.

Ranking of designs

Figure D1.

Appendix E.

CDR design with RFdiffusion