Summary
Phenotypic drug discovery (PDD) enables the target-agnostic generation of therapeutic drugs with novel mechanisms of action. However, realizing its full potential for biologics discovery requires new technologies to produce antibodies to all, a priori unknown, disease-associated biomolecules. We present a methodology that helps achieve this by integrating computational modeling, differential antibody display selection, and massive parallel sequencing. The method uses the law of mass action-based computational modeling to optimize antibody display selection and, by matching computationally modeled and experimentally selected sequence enrichment profiles, predict which antibody sequences encode specificity for disease-associated biomolecules. Applied to a phage display antibody library and cell-based antibody selection, ∼105 antibody sequences encoding specificity for tumor cell surface receptors expressed at 103–106 receptors/cell were discovered. We anticipate that this approach will be broadly applicable to molecular libraries coupling genotype to phenotype and to the screening of complex antigen populations for identification of antibodies to unknown disease-associated targets.
Keywords: phenotypic antibody discovery, computational biology, therapeutic antibodies, biomarkers, specificity predictions, mathematical modeling, phage display
Graphical abstract

Highlights
-
•
Generation of antibodies based on in silico-predicted antibody enrichment signatures
-
•
Integrates computational modeling, differential antibody display selection, and NGS
-
•
Optimizes antibody display selection by in silico modeling
-
•
Generates a diverse antibody pool targeting a broad range of surface antigens
Motivation
Phenotypic antibody discovery enables the identification of novel antibodies with the most promising functional in vitro or in vivo activity, without prior knowledge of the targeted antigen. For efficient phenotypic discovery, a large panel of antibodies against a broad repertoire of potential targets should be included in the functional testing. Current methods generate limited numbers of antibodies (102–103), targeting a few highly expressed antigens. This is a concern because antibodies to low-expressed antigens may have functional activity and may be relevant to biomarker discovery and therapeutic antibody development. We present a methodology that significantly enhances the number of antibodies generated (105) and expands the antibody-targeted receptor expression range (antigens differentially expressed at 103–106 copies/cell) to include low-expressed tumor-selective antigens, enabling functional testing of a large pool of antibodies targeting a broad range of surface antigens.
Phenotypic antibody discovery can identify antibodies with novel mechanisms of action, but it suffers from shortcomings in generating diverse antibody pools for functional testing. Mattsson et al. integrate computational modeling, differential antibody display selection, and massively parallel sequencing to generate diverse antibody pools to the cell surfaceome.
Introduction
Immunotherapy with antibodies has significantly improved the survival of cancer patients1,2,3 and enhanced the quality of life for those with autoimmune disorders.4,5 Nevertheless, the lack of response and drug resistance in many patients warrant the identification of novel antibodies and therapeutically relevant targets. Phenotypic drug discovery (PDD) is a validated approach to discovering first-in-class targets and drugs.6 In PDD, candidate drugs from large molecular libraries are screened directly for functional activity (e.g., tumor cell death induction) without prior knowledge of targeted receptors’ identities, signaling pathways, or drugs’ mechanisms of action. Many small-molecule drugs approved by the US Food and Drug Administration (FDA) were discovered using PDD.7 We and others have used PDD to identify several first-in-class antibodies being trialed in clinical studies.8,9,10,11,12,13,14
Realizing the full potential of antibody PDD (i.e., functional screening of antibodies to all a priori unknown disease-associated biomolecules) will require significant improvement over existing methods. These have generated limited numbers (101–102) of antibodies, often specific to the most highly expressed biomolecules.11,12,15,16,17,18,19,20 This is a concern since antibodies to low-expressed biomolecules may have functional activity and may be relevant to biomarker discovery and therapeutic antibody development. Consequently, the main bottleneck of antibody PDD is the target-agnostic generation of antibodies to diverse disease-associated biomolecules.
Current approaches to target-agnostic antibody generation rely on a sequential process of antibody display selection followed by screening for binding to identify clones to disease-associated biomolecules (Figure S1). First, antibody pools enriched for relevant binders are generated from a genotype-linked antibody display library by applying positive selection pressure for binding to a disease-associated sample and negative selection pressure for binding to a healthy sample21,22,23,24,25,26,27 (Figure S1A). Individual clones from the enriched antibody pools are then screened in binding assays, either directly (Figure S1B) or following sequencing revealing the most abundant or enriched clones (Figure S1C), to identify disease-associated biomolecule-specific antibodies.11,12,15,16,17,18,19,20 Since target biomolecules have unknown identities and are present at unknown concentrations in disease-associated and healthy samples, the antibody selection reaction cannot be optimized to retrieve antibodies to specific target biomolecules (to be discovered). Perhaps, as a result, antibodies to low-expressed biomolecules may be present at a very low frequency (<1/106) in enriched antibody pools,28 too low to enable their discovery by existing target-agnostic approaches.
However, antibody binding to biomolecules is known to occur in an affinity (Kd)-driven and target and antibody concentration-dependent manner, which can be described by the universal law of mass action (LaMA). We hypothesized that a computational biology approach where enrichment of antibodies is modeled using the LaMA, according to antibody specificity for biomolecules expressed at different levels throughout disease-associated and healthy sample biomolecule expression ranges, could help optimize the selection and identification of medically relevant antibodies. Here, we describe a rational discovery approach that overcomes the quantitative and qualitative limitations of the current state-of-the-art target-agnostic discovery methods. Integrating the LaMA-based computational modeling, experimental antibody selection, and massively parallel sequencing, iLaMA optimizes antibody display selection and predicts which selected antibody sequences encode specificity to disease-associated biomolecules (Figure 1). Our approach enables the rational identification of tens of thousands of unique antibody sequences encoding specificity for diverse a priori unknown disease-associated biomolecules.
Figure 1.
Schematic work flow of the computational modeling-based antibody discovery process iLaMA
(1) Categories of (hypothetical) biomolecules, e.g., surface receptors, are defined by their absolute and relative expression levels on target and nontarget cells. Biomolecule expression levels ranging from no expression to highest estimated expression on target and nontarget samples are included and used to model selection.
(2) The law of mass action-based computational modeling is performed to optimize selection reaction parameters (e.g., numbers of target and nontarget cells) which are then (3) used in experimental selection to enrich displayed antibodies to sought categories of differentially expressed biomolecules. Enriched antibody pools are analyzed by massively parallel next generation sequencing (NGS) to provide experimental antibody enrichment signatures. The fraction of antibodies that has been enriched in a target biomolecule-dependent manner (the hit rate) is determined. (4 and 5) The law of mass action-based in silico modeling is used to generate predicted antibody enrichment signatures for antibodies to the different categories of biomolecules. (6) Finally, experimental and predicted enrichment signatures are matched to identify antibody sequences encoding specificity to sought categories of differentially expressed biomolecules. See also Figure S1.
Results
iLaMA accurately predicts antibody enrichment according to targeted receptor specificity
To overcome the quantitative and qualitative limitations of current approaches to generate antibodies for biologic PDD, we developed iLaMA, a computational biology approach that integrates the law of mass action-based mathematical modeling and antibody display selection to identify medically relevant antibodies. The key steps of this methodology include optimizing and performing antibody display selection, modeling biomolecule-specific antibody enrichment, and matching computationally predicted and experimentally obtained antibody sequence enrichment profiles to identify antibody clones encoding specificity to sought types of disease-associated biomolecules, as summarized in Figure 1.
We applied this methodology to screen the large (>1010 members) naive human phage display library n-CoDeR26 for antibodies to cell surface receptors differentially expressed between two cell types: DU145 prostate cancer (target) cells and Jurkat (nontarget) T cells. In the first step, we defined categories of differentially expressed surface receptors on these cells spanning an estimated expression range of 103–106 receptors per cell (STAR Methods). We used iLaMA to identify the number of input cells needed for the enrichment of high-affinity (Kd ≤ 10 nM) antibodies to target cell receptors >5-fold upregulated on target cells compared with nontarget cells and depletion of antibodies to receptors <5-fold upregulated on target cells, expressed throughout this range (103–106 receptors/cell). In our model, the number of antibodies retrieved following selection is a function of the number of biomolecules (receptors) on the target and nontarget cells and the number of target and nontarget cells used in the selection reaction, with the formula
where rAT is the number of retrieved antibodies on the target cells, A is the total number of antibodies, B is the total number of biomolecules on target and nontarget cells, Kd is the antibody affinity (M), NA is Avogadro’s constant (6,022 × 1023 molecules/mole), V is the selection reaction volume (dm3), CT and CN are the numbers of target and nontarget cells, BT and BN are the number of biomolecules on CT and CN, Y is the fraction of recovered target cells following selection, and E is the fraction of antibodies eluted from CT.
iLaMA indicated that 1 × 107 target cells were needed in the selections to allow enrichment of antibodies to low-expressed receptors and that a 1,000-fold excess of nontarget cells was needed to efficiently remove antibodies to receptors <5-fold upregulated on target cells compared with nontarget cells while enriching antibodies to receptors >5-fold upregulated. iLaMA also predicted that the enrichment of antibodies to >5-fold upregulated receptors would markedly differ between selections with and without competition (Figure S2). The latter indicated that comparative antibody enrichment profiles generated from the two selection types could be used to discriminate antibody sequences encoding specificity for (target cell) restricted and upregulated receptors.
We performed experimental selection with and without competition using the calculated cell numbers and generated sequence enrichment profiles of individual antibody clones by massively parallel sequencing of the selected antibody pools. We used four rounds of selection aiming to generate robust and different enrichment signatures for antibody sequences encoding specificity for different types of receptors. We used iLaMA to generate predicted enrichment signatures for antibodies to biomolecules with defined expression levels on target and nontarget cells using a Kd of 10 nM (corresponding to the median affinities of antibodies to diverse cell surface receptors isolated from herein using the n-CoDeR antibody library10,29,30) and the formula
where FAT and rAT are the frequencies and numbers of recovered antibodies specific for a given type of target cell biomolecule and HR is the hit rate (i.e., the fraction of phage antibodies that has been enriched in an antibody and target cell receptor-dependent manner in each selection round as determined by flow cytometry). The HR was incorporated since the enrichment of non-specific (antibody-target cell receptor non-specific) phage antibodies, which typically dominate during early selection rounds18,31 (Table S1), appears stochastic and cannot be modeled using the LaMA. By experimentally determining the fraction of phage antibodies that have been enriched in an antibody-antigen-specific manner and then selectively modeling this, iLaMA considers the presence of the non-specific antibody fraction yet enables the focused discovery of antibody sequences encoding specificity for differentially expressed target cell receptors, according to their predicted enrichment profiles.
To evaluate how well the predicted enrichment signatures matched experimental signatures of antibody clones selected from n-CoDeR, we used reference antibody sequences encoding specificity to known receptors with a determined expression on target and nontarget cells and comprising both target-cell-restricted and upregulated receptors (Figure 2A and Table S2). The numbers of molecules per target cell of restricted receptors spanned 103 (CD130), 104 (CD40, ROR1, HER2), 105 (CD44, EGFR), and up to 106 (ICAM-1) receptors per target cell, with no detectable expression on nontarget cells. Upregulated receptors CD55, CD59, and CD71 were similarly expressed on target cells (2–4 × 105 receptors/cell) but were >5-fold (CD55, 10-fold) or <5-fold (CD59 4-fold, and CD71 2-fold) upregulated on target cells compared with nontarget cells (Figure 2A; Table S2). We used iLaMA to predict enrichment signatures of model antibodies (Kd = 10 nM) to these receptors and compared these signatures with experimental enrichment signatures of reference antibodies to the same receptors. A strong correlation between in silico-predicted and experimentally determined frequencies was observed for antibodies to all reference receptors in both selections with and without nontarget cell competition. Notably, as predicted by the computational modeling, the signatures for antibodies to low-expressed target-cell-restricted receptors (CD40 and CD130) and upregulated receptors (CD55) were very similar in selections with nontarget cell competition but differed markedly in selections without competition (Figure 2B).
Figure 2.
iLaMA models antibody enrichment according to targeted receptor’s absolute and relative expression levels
(A) Overview of 10 reference receptors used to evaluate prediction-based discovery, showing the number of receptors on target and nontarget cells. The selection conditions used in this study were optimized for the discovery of antibodies targeting receptors in the gray area.
(B) Predicted signatures and outcomes. Black lines show in silico-modeled signatures for antibodies targeting the reference receptors. Violin plots show experimental enrichment of antibodies targeting these receptors in selections with nontarget cell competition (blue) and without nontarget cell competition (red). Gray areas indicate the range of diagnostically and therapeutically relevant receptors and the therapeutic applicability of antibodies targeting these receptors. Blue areas indicate whether antibodies targeting the receptors can be found with conventional screening technology and/or prediction-based discovery. See also Figures S2, S3, and Table S2.
Since selection of antibodies from highly diversified libraries will generate receptor-specific antibodies of varying affinity, we additionally modeled antibody enrichment according to 10-fold higher (1 nM) or 10-fold lower (100 nM) affinities. While enrichment signatures were overall similar, the modeled antibody frequencies increased with higher affinity and decreased with lower affinity, mainly affecting restricted receptors (Figure S3A). To assess how possible errors in receptor quantification might affect antibody enrichment signatures, we additionally modeled enrichment according to 2-fold higher or 2-fold lower numbers than experimentally determined. This mainly affected signatures of antibodies to upregulated receptors (CD71, CD59, CD55; Figure S3B).
In summary, in silico-modeled enrichment of target receptor-specific antibodies closely mirrored experimental antibody enrichment, which was driven by the absolute and relative expression levels of targeted receptors on target and nontarget cells and by the presence or absence of competition selection.
iLaMA indicates tens of thousands of antibodies to differentially expressed receptors in selected antibody pools
Our results demonstrated that iLaMA accurately modeled the enrichment of antibodies to receptors expressed over a wide dynamic range. This indicated its potential utility as a discovery tool to identify antibodies of distinct therapeutic and diagnostic value without a priori knowledge of the antibody target identities. The method’s importance for such discovery would, however, be determined by the number of antibodies the technology could generate. To assess this, we queried the selected antibody pools for antibody sequences with enrichment profiles matching in silico-predicted signatures of antibodies specific to receptors with expression profiles indicating therapeutic or diagnostic relevance. Diagnostic antibodies and therapeutic antibodies that rely purely on blockade of ligand-receptor signaling (e.g., anti-interleukin [IL]-6R32) could be specific to receptors expressed over a wide dynamic range. In contrast, Fc-dependent or empowered therapeutic antibodies, mediating, e.g., antibody-dependent cellular cytotoxicity (ADCC) or chimeric antigen receptor (CAR)-T cell specificity, to low-expressed tumor-restricted antigens can have significant therapeutic potential but require low or no receptor expression on critical normal cells (Figure 2B). Thus, individual antibody sequences with enrichment signatures matching those predicted to represent specificity for restricted receptors expressed at (1) >106 molecules/target cell, (2) 105–106 molecules/target cell, or (3) <105 molecules/target cell or being >5-fold upregulated on target compared with nontarget cells were quantified. Analysis of the complete sequence dataset indicated the presence of tens of thousands of unique antibody clones to the above three receptor categories (Figure 3A). The signature-guided analyses further indicated that sequences encoding antibodies specific to the most highly expressed receptors dominated binder pools (102–105 ppm/sequence) following two or more selection rounds. Conversely, antibodies to lower-expressed restricted receptors or upregulated receptors were rare (10−2–10 ppm/sequence). Interestingly, and in contrast to their low frequency, the number of antibody clones with indicated specificity for upregulated or target-restricted receptors with low expression exceeded 104 by the same analyses (Figures 3A and 3B).
Figure 3.
iLAMA-predicted sequence enrichment profiles inform antibody specificity, enabling focused discovery of antibodies to differently expressed a priori unknown tumor cell-associated receptors
(A) Enrichment signatures showing antibody sequence frequencies after selections 1 to 4. Black lines show in silico-predicted signatures for antibodies targeting receptors expressed at 1 million and 100,000 copies/target cell with no expression on nontarget cells. By matching experimental signatures to predicted, experimentally enriched antibodies were classified into three groups predicted to bind target receptors expressed at > 1 million copies/cell (green, n = 179), 100,000–1 million copies/cell (purple, n = 2,698), or <100,000 copies/cell or upregulated receptors (blue, n = 94,429).
(B) The frequency of experimentally enriched antibodies to different receptor categories after selections 1–4. Gray shows antibodies predicted to not bind target cells or to bind receptors expressed at similar levels on target and nontarget cells.
(C) A subset of antibodies from each enrichment signature group (top panel) was produced and tested for binding to target and nontarget cells (bottom panel). The cross represents the geometric mean of receptor expression on target and nontarget cells. Inset: antibodies in the <100,000 or upregulated group further classified as binding upregulated or restricted low-expressed receptors based on comparative analyses of selections with nontarget cell competition (solid lines) and selections without nontarget cell competition (dotted lines). See also Figure S4.
As discussed above, upregulated and target-restricted receptors exhibit therapeutic potential as Fc-dependent and empowered antibodies, respectively. Since enrichment signatures of antibodies to these two types of receptors differed between selections with or without competition (Figures 2B and S4A), we performed comparative analysis of the total 94,429 sequences indicated by enrichment signatures to encode antibodies to low-expressed or upregulated receptors. This allowed a fraction of the antibodies to be classified into either category. Accordingly, 13,709 antibodies were indicated to be specific for low-expressed restricted receptors (increasing frequency in selections with competition compared with without), and 753 antibodies were indicated to be specific for upregulated receptors (decreasing frequency in the presence of competition) (Figure S4B). The remaining unclassified antibodies were less clearly affected by nontarget cell competition.
Predicted sequence enrichment profiles inform antibody specificity and enable the discovery of antibodies to a priori unknown tumor-cell-associated receptors
Our collective data thus indicated that prediction-guided selections generated a highly diversified pool of antibodies to a priori unknown differentially expressed surface receptors covering a broad expression range. Furthermore, enrichment signatures identified unparalleled numbers (∼105 compared with <103) of antibodies to therapeutically and diagnostically relevant receptors. To understand its relevance for such antibody discovery, a subset of antibody sequences (n = 102) indicated by enrichment signatures to encode antibodies specific for restricted receptors expressed at (1) >106 molecules/cell (n = 8), (2) 105–106 molecules/cell (n = 16), (3) <105 molecules/cell or target cell upregulated receptors (n = 67), or (4) antibodies to similarly expressed receptors (<5-fold difference) or nonreceptor specific antibodies (n = 11) were selected for production in full-length immunoglobulin (Ig) G format and quantitation of antibody-targeted receptor expression (Figure 3C). This analysis showed that 93% (85 out of 91) of the antibodies in groups 1–3 were specific to differentially expressed receptors on target cells. Equally essential and demonstrating the power of this prediction-guided discovery approach, 9 out of 11 (82%) of the antibodies in group 4 did not bind target cells. These antibody clones showed apparent enrichment between selection rounds 1 and 2 yet were indicated by enrichment signatures to lack specificity for medically relevant receptors. With existing target-agnostic methods, which do not use iLAMA predictive enrichment signatures but rather pick clones randomly or based on the greatest enrichment during selection, these medically trivial antibodies would have been selected for time- and resource-consuming production and downstream screening at the cost of medically relevant clones (e.g., antibodies specific to lower expressed biomolecules).
The above results demonstrated that iLaMA-predicted enrichment signatures could be used with high accuracy to identify which antibody sequences encode specificity to medically relevant or trivial receptors. Consistent with enrichment signatures informing antibody specificity, predicted and experimentally determined receptor expression correlated well (Figure 3C). Antibodies in group 1 were predicted to bind receptors with >106 copies/cell, and the experimentally determined receptor expression was 1.4 × 106 (7.3 × 105 to 2.6 × 106) (geometric mean [95% confidence interval]). Group 2 was predicted to bind receptors with expression levels between 105 and 106, and the experimentally determined number was 3.6 × 105 (1.0 × 105 to 1.3 × 106). Finally, group 3 was predicted to bind restricted receptors with expression levels <105 or upregulated receptors. To help separate these different categories of antibodies, we performed comparative analyses of group III enrichment signatures from selections with or without nontarget cell competition. The studies revealed sequences that showed either decreasing (7 out of 67) or increasing (28 out of 67) frequency with nontarget cell competition compared with without nontarget cell competition (Figure 3C, inset). Consistent with decreasing frequency indicating antibody specificity for upregulated receptors, the experimentally determined receptor number on target cells and nontarget cells for antibodies with decreasing frequency was 5.4 × 105 (1.4 × 105 to 2.2 × 106) and 1.3 × 104 (2.3 × 103 to 7.9 × 104), respectively. Conversely, and consistent with an increasing frequency indicating antibody specificity for restricted low-expressed receptors, the experimentally determined receptor number on target and nontarget cells for antibodies with increasing frequency was 3.0 × 104 (1.3 × 104 to 7.1 × 104) and undeterminable, the latter since 22 out of 28 antibodies bound receptors with nontarget cell expression below the detection limit (Figure 3C, inset).
While, overall, antibody specificity could be predicted by the enrichment signatures, the specificity of some individual antibodies deviated from predictions (Figure 3C). For example, only group 1 (green, left panel) was predicted to contain antibodies specific for receptors expressed at >1 million copies/target cell with no expression on nontarget cells. However, all groups contained a fraction of antibodies with this expression profile. Since our modeling data indicated differential enrichment of antibodies with varying affinity to the same receptor (or receptors with the same expression level) and naive antibody libraries contain antibodies of varying affinity, we next analyzed how antibody affinity affected antibody enrichment signatures. Antibodies with similar determined epitopes (i.e., similar receptor expression levels on target cells) were divided in two groups according to their >3nM or <3nM half-maximal effective concentration (EC50) values for binding to endogenously expressed receptors and their enrichment during selection was compared. Consistent with our modeling (Figure S3A), for both monitored target expression levels (1–4 million copies per cell, n = 22, and 300.000–1 million copies per cell, n = 10) we found the frequencies of antibodies with an EC50 value <3 nM were higher than the frequencies of antibodies with an EC50 value >3 nM. For antibodies to receptors with 1–4 million copies per cell, this difference was statistically significant (p < 0.05) after selections 2, 3, and 4 (Figure S4C).
In summary, prediction-guided discovery enabled the identification of unprecedented numbers of antibodies binding to differentially expressed surface receptors of distinct potential therapeutic or diagnostic value (Figure 4).
Figure 4.
iLaMA target-agnostic discovery of antibodies to diverse disease-associated biomolecules
iLaMA enhances the quantitative output of antibodies from target-agnostic discovery by orders of magnitude (105 compared with <103). This computational prediction-based approach further allows focused discovery of antibodies by intended therapeutic or diagnostic application(s). For example, preferential identification of antibody clones with potential in therapeutic development as naked blocking IgGs, or empowered antibody-drug conjugates (ADCs), is enabled through comparative signature analysis and picking of clones with indicated specificity for any disease-associated biomolecules, or low-expressed biomolecules expressed only in disease-associated sample, tissues or cells, respectively. iLaMA enables rational PDD of antibodies according to their indicated therapeutic potential.
Discussion
Here, we describe a target-agnostic antibody generation methodology that enables the rational identification of antibodies to a priori unknown differentially expressed cell surface receptors. Through computational prediction of antibody enrichment, this methodology enabled the discovery of antibodies to biomolecules expressed throughout therapeutic and diagnostic relevant ranges and identified orders of magnitude greater numbers of antibodies than existing technologies (Figure 4). As such, our computational science-based discovery overcomes several limitations of the current state-of-the-art selection and screening methodology and has the potential to transform biological drug, target, and biomarker discovery.
Predictions will be beneficial to drug developers interested in biologics PDD (for a recent review on PDD, see Moffat et al.33). In PDD, small molecules or antibodies from large molecular libraries are screened for functional activity (e.g., inhibition of proinflammatory cell cytokine release or induction of tumor cell death) without prior knowledge of their molecular targets. Consequently, PDD enables the discovery of the most functional molecules and antibodies across multiple receptors, epitopes, and disease-associated pathways. PDD is a well-validated strategy for first-in-class small-molecule drug discovery,6,34,35 and we and others have used PDD to identify first-in-class antibodies to, e.g., CD52,8 ICAM-1,9 CD32b,10,11 and TNFR2,12 several of which are currently in clinical development.36,37,38,39,40 By combining prediction-based antibody discovery with appropriate functional screening11,41 and our recently described high-throughput CRISPR-based method for target deconvolution,42 biologic PDD can be taken to the next level, directly on par with small-molecule PDD.
A key feature of prediction-based discovery relevant to diagnostic and therapeutic applications is its ability to identify antibodies to poorly expressed disease-associated molecules. Antibodies to low-expressed tumor-restricted antigens and rare disease-associated configurational epitopes may have significant therapeutic potential when developed as empowered biologics such as CAR-T cells or ADCs. Our observations that antibodies to low-expressed and upregulated receptors are present at a shallow frequency (one or fewer clones per million) in selected pools are consistent with the observed shortcoming of existing methods to generate such antibodies. In the absence of integrated computational modeling and informed enrichment signatures in data generated by massively parallel sequencing, robust identification of these specificities would require production and labor-intensive cell-based screening of millions of antibody clones (Figures 3 and 4).
Antibodies to low-expressed receptors are also of value in diagnostics. While liquid biopsies can be easily sampled, they typically contain biomarkers in very low concentrations. Technologies that help identify and quantitate rare disease-associated biomarkers are therefore instrumental to pursuing an earlier diagnosis and personalized medicine.
In this study, we used our computational modeling approach to antibody discovery to identify antibodies that discriminate one cell type from another. Accordingly, the methodology can be used to isolate antibodies to overexpressed targets on, e.g., primary tumor cells and tumor-infiltrating lymphocytes compared with healthy samples or resistant cancer cells compared with drug-sensitive cancer cells. However, the method is also broadly applicable to other complex antigen systems (e.g., blood, urine, cerebrospinal fluid,43 tissue,44 bacteria45 or viruses46) relevant to diverse inflammatory, immunological, neurological, infectious diseases, and cancer. A currently highly relevant application is identifying antibodies to crucial virulence factors (e.g., adhesive glycoproteins of pandemic microorganisms47) and their receptors on host cells. Identifying such antibodies and their associated molecular targets can generate both passive (antibody-based) and active (vaccine) immunotherapies to help treat and prevent drug-resistant infections.
A final key advantage of our prediction-based discovery approach relative to other discovery methodologies (e.g., gene expression-based or proteomics-based approaches) is the parallel discovery of target molecules and of candidate therapeutic or diagnostic human antibodies against these targets, which may be of a composite or processed nature (e.g., oxidized low-density lipoprotein30,48,49) and may include disease-associated epitopes and configurations.
Regarding limitations, although antibody enrichment signatures correlate well with receptor expression levels on target and nontarget cells, this is not the case for all individual antibody clones (Figure 3C). Antibody enrichment is affected by several factors that are incompletely considered or not considered by the model. These include varying affinities; individual antibody affinities may deviate significantly from the Kd value used to model antibody enrichment according to targeted molecules’ expression levels (as shown in Figure S3), accessibility of targeted epitope, and phage amplification in bacteria. The library used in this study was generated using a single framework. Hence, only the complementarity-determining regions vary between the antibodies. This minimizes sequence composition differences and associated growth bias during phage amplification. However, such differences may have a bigger impact when the method is applied to libraries with different frameworks. Despite these limitations, predictions enabled the preferential discovery of antibodies with the desired specificity.
Another limitation is the cost of antibody production. Synthesizing thousands of antibody genes discovered by the method is still associated with a considerable cost. By focusing on a particular class of antibodies, as informed by enrichment signatures, the number of antibodies to test can be dramatically reduced (Figure 4). Additionally, as new, more efficient, cost-effective gene synthesis methods are developed, more antibodies can be tested.
In this work, we used the Illumina platform to sequence the antibody pools. While this allows high-accuracy deep sequencing of CDRH3 (here, a median of 25 million usable reads/sample), the read length is too short to cover the entire antibody sequence. Recently, the accuracy and yield of long-read sequencing have approached those of short-read sequencing.50 Implementation of long-read sequencing would make it possible to distinguish different antibodies with the same CDRH3 in the enrichment signatures and eliminate the need for a second sequencing strategy to generate the full antibody sequence for gene synthesis of the selected antibody clones.
In conclusion, and with reference to PDD, we expect prediction-based discovery to shift the current bottleneck from identifying many antibodies to all differentially expressed, disease-associated biomolecules to developing efficient antibody production and high-throughput clinically predictive functional assays, which allow for the screening of thousands of clones. We recently described functional assays that allow for high-throughput screening of antibody-mediated apoptosis and ADCC using primary patient cancer cells and immune effector cell lines.11 With current instrumentation, these assays could easily be adapted for orders-of-magnitude-greater throughput and use with lower numbers of primary human cells. Encouraging advances in high-throughput antibody characterization will help prioritize clones with unique or highly desired characteristics (e.g., high affinity18 or different MoA41) for testing with primary patient material in the most clinically relevant assays. Such advancements will hopefully lead the way toward a broader range of even more effective therapeutics for patients.
Limitations of the study
The method described here enables the target-agnostic generation of a large panel of antibodies to a broad range of differentially expressed antigens. In this study, we used the phage display library n-CoDeR to generate antibodies targeting the cancer cell line DU145. While the method should be applicable to other libraries, some parameters, such as mean affinity and the number of copies of individual antibodies in the unselected library, will have to be adjusted. Further, while additional sources of complex targets can be screened, it is important that the target and nontarget populations stay consistent throughout the selections. If liquid samples such as blood are used, the method used to capture soluble components must be reproducible. The inclusion of reference targets with known expressions in target and nontarget samples can be used to ensure proper modeling. While the method helps design display selections and guide the discovery of antibodies with the desired specificity, enrichment signatures of individual antibody clones may deviate from those modeled due to, e.g., varying affinity, target epitope immunogenicity, and phage amplification in bacteria.
STAR★Methods
Key resources table
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| APC Mouse anti-Human CD54 | BD Biosciences | Cat# 559771; RRID:AB_398667 |
| APC Mouse anti-Human CD44 | BD Biosciences | Cat# 559942; RRID:AB_398683 |
| PE Mouse anti-Human EGF Receptor | BD Biosciences | Cat# 555997; RRID:AB_396281 |
| PE Mouse anti-Human HER-2/Neu | BD Biosciences | Cat# 340552; RRID:AB_400055 |
| PE Mouse anti-Human ROR1 | BD Biosciences | Cat# 564474; RRID:AB_2738822 |
| PE Mouse anti-Human CD40 | BD Biosciences | Cat# 555589; RRID:AB_395964 |
| PE Mouse anti-Human CD130 | BD Biosciences | Cat# 555757; RRID:AB_396098 |
| PE Mouse anti-Human CD55 | BD Biosciences | Cat# 561901; RRID:AB_10893598 |
| PE CD59 Antibody, anti-human, REAfinity™ | Miltenyi | Cat# 130-120-048; RRID:AB_2751973 |
| APC Mouse anti-Human CD71 | BD Biosciences | Cat# 551374; RRID:AB_398500 |
| His Tag Alexa Fluor® 647-conjugated Antibody | R&D Systems | Cat# IC0501R |
| Mouse anti-FLAG M2, AP-conjugated | Sigma-Aldrich | Cat# A9469; RRID:AB_439699 |
| Goat anti-human-Fc, APC conjugated | Jackson Immunoresearch | Cat# 109-136-098; RRID:AB_2337693 |
| Bacterial and virus strains | ||
| E.coli HB101F′ | In house produced | N/A |
| E.coli Top10 | Thermo Fisher Scientific | Cat# C404010 |
| R408 Helper phage | Agilent Technologies | Cat# 200252 |
| Chemicals, peptides, and recombinant proteins | ||
| ICAM-1 | R&D Systems | Cat# 720-IC |
| CD44 | Sino Biological | Cat# 12211-H08H |
| EGFR | Sino Biological | Cat# 10001-H08H |
| HER2 | Sino Biological | Cat# 10004-H08H |
| ROR1 | Sino Biological | Cat# 13968-H08H |
| CD40 | In house produced | N/A |
| CD130 | Sino Biological | Cat# 10974-HCCH |
| CD55 | Sino Biological | Cat# 10101-H08H |
| CD59 | Sino Biological | Cat# 12474-H08H |
| CD71 | Sino Biological | Cat# 11020-H07H |
| Experimental models: Cell lines | ||
| Human: DU 145 | ATCC | Cat# HTB-81 |
| Human: Jurkat, Clone E6-1 | ATCC | Cat# TIB-152 |
| Human: HEK293 EBNA | ATCC | Cat# CRL-10852 |
| Oligonucleotides | ||
| Forward MiSeq1: 5′-AATGATACGGCGACCA CCG AGATCTACACTCTTTCCCTACACGACGCTCTTCC GATCTttccctgagactctcctgtgcagcctctggattcacctt-3′ Upper case - sequencing adaptors, lower case - scFv primers |
This paper | N/A |
| Forward MiSeq2, NextSeq: 5′-AATGATACGG CGA CCACCGAGATCTACACTCTTTCCCTACACGACGC TCTTCCGATCTtagagccgaggacactgccgtgtattactgt-3′ |
This paper | N/A |
| Reverse MiSeq1, NextSeq: 5′-CAAGCAGAAG ACGGCATACGAGAT – 10nt Index – GTGACTGGAGTTCAGACGTGTGCTCTTCCG ATCTcgctgctcacggtgaccagtgtaccttggcccca-3′ |
This paper | N/A |
| Reverse MiSeq2: 5′-CAAGCAGAA GACGGCATACGAGAT – 10nt Index – GTGACTGGAGTTCAGACGTGTGCTCT TCCGATCTgtcagcttggttcctccgccgaa-3′ |
This paper | N/A |
| Software and algorithms | ||
| FlowJo v10.7.2 | FlowJo, LLC | https://www.flowjo.com/solutions/flowjo |
| GraphPad Prism 9.5.1 | GraphPad | https://www.graphpad.com |
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Björn Frendéus (bjorn.frendeus@bioinvent.com).
Materials availability
Antibodies generated in this study will be made available on request under a completed materials transfer agreement.
Experimental model and subject details
Tissue culture
Human prostate carcinoma cell line DU145 (ATCC) was cultured in MEM (Gibco) containing 10% fetal bovine serum (FBS, Gibco), 1mM Sodium Pyruvate (Gibco) and 1x MEM Non-Essential Amino Acids Solution (Gibco). Acute T cell leukemia cell line Jurkat (clone E6-1, ATCC) was cultured in RPMI-1640 with GlutaMAX (Gibco) containing 10% FBS and 1mM Sodium Pyruvate. The cells were grown at 37°C in a humidified atmosphere and 5% CO2.
In-house suspension adapted HEK 293 EBNA (ATCC) was cultured in Freestyle293 medium (Thermo Fisher Scientific) supplemented with 10% Pluronic F-68 (Thermo Fisher Scientific) at +37°C, 8% CO2, 300 rpm in a humidified atmosphere.
Method details
Derivation of iLaMA equations
To optimize the selection conditions, the recovery of antibodies to different categories of target biomolecules was modeled in silico. Following experimental selections, the actual experimental parameters were inserted into the equations, and predicted enrichment signatures were modeled.
According to the law of mass action, the interaction between an antibody (A), its target biomolecule (B), and their complex (AB) is given by the equilibrium interaction
with the equilibrium dissociation constant or affinity
The equilibrium interaction between (A) and (B) may be described as
with
| (Equation 1) |
The total A or B ([A] or [B]) is the sum of free and bound A or B, i.e.,
Therefore, in Equation 1, replacing [fA] by [A]-[bA] and [fB] by [B]-[bA] gives
| (Equation 2) |
which is rearranged to form
This equation has the solution
where the negative root is the relevant one
Substituting concentrations for number of antibodies/the number of molecules per mole (NA)/volume (V) yields
or simplified
| (Equation 3) |
where.
bA = number of biomolecule-bound antibodies.
A = total number of antibodies A.
B = total number of target biomolecules B.
Kd = Antibody affinity (M)
V = the reaction volume (dm3)
NA = Avogadro's constant (6.022 × 1023 molecules mole−1)
If the target and nontarget cells are mixed, the total number of biomolecules (B) will be:
where.
CT = the number of target cells.
CN = the number of nontarget cells.
BT = the number of biomolecules on CT
BN = the number of biomolecules on CN
The number of antibodies A bound to biomolecules B on target cells at equilibrium will be equal to the total number of bound antibodies on target and nontarget cells multiplied by the ratio between biomolecules on target cells and the total number of biomolecules (biomolecules on both target and nontarget cells):
| (Equation 4) |
Furthermore, the combination of Equations 3 and 4 yields
| (Equation 5) |
Since not all antibodies are recovered after selection, the number of recovered antibodies (rAT) is given by:
| (Equation6) |
where.
E = the fraction of antibodies eluted from target cells.
Y = the fraction of recovered target cells.
In the n-CoDeR library, the average copy number of each antibody is 2,000, with a 10% display level. Hence, A in selection 1 was estimated to be 200. In subsequent selections, A was calculated as the number of recovered antibodies (rAT) in the previous selection multiplied by the amplification factor. The amplification factor was experimentally determined from titrated phage numbers (or estimated to be 10,000, 100,000, and 10,000 between selections 1–2, 2–3, and 3–4, respectively, during selection optimization). E was estimated to be 0.5, and Y was experimentally determined (or estimated to be 0.5 during selection optimization).
In this study, we optimized the selection conditions to allow the identification of antibodies against 1) biomolecules with ≥5,000 copies per target cell and no expression on nontarget cells and 2) biomolecules upregulated at least five times (compared to nontarget cells) and expressed at ≥200,000 copies on target cells.
The experimentally observed antibody enrichment profiles were matched to in silico-generated enrichment profiles to group antibodies according to predicted specificities for defined categories of biomolecules. Each enrichment profile consisted of the experimentally determined, or in silico calculated, antibody frequencies in pools from selections 1 to 4.
For in silico modeled enrichment profiles, the experimental selection parameters were implemented in Equation 6 above to simulate the recovery of antibodies (rAT) to different biomolecule categories. The estimated frequency of a phage antibody that was selected in an antibody- and biomolecule-dependent manner in the phage pool (FAT) was then calculated as
| (Equation 7) |
where.
FAT = frequency of recovered antibodies specific for a given type of target cell biomolecule
rAT = number of recovered antibodies specific for a given type of target cell biomolecule.
Total number of recovered antibodies specific for all types of target cell biomolecules
HR = hit rate, the fraction of phage antibodies that have been enriched in an antibody-dependent, target cell biomolecule-dependent manner. The HR was experimentally determined as described below.
equals the total number of antibodies binding to either category of target cell differentially expressed biomolecules. In this study, phage-antibody binding to biomolecules expressed throughout the experimentally determined expression range (5×103 to 1×106 copies/target cell) was modeled, assuming the same median affinities (Kd = 10 nM) and the same number of antibodies specific for different categories of biomolecules, being present in the unselected naive antibody library.
Definition of hypothetical biomolecules
To enable computational modeling, hypothetical biomolecules were characterized and defined based on their distinct and different (theoretical) expression, ranging from lowest to highest estimated, in target and nontarget samples. The upper range of biomolecule expression to model was estimated to be 1,000,000 copies per target cell. This upper range was determined by the number of receptors per cell targeted by the five most highly enriched antibodies during selection using fluorochrome-conjugated antibodies and quantification beads (Bang Laboratories, 815B) according to the manufacturer’s instructions. The lower range of biomolecule expression to model was estimated to be 5,000 receptors per target cell. The selections were optimized so antibodies to receptors with lower expression levels should not be retrieved and hence should not affect the enrichment profiles of other antibodies. Hypothetical biomolecules in this range (5,000–1,000,000), with expression levels increasing approximately 5-fold and comprising target sample restricted biomolecules, upregulated biomolecules, and biomolecules similarly expressed in target and nontarget samples, were defined. The numbers of (hypothesized) biomolecules in target and nontarget samples were kept constant during modeling.
iLaMA optimization of selection conditions
The number of target and nontarget cells needed to recover and enrich 10 nM antibodies against 1) receptors with ≥5,000 copies or more per target cell and no expression on nontarget cells and 2) receptors upregulated at least five times on target cells compared to nontarget cells and with ≥200,000 copies on target, cells were calculated using Equation 6. The total number of antibodies A in selection one was estimated to be 200 since the average copy number of each antibody in the n-CoDeR library is 2,000, with a 10% display level. In subsequent selections, A was calculated as the number of recovered antibodies (rAT) in the previous selection multiplied by the amplification factor. Amplification factors of 10,000, 100,000 and 10,000 between selections 1–2, 2–3 and 3–4, respectively, were used in calculations. The fraction of antibodies eluted from target cells E and the fraction of recovered target cells Y were estimated to be 0.5.
iLaMA-guided cell selections
For selections with nontarget cell competition, DU145 target cells were harvested, washed, biotinylated with EZ-Link Sulfo-NHS-SS-Biotin (Thermo Fisher Scientific), and labeled with anti-biotin microbeads (Miltenyi Biotec) according to the manufacturer’s instructions. Labeled target cells (10, 2.5, 5, or 5 million cells in selections 1, 2, 3, and 4, respectively) were mixed with approximately 1000 times excess of Jurkat nontarget cells and incubated with the n-CoDeR scFv phage display library26 (BioInvent) at +4°C overnight on a rocking platform. The cell-phage mixture was loaded on a MACS column followed by extensive washing. After washing, bound phages were eluted with trypsin, 1 mg/mL (Sigma-Aldrich) for 30 min at room temperature before inactivation with aprotinin, 0.2 mg/mL (Sigma-Aldrich). Exponentially growing E. coli HB101F′ (in-house constructed from E. coli HB101, Thermo Fisher Scientific) were infected with the eluted phages, spread on selective agar plates, and incubated overnight at 30°C. Colonies were pooled and cultivated using R408 (Agilent Technologies) as a helper phage to produce amplified phage pools for use in consecutive selections. Eluted and amplified phage pools were titrated by infection of exponentially growing E. coli HB101F′ with serial dilutions of the collections. The bacteria were spread on selective agar plates and incubated overnight at 37°C before counting the resulting colonies. The amplification factor was then calculated as the number of phages used for selection divided by the number of phages eluted in the previous selection.
In selections without nontarget cell competition, phages were incubated with 10, 2.5, 5, or 5 million DU145 cells in selections 1, 2, 3, and 4, respectively, for 4 h at +4°C on a rocking platform. Cells were washed with PBS four times before phages were recovered and processed as described above.
Hit rate determination
Phagemid DNA was purified (Miniprep kit, Qiagen), and genes encoding scFv were ligated into a protein expression vector (BioInvent) used to transform chemically competent E. coli Top10 (Thermo Fisher Scientific). Transformed bacteria were spread on selective agar plates and single colonies were picked and used for production of soluble scFv in 96-well microtiter plates. 300-1,100 scFv-containing supernatants from each selection were filtered (0.45 μm Millipore), and 25 μL/well was added to a 1:1 mixture of DU145 cells and CellTrace CSFE-labelled (Thermo Fisher Scientific) Jurkat cells (50,000 cells in 25 μL PBS +0.5% BSA/well) and left to bind for 1 h at +4°C. After washing, scFv binding to live cells (eBioscience Fixable Viability Dye eFluor 780, Thermo Fisher Scientific) was detected using anti-His-AF647 (R&D Systems) and analyzed by flow cytometry (iQue, Intellicyt Sartorius FortCyt v 8.0). The hit rate was determined as the fraction of analyzed clones with a mean fluorescence intensity on target cells at least three times higher than an isotype control. The analysis was performed for selection 1–4 with nontarget cell competition and selection 2–4 without nontarget cell competition. The hit rate for selection 1 without competition was estimated to be 0.09%, the same as for selection 1 with competition.
iLaMA calculation of predictive signatures
In silico predicted enrichment signatures, defined as the expected antibody frequencies observed over four consecutive selection rounds of 10 nM antibodies targeting receptors with varying expression profiles, were calculated using Equations 6 and 7. The total number of antibodies A in selection 1 was set to 200 since the average copy number of each antibody in the n-CoDeR library is 2,000 with a 10% display level. In subsequent selections, A was calculated as the number of recovered antibodies (rAT) in the previous selection multiplied by the amplification factor (experimentally determined). The fraction of antibodies eluted from target cells E was estimated to be 0.5, and the fraction of recovered target cells Y was experimentally determined. Phage-antibody binding to biomolecules expressed throughout the experimentally determined expression range (5×103 to 4×106 copies/target cell) was modeled, assuming the same median affinities (Kd = 10 nM) and the same number of antibodies specific for different categories of biomolecules, being present in the unselected naive antibody library.
Sequence library preparation and illumina sequencing
Phagemid DNA was purified from enriched phage pools using the QIAprep Spin Miniprep Kit (Qiagen). One-step PCR using PfuUltra II Fusion HS DNA Polymerase (Agilent) was performed to amplify scFv encoding genes from phagemid DNA and attach Illumina adaptors and indexes to the samples. The reaction volume was 50 μL/sample, with 50 ng template and 0.2 μM of each primer. Samples for MiSeq sequencing were amplified using two primer pairs: the first covering CDR-H1, CDR-H2, and CDR-H3, and the second covering CDR-L1, CDR-L2, CDR-L3, and CDR-H3. Samples for NextSeq sequencing were amplified using a primer pair that covers CDR-H3. Reverse primers include a 10-bp index sequence to facilitate multiplexing. Primer sequences are listed in the key resources table. PCR amplification was carried out with the following conditions: 95°C/2 min; 12 cycles of 95°C/20 s, 62°C/30 s, 72°C/30 s; followed by 72°C/3 min. PCR products were purified from a 2% agarose gel (MinElute Gel Extraction Kit, Qiagen), quantified (Qubit dsDNA HS Assay Kit, Thermo Fisher Scientific), and analyzed for purity and size on an Agilent 2100 Bioanalyzer using the Agilent 1000 DNA kit and Agilent 2100 Expert software (version B.02.08.SI648(SRI)). The concentration of pooled sequence libraries was measured using the KAPA Library Quant Kit Universal qPCR Mix (Roche, KK4824).
Samples for MiSeq were combined on a flow cell (Illumina MiSeq Reagent Kit v3 (600 cycles)) with 10% PhiX added and sequenced on MiSeq using paired-end reads to a median depth of 8.5 million usable reads/sample. Samples for NextSeq were combined on four flow cells (Illumina NextSeq 500/550 High Output Kit v2.5 (300 cycles)) with 25% PhiX added and sequenced on an Illumina NextSeq 500 using single reads to a median depth of 25 million usable reads/sample. Base-calling and demultiplexing were performed using bcl2fastq version 2.20.0.422, and the quality of the resulting fastq files was then inspected using fastQC version 0.11.8 with default settings. For each sequence, the number of reads was normalized to the total number of reads in the corresponding pooled library. Sequences with less than two reads were omitted from further analysis.
Identification of reference receptors covering the expression range of diagnostically and therapeutically relevant targets
Ten cell surface receptors with varying gene expression profiles on target cells (DU145) and nontarget cells (Jurkat) were identified through searches in the Cancer Cell Line Encyclopedia51 and literature52 (Table S2). Their cell surface expression was measured by flow cytometry using fluorochrome-labeled antibodies (key resources table) and quantification beads (Bang Laboratories, 815B) according to the manufacturer’s instructions.
Generation of antibodies targeting reference receptors
The unselected n-CoDeR library or amplified phages from cell selection 2 with nontarget cell competition were used for selections against the recombinant extracellular domain of the reference receptors (key resources table) using polystyrene beads (Polysciences, 17175). 4 beads/receptor was coated with 25 pmol receptor/bead at +4°C over night. The beads were washed and incubated with the phage stock at +4°C over night. After washing, binding phages were eluted and amplified as described above for cell selections. For selections starting with the n-CoDeR library, a second selection on the recombinant protein was performed, followed by a third selection on DU145 cells. Phage-bound antibodies were converted to soluble scFvs, expressed, and analyzed by flow cytometry as described for Hit-rate determination. ScFvs binding to DU145 cells were analyzed for binding to the respective receptor by ELISA. Reference receptors were coated onto plates overnight at +4°C. The next day, scFv supernatant diluted 1:4 in PBS with 0.05% Tween 20 and 0.45% fish gelatin (both from Sigma-Aldrich) was left to bind the washed ELISA plate for 1 h at room temperature. Unbound material was removed, bound scFv was detected using an AP-conjugated anti-FLAG M2 antibody (Sigma-Aldrich), followed by the addition of a luminescent substrate (CDP Star Emerald II, Thermo Fisher Scientific), and plates were read in a plate reader (Tecan Ultra with Tecan Magellan v.3.0). All receptor binding clones were cherry-picked, grown overnight in 96-well microtiter plates and Sanger sequenced.
Matching experimentally observed and in silico generated antibody enrichment signatures
The frequency of individual antibody clones throughout the consecutive selections (FS1, FS2, FS3, FS4) was obtained from the NextSeq data. The antibodies were classified in three steps.
First, guided by in silico signatures for antibodies to all categories of target biomolecules in selections with nontarget cell competition, the antibodies were classified as relevant binders if FS2 > FS1 while FS2 and FS3 > 0 (Data S1). Antibodies not meeting these criteria were classified as nonenriched.
Second, relevant binders were classified into three binder types by matching experimentally observed and in silico generated antibody enrichment signatures from selections with nontarget cell competition. Experimentally determined FS4 was compared to the predicted frequency for binders with 10 nM affinities for receptors expressed at 105 and 106 copies/target cell with no nontarget cell expression (1.3 ppm and 750 ppm, respectively) to guide the classification. The antibodies were classified as binding receptors with target cell expression
-
1)
>1,000,000 (antibodies with a frequency signature higher than the 1,000,000 predicted signature) - inclusion criteria: FS4 > 750 ppm
-
2)
100,000–1,000,000 (antibodies with a frequency signature between 100,000–1,000,000 predicted signatures) - inclusion criteria: 1.32 ppm < FS4 < 750 ppm
-
3)
<100,000, or > 5-fold upregulated receptors (antibodies with a frequency signature lower than the 100,000 predicted signature) - inclusion criteria: FS4 <1.32 ppm
Third, antibodies classified as binding receptors with <100,000 copies on target cells or upregulated receptors were further classified by comparing signatures from selections with and without nontarget cell competition. The antibodies were classified as binding to
-
1)
Upregulated receptors - inclusion criteria > 100
or, if FS4 with competition = 0; FS4 without competition >0 and >10
-
2)
Low expressed, target cell-restricted receptors - inclusion criteria FS1 ≠ 0 and FS2, FS3, FS4 = 0 (all without nontarget cell competition)
-
3)
Unclassified – not meeting any of the criteria above.
Production of IgG identified with predictive signatures
Complete scFv sequences needed for clone synthesis were obtained from the MiSeq data. VH and VL + CDR-H3 sequences were combined based on the CDR-H3 sequence. In cases where one CDR-H3 was associated with more than one set of VH and VL sequences, the frequencies in the two libraries were used to join the correct VH/VL pair.
Antibody genes were synthesized (Twist Bioscience) and ligated into a vector containing genes encoding the heavy and light chain constant regions of a human IgG1 antibody (BioInvent). Suspension adapted HEK 293 EBNA (ATCC, suspension adapted in-house) in Freestyle293 medium (Thermo Fisher Scientific) supplemented with 10% Pluronic F-68 (Thermo Fisher Scientific) was transiently transfected using PEI (Polyscience Inc) in 24-well plates. The cells were incubated at +37°C, 8% CO2, 300rpm for 4h before addition of UltraPepTM Soy (Sheffield Bio-Science). After another 6 days of incubation, cell supernatants were harvested and incubated with MabSelect resin (GE Healthcare) to allow antibody binding, followed by transfer of the resin to a 96-well filter plate (Thermo Fisher Scientific) for washing and elution of purified antibodies using 100mM Glycine, pH 2.8. Purified hIgG1 antibodies were dialyzed (DispoDialyser, Harvard Apparatus) to PBS buffer.
Confirmatory binding analysis of antibodies identified by iLaMA
Purified IgG was diluted to 100 μg/mL, titrated in 25 μL PBS +0.5% BSA, added to 50,000 DU145 (target) and Jurkat (nontarget) cells/well and left to bind for 1 h at +4°C. After washing, IgG bound to live cells was detected using an APC-conjugated anti-human-Fc antibody (Jackson Immunoresearch, 109-136-098) together with a live/dead cell marker (SYTOX green, Thermo Fisher Scientific) and analyzed by flow cytometry (iQue, Intellicyt Sartorius using FortCyt v 8.0). To generate a calibration curve to transform the MFI at saturated binding to the receptor number, a subset of IgGs with different signal intensities at saturated cell binding concentrations was selected for receptor number determination. Purified IgGs were labeled with AF647 using an Alexa Fluor 647 carboxylic acid succinimidyl ester (ThermoFisher Scientific) according to the manufacturer’s instructions. Labeled antibodies were used for receptor number determination using calibration beads (Bang Laboratories, 816) according to the manufacturer’s instructions. The assay detection limit (receptor number for isotype control) was 1,000 receptors/cell.
Quantification and statistical analysis
Statistical analysis was performed using GraphPad Prism 9 and conducted by a Mann-Whitney test with ∗p ≤ 0.05 (Figure S4C).
Acknowledgments
The work was sponsored by research grants from the Swedish Foundation for Strategic Research (B.N. and M.O.). We thank Professor Mark Cragg, Antibody and Vaccine Group, Centre for Cancer Immunology, University of Southampton, UK, and Dr. Niyaz Yoosuf (BioInvent) for the critical review of the manuscript. The Graphical Abstract; Figures 4 and S1; and parts of Figures 1, 2, S2, and S4 were created with BioRender.com.
Author contributions
J.M., A.L., and B.F. designed the research. J.M., A.L., and C.S. performed the analysis. J.M., A.L., and A.C. analyzed the data. B.N. and M.O. advised on research. J.M., A.L., and B.F. wrote the paper.
Declaration of interests
J.M., C.S., and B.F. are BioInvent employees, and A.L. was employed by BioInvent. J.M., A.L., and B.F. are shareholders of BioInvent International. J.M., A.L., and B.F. are inventors on BioInvent patent applications relevant to the prediction-guided methodology for antibody discovery.
Published: May 9, 2023
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.crmeth.2023.100475.
Supplemental information
Data and code availability
-
•
Anonymized antibody frequencies following each selection can be found in Data S1.
-
•
This paper does not report original code.
-
•
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
References
- 1.Hodi F.S., O'Day S.J., McDermott D.F., Weber R.W., Sosman J.A., Haanen J.B., Gonzalez R., Robert C., Schadendorf D., Hassel J.C., et al. Improved survival with ipilimumab in patients with metastatic melanoma. N. Engl. J. Med. 2010;363:711–723. doi: 10.1056/NEJMoa1003466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Topalian S.L., Hodi F.S., Brahmer J.R., Gettinger S.N., Smith D.C., McDermott D.F., Powderly J.D., Carvajal R.D., Sosman J.A., Atkins M.B., et al. Safety, activity, and immune correlates of anti-PD-1 antibody in cancer. N. Engl. J. Med. 2012;366:2443–2454. doi: 10.1056/NEJMoa1200690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Maloney D.G. Anti-CD20 antibody therapy for B-cell lymphomas. N. Engl. J. Med. 2012;366:2008–2016. doi: 10.1056/NEJMct1114348. [DOI] [PubMed] [Google Scholar]
- 4.Lee D.S.W., Rojas O.L., Gommerman J.L. B cell depletion therapies in autoimmune disease: advances and mechanistic insights. Nat. Rev. Drug Discov. 2021;20:179–199. doi: 10.1038/s41573-020-00092-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Yasunaga M. Antibody therapeutics and immunoregulation in cancer and autoimmune disease. Semin. Cancer Biol. 2020;64:1–12. doi: 10.1016/j.semcancer.2019.06.001. [DOI] [PubMed] [Google Scholar]
- 6.Vincent F., Nueda A., Lee J., Schenone M., Prunotto M., Mercola M. Phenotypic drug discovery: recent successes, lessons learned and new directions. Nat. Rev. Drug Discov. 2022;21:899–914. doi: 10.1038/s41573-022-00472-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Swinney D.C., Anthony J. How were new medicines discovered? Nat. Rev. Drug Discov. 2011;10:507–519. doi: 10.1038/nrd3480. [DOI] [PubMed] [Google Scholar]
- 8.Waldmann H., Polliak A., Hale G., Or R., Cividalli G., Weiss L., Weshler Z., Samuel S., Manor D., Brautbar C., et al. Elimination of graft-versus-host disease by in-vitro depletion of alloreactive lymphocytes with a monoclonal rat anti-human lymphocyte antibody (CAMPATH-1) Lancet. 1984;2:483–486. doi: 10.1016/s0140-6736(84)92564-9. [DOI] [PubMed] [Google Scholar]
- 9.Veitonmäki N., Hansson M., Zhan F., Sundberg A., Löfstedt T., Ljungars A., Li Z.C., Martinsson-Niskanen T., Zeng M., Yang Y., et al. A human ICAM-1 antibody isolated by a function-first approach has potent macrophage-dependent antimyeloma activity in vivo. Cancer Cell. 2013;23:502–515. doi: 10.1016/j.ccr.2013.02.026. [DOI] [PubMed] [Google Scholar]
- 10.Roghanian A., Teige I., Mårtensson L., Cox K.L., Kovacek M., Ljungars A., Mattson J., Sundberg A., Vaughan A.T., Shah V., et al. Antagonistic human FcgammaRIIB (CD32B) antibodies have anti-tumor activity and overcome resistance to antibody therapy in vivo. Cancer Cell. 2015;27:473–488. doi: 10.1016/j.ccell.2015.03.005. [DOI] [PubMed] [Google Scholar]
- 11.Ljungars A., Mårtensson L., Mattsson J., Kovacek M., Sundberg A., Tornberg U.C., Jansson B., Persson N., Emruli V.K., Ek S., et al. A platform for phenotypic discovery of therapeutic antibodies and targets applied on Chronic Lymphocytic Leukemia. NPJ Precis. Oncol. 2018;2:18. doi: 10.1038/s41698-018-0061-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Williams G.S., Mistry B., Guillard S., Ulrichsen J.C., Sandercock A.M., Wang J., González-Muñoz A., Parmentier J., Black C., Soden J., et al. Phenotypic screening reveals TNFR2 as a promising target for cancer immunotherapy. Oncotarget. 2016;7:68278–68291. doi: 10.18632/oncotarget.11943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Semmrich M., Marchand J.-B., Fend L., Rehn M., Remy C., Holmkvist P., Silvestre N., Svensson C., Kleinpeter P., Deforges J., et al. Vectorized Treg-depleting anti-CTLA-4 elicits antigen cross-presentation and CD8+ T cell immunity to reject “cold” tumors. J. Immunother. Cancer. 2022;10 doi: 10.1136/jitc-2021-003488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Dyer M.J., Hale G., Hayhoe F.G., Waldmann H. Effects of CAMPATH-1 antibodies in vivo in patients with lymphoid malignancies: influence of antibody isotype. Blood. 1989;73:1431–1439. [PubMed] [Google Scholar]
- 15.de Kruif J., Terstappen L., Boel E., Logtenberg T. Rapid selection of cell subpopulation-specific human monoclonal antibodies from a synthetic phage antibody library. Proc. Natl. Acad. Sci. USA. 1995;92:3938–3942. doi: 10.1073/pnas.92.9.3938. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ridgway J.B., Ng E., Kern J.A., Lee J., Brush J., Goddard A., Carter P. Identification of a human anti-CD55 single-chain Fv by subtractive panning of a phage library using tumor and nontumor cell lines. Cancer Res. 1999;59:2718–2723. [PubMed] [Google Scholar]
- 17.Qin H., Lerman B., Sakamaki I., Wei G., Cha S.C., Rao S.S., Qian J., Hailemichael Y., Nurieva R., Dwyer K.C., et al. Generation of a new therapeutic peptide that depletes myeloid-derived suppressor cells in tumor-bearing mice. Nat. Med. 2014;20:676–681. doi: 10.1038/nm.3560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Egloff P., Zimmermann I., Arnold F.M., Hutter C.A.J., Morger D., Opitz L., Poveda L., Keserue H.A., Panse C., Roschitzki B., Seeger M.A. Engineered peptide barcodes for in-depth analyses of binding protein libraries. Nat. Methods. 2019;16:421–428. doi: 10.1038/s41592-019-0389-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sandercock A.M., Rust S., Guillard S., Sachsenmeier K.F., Holoweckyj N., Hay C., Flynn M., Huang Q., Yan K., Herpers B., et al. Identification of anti-tumour biologics using primary tumour models, 3-D phenotypic screening and image-based multi-parametric profiling. Mol. Cancer. 2015;14:147. doi: 10.1186/s12943-015-0415-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Nixon A.M.L., Duque A., Yelle N., McLaughlin M., Davoudi S., Pedley N.M., Haynes J., Brown K.R., Pan J., Hart T., et al. A rapid in vitro methodology for simultaneous target discovery and antibody generation against functional cell subpopulations. Sci. Rep. 2019;9:842. doi: 10.1038/s41598-018-37462-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Smith G.P. Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science. 1985;228:1315–1317. doi: 10.1126/science.4001944. [DOI] [PubMed] [Google Scholar]
- 22.McCafferty J., Griffiths A.D., Winter G., Chiswell D.J. Phage antibodies: filamentous phage displaying antibody variable domains. Nature. 1990;348:552–554. doi: 10.1038/348552a0. [DOI] [PubMed] [Google Scholar]
- 23.Boder E.T., Wittrup K.D. Yeast surface display for screening combinatorial polypeptide libraries. Nat. Biotechnol. 1997;15:553–557. doi: 10.1038/nbt0697-553. [DOI] [PubMed] [Google Scholar]
- 24.Ellington A.D., Szostak J.W. In vitro selection of RNA molecules that bind specific ligands. Nature. 1990;346:818–822. doi: 10.1038/346818a0. [DOI] [PubMed] [Google Scholar]
- 25.Hanes J., Schaffitzel C., Knappik A., Plückthun A. Picomolar affinity antibodies from a fully synthetic naive library selected and evolved by ribosome display. Nat. Biotechnol. 2000;18:1287–1292. doi: 10.1038/82407. [DOI] [PubMed] [Google Scholar]
- 26.Söderlind E., Strandberg L., Jirholt P., Kobayashi N., Alexeiva V., Aberg A.M., Nilsson A., Jansson B., Ohlin M., Wingren C., et al. Recombining germline-derived CDR sequences for creating diverse single-framework antibody libraries. Nat. Biotechnol. 2000;18:852–856. doi: 10.1038/78458. [DOI] [PubMed] [Google Scholar]
- 27.Rothe C., Urlinger S., Löhning C., Prassler J., Stark Y., Jäger U., Hubner B., Bardroff M., Pradel I., Boss M., et al. The human combinatorial antibody library HuCAL GOLD combines diversification of all six CDRs according to the natural immune system with a novel display method for efficient selection of high-affinity antibodies. J. Mol. Biol. 2008;376:1182–1200. doi: 10.1016/j.jmb.2007.12.018. [DOI] [PubMed] [Google Scholar]
- 28.Ljungars A., Svensson C., Carlsson A., Birgersson E., Tornberg U.C., Frendéus B., Ohlin M., Mattsson M. Deep mining of complex antibody phage pools generated by cell panning enables discovery of rare antibodies binding new targets and epitopes. Front. Pharmacol. 2019;10:847. doi: 10.3389/fphar.2019.00847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Fransson J., Tornberg U.C., Borrebaeck C.A.K., Carlsson R., Frendéus B. Rapid induction of apoptosis in B-cell lymphoma by functionally isolated human antibodies. Int. J. Cancer. 2006;119:349–358. doi: 10.1002/ijc.21829. [DOI] [PubMed] [Google Scholar]
- 30.Schiopu A., Frendéus B., Jansson B., Söderberg I., Ljungcrantz I., Araya Z., Shah P.K., Carlsson R., Nilsson J., Fredrikson G.N. Recombinant antibodies to an oxidized low-density lipoprotein epitope induce rapid regression of atherosclerosis in apobec-1(-/-)/low-density lipoprotein receptor(-/-) mice. J. Am. Coll. Cardiol. 2007;50:2313–2318. doi: 10.1016/j.jacc.2007.07.081. [DOI] [PubMed] [Google Scholar]
- 31.Menendez A., Scott J.K. The nature of target-unrelated peptides recovered in the screening of phage-displayed random peptide libraries with antibodies. Anal. Biochem. 2005;336:145–157. doi: 10.1016/j.ab.2004.09.048. [DOI] [PubMed] [Google Scholar]
- 32.Sebba A. Tocilizumab: the first interleukin-6-receptor inhibitor. Am. J. Health Syst. Pharm. 2008;65:1413–1418. doi: 10.2146/ajhp070449. [DOI] [PubMed] [Google Scholar]
- 33.Moffat J.G., Vincent F., Lee J.A., Eder J., Prunotto M. Opportunities and challenges in phenotypic drug discovery: an industry perspective. Nat. Rev. Drug Discov. 2017;16:531–543. doi: 10.1038/nrd.2017.111. [DOI] [PubMed] [Google Scholar]
- 34.Swinney D.C. Phenotypic vs. target-based drug discovery for first-in-class medicines. Clin. Pharmacol. Ther. 2013;93:299–301. doi: 10.1038/clpt.2012.236. [DOI] [PubMed] [Google Scholar]
- 35.Swinney D.C., Lee J.A. Recent advances in phenotypic drug discovery. F1000Res. 2020;9 doi: 10.12688/f1000research.25813.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.NCT03571568 A Study of BI-1206 in Combination with Rituximab in Subjects with Indolent B-Cell Non-hodgkin Lymphoma. https://ClinicalTrials.gov/show/NCT03571568
- 37.NCT04219254 A Study of BI-1206 in Combination with Pembrolizumab in Subjects with Advanced Solid Tumors (KEYNOTE-A04). https://ClinicalTrials.gov/show/NCT04219254
- 38.NCT029333320 BI-1206 and an Anti-CD20 Antibody in Patients with CD32b Positive B-Cell Lymphoma or Leukaemia. https://ClinicalTrials.gov/show/NCT02933320
- 39.NCT04752826 BI-1808 as a Single Agent and with Pembrolizumab in Treatment of Advanced Malignancies. https://ClinicalTrials.gov/show/NCT04752826
- 40.NCT04725331 A Clinical Trial Assessing BT-001 Alone and in Combination with Pembrolizumab in Metastatic or Advanced Solid Tumors. https://ClinicalTrials.gov/show/NCT04725331
- 41.Chandrasekaran S.N., Ceulemans H., Boyd J.D., Carpenter A.E. Image-based profiling for drug discovery: due for a machine-learning upgrade? Nat. Rev. Drug Discov. 2021;20:145–159. doi: 10.1038/s41573-020-00117-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Mattsson J., Ekdahl L., Junghus F., Ajore R., Erlandsson E., Niroula A., Pertesi M., Frendéus B., Teige I., Nilsson B. Accelerating target deconvolution for therapeutic antibody candidates using highly parallelized genome editing. Nat. Commun. 2021;12:1277. doi: 10.1038/s41467-021-21518-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Cortese I., Tafi R., Grimaldi L.M., Martino G., Nicosia A., Cortese R. Identification of peptides specific for cerebrospinal fluid antibodies in multiple sclerosis by using phage libraries. Proc. Natl. Acad. Sci. USA. 1996;93:11063–11067. doi: 10.1073/pnas.93.20.11063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Larsen S.A., Meldgaard T., Lykkemark S., Mandrup O.A., Kristensen P. Selection of cell-type specific antibodies on tissue-sections using phage display. J. Cell Mol. Med. 2015;19:1939–1948. doi: 10.1111/jcmm.12568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.DiGiandomenico A., Warrener P., Hamilton M., Guillard S., Ravn P., Minter R., Camara M.M., Venkatraman V., Macgill R.S., Lin J., et al. Identification of broadly protective human antibodies to Pseudomonas aeruginosa exopolysaccharide Psl by phenotypic screening. J. Exp. Med. 2012;209:1273–1287. doi: 10.1084/jem.20120033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.van den Brink E.N., Ter Meulen J., Cox F., Jongeneelen M.A.C., Thijsse A., Throsby M., Marissen W.E., Rood P.M.L., Bakker A.B.H., Gelderblom H.R., et al. Molecular and biological characterization of human monoclonal antibodies binding to the spike and nucleocapsid proteins of severe acute respiratory syndrome coronavirus. J. Virol. 2005;79:1635–1644. doi: 10.1128/JVI.79.3.1635-1644.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bertoglio F., Meier D., Langreder N., Steinke S., Rand U., Simonelli L., Heine P.A., Ballmann R., Schneider K.T., Roth K.D.R., et al. SARS-CoV-2 neutralizing human recombinant antibodies selected from pre-pandemic healthy donors binding at RBD-ACE2 interface. Nat. Commun. 2021;12:1577. doi: 10.1038/s41467-021-21609-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Lehrer-Graiwer J., Singh P., Abdelbaky A., Vucic E., Korsgren M., Baruch A., Fredrickson J., van Bruggen N., Tang M.T., Frendeus B., et al. FDG-PET imaging for oxidized LDL in stable atherosclerotic disease: a phase II study of safety, tolerability, and anti-inflammatory activity. JACC. Cardiovasc. Imaging. 2015;8:493–494. doi: 10.1016/j.jcmg.2014.06.021. [DOI] [PubMed] [Google Scholar]
- 49.Li S., Kievit P., Robertson A.K., Kolumam G., Li X., von Wachenfeldt K., Valfridsson C., Bullens S., Messaoudi I., Bader L., et al. Targeting oxidized LDL improves insulin sensitivity and immune cell function in obese Rhesus macaques. Mol. Metab. 2013;2:256–269. doi: 10.1016/j.molmet.2013.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.De Coster W., Weissensteiner M.H., Sedlazeck F.J. Towards population-scale long-read sequencing. Nat. Rev. Genet. 2021;22:572–587. doi: 10.1038/s41576-021-00367-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Barretina J., Caponigro G., Stransky N., Venkatesan K., Margolin A.A., Kim S., Wilson C.J., Lehár J., Kryukov G.V., Sonkin D., et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Liu A.Y. Differential expression of cell surface molecules in prostate cancer cells. Cancer Res. 2000;60:3429–3434. [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
-
•
Anonymized antibody frequencies following each selection can be found in Data S1.
-
•
This paper does not report original code.
-
•
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.




