Abstract
Computational methods have traditionally struggled to predict the effect of mutations in antibody–antigen complexes on binding affinity. This has limited their usefulness during antibody engineering and development, and their ability to predict biologically relevant escape mutations. Here we present mCSM-AB, a user-friendly web server for accurately predicting antibody–antigen affinity changes upon mutation which relies on graph-based signatures. We show that mCSM-AB performs better than comparable methods that have been previously used for antibody engineering. mCSM-AB web server is available at http://structure.bioc.cam.ac.uk/mcsm_ab.
INTRODUCTION
Antibodies (Abs) are a central component of our immune system, generally recognizing a given antigen through variable loops of β-strands known as the complementarity determining regions (CDR's). This allows Abs to bind to a wide range of targets with high specificity, including those typically viewed as undruggable, which has been widely exploited experimentally, diagnostically and therapeutically. Mutations within the CDR's are important for determining the Abs specificity and affinity, however, like any therapeutic treatment, they can exert a selective pressure leading to the development of escape mutations, which are mutations typically located in the antigen, leading to reduced Ab-binding affinity. Engineering Abs often requires optimization of not just selectivity and affinity for the given antigen, but also stability, solubility and immunogenicity of the Ab (1). Optimization of these properties is essential not only for therapeutic use (2,3), but to improve quality and reproducibility in experimental settings (4).
A number of computational methods have been used to design and optimize Abs (5–8), normally using an available crystal structure, however accurate prediction of the effect of a mutation on the free energy of protein binding is non-trivial. This was highlighted recently by Sirin et al. who compiled an experimental dataset to benchmark these available methods, showing that there is still significant room for improvement, with the best methods only able to identify a third of mutations that improved binding. We have previously shown that using graph-based signatures to represent the 3D wild-type physicochemical and geometrical environment of a residue we could accurately predict the effects of mutations on protein stability, protein–protein affinity, protein–nucleic acid affinity and most recently protein–small molecule affinity (9–12). These have provided valuable insights into the effects of mutations in a variety of biological scenarios (13–16).
An accurate, robust and scalable computational approach would have enormous implications for not only directing Ab development, but in understanding the evolution and treatment of escape mutations, including through optimized vaccine design. We have therefore benchmarked our existing general methodologies against computational approaches for Ab engineering, and trained a novel Ab-specific predictor using the mCSM graph-based signatures concept in order to account for the unique and highly flexible recognition of Abs: mCSM-AB.
MATERIALS AND METHODS
Datasets
To assess the applicability of mCSM signatures in predicting the impact of mutations on Ab–antigen affinity, a dataset derived from the AB-Bind Database was considered (17). AB-Bind Database is a collection of experimental thermodynamic parameters for wild-type and mutant Abs and antigens, including the change in Gibbs free energy of binding (ΔΔG), linked to published crystal structures of the complexes. A total of 645 single-point mutations on 29 different Ab–antigen complexes were considered, five of which were homology models, kindly provided by the AB-Bind database authors. Supplementary Figure S1 of Supplementary Data shows the experimental ΔΔG distributions for the mutations in this dataset, which is skewed towards mutations that destabilize Ab-binding affinity. This is a limitation that affects the development of machine learning methods. In order to avoid any bias caused by this, within the training and test sets we have included models of the mutations (obtained using Modeller (18)) in order to consider the hypothetical reverse mutation (mutant to wild-type). This approach was initially proposed by (19) in order to better balance experimental observations where there is natural bias in the distribution of experimental observations, avoiding the subsequent bias in the computational models.
Low-redundancy datasets
In order to reduce the chance of overfitting while training the predictive models and enhance their generalization, a procedure for reducing redundancy between cross validation folds was employed. Training and test sets for each fold were divided in a way that all mutations in a given residue position would only be present in either training or test set. The resulting low-redundancy sets are available at http://structure.bioc.cam.ac.uk/mcsm_ab/data.
Graph-based structural signatures
Our approach uses a graph-based structural signature for Ab–antigen complexes that models both the geometry and physicochemical properties of the interactions and architecture of the wild-type Ab–antigen complex by representing atoms as nodes and interactions between them as edges. From this representation, distance patterns between atoms categorized by their properties are summarized in concise signatures as cumulative distributions and used as evidence for machine learning methods. Figure 1A depicts the mCSM-AB prediction workflow. Machine learning methods, evaluation procedures and performance metrics used are described in Supplementary Data.
WEB SERVER
We have implemented mCSM-AB via a user-friendly web server freely available at http://structure.bioc.cam.ac.uk/mcsm_ab. The server front-end was built using Bootstrap framework version 2.0, while the back-end was built in Python via the Flask framework (version 0.10.1), running on a Linux server.
Input
As shown in the job submission interface (Supplementary Figure S1), mCSM-AB allows users to upload Ab–antigen complexes (in PDB format) and inform mutations (a single mutation or a list), on either the Ab or antigen, for which the impact on antigen affinity can be predicted. The mutation information is given as the residue position, wild-type and mutant residue codes in one-letter format and chain identifier. The predictions are performed as a regression task (numerical prediction of the difference in Gibbs free energy—ΔΔG). In order to aid users to submit jobs to mCSM-AB and interpret its predictions, a help page has been implemented and is accessible via the top navigation bar.
Output
Figure 2 shows the result page for predicting the effect of a single mutation as a regression task. The predicted change in Gibbs Free Energy upon mutation (ΔΔG in Kcal/mol) is given (i) as well as the identification of the provided mutation (ii) and a GLMol-based visualization of the mutated residue in the Ab–antigen structural environment (iii). A negative value (and red writing) corresponds to a mutation predicted as reducing affinity; while a positive sign (and blue writing) corresponds to a mutation predicted as increasing affinity. For classification tasks, the server will predict the mutation as either reducing or increasing affinity.
The result page for predicting effects of a list of mutations (Supplementary Figure S2) is given in a tabular format, containing the identification of the mutation, its residue relative solvent accessibility, predicted ΔΔG or direction of change (increase/decrease affinity). An option to download the results as a tab-separated file is also available.
VALIDATION
Comparison with well established methods
As highlighted in Sirin et al. (17), the leading computational approaches for Ab engineering struggled to identify a majority of mutations leading to improved antigen affinity. We have previously published and characterized a machine learning method using graph-based signature to predict the effects of mutations on protein–protein binding affinities, mCSM-PPI (20). The benchmarking database used by Sirin et al. comprised of 29 Ab–antigen structures where the antigens were comprised of protein and peptide chains. We therefore evaluated the ability of mCSM-PPI to accurately predict the effects of mutations in these Ab–antigen complexes. mCSM-PPI performed as well as the leading methodologies analysed by Sirin et al., achieving a Pearson's correlation of r = 0.36. This highlights the ability of graph-based signatures to model the effects of mutations without the need to explicitly consider the effects of solvation, unlike methods relying on free energy perturbation, thermodynamic integration and empirical models where it needs to be considered for accuracy. mCSM-PPI is therefore significantly less computationally demanding, and can be run rapidly without the associated loss in accuracy.
Based on these promising results, we hypothesized that it might be more accurate and appropriate to build a graph-based methodology tailored specifically to consider the unique interaction interfaces presented by Abs (21), along with the more specialized cases where the antigen is a peptide (22–24), than a generalized PPI methodology where these types of interactions are underrepresented. We therefore built and trained a novel predictor: mCSM-AB.
mCSM-AB performed significantly better than the methods evaluated by Sirin et al., achieving a Pearson's correlation of r = 0.53 on regression tasks on 10-fold cross validation, considering a low-redundancy dataset, with a standard deviation of 1.981. Figure 3 shows the regression plots between experimental and predicted affinity changes obtained for the original AB-Bind dataset (left-hand side) and for new method that also includes the models of the hypothetical reverse mutations (right-hand side). Table 1 presents a comparison of the performances between the considered methods. It is interesting to notice that the mCSM-AB was superior to most the compared methods, showcasing prediction capabilities of the proposed mCSM-AB platform.
Table 1. Performance comparison of available methods and mCSM-AB in classifying the direction of the change in Ab-antigen affinity caused by a mutation.
Method | Pearson's coefficient |
---|---|
bASA | 0.22 |
dDFIRE | 0.19 |
DFIRE | 0.31 |
STATIUM | 0.32 |
Rosetta | 0.16 |
FoldX | 0.34 |
Discovery Studio | 0.45 |
mCSM-PPI | 0.35 |
mCSM-AB | 0.53/0.56a |
The mCSM-AB predictive model built using the hypothetical reverse mutations was more robust and accurate than the model built using just the experimental observations. While the Pearson's correlations were not statistically different from the model trained with the original dataset, its ability to differentiate mutations that enhance or reduce binding affinity was significantly improved.
During cross validation, mCSM-AB achieved correlations of 0.32 and 0.30 for mutations increasing and decreasing affinity, respectively. It is important to point out, that even though the correlations were lower than the complete dataset, mCSM-AB still correctly assigns the direction of the change for 77 and 76% of the increasing and decreasing affinity mutations, respectively.
Blind test validation on homology models
In the original dataset proposed by (17), there were 5 Ab–antigen homology model complexes, containing 87 mutations in total. Of these mutations, ∼63% (55 mutations) lead to a decrease in experimental binding affinity, whilst 37% (32 mutations) had been shown experimentally to increase binding affinities. The hypothetical reverse mutations were also included, generating a balanced dataset. These were therefore removed from the mCSM-AB training set, which therefore was trained on experimental structures only, and used to evaluate the final models.
The model achieved a Pearson's correlation of 0.45 with the experimentally measured affinities of the homology modelled complexes. mCSM-AB presented a slightly lower performance than in cross validation, but with a lower standard deviation (1.30), which provides confidence in the applicability of this approach beyond experimental structures to those that are computationally modelled. Supplementary Table S1 shows the performance of mCSM-AB per homology model.
Predicting HIV-1 escape mutations
We also wanted to evaluate the applicability of these models outside of experimentally measured affinities to biological systems, in particular the development of escape mutations that hinder the usefulness of therapeutic Abs. For this evaluation we used the anti-HIV therapeutic Ab VRC01, which recognizes the HIV-1 envelope glycoprotein gp120 complex. The effect of 78 distinct mutations upon the effectiveness of VRC01 had been studied in (25). Of these mutations, 33 increased and 45 decreased HIV-1 sensitivity to this neutralization Ab. Using the experimental crystal structure of the VRC01–p120 complex (PDB ID: 3NGB), we tested whether mCSM-AB could correctly classify those mutations which resulted in reduced effectiveness of VRC01. mCSM-AB predictions correlated strongly to the biological measurements, with a Pearson's correlation of 0.51 (and consistent with performance on the AB-Bind database). This highlighted the potential predictive power of this approach for exploring the consequences of biologically and clinically relevant mutations.
We therefore expanded our analysis by performing computational saturation mutagenesis of the entire gp120 structure and mapping the average predicted changes in binding affinity onto the structure (Figure 1B). This could be of significant help in identifying regions more likely to lead to develop escape mutations against Ab therapy, and help guide design and development strategies.
mCSM-AB has also been further validated on a dataset of 114 mutations on 4 Ab/antigen complexes (228, including hypothetical reverse mutations) from (26), achieving a correlation of 0.67, as described in Supplementary Data.
CONCLUSIONS
We present here a new approach, mCSM-AB, which relies upon graph-based signatures to predict the impact of missense mutations upon the binding affinity of an Ab for a given antigen. The results achieved by mCSM-AB support the idea that the molecular shape and physicochemical complementarity, driving forces guiding molecular recognition on Ab–antigen complexes can be modelled and its determinant features mined with structural signatures. This allowed us to tailor a new method to accurately and robustly predict the effects of mutations. We believe that mCSM-AB will be a useful tool in the design and development of therapeutic and diagnostic Abs, and could provide useful insight into the development of escape mutations. A web server implementing mCSM-AB functionality is freely available at http://structure.bioc.cam.ac.uk/mcsm_ab.
Supplementary Material
Acknowledgments
We wish to thank Prof Amy Keating (Department of Biology and Biological Engineering, Massachusetts Institute of Technology) for kindly providing the PDB files used in their study and Amanda Albanaz for aiding the VRC01 analysis.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
FUNDING
Newton Fund RCUK-CONFAP Grant awarded by The Medical Research Council (MRC) and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG) (MR/M026302/1 to D.E.V.P., D.B.A.); René Rachou Research Center (CPqRR/FIOCRUZ Minas), Brazil (to D.E.V.P.); NHMRC CJ Martin Fellowship [APP1072476 to D.B.A.]. Funding for open access charge: MRC.
Conflict of interest statement. None declared.
REFERENCES
- 1.Chan L.J., Bulitta J.B., Ascher D.B., Haynes J.M., McLeod V.M., Porter C.J.H., Williams C.C., Kaminskas L.M. PEGylation does not significantly change the initial intravenous or subcutaneous pharmacokinetics or lymphatic exposure of trastuzumab in rats but increases plasma clearance after subcutaneous administration. Mol. Pharm. 2015;12:794–809. doi: 10.1021/mp5006189. [DOI] [PubMed] [Google Scholar]
- 2.Watt A.D., Crespi G.A.N., Down R.A., Ascher D.B., Gunn A., Perez K.A., McLean C.A., Villemagne V.L., Parker M.W., Barnham K.J., et al. Do current therapeutic anti-Aβ antibodies for Alzheimer's disease engage the target? Acta Neuropathol. 2014;127:803–810. doi: 10.1007/s00401-014-1290-2. [DOI] [PubMed] [Google Scholar]
- 3.Watt A.D., Crespi G. A.N., Down R.A., Ascher D.B., Gunn A., Perez K.A., McLean C.A., Villemagne V.L., Parker M.W., Barnham K.J., et al. Anti-Aβ antibody target engagement: a response to Siemers et al. Acta Neuropathol. 2014;128:611–614. doi: 10.1007/s00401-014-1333-8. [DOI] [PubMed] [Google Scholar]
- 4.Baker M. Reproducibility crisis: blame it on the antibodies. Nature. 2015;521:274–276. doi: 10.1038/521274a. [DOI] [PubMed] [Google Scholar]
- 5.Schymkowitz J., Borg J., Stricher F., Nys R., Rousseau F., Serrano L. The FoldX web server: an online force field. Nucleic Acids Res. 2005;33(Suppl. 2):W382–W388. doi: 10.1093/nar/gki387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Spassov V.Z., Yan L. pH-selective mutagenesis of protein–protein interfaces: in silico design of therapeutic antibodies with prolonged half-life. Proteins. 2013;81:704–714. doi: 10.1002/prot.24230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kortemme T., Morozov A.V., Baker D. An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein–protein complexes. J. Mol. Biol. 2003;326:1239–1259. doi: 10.1016/s0022-2836(03)00021-4. [DOI] [PubMed] [Google Scholar]
- 8.Yang Y., Zhou Y. Specific interactions for ab initio folding of protein terminal regions with secondary structures. Proteins. 2008;72:793–803. doi: 10.1002/prot.21968. [DOI] [PubMed] [Google Scholar]
- 9.Pires D.E.V., Ascher D.B., Blundell T.L. mCSM: predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics. 2014;30:335–342. doi: 10.1093/bioinformatics/btt691. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Pires D.E.V., Blundell T.L., Ascher D.B. pkCSM: predicting small-molecule pharmacokinetic and toxicity properties using graph-based signatures. J. Med. Chem. 2015;58:4066–4072. doi: 10.1021/acs.jmedchem.5b00104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Pires D.E.V., Blundell T.L., Ascher D.B. Platinum: a database of experimentally measured effects of mutations on structurally defined protein–ligand complexes. Nucleic Acids Res. 2015;43:D387–D391. doi: 10.1093/nar/gku966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Pires D.E.V., Ascher D.B., Blundell T.L. DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach. Nucleic Acids Res. 2014;42:W314–W319. doi: 10.1093/nar/gku411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Jafri M., Wake N.C., Ascher D.B., Pires D. E.V., Gentle D., Morris M.R., Rattenberry E., Simpson M.A., Trembath R.C., Weber A., et al. Germline mutations in the CDKN2B tumor suppressor gene predispose to renal cell carcinoma. Cancer Discov. 2015;5:723–729. doi: 10.1158/2159-8290.CD-14-1096. [DOI] [PubMed] [Google Scholar]
- 14.Nemethova M., Radvanszky J., Kadasi L., Ascher D.B., Pires D. E.V., Blundell T.L., Porfirio B., Mannoni A., Santucci A., Milucci L., et al. Twelve novel HGD gene variants identified in 99 alkaptonuria patients: focus on ‘black bone disease’ in Italy. Eur. J. Hum. Genet. 2016;24:66–72. doi: 10.1038/ejhg.2015.60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Usher J.L., Ascher D.B., Pires D.E., Milan A.M., Blundell T.L., Ranganath L.R. Analysis of HGD gene mutations in patients with alkaptonuria from the United Kingdom: identification of novel mutations. JIMD Reports. 2015;24:3–11. doi: 10.1007/8904_2014_380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Pires D. E.V., Chen J., Blundell T.L., Ascher D.B. In silico functional dissection of saturation mutagenesis: Interpreting the relationship between phenotypes and changes in protein stability, interactions and activity. Sci. Rep. 2016;6:19848. doi: 10.1038/srep19848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sirin S., Apgar J.R., Bennett E.M., Keating A.E. AB-Bind: antibody binding mutational database for computational affinity predictions. Protein Sci. 2015;25:393–409. doi: 10.1002/pro.2829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Šali A., Blundell T.L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
- 19.Thiltgen G., Goldstein R.A. Assessing predictors of changes in protein stability upon mutation using self-consistency. PLoS One. 2012;7:1–6. doi: 10.1371/journal.pone.0046084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Ascher D.B., Jubb H.C., Pires D.E.V., Ochi T., Higueruelo A., Blundell T.L. Protein-protein interactions: structures and druggability. In: Scapin G, Patel D, Arnold E, editors. Multifaceted Roles of Crystallography in Modern Drug Discovery. Netherlands: Springer; 2015. pp. 141–163. [Google Scholar]
- 21.Jubb H., Blundell T.L., Ascher D.B. Flexibility and small pockets at protein–protein interfaces: New insights into druggability. Prog. Biophys. Mol. Biol. 2015;119:2–9. doi: 10.1016/j.pbiomolbio.2015.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.David B.A., Gabriela A. N.C., Hooi L.N., Craig J.M., Michael W., et al. Novel therapeutic approaches to treat Alzheimer's disease and memory disorders. J. Proteomics Bioinform. 2008;1:464–476. [Google Scholar]
- 23.Wun K.S., Miles L.A., Crespi G. A.N., Wycherley K., Ascher D.B., Barnham K.J., Cappai R., Beyreuther K., Masters C.L., Parker M.W., et al. Crystallization and preliminary X-ray diffraction analysis of the Fab fragment of WO2, an antibody specific for the A peptides associated with Alzheimer's disease. Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun. 2008;64:438–441. doi: 10.1107/S1744309108011718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Crespi G. A.N., Ascher D.B., Parker M.W., Miles L.A. Crystallization and preliminary X-ray diffraction analysis of the Fab portion of the Alzheimer's disease immunotherapy candidate bapineuzumab complexed with amyloid. Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun. 2014;70:374–377. doi: 10.1107/S2053230X14001642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Li Y., O'Dell S., Walker L.M., Wu X., Guenaga J., Feng Y., Schmidt S.D., McKee K., Louder M.K., Ledgerwood J.E., et al. Mechanism of neutralization by the broadly neutralizing HIV-1 monoclonal antibody VRC01. J. Virol. 2011;85:8954–8967. doi: 10.1128/JVI.00754-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Benedix A., Becker C.M., de Groot B.L., Caflisch A., Böckmann R.A. Predicting free energy changes using structural ensembles. Nat. Methods. 2009;6:3–4. doi: 10.1038/nmeth0109-3. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.