Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2020 Aug 31;16(8):e1008150. doi: 10.1371/journal.pcbi.1008150

Computer-guided binding mode identification and affinity improvement of an LRR protein binder without structure determination

Yoonjoo Choi 1,#, Sukyo Jeong 1,#, Jung-Min Choi 1, Christian Ndong 2, Karl E Griswold 2,3,4, Chris Bailey-Kellogg 5, Hak-Sung Kim 1,*
Editor: Marco Punta6
PMCID: PMC7485979  PMID: 32866140

Abstract

Precise binding mode identification and subsequent affinity improvement without structure determination remain a challenge in the development of therapeutic proteins. However, relevant experimental techniques are generally quite costly, and purely computational methods have been unreliable. Here, we show that integrated computational and experimental epitope localization followed by full-atom energy minimization can yield an accurate complex model structure which ultimately enables effective affinity improvement and redesign of binding specificity. As proof-of-concept, we used a leucine-rich repeat (LRR) protein binder, called a repebody (Rb), that specifically recognizes human IgG1 (hIgG1). We performed computationally-guided identification of the Rb:hIgG1 binding mode and leveraged the resulting model to reengineer the Rb so as to significantly increase its binding affinity for hIgG1 as well as redesign its specificity toward multiple IgGs from other species. Experimental structure determination verified that our Rb:hIgG1 model closely matched the co-crystal structure. Using a benchmark of other LRR protein complexes, we further demonstrated that the present approach may be broadly applicable to proteins undergoing relatively small conformational changes upon target binding.

Author summary

It is quite challenging for computational methods to determine how proteins interact and to design mutations to alter their binding affinity and specificity. Despite recent advances in computational methods, however, in silico evaluation of binding energies has proven to be extremely difficult. We show that, in the case of protein-protein interactions where only small structural changes occur upon target binding, an integrated computational and experimental approach can identify a binding mode and drive reengineering efforts to improve binding affinity or specificity. Using as a model system a leucine-rich repeat (LRR) protein binder that recognizes human IgG1, our approach yielded a model of the protein complex that was very similar to the subsequently experimentally determined co-crystal structure, and enabled design of variants with significantly improved IgG1 binding affinity and with the ability to recognize IgG1 from other species.

Introduction

In the development of therapeutic proteins and vaccines, the efficacy and effectiveness are largely determined by their binding modes and affinities [16]. Binding mode identification and affinity improvement have generally relied on labor-intensive and time-consuming experimental approaches [7], such as determination of complex structures by X-ray crystallography, and generation and screening of large libraries. To overcome these bottlenecks, considerable effort has been made to develop alternative computational methods [8, 9]. Despite some notable advances, however, computational determination of binding mode and in silico improvement of binding affinity remain challenging in general, and purely computational approaches have been insufficiently reliable for such purposes [1013] and association energies [1417]. Recent rounds of the Critical Assessment of Predicted Interactions (CAPRI) have also shown that current computational methods are not successful in identifying the actual binding mode [18]. Thus, the problem of predicting protein-protein interations is often regarded as a “Holy Grail” in the computer-aided protein engineering [14].

Leucine-rich repeat (LRR) proteins have a rigid horseshoe-like structural feature and play key roles in many biological processes [19], including the immune system [2023] and cellular processes [2426]. LRR proteins constitute one of the most common protein families found in a wide range of species, and more than 2,000 LRR proteins have been identified [27, 28]. Typical examples include toll-like receptors (TLR) of the mammalian innate immune system [21], and variable lymphocyte receptors (VLR) of the jawless vertebrate adaptive immune system [23]. Considering the importance and abundance of LRR proteins in nature, a broadly enabling strategy for modeling and controlling LRR binding can help in understanding of their functions as well as leveraging their recognition abilities for therapeutic applications.

We previously developed a computationally-driven epitope localization method, EpiScope, through which a target antibody’s binding is evaluated against a small, optimized panel of antigenic variants to test hypothesized epitope locations [11]. EpiScope was shown to successfully predict a general epitope region, but not providing detailed information about binding mode. Here, as the extension of EpiScope, we demonstrate an integrated computational and experimental approach to identifying the binding mode that further enables affinity improvement and redesign of binding specificity. As proof-of-concept, we employed an LRR protein binder that specifically recognizes an immunoglobulin G (IgG), for which the binding mode was completely unknown. This work represents a challenging model system for affinity improvement and redesign of specificity due to high structural conservation combined with sequence diversity across species. We show that our approach effectively narrowed down the location of the binding interface, and the full-atom energy minimization identified a native-like complex model closely matching a experimentally determined X-ray crystal structure. Further computational analyses of the identified model complex allowed the design of LRR protein binders with significantly increased affinity and altered binding specificity.

Results

Binding mode identification of an LRR protein binder

In an effort to exploit the structural and functional features of LRR proteins for biotechnological and medical applications, we previously developed an LRR protein binder, called a “repebody (Rb)” [29]. Here, a human IgG1 (hIgG1)-binding repebody, named RbF4 [30], was targeted for computationally-driven binding mode identification and subsequent affinity improvement as well as redesign of binding specificity. There is indirect evidence that RbF4 recognizes the constant region of hIgG1 (hFc) [30, 31], but the actual epitope residues and the binding mode were unknown. RbF4 has a typical LRR protein sequence motif, whose structural scaffold consists of three major parts: an N terminal cap (LRRNT), LRR modules (LRRVs), and a C terminal cap (LRRCT) with an additional loop (S1 Fig). In contrast to the complementarity determining regions of antibodies, the target-binding sites of a repebody (the LRRVs) comprise parallel beta strands which are assumed to remain unchanged upon target binding. During the development of RbF4, three variable residues on each of modules LRRV2, LRRV3, and LRRV5 were randomized and subjected to a phage display selection [30].

To identify the binding mode of RbF4-hFc, we first localized the RbF4 epitope on hFc using EpiScope. This computational method first predicts mutations which appear to both disrupt a target binding according to the models and maintain antigen stability; it then optimizes targeted sets of antigenic variants that combine these mutations so as to efficiently confirm and reject the various epitope hypotheses [11]. For antibody-antigen pairs, an average of three variants, each with three mutations designed to disrupt binding and maintain stability, were shown to be sufficient to test all of the docking models over a benchmark set of targets, in each case yielding at least one variant expected to include the true epitope region. It should be noted that this form of epitope localization indicates the general region where the protein is likely to bind, but does not provide a binding mode in detail.

In this study, the ClusPro webserver was employed to dock the RbF4-hFc pair as previously [11]. We used a crystal structure of the unbound form of hFc (PDB code: 3AVE) and a homology model of RbF4 for the docking (Table 1), assigning attractions at the residues of the binding site (LRRV modules from 2 to 5) in order to focus docking to this region. The affinity improvement of repebodies is similar to that of antibodies, in that diversity is generated in the binding region of the repebody/antibody followed by selection of variants with improved affinity. This leads to an assymetric representation of amino acids, and thus we used the antibody docking mode and its associated assymetric scoring (Antibody Mode). As a result, a total of 29 target-bound complex models were generated (Table 1).

Table 1. Test sets.

The numbers in parentheses indicate results from ClusPro without the antibody mode option. For C5a, the numbers with asterisks (*) are results using the crystal structure for docking with the precise definition of paratopes.

Target Human IgG Fc Crystal Structure Complexes
Interleukin 6 (IL-6) Epidermal Growth Factor Receptor (EGFR) Complement Component 5a (C5a)
Complex (PDB) 6KA7 4J4L 4UIP 5B4P
Rb homology template 3RFS 5B4P 3RFS 4J4L
Unbound target 3AVE 1ALU 1NQL 1KJS
Cα RMSD (Å) Target 1.72 0.92 2.62 1.76
Repebody 1.39 0.63 1.62 1.17
Number of docking models 29 30 (106) 30 (88) 23 (65) | *28
Number of EpiScope Designs 3 4 2 2
Number of localized docking models 7 5 (11) 12 (20) 6 (34) | *7
Best I-RMSD (Å) 2.32 1.45 (1.78) 2.23 (6.61) 3.61 (5.53) | *0.62
Best fnat 0.43 0.62 (0.49) 0.38 (0.09) 0.27 (0.14) | *0.76

We note that each mutation is duplicated due to the dimeric nature of Fc. The selected variants include Var 1 (Q362E/N389E/N390K), Var 2 (H268K/E269K/R292L), and Var 3 (H310A/N315K/H435K). The binding affinities of RbF4 against the variants were experimentally evaluated by isothermal titration calorimetry (ITC), and binding of Var 3 was shown to be disrupted approximately 3-fold compared to wild-type hFc, with a decrease in Kd from 128 nM to 427 nM (Fig 1B and S2 Fig). This result confirms that RbF4 indeed binds to hFc. Since, as previously observed [11], it is likely that not all positions mutated in a variant are equally important for binding, we also measured the binding affinities of the single mutations comprising Var 3. Two of the single mutations (H310A and N315K) led to meaningful two-fold reductions in binding affinity (194 and 187 nM, respectively), whereas the binding affinity of H435K remained similar to (or even a little stronger than) that of wild-type hFc. From these results, we conclude that RbF4 likely contacts H310 and N315, but not H435.

Fig 1. Epitope localization and binding mode identification of human IgG1 Fc (hFc)-binding repebody (RbF4).

Fig 1

(A) EpiScope designs triple mutants considering the symmetry of the Fc structure. (B) The set of the three mutations of Var 3 (H310A, N315K, and H435K) clearly disrupts the binding, and the single mutations comprising Var 3 were individually test. The ITC results indicate that H435K may not be involved in binding. Error bars represent variation over ITC triplicates. Details are provided in S2 Fig. (C) There are seven docking models in contact with H310A and N315K; the one colored blue had the lowest molecular modeling energy. (D) Closer inspection of the model suggests that the repebody loop (highlighed in red stick; see also S1 Fig) may be responsible for the binding specificity of RbF4 to hFc. IgG from three species are considered (human; mouse, mFc; and rabbit, rFc). The residues that all three share in common are colored cyan; those common to two are gray, and those unique to one species are black.

Seven RbF4-hFc docking models were consistent with the binding results; RbF4 makes contacts with H310 and N315, but not H435 or the six other positions mutated in Vars 1 and 2 (Fig 1C). We hypothesized that the all-atom force field energy could largely capture the binding free energy landscape. The seven docking models were ranked by total energy according to the AMBER99sb force field [32] after full-atom minimization using the Tinker molecular dynamics package [33]. We determined the complex structure by X-ray crystallography (deposited as PDB ID: 6KA7), and comparison of the crystal structure with the docking models confirms that the binding mode of the docking model with the lowest energy (Fig 1C and 1D, blue model) and that of the crystal structure are indeed extremely similar (fnat: 0.43 and I-RMSD: 2.89 Å, see Fig 2 and S2 Table). The full atom structure minimization changed the overall Fc structure, and consequently one complex model had a slightly better I-RMSD than the lowest-energy model, but the lowest-energy model better maintained interactions across the interface and thus had a much better fnat (S3A Fig). These results demonstrate the utility of our integrated computational and experimental approach to identifying a native-like complex model for an LRR protein: first a computational method designs sets of mutational variants to probe docking models; then experimental binding assays effectively filter the docking model candidates; finally full-atom minimization ranks the filtered docking models. It is noteworthy that epitope localization is an essential step for precise binding mode identification, and ranking docking models using the force field energy alone may be insufficient for finding a native-like model (Fig 2B and 2C). Furthermore, testing the individual mutations comprising a selected variant was important in this case; if all of the three positions in Var 3 were assumed to be important, there would be only two possible docking models (S3B Fig), and while the interface region of the lower energy model is largely correct, its binding orientation is completely reversed.

Fig 2. Computationally-driven identification of the RbF4-hFc complex.

Fig 2

(A) The lowest-energy model is in blue and the crystal structure (6KA7) is in gold. H310A and N315K are highlighted in spheres (B,C) Comparison model energy vs. (B) I-RMSD and (C) fnat Docking models that are in contact with the epitope residues (correctly localized docking models) are shown with solid circles. The crossed-circles are models in contact with all of the three residues in Var 3. The blue circle is the model with the lowest force field energy (AMBER99sb) score (illustrated in panel A).

Redesign of binding specificity based on the modeled complex

Based on the model complex structure, we redesigned RbF4 to alter its binding specificity. RbF4 was previously determined to be highly specific for human IgG1, showing weak and negligible cross-reactivities against mouse IgG1 and rabbit IgG [30, 31]. The confirmed complex model here reveals that the loop (S1 Fig) may be largely responsible for the binding specificity of RbF4 toward hFc (Fig 1D). To obtain further insight into this possible source of specificity, we investigated the Fc sequences of IgGs from three species: human (hFc), mouse (mFc), and rabbit (rFc). The modeled RbF4:hFc complex shows that the RbF4 loop forms a tight contact with the positions where amino acids differ among hFc, mFc, and rFc (Fig 1D), strongly supporting a crucial role for this loop in the observed binding specificity of RbF4. We thus reasoned that engineering the loop could yield a variant of RbF4 showing cross-reactivities for Fc from other species. To prove our hypothesis, we replaced the loop sequence of RbF4 starting at position 239, RNSAGSVA, with the truncated but flexible amino acid pair GG. We measured with ITC assays the binding affinities of the resulting loop-truncated RbF4 variant (RbF4-LT) against the IgGs from the three species: hIgG1 (Trastuzumab), mouse IgG1 (mIgG1), and rabbit IgG (rIgG).

As shown in our previous work [31], the original RbF4 binds strongly to hIgG1 with a binding affinity of 128.7 nM, whereas binding weakly to mIgG1 with an 8-fold lower affinity of 1 μM, and has a negligible binding affinity for rIgG (Fig 3 and S4 Fig). The loop-truncated variant (RbF4-LT) displayed improved binding affinity for rIgG (1.2 μM), indicating that the loop is indeed involved in the binding specificity of RbF4 for hIgG1. Its binding affinity for mIgG1 was also improved (775 nM), whereas that for hIgG1 decreased to a level similar to that for mIgG1 (598 nM). It bears noting that the variant was designed based only on the modeled binding mode, prior to the determination of the X-ray co-crystal structure. The results thus demonstrate that our integrated approach can provide sufficiently accurate complex models for the redesign of binding specificity.

Fig 3. Redesigning binding specificity and affinity based on modeled complex structure.

Fig 3

The model suggests that the binding specificity of RbF4 comes from the loop (Fig 1D). The loop truncated RbF4 (RbF4-LT) was testing for binding against the IgGs from the three species. Further computational design on the model identified mutations (S241M and S244R: RbF4-MR) that could significantly improve the binding affinity for mIgG1 (1 μM to 168.1 nM) while maintaining that for hIgG1 (S3 Table for details). ITC-based affinity measurements were performed in triplate.

Improvement of binding affinity based on the modeled complex

Our final goal is to use computational design to improve the binding affinity of RbF4 against different IgGs. As we saw in determining the best complex model, while the full-atom force field energy alone did not identify the near-native complex, it did largely capture the binding energy landscape. We thus again used the AMBER99sb total energy to select loop mutations predicted to simultaneously improve binding affinities of RbF4 against multiple targets. To reduce the search space, FoldX [34] was employed to fast-scan possible mutations in the loop (S5 Fig), since we observed that FoldX is particularly accurate in predicting disruptive mutations (PPV > 0.9 for antibody-antigen pairs) [35]. The predicted binding energy values at G243 indicate that the inclusion of the loop may not enhance a binding affinity for rIgG. Thus we aimed to design an RbF4 variant which can bind to both hIgG1 and mIgG1 with high binding affinities. The FoldX scan suggests that S241M may substantially enhance the binding affinities for hIgG1 and mIgG1. The mutation was also observed during the phage display affinity maturation for the mIgG1-specific repebody [31]. We then fixed S241M and introduced all other amino acids in silico at S244. The AMBER99sb force field energy was used to minimize the variants. The binding energy prediction indicated that S241M with S244R (RbF4-MR) may significantly improve the binding affinities for both IgGs (S6 Fig). We tested this variant, and the ITC binding assay indeed showed that RbF4-MR strongly binds to the two IgGs as predicted, and the binding affinity for mIgG1 was markedly increased (1 μM to 168.1 nM, Fig 3).

General applicability of the binding mode identification method

To assess the general applicability of our integrated approach to identifying a binding mode, we investigated three known Rb targets whose co-crystal structures are available (Table 1): Interleukin-6 (IL-6: 4J4L) [29], epidermal growth factor receptor (EGFR: 4UIP) [36] and complement component 5a (C5a: 5B4P) [37]. We first investigated the importance of using ClusPro’s “antibody mode” for Rb docking. Docking without the antibody mode option resulted in a larger number of docking models, but the overall accuracy of the docking models proved to be worse than that of those generated with the option enabled (Table 1). The antibody mode puts a lower weight on the DARS [38] energy term than other docking modes [32]. In order to improve binding affinities of antibodies and other protein binders, mutations are extensively made on only one of the interacting partners (e.g., complementarity determining regions in antibodies) [39]. Thus, the statistics of observed amino acid frequencies for the binding interface regions are different. DARS assumes a symmetry interaction, which is beneficial for general protein-protein docking, whereas it is worse for antibody-antigen pairs [39]. We likewise hypothesize that interaction assymetry, captured in ClusPro’s antibody mode, leads to improved prediction accuracy for repebody binding and other affinity matured protein binders.

Consistent with the results above for RbF4-hFc, only a small number of variants were sufficient to localize the epitopes (two for C5a and EGFR, and four for IL-6). The filtering process resulted in a small set of docking models including native-like ones (5 to 12 models; solid circles in Fig 4). In order to identify the most native-like docking model, the ClusPro score was initially considered to rank the filtered docking models, but it was observed to be unreliable (S7 Fig). In constrast, but consistent with the RbF4-hFc results, Fig 4 shows that ranking based on the AMBER99sb force field energy successfully discriminated high-quality docking models for IL-6 and EGFR. It bears noting that the prior epitope localization was again critical; ranking by the force field energy alone was not sufficient to find native-like docking models. For example, in the case of IL-6, there are two incorrect docking models with lower energies than the most native-like model, but both of them are not in contact with true epitope residues and thus were filtered out.

Fig 4. Retrospective tests of binding mode identification for additional targets.

Fig 4

(A, B, C) IL-6; (D, E, F) EGFR; (G, H, I) C5a. (A, D, G) I-RMSD vs. energy; (B, E, C) fnat vs. energy; (C, F, I) Points represent docking models, with those that are in contact with epitope residues (correctly localized docking models) shown as solid circles. The blue circle is the model with the minimum force field energy (AMBER99sb), and is native-like for IL-6 and EGFR. Wild-type (star) energy levels are depicted as dotted lines. For IL-6, some models have lower energy values than the most native-like docking model, indicating that ranking only by the force-field energy is not sufficient for binding mode prediction. (C, F, I) Crystal structures are in gold and the docking models with the lowest energies are in blue.

The C5a case provides an additional insight into precise binding mode prediction (Fig 4, bottom row). While the binding interfaces were mostly correct (15 out of 18 correct interface residues: 83%), the predicted binding mode was completely inverted (N terminal to C, and vice versa). During the affinity improvement of the C5a-specific Rb, it was observed that some LRRV modules gave rise to a negligible increase in the binding affinity [37], suggesting that only LRRV1 and 2 were responsible for interacting with C5a. We thus hypothesized that the accuracy of structural modeling and an incorporation of the paratope information may also enhance docking quality. The four possible combinations of hypotheses (incorporation of the phage display information versus no incorporation, and a homology-model target versus a crystal structure target) revealed that the use of a high-quality structure (here the crystal structure) was not sufficient for accurate binding mode identification (Fig 5). No paratope information with the crystal structure resulted in worse correlations than the model with precisely annotated paratopes (Spearman ρ for fnat and I-RMSD of crystal structures: -0.1 and 0.1, and those of models with precise paratope definition: -0.7 and 0.19, respectively). The ideal case (crystal structures with precise paratopes) led to extremely accurate results (I-RMSD: 0.62Å and fnat: 0.76) with high correlations to the crystal structure (Spearman ρ for fnat and I-RMSD: -0.79 and 0.71).

Fig 5. Impacts of high-quality structure and paratope definition.

Fig 5

The incorporation of phage display results and the use of high quality structures (here crystal structures) lead to an extremely accurate identification of the binding mode. LRRV in blue is assigned for attraction, in red for repulsion and green for neutral (A and H). Results in the right two columns (I-N) are from docking models with precisely assigned paratopes (Results with no paratope information are in the left (B-G)). Docking models that are in contact with epitope residues (localized docking models) are in solid circle. Models with the minimum force field energy values are in blue circles. Wild-type (star) energy levels are depicted as dotted lines. Crystal structures are in gold and the docking models with the lowest energies are in blue.

This retrospective study demonstrated critical criteria for accurate binding mode identification. The full-atom force field energy can effectively discriminate the most native-like docking model when combined with an initial epitope localization using experimental data to filter the models; on its own it may not be sufficient. As also observed in the previous study with antibodies [11], high-quality antigen structures are not required for epitope localization; homology models generally suffice. However, they are necessary to generate and predict native-like binding modes. Finally, while, like antibodies, repebodies have well-defined target-binding sites, not all of them are actually involved in target binding. The inclusion of paratope information (e.g., residues contributing to phage display selection) was shown to improve the quality of docking models and binding mode prediction.

Discussion

Precise binding mode identification of protein binders is crucial for the development of therapeutic proteins, but they have heavily relied on labor-intensive experimental approaches. Computational methods that do not require structure determination offer a way to accelerate development processes and understanding of mechanisms of action, advancing the potential utility of relevant proteins in translational and basic research. We chose LRR proteins as a model system to evaluate the utility of an intergrated computational and experimental approach to identifying binding modes and reengineering binding affinity and specificity, since theses proteins not only represent a promising therapeutic scaffold, but they also have the important advantage of undergoing small structural changes upon target binding. We extensively studied one LRR protein binder targeting the constant domain of human IgG1 (hFc). Computaional and experimental epitope localization followed by full-atom energy minimization with the AMBER99sb force field enabled the successful selection of a docking model which was confirmed to be most native-like according to the independently solved X-ray crystal structure. The further utility and potential of our computationally-guided binding mode identification were demonstrated by successfully implementing the resulting model to design variants with increased binding affinity and altered specificity.

Unlike epitope localization or binding site identification, which do not require extremely native-like binding modes in the sampling step [11], the inclusion of native-like complex models is absolutely necessary for further affinity improvement from models. In general, the sampling quality of antibody-antigen docking often depends on the targets; however, in our case, native-like models were nearly always included among the samples, likely due to the rigid nature of LRR domains. Therefore, we anticipate that this approach may in principle be applicable not only to LRR domains, but also to any other proteins with rigid binding sites. As also demonstrated by the success of interface-guided docking methods [4042], the exact definition of binding interfaces including paratope mutational data from phage display selection is also important.

As repeatedly shown here, selection of a docking model based only on the molecular modeling energy may be misleading, perhaps due to the inaccuracy of the energy functions [43]. Mutagenesis-based verification and filtering of docking models generally focused on the correct epitope region, and they were shown to be necessary here in order to complement imperfect energy scores. The epitope localization method used in this study, EpiScope [11], provided an optimal set of mutational variants; here requiring only six sets of in vitro experiments to effectively localize the epitope. With the epitope thereby localized, full-atom minimization enabled ranking of filtered docking models, resulting in identification of a native-like docking model.

The successful modeling of binding mode directly enabled the design of binding specificities and affinities. From the model, we were able to identify key contributors to the binding specificity of RbF4 for the IgGs and engineer a simple manipulation to entirely change its binding specificity. Furthermore, since the force field energy is indicative of binding energy, we were able to select mutations based on the energy and dramatically improve binding affinities as predicted. However, as discussed above, this holds only with a good model of binding, as otherwise design only based on the energy may result in wrong predictions. For example, the force field energy prediction suggeests that RbF4-MR would also have a good binding affinity to hIgG1 though the actual binding affinity was not improved (S7 Fig).

Methods

Computational methods

To design mutations for epitope localization, we used EpiScope with default settings [11]. three mutations per binding patch, and one pareto-optimal curve and two suboptimal ones. The docking models were generated using the ClusPro webserver [44], initially using “Antibody mode” [39], and later a separate set was generated without that option. The non-CDR masking option was disabled, but the binding sites were masked for attraction and the convex side for repulsion (S1A Fig). As with the models of the targets, the complex models were minimized using the Tinker molecular dynamics package [33] with the AMBER99sb parameter set [32] and the GB/SA implicit solvent model [45]. The total energy was used to rank complex models, and FoldX (ver. 4) was employed to fast filter binding disruptive mutations [34]. A complex structure is repaired and optimized using ‘RepairPDB’. Given a mutation or set of mutations, the effect on binding is then calculated using the ‘BuildModel’ command.

There are currently three Rb-target complex structures in the PDB: binders to IL-6 (PDB code 4J4L), EGFR (4UIP), and C5a (5B4P). Unbound forms of the target structures were used for docking (Table 1). There is a missing loop in the structure of IL-6 (1ALU chain A: 52 SSKEALAEN). The loop was filled using MODELLER [46] and all the backbone atoms of the loop were minimized using Tinker as described above. While repebodies have a very rigid predefined structure, a single mutation at the 11th LRRV position (S1B Fig) to proline significantly changes the conformation. Two of the Rbs (4J4L and 5B4P) have such proline residues (at LRRV1 and LRRV2 respectively). These two Rbs were modeled using each other as templates. One LRRV unit of the EGFR Rb (4UIP) was omitted. The trimmed Rb structure was reconstructed by splitting the free Rb at LRRV3 and superimposing the LRRV3 on LRRV2 of the complete Rb using PyMol.

Preparation of Fc variants

The sequence of the human Fc binder repebody (RbF4) was obtained from a previously published study [30]. The Rb structure was modeled using MODELLER with a free form (3RFS:A) as a template structure. A free form of the hFc domain (3AVE) was used. Trastuzumab (trade name, Herceptin) Fc sequence available from the literature (wild-type) and all subsequent variants were reverse translated, codon optimized for expression in mammalian cells, and synthesized by Integrated DNA technologies (IDT), Inc. (Redwood City, CA). CMVR VRC01 expression vector (NIH AIDS reagent program, Germantown, MD) harboring the wild-type Fc or the Fc variant sequences was transfected into suspension HEK 293 cells using polyethylenimine (PEI) (Polysciences, Warrington, PA). Briefly, 500 μg of the wild-type Fc or Fc variant DNA was combined with 1 ml of PEI and incubated at room temperature for 10 minutes. The mixture was then added to HEK cells in the suspension and incubated in a humidified chamber at 37°C with 8% CO2 for at least 5 to 6 days. The secreted wild-type Fc or Fc variants were clarified through centrifugation at 8000 rpm at 4°C for 15 minutes on a Beckman Avanti-J25 centrifuge (Brea, CA). The resulting supernatant was filtered through a 0.45 μm filter to remove any residual cell debris and other large particles before loading onto a FPLC column.

Affinity purification was conducted on a pre-packed 5 ml Protein A column (for wild-type Fc, and Fc variants 1 and 2) or pre-packed 5 ml Protein G column (for Fc variant 3, and single mutations of the variant) from GE Healthcare (Pittsburgh, PA) as suggested by the manufacturer. The final sample was eluted with 100 mM Glycine at pH 3 in 2 ml Eppendorf tubes prefilled with 50 μl of 1 M Tris and 5 mM EDTA. The purification process was automated on an AKTA FPLC system (GE Healthcare, Pittsburgh, PA). The purified protein was subjected to a second buffer exchange step using a hitrap desalting column (GE Healthcare, Pittsburgh, PA). The final product was eluted in phosphate buffer saline and stored at -20°C until further use. The purified proteins were analyzed under reduced SDS-PAGE conditions and stained with coomassie blue.

Expression and purification of repebodies

The gene-encoding repebodies were inserted into NdeI and XhoI restriction sites of a pET21a vector (Invitrogen, Carlsbad, CA). Plasmids were cloned into competent E.coli DH5α cells using a heat shock method (at 42°C for 90 seconds). The recombinant plasmids were transformed into E.coli Origami-B cells (Merck, Kenilworth, NJ). Single colonies were inoculated into 5 mL of a Luria-Bertani (LB) medium containing 50 μg/mL carbenicillin and grown overnight at 37°C in a shaking incubator (200 rpm). A total of 250 mL of LB containing 50 μg/mL carbenicillin was inoculated with an OD600 0.05 volume of the overnight-saturated culture and grown at 37°C with shaking at 200 rpm until the OD600 reached 0.5–0.8. The cells were induced using 0.5 mM IPTG and incubated at 18°C with shaking at 200 rpm for 16 hours. The cells were harvested through centrifugation at 8000 rpm for 20 mins and suspended in a lysis buffer (50 mM NaH2PO4, 300 mM NaCl, and 10 mM imidazole, at pH 8.0). After cell lysis by sonication, the cell debris was removed through centrifugation at 16,000 rpm for 1 hour at 4°C. Cell lysates were loaded into a Ni-NTA column (Qiagen, Hilden, Germany) and washed using a wash buffer solution (50 mM NaH2PO4, 300 mM NaCl, 20 mM imidazole, at pH 8.0). The repebodies were eluted using an elution buffer (50 mM NaH2PO4, 300 mM NaCl, 250 mM imidazole, at pH 8.0), and further purified using gel permeation chromatography (Superdex 75, GE Healthcare). The buffer of the purified repebodies changed into PBS, and the concentrate was developed in an AmiconUltra centrifugal filter (10 kDa cutoff, Millipore).

Determination of crystal structure

The Fc domain of human IgG (hFc) was purified after digestion of the purchased human IgG (Sigma, St. Louis, MO) with papain, as described elsewhere [47]. The RbF4-hFc complex was purified through gel-filtration with a buffer containing 5 mM Tris·HCl and 0.1 M NaCl (pH 7.4) after reconstitution of the complex by incubating RbF4 and hFc at a 2:1 molar ratio on ice. Crystals of the complex were grown using a hanging drop vapor diffusion method against a crystallization buffer containing 0.1 M sodium acetate, 12% (w/v) polyethylene glycol 6000, and 0.1 M magnesium chloride (pH 4.6) at 20°C. The crystals formed in the space group P212121 with a = 59.9 Å, b = 107.4 Å, and c = 171.4 Å, and contained one complex molecule in an asymmetric unit. The diffraction data were collected at 100 K, with crystals flash-frozen in a crystallization buffer containing 30% glycerol. A single-wavelength (1 Å) dataset was collected using a native crystal on beam line 5C (Pohang Accelerator laboratory, Korea). Integration, scaling, and merging of the diffraction data were conducted using the HKL2000 program suite [48]. The initial phases were determined through molecular replacement using the Phenix AutoMR program [49] and human Fc of IgG (PDB accession 1H3X) and repebody (PDB accession 3RFJ) as models. Successive rounds of model building using Coot [50], refinement using the Phenix program [51], and phase combination allowed the building of the complete structure (S1 Table).

Isothermal Titration Calorimetry (ITC)

A binding affinity experiment was conducted using MicroCal-iTC200 (Malvern Instruments, Malvern, UK). Fc variants, mouse IgG1, and rabbit IgG (Sigma) were diluted in a PBS buffer to a final concentration of 0.02 mM. RbF4 or RbF4-LT (RbF4 with the loop truncation) was diluted using the same buffer to a final concentration of 0.2 mM. The ITC experiments were performed for a total 20 injections and stirred at 1000 rpm. The initial injection of 1 μL was excluded for data analysis. Titration curves were fitted with a one-site binding model. The value of Kd was determined using Origin (OriginLab).

Supporting information

S1 Fig. The repebody structure.

(A) A repebody (Rb) largely consists of three parts: N-termianl cap (LRRNT), variable regions (LRRV) and C-terminal cap (LRRCT). Binding occurs at the concave region of LRRV (in darker blue). (B) Structure of a single LRRV motif, with side chains of conserved residues rendered as stick figures. Each LRR is composed of six conserved leucine residues, a central conserved asparagine residue, and conserved phenylalanine residue on the C-terminal side.

(TIF)

S2 Fig. Detailed information about the mutations for hFc-F4 epitope localization and titration curves.

Based on the Kd values, H310 and N315 overlap epitopes.

(TIF)

S3 Fig. Docking models of RbF4.

The crystal structure is in gold. (A) The full atom energy minimization step may change the overall structure, but the binding interactions are likely to be maintained. The I-RMSD value of the lowest energy model (Model 1, blue) is slightly higher than the model with the second lowest energy (Model 2, pink). However, its fnat is twice higher. (B) There are two docking models which are in contact with the three mutations in Var 3 (Model 9: cyan and Model 10: pink). While their binding interface regions are largely correct, the binding orientation of the lower energy model (pink) is completely inverted.

(TIF)

S4 Fig. Details of Binding affinities of RbF4 variants (RbF4, RbF4-LT and RbF4-MR) for hIgG1, mIgG1 and rIgG.

RbF4 binds strongly to hIgG1 and weakly to mIgG1. However, no binding affinity is measured for RbF4-rIgG. The truncation of the loop (RbF4-LT) enabled the variant to bind to all IgGs with similar binding affinities. RbF4-MR gains strong binding affinities for hIgG1 and mIgG1. See S2 and S3 Tables for details.

(TIF)

S5 Fig. FoldX scan of the RbF4 loop.

The residue scan using FoldX suggests that the inclusion of the loop may not enhance the binding affinity of RbF4 for rIgG.

(TIF)

S6 Fig. Predicted ΔΔG values of the RbF4 variants.

The variant with S241M and S244R mutations (RbF4-MR) is predicted to strongly bind to both hIgG1 and mIgG1. S244C and S244P were not considered.

(TIF)

S7 Fig. Binding mode prediction with the ClusPro score.

The ClusPro score was tested on the retrospective test set (A-C: IL-6, D-F: EGFR, and G-I: C5a binders). Docking models that are in contact with epitope overlapping residues (localized docking models) are in solid circle. The blue circle is the model with the lowest ClusPro score. Crystal structures are in yellow and the docking models with the lowest energies are in blue on the right hand side. Score assessment using the ClusPro score is not predictive.

(TIF)

S1 Table. Data collection and refinement statistics.

(DOCX)

S2 Table. Quality measures and contact information of RbF4 docking models.

Each black block indicates that the Fc position is in contact with the docked repebody.

(DOCX)

S3 Table. In silico binding specificity control and affinity improvements.

RbF4 strongly binds to hIgG1 (weakly to mIgG1), but no binding to rIgG is observed. Removal of the loop (RbF4-LT) causes a slight reduction in the binding affinity toward hIgG1, but yields a marginal affinity improvement for mIgG1. RbF4-LT produces a significant improvement in the affinity for rIgG. RbF4 with two mutations at 241 and 244 (S241M with S244R, RbF4-MR) binds to both hIgG1 and mIgG1 with high binding affinities.

(DOCX)

Data Availability

All relevant data are within the manuscript and its Supporting Information files.

Funding Statement

This work was supported by the Global Research Laboratory (NRF-2015K1A1A2033346) and Mid-Career Researcher Program (NRF-2017R1A2A1A05001091) through the National Research Foundation (NRF) of Korea, funded by the Ministry of Science and ICT. Y.C. was supported by the Korea Research Fellowship Program (NRF-2016H1D3A1938246) through the National Research Foundation of Korea, funded by the Ministry of Science and ICT. J.M. was supported by Basic Science Research Program (NRF-2017R1A6A3A04012313) of the National Research Foundation funded by the Ministry of Science and ICT, and the Ministry of Education. The production of the Fc and Fc variant was supported by the National Institute of General Medical Sciences of the U.S. National Institutes of Health under award numbers P20-GM113132 and 2R01GM098977. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Gan HK, Burgess AW, Clayton AH, Scott AM. Targeting of a conformationally exposed, tumor-specific epitope of EGFR as a strategy for cancer therapy. Cancer Res. 2012;72(12):2924–30. 10.1158/0008-5472.CAN-11-3898 [DOI] [PubMed] [Google Scholar]
  • 2.Garrett TP, Burgess AW, Gan HK, Luwor RB, Cartwright G, Walker F, et al. Antibodies specifically targeting a locally misfolded region of tumor associated EGFR. Proc Natl Acad Sci USA. 2009;106(13):5082–7. 10.1073/pnas.0811559106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Khurana S, Fuentes S, Coyle EM, Ravichandran S, Davey RT Jr, Beigel JH. Human antibody repertoire after VSV-Ebola vaccination identifies novel targets and virus-neutralizing IgM antibodies. Nat Med. 2016;22(12):1439 10.1038/nm.4201 [DOI] [PubMed] [Google Scholar]
  • 4.Kim KM, Shin EY, Moon JH, Heo TH, Lee JY, Chung Y, et al. Both the epitope specificity and isotype are important in the antitumor effect of monoclonal antibodies against Her‐2/neu antigen. Int J Cancer. 2002;102(4):428–34. 10.1002/ijc.10732 [DOI] [PubMed] [Google Scholar]
  • 5.Lewis GK. Challenges of antibody-mediated protection against HIV-1. Expert Rev Vaccines. 2010;9(7):683–7. 10.1586/erv.10.70 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Zolla-Pazner S. Identifying epitopes of HIV-1 that induce protective antibodies. Nat Rev Immunol. 2004;4(3):199 10.1038/nri1307 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Abbott WM, Damschroder MM, Lowe DC. Current approaches to fine mapping of antigen–antibody interactions. Immunology. 2014;142(4):526–35. 10.1111/imm.12284 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Gao J, Kurgan L. Computational prediction of B cell epitopes from antigen sequences. Immunoinformatics: Springer; 2014. p. 197–215. [DOI] [PubMed] [Google Scholar]
  • 9.Zhang W, Xiong Y, Zhao M, Zou H, Ye X, Liu J. Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature. BMC Bioinform. 2011;12(1):341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hogues H, Gaudreault F, Corbeil CR, Deprez C, Sulea T, Purisima EO. ProPOSE: Direct Exhaustive Protein–Protein Docking with Side Chain Flexibility. J Chem Theory Comput. 2018;14(9):4938–47. 10.1021/acs.jctc.8b00225 [DOI] [PubMed] [Google Scholar]
  • 11.Hua CK, Gacerez AT, Sentman CL, Ackerman ME, Choi Y, Bailey-Kellogg C. Computationally-driven identification of antibody epitopes. eLife. 2017;6:e29023 10.7554/eLife.29023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Huang S-Y. Exploring the potential of global protein–protein docking: an overview and critical assessment of current programs for automatic ab initio docking. Drug Discov Today. 2015;20(8):969–77. 10.1016/j.drudis.2015.03.007 [DOI] [PubMed] [Google Scholar]
  • 13.Sela-Culang I, Ofran Y, Peters B. Antibody specific epitope prediction—emergence of a new paradigm. Cur Opin Virol. 2015;11:98–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Houk K, Liu F. Holy grails for computational organic chemistry and biochemistry. Acc Chem Res. 2017;50(3):539–43. 10.1021/acs.accounts.6b00532 [DOI] [PubMed] [Google Scholar]
  • 15.Kastritis PL, Moal IH, Hwang H, Weng Z, Bates PA, Bonvin AM, et al. A structure‐based benchmark for protein–protein binding affinity. Protein Sci. 2011;20(3):482–91. 10.1002/pro.580 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Siebenmorgen T, Zacharias M. Computational prediction of protein–protein binding affinities. Wiley Interdiscip Rev Comput Mol Sci. 2020;10(3):e1448. [Google Scholar]
  • 17.Sirin S, Apgar JR, Bennett EM, Keating AE. AB‐Bind: antibody binding mutational database for computational affinity predictions. Protein Sci. 2016;25(2):393–409. 10.1002/pro.2829 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lensink MF, Wodak SJ. Blind predictions of protein interfaces by docking calculations in CAPRI. Proteins. 2010;78(15):3085–95. 10.1002/prot.22850 [DOI] [PubMed] [Google Scholar]
  • 19.Grove TZ, Cortajarena AL, Regan L. Ligand binding by repeat proteins: natural and designed. Curr Opin Struct Biol. 2008;18(4):507–15. 10.1016/j.sbi.2008.05.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Akira S, Takeda K, Kaisho T. Toll-like receptors: critical proteins linking innate and acquired immunity. Nat Immunol. 2001;2(8):675 10.1038/90609 [DOI] [PubMed] [Google Scholar]
  • 21.Medzhitov R. Toll-like receptors and innate immunity. Nat Rev Immunol. 2001;1(2):135 10.1038/35100529 [DOI] [PubMed] [Google Scholar]
  • 22.Pancer Z, Saha NR, Kasamatsu J, Suzuki T, Amemiya CT, Kasahara M, et al. Variable lymphocyte receptors in hagfish. Proc Natl Acad Sci USA. 2005;102(26):9224–9. 10.1073/pnas.0503792102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sutoh Y, Kasahara M. Lymphocyte Populations in Jawless Vertebrates: Insights Into the Origin and Evolution of Adaptive Immunity. The Evolution of the Immune System: Elsevier; 2016. p. 51–67. [Google Scholar]
  • 24.De Wit J, Hong W, Luo L, Ghosh A. Role of leucine-rich repeat proteins in the development and function of neural circuits. Annu Rev Cell Dev Biol. 2011;27:697–729. 10.1146/annurev-cellbio-092910-154111 [DOI] [PubMed] [Google Scholar]
  • 25.van der Biezen EA, Jones JD. The NB-ARC domain: a novel signalling motif shared by plant resistance gene products and regulators of cell death in animals. Curr Biol. 1998;8(7):R226–R8. 10.1016/s0960-9822(98)70145-9 [DOI] [PubMed] [Google Scholar]
  • 26.Deretic V, Saitoh T, Akira S. Autophagy in infection, inflammation and immunity. Nat Rev Immunol. 2013;13(10):722 10.1038/nri3532 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Björklund ÅK, Ekman D, Elofsson A. Expansion of protein domain repeats. PLOS Comput Biol. 2006;2(8):e114 10.1371/journal.pcbi.0020114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Enkhbayar P, Kamiya M, Osaki M, Matsumoto T, Matsushima N. Structural principles of leucine‐rich repeat (LRR) proteins. Proteins. 2004;54(3):394–403. 10.1002/prot.10605 [DOI] [PubMed] [Google Scholar]
  • 29.Lee S-C, Park K, Han J, Lee J-j, Kim HJ, Hong S, et al. Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering. Proc Natl Acad Sci USA. 2012;109(9):3299–304. 10.1073/pnas.1113193109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Heu W, Choi J-M, Lee J-J, Jeong S, Kim H-S. Protein binder for affinity purification of human immunoglobulin antibodies. Anal Chem. 2014;86(12):6019–25. 10.1021/ac501158t [DOI] [PubMed] [Google Scholar]
  • 31.Jeong S, Heu W, Kim J-w, Kim H-S. Protein binders specific for immunoglobulin g from different species for immunoassays and multiplex imaging. Anal Chem. 2016;88(23):11938–45. 10.1021/acs.analchem.6b03851 [DOI] [PubMed] [Google Scholar]
  • 32.Hornak V, Abel R, Okur A, Strockbine B, Roitberg A, Simmerling C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins. 2006;65(3):712–25. 10.1002/prot.21123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ponder JW. TINKER: Software tools for molecular design. Washington University School of Medicine, Saint Louis, MO: 2004;3. [Google Scholar]
  • 34.Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: an online force field. Nucleic Acids Res. 2005;33(suppl_2):W382–W8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Choi Y, Furlon JM, Amos RB, Griswold KE, Bailey-Kellogg C. DisruPPI: structure-based computational redesign algorithm for protein binding disruption. Bioinformatics. 2018;34(13):i245–i53. 10.1093/bioinformatics/bty274 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Jj Lee, Choi HJ, Yun M, Kang Y, Jung JE, Ryu Y, et al. Enzymatic prenylation and oxime ligation for the synthesis of stable and homogeneous protein–drug conjugates for targeted therapy. Angew Chem Int Ed Engl. 2015;54(41):12020–4. 10.1002/anie.201505964 [DOI] [PubMed] [Google Scholar]
  • 37.Hwang D-E, Choi J-M, Yang C-S, Lee J-j, Heu W, Jo E-K, et al. Effective suppression of C5a-induced proinflammatory response using anti-human C5a repebody. Biochem Biophys Res Commun. 2016;477(4):1072–7. 10.1016/j.bbrc.2016.07.041 [DOI] [PubMed] [Google Scholar]
  • 38.Chuang G-Y, Kozakov D, Brenke R, Comeau SR, Vajda S. DARS (Decoys As the Reference State) potentials for protein-protein docking. Biophys J. 2008;95(9):4217–27. 10.1529/biophysj.108.135814 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Brenke R, Hall DR, Chuang G-Y, Comeau SR, Bohnuud T, Beglov D, et al. Application of asymmetric statistical potentials to antibody–protein docking. Bioinformatics. 2012;28(20):2608–14. 10.1093/bioinformatics/bts493 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.De Vries SJ, Van Dijk M, Bonvin AM. The HADDOCK web server for data-driven biomolecular docking. Nat Protoc. 2010;5(5):883 10.1038/nprot.2010.32 [DOI] [PubMed] [Google Scholar]
  • 41.Dominguez C, Boelens R, Bonvin AM. HADDOCK: a protein− protein docking approach based on biochemical or biophysical information. J Am Chem Soc. 2003;125(7):1731–7. 10.1021/ja026939x [DOI] [PubMed] [Google Scholar]
  • 42.Vajda S, Kozakov D. Convergence and combination of methods in protein–protein docking. Curr Opin Struct Biol. 2009;19(2):164–70. 10.1016/j.sbi.2009.02.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Rosenfeld L, Heyne M, Shifman JM, Papo N. Protein engineering by combined computational and in vitro evolution approaches. Trends Biochem Sci. 2016;41(5):421–33. 10.1016/j.tibs.2016.03.002 [DOI] [PubMed] [Google Scholar]
  • 44.Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, et al. The ClusPro web server for protein–protein docking. Nat Protoc. 2017;12(2):255 10.1038/nprot.2016.169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Still WC, Tempczyk A, Hawley RC, Hendrickson T. Semianalytical treatment of solvation for molecular mechanics and dynamics. J Am Chem Soc. 1990;112(16):6127–9. [Google Scholar]
  • 46.Eswar N, Webb B, Marti‐Renom MA, Madhusudhan M, Eramian D, Shen My, et al. Comparative protein structure modeling using Modeller. Curr Protoc Bioinformatics. 2006;15(1):5.6. 1–5.6. 30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Coleman L, Mahler SM. Purification of Fab fragments from a monoclonal antibody papain digest by Gradiflow electrophoresis. Protein Expr Purif. 2003;32(2):246–51. 10.1016/j.pep.2003.07.005 [DOI] [PubMed] [Google Scholar]
  • 48.Otwinowski Z, Minor W. Processing of X-ray diffraction data collected in oscillation mode. Meth Enzymol. 1997;276:307–26. [DOI] [PubMed] [Google Scholar]
  • 49.Adams PD, Afonine PV, Bunkóczi G, Chen VB, Davis IW, Echols N, et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D. 2010;66(2):213–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D. 2004;60(12):2126–32. [DOI] [PubMed] [Google Scholar]
  • 51.Adams PD, Grosse-Kunstleve RW, Hung L-W, Ioerger TR, McCoy AJ, Moriarty NW, et al. PHENIX: building new software for automated crystallographic structure determination. Acta Crystallogr D. 2002;58(11):1948–54. [DOI] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008150.r001

Decision Letter 0

Arne Elofsson, Marco Punta

18 May 2020

Dear Dr. Kim,

Thank you very much for submitting your manuscript "Computer-aided Binding Mode Prediction and Affinity Maturation of an LRR Protein Binder without Structural Determination" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

Take particular care in addressing comments related to your previously published paper ref. 10 and explain more in details the differences and advancements produced in the current submission. Also, please have the English double-checked. 

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Marco Punta

Associate Editor

PLOS Computational Biology

Arne Elofsson

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Recommendation: Publish with minor corrections

Comments:

This paper describes the application of protein:protein docking in conjunction with epitope prediction and experimental mutation data to predict the complex of an LRR protein and IgG1. This article is extremely well written and is a beautiful example of the power of computational techniques when paired with experimental data to guide the design of protein binders. I therefore recommend publication of the article with minor corrections.

My first concern is that the authors seem to make generalizations in the introduction but cite single articles which make the point. While the generalizations are correct there are better references the authors could cite.

Page 4 line 64 the authors cite references 10, 11 for the current status of prediction protein:protein binding poses. The authors should cite a comparative study such as:

1. Huang S-Y. Exploring the potential of global protein–protein docking: an overview and critical assessment of current programs for automatic ab initio docking. Drug Discov Today. Elsevier Ltd; 2015;20: 969–977. doi:10.1016/j.drudis.2015.03.007

2. Hogues H, Gaudreault F, Corbeil CR, Deprez C, Sulea T, Purisima EO. ProPOSE: Direct exhaustive protein-protein docking with side chain flexibility. J Chem Theory Comput. 2018;14: 4938–4947. doi:10.1021/acs.jctc.8b00225

Page 4 line 64 the authors cite reference 12 with respect to the challenge of prediction protein:protein affinity. The authors should cite a comparative study or a more relevant review such as:

1. Sirin S, Apgar JR, Bennett EM, Keating AE. AB-Bind: Antibody binding mutational database for computational affinity predictions. Protein Sci. 2016;25: 393–409. doi:10.1002/pro.2829

2. Kastritis PL, Moal IH, Hwang H, Weng Z, Bates PA, Bonvin AMJJ, et al. A structure-based benchmark for protein-protein binding affinity. Protein Sci. 2011;20: 482–491. doi:10.1002/pro.580

3. Siebenmorgen T, Zacharias M. Computational prediction of protein–protein binding affinities. Wiley Interdiscip Rev Comput Mol Sci. 2019; 1–18. doi:10.1002/wcms.1448

While not a concern, on Page 6 line 113-115 the authors state that they “… assumed that the major driving force of RbF4 for target binding should be no different from antibodies or high affinity protein binders (Antibody Mode)”. While the authors clearly show the benefit of using the Antibody mode scoring function in ClusPro in table 1, I would like the authors to expand on this statement and explain which features they feel are similar between their binding event and that of antibody CDR interacting with an antigen. In the reference cited by the authors (reference 28) the ClusPro authors state that antibody recognition is typically flatter and less hydrophobic than enzyme pockets, is this similar for your binding event. Additional discussion and insight would be beneficial to the reader.

Page 7 line 136 does the author mean AMBER99sb forcefield energy? Is this the total energy including the internal energy (bond, angle, torsions)?

On Page 9 Line 172-173 the authors state the following “The results indicate that the predicted binding mode is sufficiently accurate for further engineering of the binding specificity.” While the authors results are great, we should be careful about people misreading this sentence. The authors should clearly state that the predicted binding mode is the result of not only using a docking tool and scoring function but results were filtered using experimental information.

On page 9 line 179 the authors mention selection of LRR designs using AMBER forcefield energy yet this is not described in the experimental protocol on page 14. Did you use AMBER99sb again? Did you use total energy or MM-GB/SA? Clarification is needed.

On page 9 line 180 the authors mention a FoldX scan, but no mention of this in the computational methods section. Could the authors please include this. Also, to follow up on the above point did the authors solely use FoldX to create the mutants then use AMBER to score them or did they use the FoldX binding affinity scoring function? Again, further clarification is needed.

On Page 14 line 280-283 the authors state “From the two studies of the binding mode prediction and affinity improvement, the force field energy prediction becomes informative only if actual binding is known, i.e. supportive data from experiments are critical in practice.” I believe the authors are referring to their work here and therefore they should explicitly state that. Also, the statement is partially incorrect, since the authors clearly show that the forcefield energy can discriminate docking poses, but not how it correlates with binding affinity. This should be removed.

Overall this is an excellent paper that I can highly recommend for publication after minor corrections.

Reviewer #2: The manuscript by Choi et al. entitled "Computer-aided Binding Mode Prediction and Affinity Maturation of an LRR Protein Binder without Structural Determination" describes a project in computational protein engineering in which epitope prediction, docking and computational energy minimization were combined with intermediate experimental validation steps to determine the binding mode of the RbF4-hFc interaction to facilitate affinity maturation of this interaction. This work closely follows the approach taken in a previous publication (ref [10]), extending to the context of a leucine-rich repeat protein binder.

The ultimate goals of the work were three-fold:

i. To test the ability of computational energy minimization to accurately predict the binding mode of the interaction between hFc and a previously developed LRR protein binder or repebody RbF4, and to validate the accuracy of the prediction.

ii. To evaluate the utility of these predictions for affinity maturation of this interaction.

iii. To demonstrate that this approach can be applied to other LRR protein binders.

Overall, this is an interesting study that represents a contribution to the growing literature surrounding the application of computational approaches to the engineering of protein binders.

Major comments:

The authors use existing tools (ClusPro in conjunction with EpiScope) to predict the epitope of the RbF4-hFc interaction, using a crystal structure of the unbound hFc, and a homology model of the repebody RbF4. The strategy used in this study is initially entirely consistent with that in ref. 10, and the novelty here is that it is applied in the context of the RbF4-hFc interaction. Verifying that the strategy presented in ref. 10 can be applied in this additional context is useful.

1) The authors need to make clear in the introduction to the study the relationship of the work carried out in this manuscript to the previously published protocol from ref [10]. It is not immediately clear to the reader that the approach described in this manuscript is not wholly original, and I found that to be somewhat misleading. Placing the work in the broader context of the field does not detract from it.

2) How did the authors confirm that the measured reduction in binding affinity was not caused by changes in the stability or structure of Var3, H75A or N80K, but rather resulted from specific disruption of the Ab-Ag binding interface?

3) What positions were contained in Var 1 and Var 2? Please provide position numbering that is concordant the the PDB structure 3AVE referred to in the text.

4) The decision to evaluate each of the three mutations in Var 3 as single mutants appears to deviate from the approach proposed in ref 10, requiring additional experimental work. The authors comment:

Lines 125-126 'Although the hFc interface region in contact with RbF4 was roughly identified through the triplet, it is probably that not all of them are involved in the binding.'

What is the effect on the subsequent analysis if this step to interrogate each of the three mutations in Var 3 is not taken? What do the authors recommend to future users of this technology as the standard approach?

5) Figure 2B and C show that of the docking models in contact with the epitope overlapping residues, the model with the lowest energy has close to the lowest I-RMSD and the highest f_nat. How different is the binding mode of the model with higher energy and lower I-RMSD?

6) Similarly, how different were the models with lower energy in Figs 2B and C that did not fulfill the conditions of being in contact with H75 and N80 but not with H200 or the six other positions in Var 1 and 2? Please provide a table in the supplement with details of the positions in contact in each model, and the force field energy and structural parameters (I-RMSD, f_nat) for each of the 29 models.

Minor comments:

1. While overall the writing and presentation is fully adequate and relatively easy to understand, the paper would nonetheless benefit greatly from careful reading to fix a large number of awkward and/or unclear statements and phrases. For example:

(i) Lines 111-113 are quite unclear, in particular the phrase 'assigning attractions at the concave residues of LRRV modules from 2 to 4, as known during the phage display selection' does not make sense.

2. What is shown in red in Figure 1d? I'm guessing it is the repebody loop?

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008150.r003

Decision Letter 1

Arne Elofsson, Marco Punta

14 Jul 2020

Dear Dr. Kim,

We are pleased to inform you that your manuscript 'Computer-guided Binding Mode Identification and Affinity Improvement of an LRR Protein Binder without Structure Determination' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Marco Punta

Associate Editor

PLOS Computational Biology

Arne Elofsson

Deputy Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008150.r004

Acceptance letter

Arne Elofsson, Marco Punta

24 Aug 2020

PCOMPBIOL-D-20-00449R1

Computer-guided Binding Mode Identification and Affinity Improvement of an LRR Protein Binder without Structure Determination

Dear Dr Kim,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Laura Mallard

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. The repebody structure.

    (A) A repebody (Rb) largely consists of three parts: N-termianl cap (LRRNT), variable regions (LRRV) and C-terminal cap (LRRCT). Binding occurs at the concave region of LRRV (in darker blue). (B) Structure of a single LRRV motif, with side chains of conserved residues rendered as stick figures. Each LRR is composed of six conserved leucine residues, a central conserved asparagine residue, and conserved phenylalanine residue on the C-terminal side.

    (TIF)

    S2 Fig. Detailed information about the mutations for hFc-F4 epitope localization and titration curves.

    Based on the Kd values, H310 and N315 overlap epitopes.

    (TIF)

    S3 Fig. Docking models of RbF4.

    The crystal structure is in gold. (A) The full atom energy minimization step may change the overall structure, but the binding interactions are likely to be maintained. The I-RMSD value of the lowest energy model (Model 1, blue) is slightly higher than the model with the second lowest energy (Model 2, pink). However, its fnat is twice higher. (B) There are two docking models which are in contact with the three mutations in Var 3 (Model 9: cyan and Model 10: pink). While their binding interface regions are largely correct, the binding orientation of the lower energy model (pink) is completely inverted.

    (TIF)

    S4 Fig. Details of Binding affinities of RbF4 variants (RbF4, RbF4-LT and RbF4-MR) for hIgG1, mIgG1 and rIgG.

    RbF4 binds strongly to hIgG1 and weakly to mIgG1. However, no binding affinity is measured for RbF4-rIgG. The truncation of the loop (RbF4-LT) enabled the variant to bind to all IgGs with similar binding affinities. RbF4-MR gains strong binding affinities for hIgG1 and mIgG1. See S2 and S3 Tables for details.

    (TIF)

    S5 Fig. FoldX scan of the RbF4 loop.

    The residue scan using FoldX suggests that the inclusion of the loop may not enhance the binding affinity of RbF4 for rIgG.

    (TIF)

    S6 Fig. Predicted ΔΔG values of the RbF4 variants.

    The variant with S241M and S244R mutations (RbF4-MR) is predicted to strongly bind to both hIgG1 and mIgG1. S244C and S244P were not considered.

    (TIF)

    S7 Fig. Binding mode prediction with the ClusPro score.

    The ClusPro score was tested on the retrospective test set (A-C: IL-6, D-F: EGFR, and G-I: C5a binders). Docking models that are in contact with epitope overlapping residues (localized docking models) are in solid circle. The blue circle is the model with the lowest ClusPro score. Crystal structures are in yellow and the docking models with the lowest energies are in blue on the right hand side. Score assessment using the ClusPro score is not predictive.

    (TIF)

    S1 Table. Data collection and refinement statistics.

    (DOCX)

    S2 Table. Quality measures and contact information of RbF4 docking models.

    Each black block indicates that the Fc position is in contact with the docked repebody.

    (DOCX)

    S3 Table. In silico binding specificity control and affinity improvements.

    RbF4 strongly binds to hIgG1 (weakly to mIgG1), but no binding to rIgG is observed. Removal of the loop (RbF4-LT) causes a slight reduction in the binding affinity toward hIgG1, but yields a marginal affinity improvement for mIgG1. RbF4-LT produces a significant improvement in the affinity for rIgG. RbF4 with two mutations at 241 and 244 (S241M with S244R, RbF4-MR) binds to both hIgG1 and mIgG1 with high binding affinities.

    (DOCX)

    Attachment

    Submitted filename: response_final.pdf

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES