Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2014 Apr 23;82(8):1599–1610. doi: 10.1002/prot.24576

Assessment of fully automated antibody homology modeling protocols in molecular operating environment

Johannes K X Maier 1,*, Paul Labute 1
PMCID: PMC4284016  PMID: 24715627

Abstract

The success of antibody-based drugs has led to an increased demand for predictive computational tools to assist antibody engineering efforts surrounding the six hypervariable loop regions making up the antigen binding site. Accurate computational modeling of isolated protein loop regions can be quite difficult; consequently, modeling an antigen binding site that includes six loops is particularly challenging. In this work, we present a method for automatic modeling of the FV region of an immunoglobulin based upon the use of a precompiled antibody x-ray structure database, which serves as a source of framework and hypervariable region structural templates that are grafted together. We applied this method (on common desktop hardware) to the Second Antibody Modeling Assessment (AMA-II) target structures as well as an experimental specialized CDR-H3 loop modeling method. The results of the computational structure predictions will be presented and discussed.

Keywords: homology modeling, comparative modeling, antibody structure, antibody engineering, CDR-H3 loop modeling, loop enrichment, loop refinement

INTRODUCTION

The hallmark feature of antibodies is their ability to bind an almost unlimited collection of target structures with remarkably high affinity. This high target specificity combined with modular composition into distinct functional and structural domains makes immunoglobulins particularly attractive for use as drugs. Antibodies have been successfully applied in therapeutic contexts for the last 20 years and there are currently hundreds of antibody-based drugs in the late stage of clinical trials.1 The focus of engineering efforts revolves around the FV region of the immunoglobulin, the smallest fragment of the antibody that retains antigen binding ability. One particular challenge in modeling an antigen-binding site is that the regions of interest—the six hypervariable loop regions—are designed by nature to be diverse, whereas the region outside the antigen binding loops, the framework region (FR), is structurally characterized by a conserved beta barrel fold.

Accurate computational modeling of isolated protein loop regions can be quite difficult; consequently, modeling an antigen binding site that includes six loops is particularly challenging. Fortunately, five of the six hypervariable “CDR” loops, three of the light chain (CDR-L1, CDR-L2, and CDR-L3) and two of the heavy chain (CDR-H1 and CDR-H2) assume a rather small repertoire of main chain conformations and are therefore called “canonical” loops24 leaving the central challenge of modeling the CDR-H3 loop. A further complication is the fact that the antigen-binding site has to be modeled as a dimer composed of the variable domains of the light VL and heavy VH chains. Given that the CDR loops bind to the antigen at the interface of the VL and VH chains, their relative orientation also affects binding and can create an additional source of diversity.

A computational assessment of various antibody modeling programs was held in 2009,5 the First Antibody Modeling Assessment (AMA-I). Our various submissions in AMA-I were characterized by the amount of user intervention necessary to model the target as well as a control protocol that used no antibody-specific knowledge. Our primary fully automated protocol used in AMA-I, referred to as “autoFv,” has undergone several modifications since its first implementation and is the standard for automated antibody homology modeling in the Molecular Operating Environment6 (MOE) software.

In this work, we report on our contributions to the Second Antibody Modeling Assessment (AMA-II) that consisted of two stages: Stage 1 involved full structure modeling (as in the AMA-I) while Stage 2 focused on H3 modeling. The autoFv protocol (also referred to as “ccg3” in this work) was used along with other prototype protocols to be described below. Our efforts for AMA-II were directed at rapid automated modeling of antibodies; that is, running on common desktop hardware without user intervention. We describe our methods in the Materials and Methods section. In the next section, the results of AMA-II computational experiments are presented and discussed. Conclusions are drawn in the final section.

MATERIALS AND METHODS

The MOE antibody modeler is based upon the use of a precompiled antibody structure database, the “Fab Database” (see below). This database is used as a source of framework and hypervariable region structural templates, which are grafted together (see below). Typically, the Fab Database will contain Protein Data Bank7,8 (PDB) structures augmented with additional proprietary structures, if available. In this work, only PDB structures available as of December 2012 were used, a total of 1,969 structures. Given the VL and VH amino acid sequences as input, the automated antibody modeling protocol (autoFv) proceeds as follows:

  1. Search the Fab Database sequences for collections of candidate templates for each of VL, VH, and the FR FV = (VL,VH) as well as individual CDR loops.

  2. Build 10 crude models from combinations of the template candidates. Each model is built by grafting the loop templates onto the template VL and VH framework chains. In case of hetero template composition, the VL and VH grafted templates are transposed via backbone superposition to FV dimer coordinates of the highest ranking (see below) FV dimer of the VL + VH search results. Side chains are modeled with a rotamer library and coordinates are refined using forcefield energy minimization to relieve strained geometries and clashes.

  3. Build a consensus model from the pool of crude models. The model with the lowest FV binding energy is used as the consensus structural core.9 The binding energy of the FV complex is calculated as E (VL:VH) – E (VL) – E (VH). Cluster the CDR conformations in the pool and select the best representative (see below) for each CDR loop. Graft the selected loops onto the consensus structural core using the protocol for building the crude models.

This autoFv procedure is termed “ccg3” for AMA-II; the “ccg1” and “ccg2” prototype protocols are derivatives of ccg3 that will be described after the details of the ccg3/autoFv protocol are described. All three protocols are summarized in the flow chart of Figure 1.

Figure 1.

Figure 1

Workflow diagram of the automated antibody homology modeling protocol. The FV consensus model is built based on a collection of either 10 (ccg3) or 5 (ccg2) individual models. The steps that are specific to the ccg2 protocol are outlined in dark blue color; ccg1 is an extension step to ccg2 or ccg3 in generating additional loop conformation in CDR-H3.

Fab database compilation

The Fab Database is a collection of antibody structures determined by x-ray crystallography. It was assembled using an automated procedure that identifies FV domains from the PDB (or collection of proprietary structures). The antibody detection and classification system is based on a previously published10 collection of VL and VH reference sequences enriched with antibody sequences submitted to the PDB with no restrictions concerning type or species. The reference sequences were aligned using MOE, followed by single-linkage clustering to extract antibody clusters of maximum diversity in the framework sequence. The VL reference set is composed of 101 antibody structures with a maximum framework sequence identity of 85% and the VH set of 85 structures at a maximum framework identity of 75%.

The entire PDB was then screened for immunoglobulin structures. The sequence of each protein chain is aligned against the VL and VH reference sets and the sequence identity is calculated as a percentage of identical residues within the aligned FV domain as well as within the respective FRs FR1, FR2, and FR3. Protein chains are classified as antibody chains if the overall sequence identity with one chain of the reference collections exceeds 40%. Classifying an FV chain as either VL or VH requires a framework sequence identity of at least 60% with one class and <50% with the opposite class. The cutoffs in the antibody detection as described above ensured no contamination of other immunoglobulin-like structures such as T-cell receptor sequences. If detected as a VL chain, further subclassification into either κ or λ subclasses is achieved by aligning to reference germline sequences originating from the Immunogenetics database.11,12 For structural clustering, the Structural Composition of Protein database SCOP13,14 was used as seed clusters followed augmented with single linkage clustering.

Template search

Given the VL and VH amino acid sequences of the antibody to be modeled, the Fab database is searched for suitable templates. Separate results are collected for VL, VH, and VL + VH as well as the hypervariable loop regions. A sequence alignment with the BLOSUM6215 substitution matrix is performed. Two different scores are used for framework and hypervariable loop regions. The scores for the FRs are calculated as the identity with prealigned framework clusters in the Fab Database, factoring gaps into the calculation. The hypervariable loop scores are calculated as similarity of the query loop compared to the corresponding hypervariable loop sequences of the same length in the Fab Database.16 Since the template search is confined to the content of the Fab Database, the time required to find suitable templates takes on average <0.5 s per query. Only the top 10 scoring framework candidates from each of respective result set are retained. If the sequence identity of a VL + VH result is within 10% of the target sequence then a homo template (VL and VH belong to the same structure) is used, otherwise hetero templates (VL and VH from different structures) are considered. For the CDRs, only the top 10 ranking candidates at a minimum sequence similarity of 50% are retained. In case the similarity cutoff in a CDR region is not met, and no templates are found as a result, the loop will be modeled using the general loop building procedure of the MOE Homology Modeler instead of the particular antibody techniques described here.

Template and model construction

Antibody template structures are built by grafting the framework and hypervariable backbone geometries together. The selected CDR template chains are superimposed in the backbone atoms (N, Cα, C, and O) onto the target framework template within three framework residues on either side of the CDR region followed by tethered energy minimization to relax strained geometries in the transition. In case of a hetero template composition, the templates are transposed via backbone superposition to FV dimer coordinates of the highest-ranking FV dimer of the VL + VH results of the template search. The MOE general Homology Modeler was used to perform sidechain repacking of CDR loop sidechain rotamers as well as framework residues with van der Waals clashes (>1.5 kcal/mol) with the grafted loop residues. For this modeling exercise, all CDR residues were repacked using the conformations from the MOE rotamer library exclusively.

For the consensus model, single-linkage clustering is applied to determine the conformational spread within the respective CDR loops of the models in the pool built above. The near neighbor cutoff for the CDR conformation clusters is set to 0.4 Å for the canonical loops L1, L2, L3, and H1, 0.6 Å for the H2 and 1.5 Å for H3. The loop candidate with the highest structure score within the most frequent CDR conformation was selected as input for the consensus core structure. The structure score, S, is a geometric mean with S = (t3 g2ob)1/7 where t, g, o, and b are values between 0 and 1 as follows; t assesses backbone topology, bond lengths, angles etc.; g is the Ramachandran phi/psi probability; o assesses crystallographic occupancies; b assesses temperature factors. Each of t, g, o, and b were calibrated by setting 0 to correspond to the lower range of outliers and 1 the upper range of outliers as measured from PDB statistics. Values of S are interpreted as 0 meaning completely unsuitable for modeling purposes and 1 meaning ideal. Only structures with S > 0.65 were used in this work.

For the canonical CDR loops, the loop conformations of the crude models are used as input for conformational analysis in each CDR loop category, recruiting the most common conformation as input for the consensus model. The conformational analysis helps filter CDR loop conformations that fit particularly well within a distinct modeling environment, shifting the selection process away from general sequence criteria to structurally more specific ones. Consequently, CDR loops with lower sequence identity (but higher structure scores) may be selected for assembly in the consensus model. In addition, the crude models serve as a pool to address complex structural issues such as the VL:VH dimer orientation and the variability in CDR loop conformations that are challenging to predict based on sequence alone.

Protocol ccg2

The ccg2 protocol is a variation of the autoFv/ccg3 protocol described above. First, only five crude models are built (instead of the usual 10), which reduces the overall modeling time significantly. Second, the canonical CDR loops templates are selected based on position-based profile scores of the respective CDR loop clusters in a special canonical loop database (see below) that match the length of the CDR query; only the top five most sequence-similar candidates are retained. For H3, only the top five template candidates based on sequence similarity calculations are retained.

The canonical CDR loop templates of the ccg2 protocol are selected from a dedicated loop database. All canonical CDR loops extracted from the Fab Database were grouped by CDR type and loop length. Templates with structural issues in the protein backbone and/or low phi/psi scores were discarded. The remainder were clustered based on backbone RMSDs at a resolution of 0.3 Å employing a single-linkage clustering algorithm followed by a sequence profile calculation that determined the frequencies of unique residues in each cluster column. The resulting values were normalized and used as weights for compiling a position score for each residue in the cluster and used when matching the template queries to the sequence profiles of each cluster. A similar approach using Hidden Markov models has been previously reported.17 In this way, residues that are more preserved (and thus more relevant for determining the actual conformation) within a cluster have a stronger impact on the selection.

The selection of framework candidates is based on position-based profile scores with residues shown to be involved in FV dimer interactions. The orientation between VL and VH is a particular challenge in modeling an antigen-binding site.18,19 Narayanan et al. (2009) used the Cα RMSD in the FR of VL when superposed on the VH framework as a metric to quantify the relative changes in the VL:VH orientation and also demonstrated that the native VL:VH orientations correspond with the energy minima measured as FV complex interaction energies. We make use of this observation along with the residues identified by Abhinandan and Martin20 in the framework (L38, L40, L41, L44, L46, L87, H33, H42, H45, H60, H62, H91, H105, and Chothia numbering) for the profile scores. With this distinct and reduced set of residues in place that are (intended to be) determinants of the FV dimer orientation, the framework template candidates were identified based on the number of identical residues within this distinct set between the templates. The overall framework sequence identity was used as a secondary metric to determine the final FR template selection.

Protocol ccg1

The ccg1 is a derivative of the ccg2 protocol in which the CDR-H3 loop diversification step is applied using the Fab Database described above. The database was first searched for H3 loops identical in length with the query CDR-H3. Single-linkage clustering was applied based on backbone RMSD at a resolution of 0.5 Å to determine distinct sets of CDR-H3 loop conformations. The candidate with the best structure score within each conformation cluster was used as input for template assembly into the VH chain of the consensus model or native structure. A model based on each cluster was built in the manner described above. For each resulting model, VL:VH binding energies and conformational energies were calculated. Models with more than 10 clashes > 0.5 kcal/mol or positive conformation energies were discarded. The three models with the lowest binding energies were output.

Manual modeling of target MA2-1 in AMA-II Stage 1

The lack of suitable structure templates for CDR L1, L3, and H3 combined with low sequence identity in the FR across species precluded a fully automated modeling approach for target MA2-1. Instead, this target was modeled using the interactive MOE Antibody Modeler application using the default settings. This application suggested PDB:2VUO—one of the few rabbit immunoglobulins in the PDB—as the only suitable structure template. The top 3 ranking models based on geometry assessments in the protein backbone and overall energy scores of 100 coarse models were used for CDR conformation diversification in CDR-L1, CDR-L3, and CDR-H3 using LowModeMD.21 The backbone atoms N, Cα, C, and O of the CDR loops L2, H1, and H2 as well as the framework atoms were kept fixed, allowing only the sidechain atoms to move and thus accommodating novel conformations. The procedure was stopped after 300 distinct conformations were generated. The candidates with the lowest energies were selected and subjected to a final structure refinement procedure as described above.

CDR-H3 modeling in AMA-II Stage 2

Stage 2 focused on modeling CDR-H3 when given x-ray crystal structures (with H3 removed) provided by the organizers of the modeling assessment. Our efforts involved the use of a dedicated CDR-H3 loop database. This database was searched to find H3 loops (of matching length) that superposed to an RMSD of 0.25 Å to the surrounding three framework backbone atoms. Sidechains were repacked as described above and the coordinates were refined using a tethered protocol. During refinement, all heavy atoms of the input x-ray structure were held fixed allowing only the H3 loop atoms to move within the antigen-binding complex. The minimization procedure was repeated five times, gradually releasing the tethers on the backbone atoms of H3. Finally, models with unresolvable clashing energies >2.5 kcal/mol were discarded and the best scoring (see below) conformation was selected. On average, the processing time of one conformation consumed ∼1 CPU minute.

For the compilation of the CDR-H3 loop database, all H3 loops were extracted from the Fab Database and clustered according to length and backbone RMSD. Single-linkage clustering applied based on a backbone RMSD cutoff of 0.4 Å to determine distinct sets of CDR-H3 loop conformations. Structures with positive forcefield energies, poor (<85%) Ramachandran map probability, or a clash count of >5 were discarded. The cluster member with the highest score (see below) was used as the representative. The antibody dimer structures of these CDR-H3 representatives were then extracted from the Fab Database for further diversification of the loop conformation space by subjecting them to conformational search using LowModeMD. All residue atoms beyond a perimeter of 5.5 Å of CDR-H3 loop atoms were kept fixed, as well as the backbone atoms of non-H3 residues within the perimeter. This left the H3 loop atoms, including three flanking framework residues and all sidechain atoms within the perimeter, free. LowModeMD was terminated after 100 attempts to generate a novel conformation up to a maximum or when 50 unique structures were generated. The structural scores of the newly generated H3 conformations were calculated discarding conformations with inappropriate structure/clashing scores. The remainders were clustered at a resolution of 0.4 Å and the conformation representatives were merged with the corresponding conformation pool extracted from the Fab Database.

As a control alternative to energy-based scores used for ranking output models in Stage 1, a scoring function was developed factoring in H3-specific properties like accessible surface area, psi/phi values and interaction energies of CDR-H3 to the FV dimer complex. The prepared representative FV dimer structures extracted above from the Fab Database were used as raw data. For each loop length category (ranging from 4 to 14 residues), the respective components were calculated, normalized, and the frequency data was fitted to a Gaussian distribution. The final CDR-H3 loop score was calculated as the geometric mean of the three individual loop scores.

Hardware and software

All calculations (and validation analyses) in this work were conducted using the MOE version 2012.10 on two computers, a Mac Pro, Dual Core Intel Xeon at 2.66 GHz, OS X 10.5 (Leopard) and a Dell Inspiron, Intel 2 Dual Core on OS Linux Mint 13. Energy calculations and coordinate refinement were performed using the AMBER 1222 forcefield parameters augmented with the GB/VI implicit solvent model23 with a nonbonded cutoff of 8 Å. PDB structures were prepared for simulation work using the MOE Structure Preparation application followed by Protonate3D.24

RESULTS AND DISCUSSION

AMA-II Stage 1

The Stage 1 models (ccg1, ccg2, and ccg3) were built using the methods described in the previous section with the exception of target MA2-1, a rabbit antibody, which was modeled manually due to a lack of templates in the PDB. Each final ccg3 model required 20–50 CPU minutes to compute and 10–25 CPU minutes per final model for ccg2 and ccg3 protocols with most of the time spend in coordinate refinement. All of the protocols presented here produced models of good structural quality evidenced by an overall MolProbity score average of 1.5, lower than the 1.8 average we have calculated for 1,969 x-ray structures from the PDB (our Fab Database). A detailed listing of the various MolProbity measurement components and comparisons to the other participants are provided in the Supporting Information Tables of Topalev et al. in this issue.

Table 1 presents a comparison of the ccg1–3 structures with the (previously unknown) x-ray coordinates of the targets of the second antibody modeling assessment. The RMSD values measured over the entire framework of all 10 models (FR) reveals a more robust and overall slightly better performance of the established ccg3/autoFv protocol (0.80 ± 0.19 Å RMSD), compared with ccg2 (0.87 ± 0.24 Å RMSD). However, the discrepancy is largely caused by a poor ccg2 template choice for modeling target MA2-3 with a RMSD of 1.35 Å compared with 0.57 Å of the corresponding ccg3 model. The low agreement in dimer orientation is further evidenced by large tilt angles (14.3°) and large discrepancies in the HL torsion and HC2 bend angles (7.8° and 7.9° difference to the x-ray structure). Our analysis revealed that the failure in selecting an appropriate framework template for this target was not solely rooted in a failure of the FR scoring function that was applied in selecting the five template candidates to build the crude models, but rather in the unusual, “open” conformation of CDR-H3 in one of the models. Unfortunately, this unusual conformation also measured the lowest FV complex binding energy among the five crude models that is the driving parameter for the framework selection of the consensus model. Further analysis of the MolProbity scores (Topalev et al. in this issue) and the all-atom energies of the ccg2 model are reasonable for all of the measured components suggest that lower interaction energies were not obtained at the expense of unrealistic structures. It is possible that a larger collection of crude models as used by ccg3/autoFv (ten instead of five) offers better protection against selecting a particularly poor framework template. However, given our findings that CDR-H3 residues are strongly influential in the interaction between VL and VH chains, the issues might be more complex and point potentially to a chicken and egg problem by which the orientation of the VL and VH chains may only be predicted accurately if the correct conformation of CDR-H3 is known. Even with inclusion of the poor modeling result of target MA2-3, the measurements in the FR are promising and comparable with the other modeling approaches presented in this contest (see Almagro overview in this issue). As the ccg2 modeling protocol was still under development at the time of AMA-II, we are optimistic that further fine-tuning of the scoring parameters will produce more stable results that are in keeping with those demonstrated with ccg3/autoFv over the last few years. The even more experimental ccg1 protocol fared worst of the three. Since ccg1 used the output of ccg2 it would appear that the additional steps were detrimental and further investigation is required to diagnose the cause. An interesting possibility is ccg1's additional coordinate refinement with CDR-H3 backbone atoms fixed (letting the other CDR loops adapt to the backbone of CDR-H3), which may introduce negative effects on the other CDR loops (e.g., CDR-L1 in MA2-2, MA2-3 and MA2-10 and to a lesser degree in CDR-L3 in MA2-3).

Table I.

RMSD Values in Å Between Three CCG Protocols (See Text) and the Stage 1 Target X-Ray Crystal Coordinates; “All” is the RMSD on the Entire FV Region; “FR” Is the RMSD on the Framework (Excluding the CDR Loops); “Canonical” Is the RMSD on CDR L1L3 and H1H2; “L1L3” and “H1H3” Are the RMSDs of Each Individual CDR Loop Region

graphic file with name prot0082-1599-t1.jpg

All RMSD measurements (in Å) are based on backbone atoms N, Cα, C, and O.

Since the distinct arrangement of CDR loops is a critical factor in the formation of the antigen/antibody complex, CDR measurements must be assessed on two levels–the first with respect to their orientation within the dimer and the second with respect to the accuracy of the individual loop conformations. Assessment is compromised, however, when the entire antigen-binding surface (all Loops) is considered due to the influence of the CDR-H3 loop component. Accordingly, the canonical loops are assessed first (Table 1, Canonical) and, in this measurement component, these show competitive results with an average ccg3/autoFv of 1.00 ± 0.29 Å RMSD for all 10 models. The RMSDs are almost identical between the protocols (∼1.0 ± 0.25 Å) with some interesting variations in specific cases. For example, larger RMSD outliers of single models are responsible for the variations in the light chain averages (CDR-L1 in MA2-7, 1.25 vs. 0.37 Å and 0.58 vs. 0.27 Å in MA2-8 for CDR-L2), whereas overall improvement in modeling CDR-H1 and CDR-H2 is observed with ccg2 over ccg3. As expected, the less canonical and particularly challenging CDR loops (CDR-L1 and CDR-L3 of target MA2-1, CDR-H1 of MA2-3 and CDR-L3 of MA2-5) show lower precision but within the range of the results in the modeling competition (see Table 1 and Teplyakov and Almagro this issue). Promising results are seen in CDR loops with low agreements to the North CDR clustering scheme such as the CDR-L3 loops of MA2-2 and MA2-3, CDR-L1 of MA2-4 and MA2-5 and CDR-H2 of MA2-6 and MA2-8 where the RMSD values are within the modeling precision of less challenging cases. An exception is CDR-H2 of MA2-8 that has only been well modeled with the ccg2 protocol (0.35 vs. 0.90 Å RMSD). As mentioned in Teplyakov and Almagro (this issue), improvements could be made in addressing the cis/trans configuration of certain proline residues in the CDRs in more detail.

A quality assessment of the antigen-binding surface is not complete without the analysis of the key CDR-H3. The unusually broad range in loop lengths, the conformational flexibility as well as its influence on other areas of the antigen-binding complex warrants a separate evaluation. Since it is not yet possible to employ any type of loop conformation dictionary in modeling this loop accurately, it is probably best to offer a set of plausible conformations as a representation of CDR-H3 rather than a single conformation providing only the sequence as input. To this end, the main objective of the ccg1 protocol is to provide a diverse, structurally sound, and energetically plausible collection of CDR-H3 conformations. On average, 55 alternate and diverse conformations were generated in approximately one CPU hour of computation on a common desktop computer. While this approach demonstrates that the primary objective was satisfied, the H3 modeling results in Table 2 reveal that it was not possible to improve the modeling predictions of CDR-H3 compared with the output conformations of ccg2 and ccg3 (which performed best). The ccg1 conformation diversification step does not fail in producing reasonably accurate conformations (average global RMSD of 1.54 ± 0.61 Å) but rather in selecting the best, or close to the best, solution within the generated conformation pool. The H3 scoring function, trained on loop-specific properties like psi/phi torsion values, FV interaction energies and surface area was implemented as a relative metric for ranking CDR-H3 loop conformations but failed in predicting CDR-H3 conformations accurately (3.31 ± 0.94 Å). By default, the ccg1 protocol uses energies as a ranking criterion for the models in addition to the loop score. Selecting the lowest energy model for predicting H3 would have improved the average RMSD value to 2.89 ± 0.57 Å (and thus within the assessment) but would have also failed to improve the prediction accuracy of CDR-H3 of the input models ccg2 or ccg3/autoFv. Figure 2(A) lists the AMBER12 energies of the target x-ray structures and the global RMSDs in CDR H3 of the top 3 ranking ccg1 models (Models 1, 2, and 3) followed by the lowest RMSD model. The suffix indicates the position of the low RMSD conformer within the energy-ranked conformation set. The comparison of AMBER12 all-atom energies shows that the x-ray structures would have been discarded in all cases if present in the pool of ccg1 conformations. The correlation analysis [global RMSD (y-Axis) vs. AMBER12 all-atom energies (x-Axis)] as depicted in the plots of Figure 2(A) does not support the notion that prediction accuracy is linked to low energies of the models further substantiating our problems in discriminating the best or close to the best possible solutions. Taken together, these results suggest that modeling of CDR-H3 still represents a serious challenge both for us as well as for the participants that use more computationally expensive methods for modeling H3.

Table II.

RMSD Values in Å (Local and Global) of H3 Between Three CCG Protocols (See Text) and the Stage 1 Target X-Ray Crystal Coordinates; “Lowest” Is the Smallest RMSD Value of Intermediate Models Generated in the Protocols; “ccg1–3” Are the RSMD Values of the Three Protocols; “Lowest E” Is the RMSD of the Intermediate Structure with the Lowest Forcefield Energy for ccg1

graphic file with name prot0082-1599-t2.jpg

Global RMSDs denote loop measurements after superposition of the framework first.

Figure 2.

Figure 2

Comparison of AMBER12 all-atom energies (kcal/mol) of the target x-ray crystal structures with the Stage-1 (A) and Stage-2 models (B). Values referring to the X-ray structures are shown with gray background, followed by the lowest three energy models (Model-1, Model-2, and Model-3) and the model with the lowest global RMSD (LowRMSD_x). The suffix indicates the rank in the energy-sorted conformation pool. The RMSD values (Å) are based on measurements in the backbone (N, Cα, C, and O) of CDR H3 and represent the global RMSD to the target H3. The Plots show the results of a correlation analysis (global RMSD against AMBER12 all-atom energies), “n” denotes the number of conformations left in the conformation pool after removal of models with bad energies or unresolvable clashes, “R” is the Pearson's correlation coefficient of global RMSD (y-axis) and Amber12 all-atom energies (x-axis).

In summary, the fully automated modeling protocols ccg3/autoFv and ccg2 show good prediction accuracy in the structural core and the canonical CDR loop regions. In AMA-II, the more seasoned and established ccg3/autoFv modeling method displayed a more robust and slightly improved performance compared with ccg2 (overall RMSD average of 1.05 vs. 1.11 Å for ccg2 and 1.14 Å for ccg1 Table 1) in contrast to an internal benchmark (data not shown). The variations we observe among these modeling procedures are more evident in the framework/dimer orientation category (see Table 1 FR section as well as Topalev et al. and the Supporting Information Section of Almagro et al. in this issue) and are signs of a not yet fully fine-tuned and matured ccg2 modeling protocol.

Comparison to the first antibody modeling assessment (AMA-I)

The second assessment (AMA-II) provides an opportunity to measure the improvements made since the first assessment (AMA-I). The comparison is best done with the autoFv/ccg3 protocol since this protocol was also used in the first modeling assessment. As depicted in Figure 3, the current version of the autoFv protocol suggests enhancements in the important framework (FR), canonical loops (Canonical) measurements evidenced by the average RMSD values of 1.34 vs. 1.05Å over the entire FV, 0.93 vs. 0.79 Å in the framework and 1.17 vs. 1.0 Å in the canonical loop categories. A similar degree of progress is seen with the ccg2 protocol. As discussed above, the differences observed in modeling the H3 loop are not rooted in a systematic effort to improve modeling in this category but rather a result of refinements made in other areas of the binding site that result in better predictions of CDR-H3 overall.

Figure 3.

Figure 3

Comparison of the RMSDs of the automated protocols in the first and second antibody modeling assessment in the FV region (All), the dimer framework (FR), Canonical Loops (Canonical), and CDR-H3 (H3).

AMA-II Stage 2

Stage 2 focused on modeling of CDR-H3 when given the coordinates of the remaining antibody dimer. Minor deviations in modeling the framework and canonical CDR loops will influence the architecture of the antigen binding surface and will in turn affect the modeling of CDR-H3. In this respect, Stage 2 can be viewed as a reduction of the modeling complexity by eliminating the uncertainty of modeling the remaining FV region; in other words, Stage 2 is a CDR-H3 modeling exercise under ideal conditions. Using the dedicated CDR-H3 loop database and techniques described in the previous section (still under development at the time of AMA-II), we generated five models, CCG1–CCG5, selected on the basis of three different scoring regimes. The CCG1 and CCG2 models were the two lowest forcefield energy structures; CCG3 and CCG4 models represent the top two ranking models based on specialized H3 loop scores; CCG5 models were selected manually (based on personal experience, atom clashes and protein geometry) from the pool of the top ten candidates of each group.

Table 3 presents the RMSD comparison (local and global) of each of the generated models with the experimental crystal structure. The number of CDR-H3 conformations generated was ∼800 conformations for CDR-H3 lengths of 8, ∼1,200 for lengths 10 and 11, and ∼300 conformations for the 14 residue loop of target MA2-6. Conformation coverage was good with 7 of 10 targets having a conformation generated at <1.0 Å local RMSD to the experimental structure, two between 1.0 and 1.25 Å and target MA2-6 at 1.6 Å. It appears the conformation space of target MA2-6 was not adequately sampled suggesting that the iteration limits of LowModeMD should increase as a function of loop length. The energy-based selections, CCG1 and CCG2, were better than the scoring function selections, CCG3 and CCG4; this is true on average and in the individual cases with the exception of target MA2-3. In three cases, the manual selection, CCG5, outperformed the CCG1 energy based selections; however, in two out of the three the RMSD to the crystal structure was >2.0 Å and these cases (MA2-6 and MA2-8) also had no generated conformations under 1.0 Å RMSD. This suggests that in the absence of a close-to-experiment conformation, aesthetic qualities of the loop conformation, rather than energy, may lead to closer predictions. The average CCG1 RMSD to experiment on all 10 targets was 1.5 local and 2.5 Å global. Leaving out the targets for which no conformation <1 Å RMSD (local) was generated (MA2-2, MA2-6, and MA2-8) the averages were 1.1 Å RMSD local and 1.7 Å global. This supports the hypothesis that if there is a close conformation to experiment in the pool then the energy based selection can do well.

Table III.

RMSD Values in Å (Local and Global) Between Three Distinct Selection Methods (See Text) Employed in Stage 2 and the Target X-Ray Crystal Coordinates; “H3” Gives the Length of the CDR-H3 Loop; “nConfs” Is the Total Number of Conformations Generated in the Respective Pool; “CCG1 and 2” Are the RMSDs of the Lowest AMBER12 All-Atom Energy Models, “CCG3 and 4” Are the RMSDs of the Models With the Highest Loop Scores and CCG5 Is Based on a Manual Selection, “Lowest” Is the Lowest RMSD Conformer in the Pool

graphic file with name prot0082-1599-t3.jpg

All RMSD measurements are based the backbone atoms N, Cα, C, and O.

Some discussion is warranted surrounding the issues related to energy based model selection. First, atomic forcefields are only approximations and, as such, they have parameterization errors; for example, the bonded parameters, van der Waals and electrostatics/solvation treatment. The lowest energy structures can be quite different from each other. For example, Figure 4 shows the three lowest energy conformers of CDR-H3 sampled near the x-ray structure of MA2-8; there is a large conformational spread but with a relatively small energy range of 2.6 kcal/mol (well within forcefield parameterization error). This presents a problem when ranking conformations and has been observed in small molecule docking calculations.25 Second, forcefields are quite sensitive to small perturbations of the coordinates; experimental coordinates (which serve as the RMSD reference) are typically high in energy when compared to refined modeled structures, which, in some sense, is a built-in bias towards higher RMSD values. Given the premise of Stage 2 modeling with >90% of the heavy atom coordinates known, the energy differences among the models are caused exclusively by the conformational variability in CDR H3. Comparing the top energy models with the native x-ray structures as shown in Figure 2(B) reveals a systematic failure in discriminating the low RMSD conformer from the conformation pool. The results of a correlation analysis of AMBER12 all-atom energy against global RMSD do not support the notion of a linear relationship between the two variables [Fig. 2(B)]. This suggests that other—maybe more CDR H3-specific—terms are required to increase the loop prediction accuracy.

Figure 4.

Figure 4

Conformational range in CDR-H3 within a small energy window of 2.6 kcal/mol. The energies of Models 1, 2, and 3 are in order from lowest to highest and the deviation of the models from the x-ray coordinates were 6.37, 3.63, and 2.56 Å global RMSD.

In retrospect, keeping all of the non-H3 heavy atoms fixed in our protocol may have contributed to the higher RMSD values; for example, the join residues in the FV (being fixed) may have contributed to increased strain energy that could have been relaxed if a tethering protocol was used. Such strain would confound the ranking and perhaps the Ramachandran component of the scoring function used in CCG3 and CCG4 (and implicitly in CCG5) may help in such cases if some means were developed to combine it with the energy value.

CONCLUSION

We have described a method for automatic modeling of the FV region of an immunoglobulin. It is based upon the use of a precompiled antibody x-ray structure database. This database is used as a source of framework and hypervariable region structural templates, which are grafted together. Multiple initial models from multiple templates are constructed. A final consensus model is built by selecting the templates used in one or more of the initial selected by considering forcefield binding energies between FV components along with cluster analyses of the CDRs of the initial models. In addition, we have described a method for modeling the CDR-H3 loop in the context of known coordinates for the remainder of an FV. The method is based on using a computationally enriched database of H3 conformations as a source for loop grafting and coordinate refinement.

The described methods were used in the Second Antibody Modeling Assessment (see Almagro, this issue) on commonly available desktop computer hardware; ∼30 min of CPU time were required to build each model without user intervention (except for one manually modeled rabbit FV). Stage 1 of AMA-II involved full FV modeling and our models resulted in an average of 1.05 Å RMSD to the (previously withheld) x-ray crystal coordinates. Our previously established protocol (used in the First Antibody Modeling Assessment) proved the best when compared with two experimental new protocols still under development. Stage 2 of AMA-II focused on CDR-H3 modeling with known remaining FV coordinates. Our specialized CDR-H3 modeling method (which was still under development at the time of the assessment) produced conformation predictions that were on average 1.5 Å local RMSD to the x-ray coordinates and 2.4 Å global RMSD. Leaving out a few problem structures resulted in a remaining average of 1.1 Å local RMSD local and 1.7 Å global.

The accuracy of CDR-H3 predictions (whether under Stages 1 or 2 conditions) is lower than that of the other CDR loops and the remainder of the FV region. Our methods are capable of generating diverse numbers of CDR-H3 conformations and in the majority of cases considered in AMA-II there was a generated conformation <1.0 Å RMSD from the x-structure. The challenge appears to be the accurate ranking of the conformations. We present some evidence that the forcefield energies are a viable method for ranking keeping in mind that parameterization and coordinate perturbations can have a significant impact on such ranking.

REFERENCES

  • 1.Maggon K. Monoclonal antibody “gold rush”. Curr Med Chem. 2007;14:1978–1987. doi: 10.2174/092986707781368504. [DOI] [PubMed] [Google Scholar]
  • 2.Chothia C, Lesk AM. Canonical structures for the hypervariable regions of immunoglobulins. J Mol Biol. 1987;196:901–917. doi: 10.1016/0022-2836(87)90412-8. [DOI] [PubMed] [Google Scholar]
  • 3.Tramontano A, Chothia C, Lesk AM. Structural determinants of the conformations of medium-sized loops in proteins. Proteins. 1989;6:382–394. doi: 10.1002/prot.340060405. [DOI] [PubMed] [Google Scholar]
  • 4.Martin AC, Thornton JM. Structural families in loops of homologous proteins: automatic classification, modelling and application to antibodies. J Mol Biol. 1996;263:800–815. doi: 10.1006/jmbi.1996.0617. [DOI] [PubMed] [Google Scholar]
  • 5.Almagro JC, Beavers MP, Hernandez-Guzman F, Shaulsky J, Butenhof K, Maier J, Kelly K, Labute P, Thorsteinson N, Teplyakov A, Luo J, Sweet R, Gilliand G. Antibody Modeling Assessment. Proteins. 2011;79:3050–3066. doi: 10.1002/prot.23130. [DOI] [PubMed] [Google Scholar]
  • 6.Molecular Operating Environment (MOE) software. Chemical Computing Group Inc. 2014. http://www.chemcomp.com (last accessed April )
  • 7.Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Jr, Brice MD, Rogers JR, Kennard O, Shimanouchi T, Tasumi M. The protein data bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977;112:535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
  • 8.Berman HM, Bhat TN, Bourne PE, Feng Z, Gilliland G, Weissig H, Westbrook J. The Protein Data Bank and the challenge of structural genomics. Nat Struct Biol. 2000;7:957–959. doi: 10.1038/80734. [DOI] [PubMed] [Google Scholar]
  • 9.Narayanan A, Sellers BD, Jacobson MP. Energy-based analysis and prediction of the orientation between light- and heavy-chain antibody variable domains. J Mol Biol. 2009;388:941–953. doi: 10.1016/j.jmb.2009.03.043. [DOI] [PubMed] [Google Scholar]
  • 10.Allcorn LC, Martin ACR. SACS – self-maintaining database of antibody crystal structure information. Bioinformatics. 2002;18:175–181. doi: 10.1093/bioinformatics/18.1.175. http://antibody.bath.ac.uk. [DOI] [PubMed] [Google Scholar]
  • 11.Giudicelli V, Chaume D, Bodmer J, Müller W, Busin C, Marsh S, Bontrop R, Marc L, Malik A, Lefranc MP. IMGT, the international ImMunoGeneTics database. Nucleic Acids Res. 1997;25:206–211. doi: 10.1093/nar/25.1.206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lefranc M-P, Giudicelli V, Ginestoux C, Jabado-Michaloud J, Folch G, Bellahcene F, Wu Y, Gemrot E, Brochet X, Lane J, Regnier L, Ehrenmann F, Lefranc G, Duroux P. IMGT, the international ImMunoGeneTics information system. Nucleic Acids Res. 2009;37(Database issue):D1006–D1012. doi: 10.1093/nar/gkn838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hubbard TJ, Murzin AG, Brenner SE, Chothia C. SCOP: a structural classification of proteins database. Nucleic Acids Res. 1997;25:236–239. doi: 10.1093/nar/25.1.236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Andreeva A, Howorth D, Brenner SE, Hubbard TJP, Chothia C, Murzin AG. SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res. 2004;32(Database issue):D226–D229. doi: 10.1093/nar/gkh039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA. 1992;89:10915–10919. doi: 10.1073/pnas.89.22.10915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Sonnhammer ELL, Hollich V. Scoredist: a simple and robust protein sequence distance estimator. BMC Bioinformatics. 2005;6:108. doi: 10.1186/1471-2105-6-108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.North B, Lehmann A, Dunbrack RL., Jr A New Clustering of Antibody CDR Loop Conformations. J Mol Biol. 2011;406:228–256. doi: 10.1016/j.jmb.2010.10.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Chailyan A, Marcatili P, Tramontano A. The association of heavy and light chain variable domains in antibodies: implications for antigen specificity. FEBS. 2011;278:2848–2866. doi: 10.1111/j.1742-4658.2011.08207.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Teplyakov A, Obmolova G, Malia T, Gilliland G. Antigen recognition by antibody C836 through adjustment of VL/VH packing. Acta Cryst. 2011;F67:1165–1167. doi: 10.1107/S1744309111027746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Abhinandan KR, Martin ACR. Analysis and improvements to Kabat and structurally correct numbering of antibody variable domains. Mol Immunol. 2008;45:3832–3839. doi: 10.1016/j.molimm.2008.05.022. [DOI] [PubMed] [Google Scholar]
  • 21.Labute P. LowModeMD – implicit low-mode velocity filtering applied to conformational search of macrocycles and protein loops. J Chem Inf Model. 2010;50:792–800. doi: 10.1021/ci900508k. [DOI] [PubMed] [Google Scholar]
  • 22.Case DA, et al. AMBER 12. San Francisco: University of California; 2012. [Google Scholar]
  • 23.Labute P. The generalized Born/volume integral implicit solvent model: estimation of the free energy of hydration using London dispersion instead of atomic surface area. J Comput Chem. 2008;29:1693–1698. doi: 10.1002/jcc.20933. [DOI] [PubMed] [Google Scholar]
  • 24.Labute P. Protonate3D: assignment of ionization states and hydrogen coordinates to macromolecular structures. Proteins. 2008;75:187–205. doi: 10.1002/prot.22234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Corbeil C, Williams C, Labute P. Variability in docking success rates due to dataset preparation. JCAMD. 2012;26:775–786. doi: 10.1007/s10822-012-9570-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Proteins are provided here courtesy of Wiley

RESOURCES