Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Aug 10.
Published in final edited form as: Structure. 2011 Aug 10;19(8):1108–1126. doi: 10.1016/j.str.2011.05.012

Status of GPCR modeling and docking as reflected by community-wide GPCR Dock 2010 assessment

Irina Kufareva 1, Manuel Rueda 1, Vsevolod Katritch 1,2; participants of GPCR Dock 2010, Raymond C Stevens 3,*, Ruben Abagyan 1,2,*
PMCID: PMC3154726  NIHMSID: NIHMS304227  PMID: 21827947

Summary

The community-wide GPCR Dock assessment is conducted to evaluate the status of molecular modeling and ligand docking for human G-protein coupled receptors. The present round of the assessment was based on the recent structures of dopamine D3 and CXCR4 chemokine receptors bound to small molecule antagonists and CXCR4 with a synthetic cyclopeptide. Thirty five groups submitted their receptor-ligand complex structure predictions prior to the release of the crystallographic coordinates. With closely related experimentally determined homology templates, as for dopamine D3 receptor, and with incorporation of biochemical and QSAR data, modern computational techniques predicted complex details with accuracy approaching experimental. In contrast, CXCR4 complexes that had less characterized interactions and only distant homology to the known GPCR structures still remained very challenging. The assessment results provide guidance for modeling and crystallographic communities in method development and target selection for further expansion of the structural coverage of the GPCR universe.

Keywords: G-protein coupled receptor, chemokine receptor CXCR4, dopamine D3 receptor, homology modeling, docking, experimental restraints, structure prediction, atomic contacts, site-directed mutagenesis

Introduction

The G protein-coupled receptor (GPCR) superfamily has more than 800 members in the human genome (Fredriksson et al., 2003; Ono et al., 2005), detecting a variety of extracellular chemical, biological or physical signals which are critical for human biology and disease (Lagerstrom and Schioth, 2008). Understanding their three dimensional (3D) structures will help understanding their function and will enable the development of new therapeutic molecules. However, the dynamical nature of these membrane proteins makes them notoriously difficult crystallization targets (Cherezov et al., 2010). Until recently, only four vertebrate GPCRs yielded to crystallization efforts: bovine rhodopsin (bRho) in liganded (Palczewski et al., 2000) and ligand-free (opsin) (Park et al., 2008; Scheerer et al., 2008) forms, human β2 (Cherezov et al., 2007) and turkey β1 (Warne et al., 2008) adrenergic receptors, and human A2A adenosine receptor (Jaakola et al., 2008). Comparative analysis of these structures demonstrated that, despite the conserved seven transmembrane (7TM) topology, structural determinants of ligand interaction are strikingly diverse between distantly related GPCRs, even within class A where all the structures solved so far belong, and that factors contributing to reshaping of the ligand binding pockets include helical shifts, turns, tilts, and kinks, as well as conformations of highly variable extracellular loops. Due to such structural diversity, it is impossible to expand the atomic details of ligand binding elucidated by crystallography to cover all other members of the GPCR family.

Continuous improvement of molecular modeling and docking algorithms and methodology (Abagyan et al., 1997; Abagyan and Totrov, 1994; Abel et al., 2010; Barth et al., 2009; Bottegoni et al., 2008; Bottegoni et al., 2009; Brylinski and Skolnick, 2008; Case et al., 2005; Cavasotto et al., 2005; Cavasotto et al., 2008; Chen et al., 2003; Davis and Baker, 2009; Eswar et al., 2006; Katritch et al., 2010; Lang et al., 2009; Morris et al., 2009; Vaidehi et al., 2002; Verdonk et al., 2003; Yarov-Yarovoy et al., 2006; Zhou and Gilson, 2009) is complemented by great advances in computing resources and technologies. The recent decades were marked by an increase of approximately four orders of magnitude in individual computing power and storage, as well as the development of large supercomputers and specialized hardware (e.g. (Shaw et al., 2010)). The amount of protein structural data in the Protein Data Bank (PDB) has been growing exponentially providing invaluable information about molecular interactions and, with increasing structural coverage of the mammalian proteome, relevant templates for modeling by homology. Despite all these advances, modern theoretical methods still fail to produce models of experimental accuracy and/or “dockable” quality in many real-life cases, e.g. in the absence of close structural homology templates.

The community-wide GPCR Dock assessment has the goal of monitoring the progress of molecular modeling and ligand docking for GPCR targets. The first round of assessment was performed in 2008 when the structure of the A2A adenosine receptor (A2AAR) was solved; twenty-nine groups attempted to predict atomic details of its interaction with a small molecule antagonist. The most accurate models were built by homology with the β2 adrenergic receptor (β2AR) structure that shares ~35% sequence identity with A2AAR in the TM domain. These models were able to correctly predict about a half of intermolecular contacts (Michino et al., 2009). The present round of the assessment, GPCR Dock 2010, is particularly exciting because the modeled complexes represent three distinct classes and three levels of difficulty: (i) dopamine D3 receptor in complex with eticlopride: a small molecule in a small molecule pocket with two close homology modeling templates; (ii) chemokine receptor CXCR4 bound to isoithiourea IT1t: a small molecule in a large peptide binding pocket with more distant templates; and (iii) CXCR4/CVX15: the first GPCR complex with a peptide-analogue. Prediction of the ligand binding pose and its interactions within the receptor binding pocket constituted the main focus of the assessment. As a secondary target, the prediction of the overall structure of the TM bundle was also evaluated.

Comprehensive analysis of the 275 GPCR complex models submitted for the assessment helped elucidate the current trends in GPCR modeling and docking. In particular, we found that reliable homology modeling requires 35-40% sequence identity between target and template, and that in such close homology cases, the combination of modern modeling techniques with biochemical and QSAR studies allows complex details prediction with accuracy approaching experimental. The results of this experiment outline the boundaries for computational expansion of the sparse GPCR structural information onto the new GPCR family members and their interactions with small molecules or peptides. They also define the “white spots” on the GPCR map that are in biggest need of being addressed by crystallography.

Results

GPCR Dock 2010: description and submission statistics

The GPCR modeling and docking assessment 2010 was performed for three separate ligand-receptor structures: human dopamine D3 receptor bound to eticlopride (D3/eticlopride), human chemokine receptor CXCR4 bound to an isothiourea derivative IT1t (CXCR4/IT1t) (Thoma et al., 2008), and CXCR4 bound to the CVX15 peptide (Arg-Arg-Nal-Cys-Tyr-Gln-Lys-dPro-Pro-Tyr-Arg-Cit-Cys-Arg-Gly-dPro) (CXCR4/CVX15). For these three targets, 117, 103, and 55 unique interpretable models were submitted by 32, 25, and 19 groups, respectively. The list of participating groups, names, and affiliations is given in Box 1.

The models were assessed by several independent criteria evaluating: (i) the TM bundle structure and (ii) structure of the second extracellular loop (ECL2), (iii) definition of the binding pocket, (iv) geometry of the binding site residues, (v) ligand position, and (vi) the atomic contacts between the ligand and the receptor. The last two criteria, ligand position and ligand/pocket atomic contacts, constituted the primary focus of the present assessment; therefore, they were converted into a single Z-score and used for final model ranking.

One characteristic of GPCR Dock 2010 was the availability of multiple experimental 3D structures for two out of three target categories. Wu et al. have obtained as many as four structures of the CXCR4/IT1t complex, with the total of eight chains related by non-crystallographic symmetry (Wu et al., 2010), all slightly different in the position and atomic contacts of the compound. Similarly, the single structure of the D3/eticlopride complex (Chien et al., 2010) contains two chains. The models submitted for the assessment were compared to all relevant target structures and the structure resulting in most favorable values of the assessment criteria (i.e. lowest TM bundle RMSD, lowest ECL2 RMSD, or highest Z-score) was chosen for each model.

TM bundle and ECL2 prediction

Given the conserved 7TM topology and the availability of several GPCR structures that can be used as templates for homology modeling, structure prediction for TM helices is a less challenging task than for loops or ligand/receptor complexes, and is usually performed with reasonable accuracy. Figure 1A illustrates the levels of TM region sequence identity of the two receptors in this assessment to the available structural templates, and the corresponding TM backbone RMSD values. Dopamine D3 receptor belongs to the aminergic family of GPCRs; two other aminergic receptors, turkey β1AR or human β2AR, could be used as reliable homology modeling templates, with the level of TM sequence identity of D3 to β1AR as high as 41% and the backbone atom RMSD as low as 1.41 Å. In contrast, for CXCR4 the highest sequence identity template (β1AR) is only 25.5% identical with 2.84 Å TM backbone RMSD, and the lowest TM backbone RMSD (observed between CXCR4 and bRho) is only 2.3 Å. Consequently, CXCR4 appears to be a more difficult modeling target than D3.

Figure 1. Increasing modeling difficulty of targets in GPCR Dock 2010 correlated with the decrease in prediction accuracy.

Figure 1

(A) Sequence and structural similarity of the assessment targets to the available homology modeling templates: crystal structures of bovine rhodopsin (Palczewski et al., 2000), opsin (Park et al., 2008; Scheerer et al., 2008), human β2AR (Cherezov et al., 2007), turkey β1AR (Warne et al., 2008), and human A2AAR (Jaakola et al., 2008). For CXCR4, the highest sequence similarity template (β1AR) is only ~25% identical and structurally quite dissimilar from the target. For D3, β1AR and β2AR represent relatively high sequence and structural similarity templates. The values were obtained by comparison with PDB entries 1u19 (bRho), 2vt4 (β1AR), 2rh1 (β2AR), 3eml (A2AAR) and 3cap (opsin).

(B) Scatter plots of model TM domain RMSD and ECL2 RMSD values comparison with the target structures for the three assessment targets. Source data for these plots can be found in Supp. Table S1.

Target difficulty was evaluated using standard CASP measures (Cozzetto et al., 2009) and the LGA (Local-Global Alignment) package (Zemla, 2003) kindly provided by Dr. Adam Zemla. At least 94% of target-template Cα atom pairs fell within 5 Å following the LGA superposition in the case of D3 (data for best templates, i.e. β1 and β2 adrenergic receptor structures), with the sequence identity of the superimposed regions exceeding 35% and LGA score of 86.8%. The corresponding numbers for CXCR4 were 84% Cα atom pairs, 20% sequence identity, and the LGA score of 61.3%. This analysis places both targets in the easy region on the CASP difficulty scale with CXCR4 being more challenging.

The estimated GPCR Dock 2010 target difficulty correlates with the assessment results: the lowest TM backbone RMSD achieved by several groups is 1.26 Å in the case of the D3/eticlopride complex (UMich-Zhang #3), while for the CXCR4/IT1t and CXCR4/CVX15 complexes, the best RMSD was 2.05 Å (UMich-Pogozheva #1, UMich-Zhang #2), and 2.53 Å (UMich-Pogozheva #1), respectively. The corresponding median values are 1.70 Å, 2.75 Å, and 3.28 Å (Figure 1B, source data in Supp. Table S1).

One challenge in modeling the TM domain of CXCR4 was represented by the fact that this receptor, as well as many other chemokine receptors, has a conserved proline-induced kink in helix II, the so-called TXP motif (Devillé et al., 2009; Govaerts et al., 2001; Kellenberger et al., 2007; Rey et al., 2010). A proline-related distortion observed in the corresponding regions of the available homology modeling templates, β-adrenergic and adenosine receptors (but not in bovine rhodopsin), represents a bulge rather than a typical kink (Rey et al., 2010). Sequence alignment in the region is not straightforward and requires introduction of a one-residue gap (Figure 2A), which allows for ~100° rotation of the top portion of helix II and places W94 and D97 (Figure 2B), the critical residues known to interact with CXCR4 antagonists (Wong et al., 2008), inside the binding pocket. We found that in almost 50% of the CXCR4 models submitted for the assessment, the rotation of the top part of helix II is within 20 degrees from the target structure (Figure 2C, source data in Supp. Table S1). Therefore, the modeling community appears well aware of the possible helical shifts between distantly related GPCRs and is capable of modeling such a shift with a relatively high degree of accuracy.

Figure 2. Modeling the proline-induced kink in helix II of CXCR4.

Figure 2

(A) Helix II sequence alignment between CXCR4 and the four homology modeling templates available in PDB at the time of the assessment. The two adrenergic receptors and the adenosine receptor, but not bRho, have a proline-induced bulge in helix II, whereas CXCR4 has a proline kink. Consequently, a one-residue alignment gap is necessary to correctly model this kink and orient W94 and D97 towards the binding pocket

(B) Comparison of the top part of helix II (viewed along the helix from the extracellular side) between the CXCR4/IT1t target structure and a representative model.

(C) Scatter plot of the rotation angles at the top of helix II with respect to the target structure for all models submitted to CXCR4/IT1t and CXCR4/CVX15 assessments. Source data is given in Supp. Table S1.

In addition to the TM helical bundle, we assessed the correctness of prediction of ECL2. Though none of the submitted models had its ECL2 within 2 Å RMSD from the crystal structure, D3 models were closer to this goal with the best ECL2 backbone RMSD of 2.69 Å (WUStL #2-3, Monash-Hall #4, RHUL all). The closest ECL2 predictions for CXCR4/IT1t and CXCR4/CVX15 were at 4.32 Å (Baylor-Barth #4) and 6.61 Å (PharmaDesign #1, UMich-Zhang #2) from their respective target structures. The median ECL2 RMSD values were 4.11 Å, 9.19 Å, and 9.70 Å for D3/eticlopride, CXCR4/IT1t, and CXCR4/CVX15, respectively. Eight groups predicted the β-hairpin fold of the second extracellular loops in CXCR4 (45 models of both CXCR4 complexes), but in most of these models, the loop was placed deeply in the pocket similarly to bRho ECL2, and as in bRho, the hydrogen bond pattern was shifted compared to the CXCR4 structures. Quite surprisingly, the precise CXCR4 ECL2 β-strand hydrogen bond connectivity was only observed in the least accurate set of models (LenServer #1-5, TM RMSD of >11Å, ECL2 RMSD of >24Å).

The protein prediction Z-score calculated by averaging TM and ECL2 prediction Z-scores is given in Supp. Table 1. The positive side of the Z-score distribution was relatively compact with no models scoring above two SD from the average in any of the three assessments: the highest protein prediction Z-scores were 1.11 for D3/eticlopride (WUStL #3), 1.72 for CXCR4/IT1t (Baylor-Barth #4), and 1.59 for CXCR4/CVX15 (UMich-Zhang #2). Models generated ab initio were noticeably less accurate than homology models. For most CXCR4 models, authors had to intervene in the modeling procedure by manually adjusting either target-template sequence alignment, or helix I and II conformation of the obtained 3D model.

To evaluate the results in context of CASP achievements, we assessed the models by the main CASP measure, GDT (Global Distance Test) total score (TS). The average/best GDT-TS values were 73%/80% for D3 and 55%/60% for CXCR4 which is within the range of prediction accuracy observed in CASP for targets of similar difficulty (Kryshtafovych et al., 2009; Kryshtafovych et al., 2005). GDT-TS improvement over the naïve model obtained by copying the coordinates of a single best template was observed in more than 30 D3 models (maximal GDT-TS improvement of 2.7%), but only in 2-4 CXCR4 models (number of models varies depending on the target structure, maximal GDT-TS improvement of 3.4%). Therefore, by the CASP measures, the progress in protein modeling achieved in GPCR Dock 2010 is rather modest. However, it is important to realize that CASP measures primarily focus on the backbone prediction, are equally influenced by the binding pocket residues and distant regions irrelevant for interactions with the ligand, and evaluate model accuracy at a relatively low resolution; for example, GDT-TS calculates the fraction of Cα atoms that fit under distance cutoffs of 2 Å, 4 Å, etc. It is well known that energy-based ligand docking requires higher accuracy: even 1 Å deviation of a single side-chain in the binding pocket can lead to incorrect ligand positioning and scoring (Erickson et al., 2003). Therefore, in this assessment we directly evaluated the ligand binding prediction accuracy as described below.

Pocket definition

Selection of binding site residues represents a critical step in prediction of protein/ligand complex structures. Unguided, “blind” ligand docking to a distant homology model of the target often results in unrealistic ligand positions outside of the TM bundle, in the middle of the tentative lipid bilayer or even on the intracellular side of the protein. Luckily, residues forming the orthosteric binding pocket in class A GPCRs can, to some extent, be inferred by distant homology with the available structural templates. Site-directed mutagenesis studies on both CXCR4 and D3 provide additional guidance in binding pocket residue selection. In some cases, energy-based sampling and refinement of the complex models can further improve the initial prediction.

The best models in terms of pocket definition predicted as much as 81% of the pocket surface area for D3/eticlopride (Caltech #2 in comparison with chain B in PDB 3pbl), 49% for CXCR4/IT1t (PharmaDesign #4 in comparison with chain A in PDB 3odu), and 49% for CXCR4/CVX15 (PharmaDesign #1). The median values were 52%, 30%, and 12%, respectively (Figure 3, source data in Supp. Table S2). The lower prediction accuracy for CXCR4 correlates not only with more distant homology, but also with the larger size and less defined composition of its binding pocket. The pocket was lined by multiple polar residues many of which were shown to play critical roles in ligand binding and/or signaling (however, no direct mutagenesis data was available for IT1t). In contrast to CXCR4, evolution of D3 as a receptor for endogenous dopamine resulted in a small well-defined pocket with a single acidic residue, D1103.32, known to make the critical interaction with positively charged amines in the ligands.

Figure 3. Prediction of binding pocket area by the models of the three assessment targets.

Figure 3

Source data for these plots can be found in Supp. Table S2.

Ligand RMSD and ligand-pocket contacts

Correct prediction of the ligand binding pose and its atomic contacts with the pocket side-chains was the primary goal of the GPCR Dock 2010 participants. The eticlopride molecule is engaged in 62 to 65 atomic contacts with the 15 residues in the two D3 pocket structures (here contacts are defined as a pair of atoms at the distance of ≤ 4Å). The mean and SD of ligand contact strength with each of the pocket residues are shown in Figures 4A and 4D. Similarly, IT1t makes from 46 to 64 contacts with the neighboring residues in different CXCR4/IT1t complex structures; the list of interacting residues includes W94, D97, A98, W102, V112, H113, Y116, R183, I185, C186, D187, R188, and E288; in some chains, the ligand also has non-zero contact strength with E32, K38, and Y255 (Figures 4B, 4E). Finally, the CVX15 peptide makes 181 atomic contacts (≤ 4Å) with 26 CXCR4 residues (Figures 4C, 4F). The models submitted for the assessment were assessed for the ability to reproduce some or all of these contacts, while maintaining a reasonably accurate placement of the ligand in the binding site.

Figure 4. Ligand-pocket atomic contacts for the three targets of GPCR Dock 2010.

Figure 4

(A-C) Per-residue contact strengths (average and standard deviation for the multiple target structures where available)

(D-F) Target structures: ligands are shown as yellow sticks; interacting pocket residues are in black.

To assist evaluation of accuracy of a model with a given number of correct contacts and/or ligand RMSD, it is useful to know the experimentally observed distribution of the corresponding parameters, for example, between multiple structures of a single complex that are related by non-crystallographic symmetry (NCS). The shaded plot background in Figure 5A represents the result of comparison of NCS-related molecules in a large subset of PDB structures. It shows that due to natural protein flexibility and experimental resolution limits, multiple structures of the same complex are expected to differ on average by 0.2-0.6 Å in ligand RMSD and share 80-90% of ligand/pocket atomics contacts, with the marginal cases reaching 1.5 Å RMSD and 65% of shared contacts. On the same plot, we provide the scatter of cross-comparison of the target structures: the two chains in the structures of D3/eticlopride differ by 0.37 Å in ligand RMSD and share 80% of contacts, while between some CXCR4/IT1t complexes, ligand RMSD reaches 0.8 A and common contacts are only 53% (this low number is in part due to disordered parts of the pocket in PDB entries 3oe6, 3oe8, and 3oe9).

Figure 5. Prediction of ligand pose and its atomic contacts with the pocket residues for the three targets in GPCR Dock 2010 assessment.

Figure 5

(A) Scatter plots of ligand RMSD and correctly reproduced contact strength ratio values for the models of the three assessment targets (source data is provided in Supp. Table S2). The shaded plot background represents the distribution of the corresponding parameters for pairs of NCS-related molecules in a large subset of PDB structures.

(B-C) Superimposition of the top models that correctly reproduced the critical hydrogen bonding interaction of the ligand with D110 in D3/eticlopride complex (B) and with E288 in CXCR4/IT1t complex (C). Capturing these interactions does not correlate with the correctness of the ligand pose in case of CXCR4, and only weakly correlates in case of D3.

As Figure 5A illustrates, ligand RMSD weakly correlates with the number of correct atomic contacts for a small fraction of high accuracy models (source data in Supp. Table S2). For the majority of the set, the two measures represent complementary criteria of the model accuracy. For D3, as many as 23 submitted models correctly predicted the ligand position with RMSD <2.5 Å; however, these models demonstrate a wide range of atomic contact predictions. The best model, PompeuFabra #3, has the strikingly low RMSD of 0.96 Å from the ligand in chain B of the D3 crystal structure, and correctly reproduces as much as 58% of the contacts, thus scoring first by both criteria and approaching the level of similarity to the target structures that can be expected between NCS-related molecules in a single X-ray structure. The lowest ligand RMSD achieved for the CXCR4/IT1t complex is 2.47 Å; but the corresponding model (COH-Vaidehi #1) reproduces only 18% of contacts. The small number of correct contacts in this model is explained by the large deviations in the pocket side-chains: 6.01 Å for all 13 pocket residues, and 3.2 Å for the 7 pocket residues in the TM region. The model that ranks first in terms of ligand-pocket contacts (VU-MedChem #5, 36% of correctly predicted contacts), demonstrates ligand RMSD of 4.88 Å. Pocket residue RMSD for this model is slightly lower, 4.94 Å for the entire pocket, and 3.04 Å for its TM part. Prediction of CXCR4/CVX15 complex apparently represented the biggest challenge as none of the models is closer than 8 Å RMSD to the target structure, with the highest ratio of correctly predicted contacts being as low as 7%. Median ligand RMSD values are 3.85 Å for D3/eticlopride, 8.00 Å for CXCR4/IT1t, and 15.03 Å for CXCR4/CVX15. The challenge of peptide docking is partially associated with the larger size and greater number of rotatable bonds which not only increases the computational complexity of the problem (Erickson et al., 2003), but also makes it less amenable for site-directed biochemical characterization: unlike for eticlopride, or even for IT1t, no definite spatial restraints could be identified for CVX15.

Figure 6 shows the distribution of pocket residue RMSD values for all competing models (source data in Supp. Table S2). We found no correlation between the pocket prediction accuracy and the ligand docking accuracy (data not shown). In some cases, the assessment participants were able to model the pocket relatively closely (e.g. ~1.9-2 Å from the crystal structure in case of CXCR4IT/1t models GaTech #1 and Soochow #1-3), but failed to dock the ligand sufficiently accurately. On the contrary, biochemistry driven model generation and selection helped generate approximately correct ligand poses even in some partially incorrect pockets.

Figure 6. Scatter plots of the model binding pocket residue RMSD from the target structures for the three assessment targets.

Figure 6

Source data for these plots can be found in Supp. Table S2.

Despite the fact that only one model of the CXCR4/IT1t complex predicted more than 20% of the native contacts, 22 of the 103 models captured, at least partially, the critical contacts that the ligand makes with the side-chains of E288 or D97. These predictions, however, did not correlate with the correctness of the ligand position or with the total number of predicted contacts, with the exception of the two models described above (Figure 5C). This was also the case for the predictions of D3/eticlopride complex, though variability of the ligand poses capturing the critical charged amine interaction with D110 was not as striking (Figure 5B). This can be explained by the fact that in most cases (see Supp. Info), the key contacts were deduced from mutagenesis or other experimental studies and imposed as restraints or selection criteria in the modeling procedures, rather than reproduced as a result of ab initio ligand docking.

Model ranking by groups

The participants of GPCR Dock 2010 were allowed to submit at most five models for each of the targets, and were asked to rank their models according to their confidence in the correctness of the prediction. We could therefore evaluate the reliability of complex scoring functions used by the assessment participants and their ability to rank the correct models first.

While the distribution of Z-scores for models ranked first does appear somewhat skewed towards higher values for the D3/eticlopride and CXCR4/IT1t models (Figure 7, source data in Supp. Table S2), in most cases the best models were not ranked first by their authors. Groups from COH and CDD-CMBI make a notable exception: the most accurate COH-Vaidehi models in both CXCR/IT1t and D3/eticlopride assessments were ranked first; their second, third, and fifth models of CXCR/IT1t complex also scored high, and so did their third model in D3/eticlopride assessment. Accuracy of all five CDD-CMBI D3/eticlopride models was above one standard deviation (SD) from the mean, and they ranked their best model first. Finally, the Helsinki-Xhaard group submitted a single model to the D3/eticlopride competition that appeared to be within the top most accurate models in the assessment. From the methods descriptions, however, it is clear that Helsinki-Xhaard did not use any formal scoring functions to rank their potential solutions; pose selection and ranking of the ligand binding pose was performed manually, taking into account the existing mutagenesis data. In the case of COH-Vaidehi group, the ligand poses were ranked by their all-atom force-field binding energies (see Supp. Methods); the five solutions submitted for the competition were selected from the ten top-scoring poses manually by agreement with mutagenesis and biochemistry data (Rosenkilde et al., 2007; Wong et al., 2008). The CDD-CMBI group ranked their solutions with a consensus scoring function incorporating FlexX and PLP docking scores, geometrical quality indicators, as well as molecular dynamics force field interaction energies (Nabuurs et al., 2007).

Figure 7. Distribution of Z-scores for models that were ranked 1st, 2nd, etc. by their authors.

Figure 7

Though on average, models ranked first by the authors tend to be marginally more accurate, the best models in D3/eticlopride, CXCR4/IT1t, and CXCR4/CVX15 assessments were assigned ranks 3, 5, and 5, by their authors, respectively. Source data can be found in Supp. Table S2.

Analysis of best models

Participants of GPCR Dock 2010 employed a variety of methods for model generation. While most models were built by homology, several groups used ab initio approaches for modeling of the TM domain and/or loops. Prior to ligand docking, the models were optimized by molecular dynamics or Monte Carlo sampling. Selection of a few representative models from generated ensembles was performed with either force-field interaction energies or knowledge-based potentials. The ab initio models generally tend to be less accurate than the homology models (Supp. Table S1); however, ligand pose prediction accuracy does not significantly correlate with the protein prediction accuracy. In most cases human intervention was used in the modeling process, at the stages of building the sequence alignment, protein modeling, ligand docking, or answer selection. Consequently, expert knowledge of the subject area, chemical intuition, and a degree of ingenuity seem to be at least as important in successful GPCR/ligand complex modeling as advanced modeling algorithms and tools. Among the automated procedures, selecting the model by its ability to correctly dock and score the active ligand(s), a.k.a. ligand-guided modeling, appeared to be a common and successful approach. Another fruitful ligand-guided technique employed by several groups was based on pharmacophore representation of multiple existing ligands.

Cumulative Z-scores obtained from ligand RMSD and atomic contacts measurements were used for overall model ranking and identification of most accurate models. Models with Z-scores above 1 are listed in Table 1; the complete model list can be found in Supplementary Materials. The detailed characterization of the top models from each assessment is presented below.

Table 1. GPCR Dock 2010 models that score one or more SD above the average.

The results of analysis of these and all other models can be found in Supp. Tables S1 and S2.

Group Model TM RMSD, Å
(# residues)
fraction TM
superimposed
(RMSD) Å
ECL2
RMSD, Å
Pocket
RMSD, Å
(# residues)
TM pocket
RMSD, Å
(# residues)
W94, D97
rotation
(degrees)
Fraction of
pocket predicted
Ligand RMSD
(Å)
Atomic contacts
(#residues) / reference
Correct contact
strength
(% reference)
Critical
contacts
Z-score
D3/eticlopride
PompeuFabra 3 1.38(205) 63%(0.52) 2.87 1.50(15) 1.16(14) n/a 63% 0.96 36(9)/65(15) 39.95(58%) D110 2.39
CDD-CMBI 1 1.95(205) 66%(0.66) 4.38 2.25(15) 1.67(14) n/a 50% 2.13 36(12)/65(15) 38.75(57%) D110 2.05
COH-Vaidehi 1 1.58(205) 69%(0.79) 4.11 2.04(15) 1.36(14) n/a 59% 1.22 31(10)/65(15) 34.78(51%) D110 2.00
UCSF-Shoichet-1 4 1.45(205) 65%(0.60) 3.53 1.53(15) 1.41(14) n/a 74% 1.23 29(10)/65(15) 32.46(47%) D110 1.85
Schrödinger 5 1.41(205) 60%(0.53) 12.57 1.57(15) 1.48(14) n/a 38% 1.77 23(11)/65(15) 30.94(45%) D1101 1.63
CDD-CMBI 2 1.95(205) 67%(0.66) 4.38 2.24(15) 1.67(14) n/a 46% 2.27 26(11)/65(15) 30.5(44%) D1101 1.49
CDD-CMBI 3 1.95(205) 66%(0.65) 4.35 2.17(15) 1.59(14) n/a 43% 2.38 30(11)/65(15) 30.1(44%) D110 1.44
Warsaw 4 2.13(205) 62%(0.64) 4.71 1.68(15) 1.64(14) n/a 52% 2.34 27(9)/65(15) 26.47(39%) 1.22
CDD-CMBI 4 1.94(205) 67%(0.66) 4.17 1.99(15) 1.62(14) n/a 35% 2.13 19(8)/65(15) 24.84(36%) 1.16
CDD-CMBI 5 1.94(205) 67%(0.67) 4.00 1.87(15) 1.67(14) n/a 41% 2.44 21(8)/65(15) 25.57(37%) D1101 1.14
Helsinki-Xhaard 1 1.55(205) 67%(0.67) 3.74 1.43(15) 1.29(14) n/a 47% 3.42 21(7)/65(15) 28.95(42%) D1101 1.13
COH-Vaidehi 2 1.57(205) 70%(0.80) 4.01 2.16(15) 1.50(14) n/a 57% 2.96 25(7)/65(15) 26.75(39%) D110 1.10
Monash-Sexton-2 3 1.61(205) 63%(0.62) 4.81 2.47(15) 2.30(14) n/a 52% 1.94 17(6)/65(15) 22.51(33%) 1.05
Schrödinger 1 1.43(205) 63%(0.55) n/a2 1.50(15) 1.42(14) n/a 61% 2.24 19(10)/65(15) 22.81(33%) D110 1.01
CXCR4/IT1t
VU-MedChem 5 2.21(204) 64%(1.14) 7.42 4.94(13) 3.04(7) 22.9,43.9 47% 4.88 19(5)/64(13) 21.16(36%) D97,E288 3.49
COH-Vaidehi 1 4.16(204) 44%(1.09) 9.16 6.01(13) 3.20(7) 0.,−1.4 47% 2.47 9(2)/64(13) 10.56(18%) D97 2.19
COH-Vaidehi 2 4.15(204) 44%(1.08) 9.19 5.87(13) 3.07(7) −0.2,2. 43% 2.88 8(4)/64(13) 9.93(17%) D97 2.02
COH-Vaidehi 3 5.28(204) 18%(1.56) 10.42 7.58(13) 6.00(7) 92.,0.1 36% 6.86 10(2)/64(13) 10.28(17%) D97 1.47
UCSF-Shoichet-2 5 2.79(204) 67%(1.15) n/a2 2.15(8) 2.18(7) 31.5,−0.4 40% 6.14 7(4)/64(13) 9.22(15%) E288 1.41
COH-Vaidehi 5 4.16(204) 44%(1.09) 9.16 5.98(13) 3.21(7) 0.,−1.4 38% 5.17 7(4)/64(13) 7.7(13%) D97 1.32
GaTech 3 2.98(204) 57%(1.30) 12.05 8.05(13) 1.90(7) 3.7,7.4 30% 7.49 6(2)/64(13) 8.58(14%) E2881 1.10
GaTech 2 2.98(204) 57%(1.30) 12.05 8.05(13) 1.90(7) 3.7,7.4 29% 6.41 5(2)/64(13) 7.41(12%) 1.08
Caltech 2 3.44(204) 35%(1.33) 11.19 6.93(13) 3.47(7) −2.9,13.8 30% 9.46 9(2)/64(13) 9.91(17%) E288 1.01
CXCR4/CVX15
UMich-Zhang 5 2.88(204) 53%(1.08) 8.19 6.73(26) 4.11(12) 58.8,57.5 26% 8.88 7(4)/181(26) 11.64(6%) n/a 2.40
COH-Vaidehi 4 4.99(204) 49%(1.03) 9.30 9.05(26) 5.19(12) 0.,4.4 37% 15.61 11(1)/181(26) 12.96(7%) n/a 1.92
COH-Vaidehi 3 4.99(204) 49%(1.03) 9.30 9.20(26) 5.37(12) 0.,4.4 34% 13.90 6(5)/181(26) 9.51(5%) n/a 1.52
UMich-Zhang 4 3.22(204) 51%(1.01) 8.73 7.31(26) 5.25(12) 66.,65.1 35% 17.06 8(5)/181(26) 10.43(6%) n/a 1.34
UMich-Zhang 1 2.82(204) 60%(1.18) 6.78 6.40(26) 3.66(12) 73.1,67.3 41% 10.53 6(1)/181(26) 5.49(3%) n/a 1.21
MolLife 2 3.88(149) 32%(1.56) 9.96 5.78(25) 3.84(12) 60.6,92.3 43% 11.853 4(1)/181(26) 6.14(3%) n/a 1.18
1

Contact geometry incompatible with hydrogen bond formation

2

ECL2 not modeled or incomplete

3

Ligand structure errors.

The highest degree of accuracy achieved in the D3/eticlopride assessment is represented by model #3 by PompeuFabra (Figure 8A). It ranked first both in terms of ligand pose (0.96 Å RMSD to the crystal structure) and atomic contacts (58% correct contacts). The ligand is in a correct conformation (0.34 Å RMSD after ligand structure superimposition) though translated by ~0.8 Å with respect to its position in the target structure. The correctly predicted residue contacts include charged amine interaction with D110, as well as contacts with V111, V189, S192, S193, F345, F346, H349, and Y373 (9 out of 15 target contact residues). The contact with I183 was not reproduced due to ~3.5 Å shift in the predicted position of ECL2, while other contacts were probably missed because of the ligand shift. Sixty-three percent of the TM bundle is superimposable onto the target structure with backbone RMSD of 0.52 Å, but the ECL2 RMSD is about 2.87 Å. The errors in ECL2 placement lead to overall pocket residue RMSD of 1.5 Å in this model, while the TM part of the pocket is only 1.16 Å from the target structure. In the docking procedure, eticlopride was restricted to form a salt bridge between the positively charged nitrogen and the carboxylate of Asp1103.32. The poses were relaxed and refined using MD simulations of ECL2, binding site, and the ligand, alone and in combinations. Final re-scoring was performed using a consensus of several available scoring functions, taking into account the formation of two intramolecular hydrogen bonds in eticlopride.

Figure 8. Top scoring models of the three GPCR Dock 2010 assessment targets.

Figure 8

These and other models can also be viewed in interactive mode on the assessment results page at http://ablab.ucsd.edu/GPCRDock2010/.

(A) Most accurate models of D3/eticlopride complex: up to 60% of ligand-pocket contacts are reproduced correctly, the ligand pose and orientation closely resemble those observed in the X-ray structure.

(B) Top scoring models of the CXCR4/IT1t complex. Although the models reproduce the critical hydrogen bonding interaction of the ligand with Asp97, and in one case even with Glu288, very few other contacts are captured, and the ligand orientation is only approximately correct.

(C) Top-scoring model of the CXCR4/CVX15 complex. Relative orientation of the peptide and its β-hairpin fold are reproduced with some degree of similarity to the experimental structure. Only a few residue contacts are captured.

Model #1 by CDD-CMBI (Figure 8A) is just as good in terms of ligand/pocket contacts (57% of correct contacts, 12/15 residues), but has errors in ligand conformation, including the incorrect orientation of the 5-ethyl attachment on the benzene ring as well as ~60° rotation of the ethylpyrrolidin ring (ligand heavy atom RMSD of 2.13 Å). It therefore illustrates that ligand heavy atom RMSD is a measure dominated by most deviating fragments of the molecule; as such, RMSD may not be the best choice for evaluating the correctness of docking poses and must be complemented by other measures such as atomic contacts. Structure-based pharmacophores, complementary to known active compounds, were used to generate initial low-resolution poses, which were subsequently redocked and optimized to generate the final models. Model #1 by COH-Vaidehi (Figure 8A) also features a very accurate ligand placement (1.22 Å RMSD), but has a slightly lower number of correct contacts (51% contacts, involving 10/15 residues). Pose selection here was performed using existing mutagenesis data.

In all cases, correct ligand placement and contacts were achieved despite errors in prediction of ECL2: none of the top-scoring models had an ECL2 backbone RMSD of lower than 2.8 Å, and none predicts the ligand interaction with I183 in that loop.

For CXCR4/IT1t complex, VU-MedChem model #5 (Figure 8B) predicted the largest number of correct ligand-pocket contacts (36%). Though only 5 of 13 residues interacting with the ligand were identified, the critical hydrogen bonding interactions with D97 and E288 were captured correctly. In the process of model building, authors formulated several hypotheses about interactions of the basic centers in IT1t with acidic residues in the CXCR4 TM domain (D972.63, D1714.60, D2626.58 and E2887.39) whose importance was also confirmed by mutagenesis (Wong et al., 2008); they consequently chose ligand poses that satisfied some of these hypotheses (de Graaf and Rognan, 2009). Despite the presence of a proline-imposed kink (TXP motif) in helix II, its top, including D97, is bent outwards by about 5 Å and slightly rotated in comparison with the crystal structure. As a result, the ligand is shifted in the direction of helix II and III by approximately 5.5 Å. Its orientation, however, is similar to the crystallographic position. Overall topology of the TM domain (modeled by homology from β2AR receptor) is correct, with 64% of the helices closely superimposable with the target structure (partial backbone RMSD of 1.14 Å). The extracellular loops are not predicted correctly, e.g. ECL2 backbone RMSD is 7.42 Å.

COH-Vaidehi models #1 and #2 (Figure 8B) were most accurate in terms of ligand RMSD. The dicyclohexylthiourea portion of the ligand was docked correctly, while the imidazothiazole system was flipped leading to overall ligand heavy atom RMSD of 2.47 Å and 2.88 Å for models #1 and #2, respectively. The location of the binding pocket was selected based on mutation data for cyclam and non-cyclam compounds (Rosenkilde et al., 2007; Wong et al., 2008). The models were selected by optimal protein-ligand contacts from the top docked poses ranked by all-atom binding energies (see Supp. Methods). Quite surprisingly, model #1 reproduced only 18% of correct contacts which involved 2 residues, D97 and V112. In model #2, some of these contacts were lost, but one contact with W94 and two contacts with Y116 were captured, leading to the total of 17% contacts with 4 residues. Both models rank relatively low in terms of protein structure prediction: they have the Z-score of −1.08, with about 44% of the TM domain superimposable with the crystal structure (RMSD of 1.09 Å and 1.08 Å, respectively); however, the most significant deviations are found on the cytoplasmic side and on the extracellular sides of helices I, IV and V, i.e. in the regions not involved in ligand binding. Though extracellular loops are far from crystallographic conformation, the kink in helix II is modeled very accurately (about 2° rotation for W94 and D97) placing the ligand-binding part of helix II within 1.3 Å of the target position. These authors generated the TM domain model by homology using A2AAR and β2AR crystal structures as templates, and adjusted the rotational orientation of helices II and IV using a previously published LITICON protocol (Bhattacharya and Vaidehi, 2010) so that D97 and D171 are completely inside the binding pocket.

Modeling the CXCR4/CVX15 peptide complex represented the biggest challenge of GPCR Dock 2010. The top model of this complex (#5 by UMich-Zhang, Figure 8C) has the Z-score of 2.4 thus far exceeding other models in accuracy; however, this accuracy is far from either experimental or even modeling accuracy in the other two assessments. Only the overall orientation and topology of the peptide are predicted correctly, including the β-hairpin fold, solvent exposed crown and buried termini, and N to C direction. The peptide RMSD from the target structure is almost 9 Å, and only 6% of the contacts are captured in this model: Arg1 of the peptide with D187, Nal3 with Q200, and Arg14 with D262. In the process of model generation, the peptide β-hairpin conformation was imposed and maintained using distance restraints; docking poses were generated by random translation and rotation in binding site and assessed on the basis of shape and chemical feature complementarity and ability to satisfy the experimental restraints. The β-hairpin fold of ECL2 is predicted, though the two β-strands are shifted with respect to each other (e.g. the backbone carbonyl of D187 makes a hydrogen bond with S178 instead of N176). The position of ECL2 sinking in the pocket is incorrect, rather resembling its bRho template. The TM domain RMSD is 53% superimposable with the structure (partial RMSD of 1.08 Å) though top parts for all helices except helix II are shifted by 2-3 Å.

Discussion: current status of GPCR modeling and docking

Along with GPCR Dock 2008, the present assessment helps to define the boundaries for reliable structural modeling of GPCRs and their complexes with ligands. It illustrates that the availability of an experimentally solved homology modeling template with ~35-40% sequence identity to the target (that preferably belongs to the same GPCR family) enables the computational groups to produce complex models that approach the level of accuracy observed in the experiment. By projecting this result onto the GPCR phylogenetic tree, we can pinpoint the targets that may be modeled sufficiently accurately from the existing structural templates (Figure 9). In contrast, distant homology modeling (~ 25-30% between target and template, template and target belonging to different families) still needs much improvement to reach docking application accuracy comparable to the crystal structure. The modeling accuracy for the three GPCR targets evaluated so far decreases in the order D3 > A2AAR > CXCR4, which follows the order of corresponding target-template phylogenetic distances. Though prediction accuracy depends on variety of additional factors, e.g. shape and hydrophobilicty of the pocket, this circumstantial correlation reflects the general tendency in homology modeling and should be taken into account for selection of structural templates for modeling and targets for crystallization efforts.

Figure 9. GPCR phylogenetic tree highlighting the recently solved GPCR structures and their homologs.

Figure 9

Flags indicate high-resolution structures of β1 and β2 adrenergic (red), adenosine A2A receptor (yellow), dopamine D3 (cyan) and chemokine CXCR4 (green) receptors, used as templates. Homologs with more than 35% TM domain sequence-template identity are circled, while those with more than 50% identity in the ligand binding pockets only are also highlighted with the color of corresponding template. For reference, TM domain sequence identity between the dopamine D3 receptor and β2AR is about 38%, while pocket-only identity is 54%.

As expected, modeling variable and flexible regions such as ECL2 represents the most difficult task. In many cases (CXCR4/IT1t and D3/eticlopride), the ligand binding pocket is mostly formed by the TM residues, and therefore ligand position and contacts can be roughly predicted based on the TM domain only. However, even for this region, ambiguities in target-template sequence alignment (like in case of CXCR4 helix II kink) and significant structural deviations make distant homology modeling challenging.

The assessment illustrated that the importance of using the biochemical, biophysical, QSAR, and other experimental data cannot be overestimated. This type of data can be often misinterpreted (e.g. allosteric effects of a mutated residue are mistakenly reported as its direct contact with the ligand (Jaakola et al., 2010; Kim et al., 2003)), however, most accurate predictions in the present assessment were generated using the mutagenesis data. Information about compound binding and activity is another valuable piece in the puzzle that can be used for model selection either in the form of structure-based pharmacophores, or as a VLS enrichment benchmark; both of these proved to be successful strategies in GPCR Dock 2010 (Supp. Table 2).

The results of this prediction exercise suggest that fully automatic solutions for docking to modeled GPCRs remain out of reach. Though most predictions involved extensive computation, expert knowledge of the subject area and chemical intuition remained the main determinants of success in the distant homology cases. That said, the accuracy of the top predictions of the D3/eticlopride complex was sufficient to capture the key aspects of ligand recognition and binding and, possibly, to support chemical discovery and design efforts for this target. In the past, few if any efforts of de novo protein-ligand complex predictions led to models that were subsequently confirmed at atomic resolution, far less for targets as challenging as membrane receptors. The partial successes observed in GPCR Dock are a testament to the confidence that the field has gained, and that some previously inaccessible targets, with human input, now fall into its remit.

Conclusion

Three novel crystal structures in GPCR Dock 2010 dramatically expanded the range of assessed modeling problems, which now includes relatively close homologues within a GPCR family, distant homology with a different branch of the Class A GPCR tree, as well as a peptide-receptor complex. For a close homology case, represented by dopamine D3 receptor with ~40% sequence identity to adrenergic structural templates, many groups show reasonably accurate ligand docking results, with the top models approaching accuracy of ligand placement in the crystal structures and potentially providing basis for structure-based chemical discovery efforts. For more distant homology, represented by CXCR4 target with ~25% sequence identity to the 3D templates, only three groups captured overall position of the small molecule ligand (RMSD < 5Å) and only one identified more than 20% ligand-receptor atomic contacts, pointing to a spectrum of challenges for such modeling. Thus, even when ligand binding is largely defined by structurally conserved TM regions of GPCRs, a modeler has to deal with substantial variations in kinks and helical structure in the binding pocket region, which impacts quality of docking. While some aspects of these variations (e.g. CXCR4 helix II kink and rotation) can be modeled with help of experimentally derived restraints, others are likely to go undefined. Finally, for the most challenging case of CXCR4-peptide complex, the modeling is further complicated by the fact that peptide interactions are defined by highly variable extracellular loops and N-terminus of the receptor, which are not amenable to accurate predictions so far.

Like the previous assessment, the results of GPCR Dock 2010 demonstrate the advantage of hypothesis-driven approaches which take maximum advantage of available experimental information about the target and its ligands. Improving coverage of the GPCR family with experimentally determined structural, continued efforts for biophysical characterization of GPCRs and their complexes with ligands, and of course improved conformational modeling that makes good use of these data will help to advance towards comprehensive understanding of GPCR structural diversity.

Materials and Methods

Data collection and filtering

GPCR Dock 2010 registration and model submission system was implemented and made available online at http://gpcr.scripps.edu/GPCRDock2010/ Authors were requested to submit at most five models for each target they chose to model, in the PDB format, using the provided PDB file templates. At the analysis stage, the models were checked for correctness of the protein and ligand covalent geometry and bond connectivity. In an attempt to be inclusive, we did not discard any models with errors in ligand covalent geometry; most of these errors could be unambiguously fixed, and the remaining cases (two D3/eticlopride, five CXCR4/IT1t, and eighteen CXCR4/CVX15 models) were compared to the target structures by the maximal common ligand substructure.

Structure prediction of the TM helix bundle and the second extracellular loop

For all spatial comparison criteria, the protein molecule of each model was first superimposed onto the backbone Cα, C, and N atoms of the TM helices of the target structure. TM regions in both target receptors were defined by residue stretches 1.30-1.60, 2.37-2.66, 3.22-3.54, 4.38-4.61, 5.37-5.64, 6.28-6.60, and 7.31-7.55 in Ballesteros-Weinstein notation (Ballesteros and Weinstein, 1995). In this notation, a single most conserved residue among the class A GPCRs is designated x.50, where x is the TM helix number; all other residues on that helix are numbered relative to this conserved position. Superimposition was performed using an adaptive algorithm that iteratively finds the region of higher similarity by assigning distance-dependent Gaussian weights to deviating fragments of the structure (Abagyan and Kufareva, 2009; Bottegoni et al., 2009). Application of this algorithm ensured that the superimposition quality was not dominated by a single flexible and/or poorly predicted part, e.g. one deviating part of a helix.

For the superimposed TM bundles, RMSD of Cα, C, and N atoms of the model from their respective counterparts in the target structure was calculated. The fraction of TM bundle for which high-quality superimposition was found (< 2 Å RMSD) and the corresponding partial RMSD were also reported. With the same superimposition of TM helices, we calculated RMSD of backbone atoms of the model’s ECL2 from that of the target structure. We chose to focus on ECL2 rather than on all extracellular parts of the protein because of its size and the critical role it plays in ligand binding for many GPCRs. ECL2 was defined by residues F171-N185 in D3 and by residues A174-E179, R183-N192 in CXCR4. The tip of ECL2 β-hairpin (residues A180, D181, and D182) was omitted from ECL2 comparison for CXCR4 because this region was disordered in the majority of target structures, and was the most flexible in others as demonstrated by its structural variability and high B-factor values.

For CXCR4, additional attention was paid to the rotation of the top part of helix II that carries two critical ligand-interacting residues, W94 and D97. The accurate modeling of this region by homology with the available structural templates (bRho, β1 and β2 adrenergic receptors, and A2AAR) required introduction of a one-residue gap in the alignment that results in ~100° rotation in the top of helix II (residues 91-100) and orients W94 and D97 towards the binding pocket (Figure 2B). To assess the extent of rotation in helix II, the TM domain of each model was superimposed onto the target structure as described above; the model was then translated in space to ensure the optimal overlay of the helical axis of the top part of its helix II with the corresponding axis in the target. The two angles were measured: one angle between the projections of W94 Cβ atoms onto the plane perpendicular to the helical axes, and another angle between the projections of D97 Cβ atoms.

Binding pocket prediction

Similarity of the predicted to the experimental pocket residue content was assessed by calculating and comparing the residue backbone and side-chain surface areas that become solvent-inaccessible in the presence of the ligand in the target structures and in the models. Accessible surface area calculations were performed using the Shrake and Rupley algorithm implemented in ICM (Abagyan et al., 2009; Abagyan et al., 1994). A binding pocket was formalized as vector P of length 2n where n is the number of residues in the protein, with components P[2i − 1] and P[2i] equal to the decrease in accessible backbone and side-chain areas of the i-th residue upon ligand binding. For each target structure/model pair with pockets PR and PM, a pocket similarity vector PRM was constructed using PRM[i] = Min(PR[i], PM[i]). The weight of this similarity vector was calculated as |PRM| = iPRM[i] and compared to the weight of the target pocket, |PR|. The result was reported as a real number continuously distributed on the interval [0,1], or in percents. This number has a meaning of recall (or coverage) in statistical classification terminology (TP/(TP+FN)); the value corresponding to precision (TP/(TP+FP)) can be obtained by comparing |PRM| to the weight of the model pocket vector, |PM|. However, we did not use the pocket precision value in model evaluation and do not report it in the present publication.

Similarity of the pocket residue conformations was evaluated by measuring RMSD between the heavy atoms of the residues that constituted the binding pockets in the target structures. For D3 and CXCR4/IT1t complexes, binding pockets were defined as the sets of residues for which the average contact strength with the ligand exceeded the value corresponding to the distance of 4Å. For CXCR4/CVX15 complex where only a single structure was available, the binding pocket was defined as the set of residues with non-hydrogen atoms at the distance of ≤4Å from the ligand. Specifically, the sets of target pocket residues included:

  • D3/eticlopride complex: F106, D110, V111, C114, I183, V189, S192, S193, W342, F345, F346, H349, Y365, T369, and Y373 (15 residues: 14 in TM domain and one in ECL2)

  • CXCR4/IT1t complex: W94, D97, A98, W102, V112, H113, Y116, R183, I185, C186, D187, R188, and E288 (13 residues: 7 in TM domain and 6 in extracellular loops)

  • CXCR4/CVX15 complex: P27, H113, Y116, T117, D171, S178, C186, D187, R188, F189, Y190, P191, N192, D193, V196, F199, Q200, Y255, D262, I265, L266, E277, H281, I284, S285, and E288 (12 TM domain residues and 14 extracellular loop residues)

The optimal superimposition of TM domains was performed prior to the binding pocket comparison as described above. Residue symmetry was taken into account when calculating pocket RMSD.

Ligand RMSD

RMSD of the ligand non-hydrogen atoms from their respective counterparts in the crystallographic structure was determined after superimposition of the model onto the target structure as described above. Internal ligand symmetry was taken into account for RMSD definition as well as other calculations. For example, for the isothiourea IT1t molecule co-crystallized with CXCR4, as many as 16 atom permutations are possible that result in exactly the same ligand covalent geometry and bond topology; all of these were tested and the one with the smallest RMSD to the model was chosen.

Atomic contacts

In the traditional definition, an atomic contact is a pair of heavy ligand and protein atoms located at the distance closer than a specified cutoff (usually 4 Å, (Rueda et al., 2010)). The number of contacts that are the same between the target structure and the model is calculated and compared to the total number of ligand-protein contacts in the target structure (recall) or in the model (precision). As with ligand RMSD, calculation of atomic contacts requires enumeration of topologically equivalent atom permutations in the ligand; moreover, some amino acids also possess internal symmetry that should be taken into account. Treating side-chain symmetry in the same way as ligand symmetry is possible, but it quickly leads to combinatorial explosion of the total number of permutations in the system. For this reason, and because the “wingspan” of symmetric groups in the protein side-chains is limited by three heavy atoms (e.g. Cζ, Nη1 and Nη1 in arginine), we accounted for side-chain symmetry by considering symmetric atoms indistinguishable instead of explicitly enumerating them.

We refined the definition of an atomic contact in an attempt to make it more robust and continuous. Instead of using a “hard” distance cutoff and counting a contact as present (1) for interatomic distances below this cutoff, and as absent (0) for the distances above this cutoff, we designed a continuous contact strength function that gradually decreased from 1 to 0 within a specified distance margin. The traditional “hard” definition of contact cutoff therefore corresponds to the margin of 0 in our scheme (Figure 10). Zero margin (a.k.a. hard cutoff) contact definition leads to unstable behavior of the function and its intolerance to even the smallest (< 0.1 Å) changes in side-chain and ligand conformations. The continuous decrease margin approach is devoid of this instability. Using the continuous decrease margin affected the relative ranking of intermediate and low quality models (data not shown), but not the best scoring models in all three assessments. At the same time, it yielded contact similarity values that were more stable and robust than atom contact number calculated in a traditional way, and better reflected the intuitive human perception of contact similarity.

Figure 10. Definition and properties of atomic contact strength function with and without a continuous decrease margin.

Figure 10

(A) Atomic contact strength function definition for two atoms.

(B) Zero margin (a.k.a. hard cutoff, black curve) contact definition leads to unstable behavior of the function and its intolerance to even the smallest changes in side-chain and ligand conformations. The continuous decrease margin approach is devoid of this instability (colored curves).

In model evaluation, we constructed the vectors of atomic contact strengths for all ligand-protein atom pairs in the target structure (CR) and in the model (CM). The contact similarity vector CRM was constructed using CRM[i] = Min(CR[i], CM[i]); its weight was found as |CRM| = ΣiCRM[i] and compared to the weight of the target contact strength vector, |CR|. All possible topologically equivalent permutations of ligand atoms were tested and one resulting in the highest similarity value was chosen. Similar to pocket definition evaluation, we only report recall (coverage) of correct contacts by the model; precision is disregarded in this evaluation. Protein and ligand covalent geometry and van der Waals interactions impose natural constraints onto precision values because they limit the number of contacts that a ligand can make with the neighboring side-chains in the model.

Z-scores

Model Z-scores were calculated in the spirit of the previous assessment, GPCR Dock 2008 (Michino et al., 2009). Ligand RMSD values and fractions of correctly predicted ligand-protein contacts were independently converted into Z-scores (the opposite of RMSD, Z-score was taken so that higher values correspond to better models in all cases); the two Z-scores were averaged. The new mean and standard deviation were calculated excluding the low-scoring models that deviated from the old mean by more than two standard deviations (SD), and new Z-scores were found using this corrected mean and SD. In cases of CXCR4/IT1t and D3/eticlopride, for which multiple target structures were available, the structure resulting in the best Z-score was chosen for each model. A similar algorithm was used for assessment of protein prediction accuracy based on TM and ECL2 backbone RMSD.

Modeling methods

Methods, techniques, and approaches used by the participants of GPCR Dock 2010 for complex model generation are described in Supplementary Materials.

Supplementary Material

Supplementary Material

Box 1.

Participants of GPCR Dock 2010

Group Name/ID # Names Department Institution E-mail
PharmaDesign
(0400)
Yasushi Yoshikawa
Toshio Furuya
Research & Development Division PharmaDesign Inc., Tokyo, Japan yoshikawa@pharmadesign.co.jp
UMich-Zhang
(0460)
Huisun Lee
Ambrish Roy
John Grime
Joseph Rebehmed
Yang Zhang
Center for Computational Medicine and
Bioinformatics
University of Michigan, Ann
Arbor, MI
yangzhanglab@umich.edu
VU-MedChem
(1006)
Luc Roumen
Iwan J.P. de Esch
Rob Leurs
Chris de Graaf
Department of Medicinal Chemistry VU University, Amsterdam, The
Netherlands
lroumen@few.vu.nl,
c.de.graaf@few.vu.nl
Soochow
(1135/4416)
Youyong Li
Tingjun Hou
Institute of Functional Nano & Soft
Materials (FUNSOM) and Jiangsu Key
Laboratory for Carbon-Based
Functional Materials & Devices
Soochow University, Suzhou,
China
yyli@suda.edu.cn
UCSF- Shoichet-2
(1178)
Michael M. Mysinger*
Dahlia R. Weiss*
John J. Irwin
Brian K. Shoichet
Department of Pharmaceutical
Chemistry
University of California, San
Francisco, CA
shoichet@cgl.ucsf.edu
Monash-Yuriev
(1180)
Fiona M. McRobb
Ben Capuano
Ian T. Crosby
David K. Chalmers*
Elizabeth Yuriev*
Medicinal Chemistry and Drug Action,
Monash Institute of Pharmaceutical
Sciences,
Monash University, Melbourne,
Australia
David.Chalmers@monash.edu
Elizabeth.Yuriev@monash.edu
WUStL
(1285)
Qi Wang
Robert H. Mach
David E. Reichert
Division of Radiological Sciences,
Mallinckrodt Institute of Radiology
Washington University, St. Louis,
MO
reichertd@wustl.edu
Strasbourg
(1576)
Gwo-Yu Chuang
Didier Rognan
Structural Chemogenomics University of Strasbourg, Illkirch,
France
chuang@unistra.fr
Monash-Sexton-1
(1487)
John Simms
Patrick Sexton
Pharmacology Monash University, Melbourne,
Australia
john.simms@med.monash.edu.au
Monash-Sexton-2
(1813)
Denise Wootten
John Simms
Patrick Sexton
Pharmacology Monash University, Melbourne,
Australia
denise.wootten@med.monash.edu.au
Warsaw
(2211)
Dorota Latek
Umesh Ghoshdastider
Slawomir Filipek
Faculty of Chemistry University of Warsaw, Warsaw,
Poland
sfilipek@chem.uw.edu.pl
LenServer
(2364)
LenServer School of Computer Science and
Technology
Soochow University, Suzhou,
China
LenServer.SU@gmail.com
Caltech
(2556)
Andrea Kirkpatrick
Bartosz Trzaskowski
Adam Griffith
Soo-Kyung Kim
Ravinder Abrol
William A. Goddard III
Chemistry Caltech, Pasadena, CA abrol@wag.caltech.edu
wag@wag.caltech.edu
COH-Vaidehi
(2560)
Nagarajan Vaidehi
Alfonso Lam
Supriyo Bhattacharya
Hubert Li
Gouthaman Balaraman
Michiel Niesen
Division of Immunology Beckman Research Institute of
the City of Hope, Duarte, CA
NVaidehi@coh.org
Evotec
(2632)
Sandeep Pal Computational chemistry and Molecular
Modelling
Evotec Ltd, Abingdon, UK Sandeep.pal@evotec.com
RHUL
(2866)
Yrii Vorobjev2
Natalia Bakulina2
Victor Solovyev1
1Department of Computer Science,
Royal Holloway
1Royal Holloway, University of
London, Egham, UK and
2Softberry Inc., Mount Kisco, NY
victor@cs.rhul.ac.uk
Schrödinger
(3041)
Thijs Beuming1
Stefano Costanzi2
Lei Shi3
Chris Higgs1
Noeris Salam1
Dmitry1 Lupyan1
Woody Sherman1
2Laboratory of Biological Modeling,
National Institute of Diabetes and
Digestive and Kidney Diseases; 3Weill
Cornell Medical College
1Schrödinger Inc., New York, NY;
2NIH, Bethesda, MD; 3Cornell
University, NY
thijs.beuming@schrodinger.com
UNC
(3532)
Feng Ding
Pradeep Kota
Srinivas Ramachandran
Nikolay V. Dokholyan
Biochemistry and Biophysics University of North Carolina,
Chapel Hill, NC
dokh@med.unc.edu
UCSF- Shoichet-1
(3646)
Jens Carlsson*
Ryan G. Coleman*
Hao Fan
Avner Schlessinger
John J. Irwin
Andrej Sali
Brian K. Shoichet
Department of Pharmaceutical
Chemistry
University of California San
Francisco, San Francisco, CA
jens7904@gmail.com
shoichet@cgl.ucsf.edu
QUB
(3682)
Irina Tikhonova School of Pharmacy, Medical Biology
Centre
Queen's University, Belfast, UK i.tikhonova@qub.ac.uk
UMich-Pogozheva
UMich-Lomize
(3713/7425)
Irina Pogozheva
Andrei Lomize
Department of Medicinal Chemistry,
College of Pharmacy
University of Michigan, Ann
Arbor, MI
irinap@umich.edu
almz@umich.edu
Monash-Hall
(3801)
Nathan E. Hall Drug Discovery Biology, Monash
Institute for Pharmaceutical Sciences
Monash University, Parkville,
Australia
Nathan.Hall@monash.edu
KIAS
(4374)
Muhammad
Muddassar1,2,3
Yang Zhang2
Ae Nim Pae3
Jooyoung Lee1
1School of Computational Sciences;
2Center for Computational Medicine
and Bioinformatics; 3Center for
Neuromedicine
1Korea Institute for Advanced
Study, Seoul, Korea; 2University
of Michigan, Ann Arbor, MI;
3Korea Institute of Science and
Technology, Seoul, Korea
mmuddassar@gmail.com
PompeuFabra
(5084)
Laura Lopez
Cristian Obiol-Pardo
Jana Selent
Computer Assisted Drug Design Pompeu Fabra University,
Barcelona, Spain
jana.selent@upf.edu
Sydney
(5207)
Sadia Mahboob
Tim Werner
W. Bret Church
Biomolecular Structure and Informatics,
Faculty of Pharmacy
University of Sydney, Australia bret.church@sydney.edu.au
GaTech
(5334)
Michal Brylinski
Tadashi Ando
Aysam Guerler
Hongyi Zhou
Jeffrey Skolnick
Center for the Study of Systems
Biology
Georgia Institute of Technology,
Atlanta, GA
michal@gatech.edu
Helsinki- Xhaard
(5508)
Henri Xhaard Centre for Drug Research, Faculty of
Pharmacy
University of Helsinki, Helsinki,
Finland
henri.xhaard@helsinki.fi
Stockholm
(6006)
Wiktor Jurkowski
Arne Elofsson
Center of Biomembrane Research Stockholm University, Stockholm,
Sweden
wiktor.jurkowski@cbr.su.se
UNSW
(7141)
Ahsan K. Murad
Malgorzata Drwal
Tom B. Dupree
Renate Griffith
School of Medical Sciences University of New South Wales,
Sydney, Australia
r.griffith@unsw.edu.au
UNM
(7334)
Liliana Ostopovici-Halip1
Cristian Bologa2
1Institute of Chemistry; 2Division of
Biocomputing
1Romanian Academy, Timisoara,
Romania; 2University of New
Mexico, Albuquerque, NM
lili.ostopovici@gmail.com
cbologa@salud.unm.edu
Baylor-Barth
(7533)
Chen, K.M.1
Sun, J.2
Barth, Patrick1,2
1Verna and Marrs McLean Department
of Biochemistry and Molecular Biology;
2Department of Pharmacology
Baylor College of Medicine,
Houston, TX, USA
patrickb@bcm.edu
UWash
(7571)
Vladimir Yarov-Yarovoy
David Baker
Pharmacology University of Washington, Seattle,
WA
yarovoy@uw.edu
dabaker@uw.edu
CDD-CMBI
(8004)
Bas Vroling
Marijn P.A. Sanders
Sander B. Nabuurs
Computational Drug Discovery Group,
Centre for Molecular and Biomolecular
Informatics
Radboud University Nijmegen
Medical Centre, Nijmegen, The
Netherlands
s.nabuurs@cmbi.ru.nl
MolLife
(8241)
Gregory V. Nikiforovich MolLife Design LLC, St. Louis,
MO
gnikiforovich@gmail.com
*

Equal contributors

Highlights.

  • The GPCR Dock 2010 assessment featured three targets of varying modeling difficulty

  • 35 groups submitted 275 GPCR complex models prior to release of X-ray coordinates

  • Best predictions capture GPCR-ligand interaction details at atomic resolution level

  • Reliable homology modeling requires 35-40% sequence identity between target and template

Acknowledgements

Authors thank Joshua Kunken and Angela Walker, and Katya Kadyshevskaya for their help with data processing, manuscript preparation, and graphic design. The work was supported by NIH grants R01 GM 071872 and U01 GM094612 to RA and U54 GM094618 to RCS.

Footnotes

Supplementary materials

Summaries of protein prediction accuracies and ligand prediction accuracies for all GPCR Dock 2010 models and descriptions of the computational methods used by the participating groups are given in the Supplementary materials. All GPCR Dock 2010 models can be interactively viewed or downloaded from the assessment result web-site at http://ablab.ucsd.edu/GPCRDock2010/.

Bibliography

  1. Abagyan R, Batalov S, Cardozo T, Totrov M, Webber J, Zhou Y. Homology modeling with internal coordinate mechanics: deformation zone mapping and improvements of models via conformational search. Proteins. 1997;(Suppl 1):29–37. doi: 10.1002/(sici)1097-0134(1997)1+<29::aid-prot5>3.3.co;2-4. [DOI] [PubMed] [Google Scholar]
  2. Abagyan R, Kufareva I. The flexible pocketome engine for structural chemogenomics. Methods Mol Biol. 2009;575:249–279. doi: 10.1007/978-1-60761-274-2_11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Abagyan R, Totrov M. Biased probability Monte Carlo conformational searches and electrostatic calculations for peptides and proteins. J Mol Biol. 1994;235:983–1002. doi: 10.1006/jmbi.1994.1052. [DOI] [PubMed] [Google Scholar]
  4. Abagyan RA, Orry A, Raush E, Budagyan L, Totrov M. ICM Manual. MolSoft LLC; La Jolla, CA: 2009. [Google Scholar]
  5. Abagyan RA, Totrov MM, Kuznetsov DA. Icm: A New Method For Protein Modeling and Design: Applications To Docking and Structure Prediction From The Distorted Native Conformation. J. Comp. Chem. 1994;15:488–506. [Google Scholar]
  6. Abel R, Wang L, Friesner RA, Berne BJ. A Displaced-Solvent Functional Analysis of Model Hydrophobic Enclosures. J Chem Theory Comput. 2010;6:2924–2934. doi: 10.1021/ct100215c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Ballesteros JA, Weinstein H. Methods in Neurosciences. Academic Press; 1995. Integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in G protein-coupled receptors; pp. 366–428. [Google Scholar]
  8. Barth P, Wallner B, Baker D. Prediction of membrane protein structures with complex topologies using limited constraints. Proc Natl Acad Sci U S A. 2009;106:1409–1414. doi: 10.1073/pnas.0808323106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bhattacharya S, Vaidehi N. Computational mapping of the conformational transitions in agonist selective pathways of a G-protein coupled receptor. J Am Chem Soc. 2010;132:5205–5214. doi: 10.1021/ja910700y. [DOI] [PubMed] [Google Scholar]
  10. Bottegoni G, Kufareva I, Totrov M, Abagyan R. A new method for ligand docking to flexible receptors by dual alanine scanning and refinement (SCARE) J Comput Aided Mol Des. 2008;22:311–325. doi: 10.1007/s10822-008-9188-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bottegoni G, Kufareva I, Totrov M, Abagyan R. Four-dimensional docking: a fast and accurate account of discrete receptor flexibility in ligand docking. J Med Chem. 2009;52:397–406. doi: 10.1021/jm8009958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Brylinski M, Skolnick J. Q-Dock: Low-resolution flexible ligand docking with pocket-specific threading restraints. J Comput Chem. 2008;29:1574–1588. doi: 10.1002/jcc.20917. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Case DA, Cheatham TE, 3rd, Darden T, Gohlke H, Luo R, Merz KM, Jr., Onufriev A, Simmerling C, Wang B, Woods RJ. The Amber biomolecular simulation programs. J Comput Chem. 2005;26:1668–1688. doi: 10.1002/jcc.20290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cavasotto CN, Kovacs JA, Abagyan RA. Representing receptor flexibility in ligand docking through relevant normal modes. J Am Chem Soc. 2005;127:9632–9640. doi: 10.1021/ja042260c. [DOI] [PubMed] [Google Scholar]
  15. Cavasotto CN, Orry AJ, Murgolo NJ, Czarniecki MF, Kocsi SA, Hawes BE, O’Neill KA, Hine H, Burton MS, Voigt JH, et al. Discovery of novel chemotypes to a G-protein-coupled receptor through ligand-steered homology modeling and structure-based virtual screening. J Med Chem. 2008;51:581–588. doi: 10.1021/jm070759m. [DOI] [PubMed] [Google Scholar]
  16. Chen R, Li L, Weng Z. ZDOCK: an initial-stage protein-docking algorithm. Proteins. 2003;52:80–87. doi: 10.1002/prot.10389. [DOI] [PubMed] [Google Scholar]
  17. Cherezov V, Abola E, Stevens RC. Recent progress in the structure determination of GPCRs, a membrane protein family with high potential as pharmaceutical targets. Methods Mol Biol. 2010;654:141–168. doi: 10.1007/978-1-60761-762-4_8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen SGF, Thian FS, Kobilka TS, Choi H-J, Kuhn P, Weis WI, Kobilka BK, Stevens RC. High-Resolution Crystal Structure of an Engineered Human 2-Adrenergic G Protein Coupled Receptor. Science. 2007;318:1258–1265. doi: 10.1126/science.1150577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Chien EYT, Liu W, Zhao Q, Katritch V, Won Han G, Hanson MA, Shi L, Newman AH, Javitch JA, Cherezov V, Stevens RC. Structure of the Human Dopamine D3 Receptor in Complex with a D2/D3 Selective Antagonist. Science. 2010;330:1091–1095. doi: 10.1126/science.1197410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Cozzetto D, Kryshtafovych A, Fidelis K, Moult J, Rost B, Tramontano A. Evaluation of template-based models in CASP8 with standard measures. Proteins: Structure, Function, and Bioinformatics. 2009;77:18–28. doi: 10.1002/prot.22561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Davis IW, Baker D. RosettaLigand docking with full ligand and receptor flexibility. J Mol Biol. 2009;385:381–392. doi: 10.1016/j.jmb.2008.11.010. [DOI] [PubMed] [Google Scholar]
  22. de Graaf C, Rognan D. Customizing G Protein-coupled receptor models for structure-based virtual screening. Curr Pharm Des. 2009;15:4026–4048. doi: 10.2174/138161209789824786. [DOI] [PubMed] [Google Scholar]
  23. Devillé J, Rey J, Chabbert M. An Indel in Transmembrane Helix 2 Helps to Trace the Molecular Evolution of Class A G-Protein-Coupled Receptors. J Mol Evol. 2009;68:475–489. doi: 10.1007/s00239-009-9214-9. [DOI] [PubMed] [Google Scholar]
  24. Erickson JA, Jalaie M, Robertson DH, Lewis RA, Vieth M. Lessons in Molecular Recognition: The Effects of Ligand and Protein Flexibility on Molecular Docking Accuracy. Journal of Medicinal Chemistry. 2003;47:45–55. doi: 10.1021/jm030209y. [DOI] [PubMed] [Google Scholar]
  25. Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using Modeller. Curr Protoc Bioinformatics. 2006:6. doi: 10.1002/0471250953.bi0506s15. Chapter 5, Unit 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Fredriksson R, Lagerstrom MC, Lundin L-G, Schioth HB. The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol Pharmacol. 2003;63:1256–1272. doi: 10.1124/mol.63.6.1256. [DOI] [PubMed] [Google Scholar]
  27. Govaerts C.d., Blanpain C.d., Deupi X, Ballet S.b., Ballesteros JA, Wodak SJ, Vassart G, Pardo L, Parmentier M. The TXP Motif in the Second Transmembrane Helix of CCR5. J Biol Chem. 2001;276:13217–13225. doi: 10.1074/jbc.M011670200. [DOI] [PubMed] [Google Scholar]
  28. Jaakola V-P, Griffith MT, Hanson MA, Cherezov V, Chien EYT, Lane JR, Ijzerman AP, Stevens RC. The 2.6 Angstrom Crystal Structure of a Human A2A Adenosine Receptor Bound to an Antagonist. Science. 2008;322:1211–1217. doi: 10.1126/science.1164772. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Jaakola V-P, Lane JR, Lin JY, Katritch V, Ijzerman AP, Stevens RC. Ligand Binding and Subtype Selectivity of the Human A2A Adenosine Receptor. J Biol Chem. 2010;285:13032–13044. doi: 10.1074/jbc.M109.096974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Katritch V, Rueda M, Lam PC-H, Yeager M, Abagyan R. GPCR 3D homology models for ligand screening: lessons learned from blind predictions of adenosine A2a receptor complex. Proteins. 2010;78:197–211. doi: 10.1002/prot.22507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kellenberger E, Springael J-Y, Parmentier M, Hachet-Haas M, Galzi J-L, Rognan D. Identification of Nonpeptide CCR5 Receptor Agonists by Structure-based Virtual Screening. J Med Chem. 2007;50:1294–1303. doi: 10.1021/jm061389p. [DOI] [PubMed] [Google Scholar]
  32. Kim S-K, Gao Z-G, Van Rompaey P, Gross AS, Chen A, Van Calenbergh S, Jacobson KA. Modeling the Adenosine Receptors: Comparison of the Binding Domains of A2A Agonists and Antagonists. J Med Chem. 2003;46:4847–4859. doi: 10.1021/jm0300431. [DOI] [PubMed] [Google Scholar]
  33. Kryshtafovych A, Fidelis K, Moult J. CASP8 results in context of previous experiments. Proteins: Structure, Function, and Bioinformatics. 2009;77:217–228. doi: 10.1002/prot.22562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Kryshtafovych A, Venclovas Č, Fidelis K, Moult J. Progress over the first decade of CASP experiments. Proteins: Structure, Function, and Bioinformatics. 2005;61:225–236. doi: 10.1002/prot.20740. [DOI] [PubMed] [Google Scholar]
  35. Lagerstrom MC, Schioth HB. Structural diversity of G protein-coupled receptors and significance for drug discovery. Nat Rev Drug Discov. 2008;7:339–357. doi: 10.1038/nrd2518. [DOI] [PubMed] [Google Scholar]
  36. Lang PT, Brozell SR, Mukherjee S, Pettersen EF, Meng EC, Thomas V, Rizzo RC, Case DA, James TL, Kuntz ID. DOCK 6: combining techniques to model RNA-small molecule complexes. RNA. 2009;15:1219–1230. doi: 10.1261/rna.1563609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Michino M, Abola E, participants GD, Brooks CL, Dixon JS, Moult J, Stevens RC. Community-wide assessment of GPCR structure modelling and ligand docking: GPCR Dock 2008. Nat Rev Drug Discov. 2009;8:455–463. doi: 10.1038/nrd2877. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, Olson AJ. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem. 2009;30:2785–2791. doi: 10.1002/jcc.21256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Nabuurs SB, Wagener M, de Vlieg J. A Flexible Approach to Induced Fit Docking. J Med Chem. 2007;50:6507–6518. doi: 10.1021/jm070593p. [DOI] [PubMed] [Google Scholar]
  40. Ono Y, Fujibuchi W, Suwa M. Automatic gene collection system for genome-scale overview of G-protein coupled receptors in eukaryotes. Gene. 2005;364:63–73. doi: 10.1016/j.gene.2005.05.038. [DOI] [PubMed] [Google Scholar]
  41. Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Trong IL, Teller DC, Okada T, Stenkamp RE, et al. Crystal Structure of Rhodopsin: A G Protein-Coupled Receptor. Science. 2000;289:739–745. doi: 10.1126/science.289.5480.739. [DOI] [PubMed] [Google Scholar]
  42. Park JH, Scheerer P, Hofmann KP, Choe H-W, Ernst OP. Crystal structure of the ligand-free G-protein-coupled receptor opsin. Nature. 2008;454:183–187. doi: 10.1038/nature07063. [DOI] [PubMed] [Google Scholar]
  43. Rey J, Deville J, Chabbert M. Structural determinants stabilizing helical distortions related to proline. J Struct Biol. 2010;171:266–276. doi: 10.1016/j.jsb.2010.05.002. [DOI] [PubMed] [Google Scholar]
  44. Rosenkilde MM, Gerlach L-O, Hatse S, Skerlj RT, Schols D, Bridger GJ, Schwartz TW. Molecular mechanism of action of monocyclam versus bicyclam non-peptide antagonists in the CXCR4 chemokine receptor. J Biol Chem. 2007;282:27354–27365. doi: 10.1074/jbc.M704739200. [DOI] [PubMed] [Google Scholar]
  45. Rueda M, Katritch V, Raush E, Abagyan R. SimiCon: a web tool for protein-ligand model comparison through calculation of equivalent atomic contacts. Bioinformatics. 2010;26:2784–2785. doi: 10.1093/bioinformatics/btq504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Scheerer P, Park JH, Hildebrand PW, Kim YJ, Krausz N, Choe H-W, Hofmann KP, Ernst OP. Crystal structure of opsin in its G-protein-interacting conformation. Nature. 2008;455:497–502. doi: 10.1038/nature07330. [DOI] [PubMed] [Google Scholar]
  47. Shaw DE, Maragakis P, Lindorff-Larsen K, Piana S, Dror RO, Eastwood MP, Bank JA, Jumper JM, Salmon JK, Shan Y, Wriggers W. Atomic-Level Characterization of the Structural Dynamics of Proteins. Science. 2010;330:341–346. doi: 10.1126/science.1187409. [DOI] [PubMed] [Google Scholar]
  48. Thoma G, Streiff MB, Kovarik J, Glickman F, Wagner T, Beerli C, Zerwes H-G. Orally bioavailable isothioureas block function of the chemokine receptor CXCR4 in vitro and in vivo. J Med Chem. 2008;51:7915–7920. doi: 10.1021/jm801065q. [DOI] [PubMed] [Google Scholar]
  49. Vaidehi N, Floriano WB, Trabanino R, Hall SE, Freddolino P, Choi EJ, Zamanakos G, Goddard WA. Prediction of structure and function of G protein-coupled receptors. Proc Natl Acad Sci U S A. 2002;99:12622–12627. doi: 10.1073/pnas.122357199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD. Improved protein-ligand docking using GOLD. Proteins. 2003;52:609–623. doi: 10.1002/prot.10465. [DOI] [PubMed] [Google Scholar]
  51. Warne T, Serrano-Vega MJ, Baker JG, Moukhametzianov R, Edwards PC, Henderson R, Leslie AGW, Tate CG, Schertler GFX. Structure of a {beta}1-adrenergic G-protein-coupled receptor. Nature. 2008;454:486–491. doi: 10.1038/nature07101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wong RSY, Bodart V, Metz M, Labrecque J, Bridger G, Fricker SP. Comparison of the Potential Multiple Binding Modes of Bicyclam, Monocylam, and Noncyclam Small-Molecule CXC Chemokine Receptor 4 Inhibitors. Mol Pharmacol. 2008;74:1485–1495. doi: 10.1124/mol.108.049775. [DOI] [PubMed] [Google Scholar]
  53. Wu B, Chien EYT, Mol CD, Fenalti G, Liu W, Katritch V, Abagyan R, Brooun A, Wells P, Bi FC, et al. Structures of the CXCR4 Chemokine GPCR with Small-Molecule and Cyclic Peptide Antagonists. Science. 2010;330:1066–1071. doi: 10.1126/science.1194396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Yarov-Yarovoy V, Schonbrun J, Baker D. Multipass membrane protein structure prediction using Rosetta. Proteins. 2006;62:1010–1025. doi: 10.1002/prot.20817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Zemla A. LGA: a method for finding 3D similarities in protein structures. Nucleic Acids Research. 2003;31:3370–3374. doi: 10.1093/nar/gkg571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Zhou H-X, Gilson MK. Theory of Free Energy and Entropy in Noncovalent Binding. Chemical Reviews. 2009;109:4092–4107. doi: 10.1021/cr800551w. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material

RESOURCES