Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2012 Mar 19;109(14):5517–5522. doi: 10.1073/pnas.1120431109

Structure-based ligand discovery for the protein–protein interface of chemokine receptor CXCR4

Michael M Mysinger a,1, Dahlia R Weiss a,1, Joshua J Ziarek b,1, Stéphanie Gravel c,d, Allison K Doak a, Joel Karpiak a, Nikolaus Heveker c,d, Brian K Shoichet a,2, Brian F Volkman b,2
PMCID: PMC3325704  PMID: 22431600

Abstract

G-protein–coupled receptors (GPCRs) are key signaling molecules and are intensely studied. Whereas GPCRs recognizing small-molecules have been successfully targeted for drug discovery, protein-recognizing GPCRs, such as the chemokine receptors, claim few drugs or even useful small molecule reagents. This reflects both the difficulties that attend protein–protein interface inhibitor discovery, and the lack of structures for these targets. Imminent structure determination of chemokine receptor CXCR4 motivated docking screens for new ligands against a homology model and subsequently the crystal structure. More than 3 million molecules were docked against the model and then against the crystal structure; 24 and 23 high-scoring compounds from the respective screens were tested experimentally. Docking against the model yielded only one antagonist, which resembled known ligands and lacked specificity, whereas the crystal structure docking yielded four that were dissimilar to previously known scaffolds and apparently specific. Intriguingly, several were potent and relatively small, with IC50 values as low as 306 nM, ligand efficiencies as high as 0.36, and with efficacy in cellular chemotaxis. The potency and efficiency of these molecules has few precedents among protein–protein interface inhibitors, and supports structure-based efforts to discover leads for chemokine GPCRs.

Keywords: drug design, virtual screening, promiscuous aggregation


G-protein–coupled receptors (GPCRs) play a central role in many normal physiological pathways and altered diseased states, and are the targets of approximately 30% of marketed drugs (1). Ligand discovery against small-molecule GPCRs such as the bioamine receptors has been particularly productive, as have structure-based screens against their crystal structures (25). Targeting larger-molecule–recognizing GPCRs has been more difficult. Although multiple reagents are available for lipid and peptidergic GPCRs, their molecular weights are substantially higher than those typical for bioamine receptors, and they are less ligand efficient. This reflects the challenges faced in ligand discovery against peptide–protein and lipid–protein interfaces. These difficulties are still more acute against chemokine GPCRs, which recognize folded proteins of ∼100 amino acids in length and are thus protein–protein interface (PPI) targets (6). Although there are several example drugs in this class, such as maraviroc, plerixafor, and vorapaxar, finding organic molecules with good affinity and the physical properties of oral drugs is notoriously difficult for PPI targets, as reflected in the high molecular weight and hydrophobicity of the few PPI drugs (7).

A public competition to predict ligand complexes with the structure of C-X-C chemokine receptor 4 (CXCR4) inspired us to bring structure-based discovery to bear against a key member of the chemokine family (8). CXCR4 natively recognizes the CXCL12 chemokine, an 8-kDa protein. Like many other PPI targets, CXCR4 plays a key signaling role: it is constitutively expressed in many organs and is implicated in chemotactic roles as diverse as lymphopoiesis, myelopoeisis, embryogenesis, angiogenesis, cardiogenesis, neuron migration, and cerebral development (9, 10). The receptor is involved in disease states such as myocardial infarction/reperfusion injury (11), myelokathexis (12), human immunodeficiency virus (HIV) infection (13), and the growth and development of more than 20 different types of cancer (14). Despite intense interest, only a few potent and selective small molecule antagonists have been discovered for CXCR4 (1517). New ligand chemotypes, the identification of which a structure-based approach enables, might provide leads to perturb the critical biology for which CXCR4 is responsible.

Not withstanding intense effort, experimental structures of GPCRs remain scarce, and so homology models are often used for GPCR ligand discovery (1821). Such models potentially enable structure-based discovery against many more targets than have been experimentally determined, but their reliability has rarely been tested prospectively. From a technical perspective, the ability to compare a discovery campaign against a homology model to one against the subsequently released crystal structure might illuminate model viability in an unbiased and wholly prospective way.

Thus, we had two broad questions that we hoped to address in this study. First, can we discover biologically useful ligands for CXCR4 using a structure-based approach? Second, how does a prospective docking screen against a homology model of the receptor compare with that against the crystal structure? The first question reflects the intense biological interest in this target and its problematic status as a PPI, with all of the challenges that those factors present for ligand discovery. The second question might inform which parts of the class A GPCR family are good candidates for homology-based drug discovery. A virtual screen of a dopamine receptor D3 (DRD3) homology model was recently shown to be as effective as a virtual screen against its crystal structure (22), but transmembrane sequence identity between D3 receptor and its nearest structural template is 42%. CXCR4 has, at best, 25% transmembrane sequence identity to the nearest template structure. If we can rely only on models with sequence identities as high as D3 receptor, then only about 10% of GPCRs might be modeled, given the current structural coverage; however, if 25% sequence identity suffices, then more than 70% are viable for structure-based ligand discovery. Experimental tests of docking hits against both model and crystal structure will potentially uncover new ligands for CXCR4, and may also inform the general usefulness of distant GPCR homology models.

Results

Homology Model Construction.

The effort began with calculating homology models for CXCR4. The GPCR Dock 2010 Assessment (8) challenged us to predict the orientation of the small molecule IT1t before release of the first CXCR4 structures (23). We followed a strategy (SI Appendix, Fig. S1) that used enrichment of known ligands to guide model selection, as pioneered in earlier studies (19). Initial homology models were refined for sequence alignment of CXCR4 to the four crystallographic templates then available, β1 and β2 adrenergic receptors, adenosine A2A receptor, and rhodopsin (SI Appendix, Fig. S2). To expand backbone diversity, we used low-frequency elastic normal modes to perturb template backbones (24). We calculated 576 and 510 homology models from the crystallographic templates and the perturbed structures, respectively. We docked known ChEMBL04 ligands (25) and property-matched decoys to each model and then measured the retrospective enrichment using adjusted LogAUC (26). Enrichment is a widely used metric in docking, reflecting the ranking of known ligands selected from a database of decoy molecules, compared with what would be expected at random. This can be expressed as either overall enrichment over random, or, as we do here, a log-weighted enrichment to emphasize the highest ranking molecules, which are the most likely to be selected for testing. In the adjusted LogAUC metric, a value of 0% represents completely random selection. To select models, we also used the rank of the cocrystal ligand IT1t (although its bound structure was still unknown), the number of ligands interacting with the critical residue E7.39 (27) (Ballesteros-Weinstein numbering), and the complementarity between docked ligands and modeled binding site. Five top models and their corresponding ligand IT1t binding positions were ultimately submitted to the competition.

Before release of the crystal structure, we continued to develop homology models for prospective ligand discovery. In the final iteration, we built 2044 homology models without extracellular loop 2, docking each to 60 known ligands and 2456 property-matched decoys. Overall, 36 billion ligand orientations and 55 billion conformations were sampled, so more than 64 trillion complexes completed within 176 cpu-d or 9 h of wall clock time on our cluster. Based on the criteria mentioned above, we selected one model upon which we generated 1000 extracellular loop 2 variants. We selected a single-loop model with ligand enrichment of 22% LogAUC for prospective screening.

Homology Model Virtual Screen.

To predict previously undiscovered CXCR4 ligands from the prospective homology model, we used DOCK 3.6 to virtually screen the lead-like subset of ZINC (28), i.e., molecules with molecular weights less than 350, logP less than 3.5, and 7 or fewer rotatable bonds. Each of the 3.3 million molecules in ZINC was sampled in an average of 11,000 orientations and 2,700 conformations, or 41 trillion complexes sampled overall. Each complex was scored for complementarity based on van der Waals (using a modified AMBER potential function) and electrostatic interaction energies (using potentials calculated with DelPhi), corrected for ligand desolvation (26). The full screen took 372 cpu-d, or 18 h of wall clock time on our cluster.

We then selected molecules for experimental testing. Commonly, this takes the form of visually inspecting the top ranking 500 molecules or up to 1% of the docking-ranked database. It is well-known that docking scoring functions are approximate and incomplete, but less discussed are problems with compound representation in database libraries (e.g., incorrect ionization states, overly strained conformations, and simple lack of availability from vendors). More generally, what makes a good lead molecule reflects a plurality of not only orthogonal but sometimes opposed criteria. For instance, larger, hydrophobic molecules will often bind tighter and score better, but biological efficacy and solubility often favor smaller, less hydrophobic molecules. Whereas we prefer molecules that engage all of their functional groups with the protein, we also are looking for the formation of key, “warhead” interactions with tightly defined geometric criteria. These and other extrathermodynamic criteria have not been reduced to a single function—and, given their opposed nature, this might be difficult to do—but may be rapidly evaluated by the eye of the trained investigator. In CXCR4, molecules were rejected, in rough order of importance, due to the following: (i) wrong ionization state, (ii) unavailability, (iii) high internal energy, (iv) unsatisfied polar interactions, and (v) low hit diversity. Molecules were prioritized for key salt bridges to E7.39 and at least one other anionic residue, plus a complementary fit to the binding site.

Before release of the crystal structure, we purchased 24 high-ranking molecules for testing, all in the top 1800 (0.05%) of 3.3 million molecules docked. One of these inhibited CXCL12 induced calcium flux in cell culture, with an IC50 of 107 μM, a hit rate of 4% (compound 1; Table 1 and Fig. 1A). Compound 1 ranked 1725th and fit deep within the modeled binding site, forming salt bridges to E7.39 and D6.58 in the putative docked complex (Fig. 2A). To measure chemical similarity to known CXCR4 ligands in ChEMBL09 (25), we represented compound 1 by a 2D topological fingerprint, ECFP4, and compared the bits (features) using the Tanimoto coefficient (Tc), as is widely done in the field (29). Despite a Tc of 0.36, indicating marginal chemical dissimilarity, the molecule was a combination of two previously observed chemotypes, so we did not consider it particularly dissimilar. Moreover, specificity counterscreens suggested that compound 1 also inhibited the related chemokine receptor CCR2.

Table 1.

Compound 1 identified from homology model docking screen, and compounds 2–5 identified from crystal structure docking screen

graphic file with name pnas.1120431109t01.jpg

*Ligand efficiency.

Tanimoto similarity to the most similar CXCR4 small molecule ligand in ChEMBL09 database.

Closest CXCR4 small molecule ligand in ChEMBL09 database.

§Ranks over 5000 not filtered for broken molecules.

Fig. 1.

Fig. 1.

Dose–response curves shown for inhibitor compounds 1–5. (A–E) Percent calcium flux calculated as maximum minus minimum fluorescence as a percent of baseline (n = 6). (F) [I125]-CXCL12 radioligand displacement by compound 3.

Fig. 2.

Fig. 2.

Docking modes of our inhibitors. (A) The homology model with compound 1 (blue) as docked to the homology model. (B) The crystal structure with compound 1 (yellow) as docked to it. (C) Compound 2 (yellow) docked to crystal structure and the cocrystal ligand pose of small molecule 1T1t (blue lines). (D–F) Compounds 3–5 docked to crystal structure.

Crystal Structure Virtual Screen.

With the crystal structure released, we again screened the lead-like subset of ZINC, now composed of 4.2 million molecules (28). Docking statistics were similar, with each molecule sampled in an average of 10,200 orientations and 2,100 conformations, or 87 trillion complexes sampled overall (SI Appendix, SI Methods). From among the top 0.03% of the docking hit list, we purchased 23 molecules for testing. Compounds 25 (17% hit rate) substantially inhibited CXCL12-induced calcium flux in cell culture, with IC50 values ranging from 55 to 77 μM (Table 1 and Fig. 1 B–E). In the docked poses, all four inhibitors formed salt bridges to E7.39 and D2.63 (Fig. 2 C–F). Three formed salt bridges through an unprecedented imidazole functional group. Again comparing chemical similarity to known CXCR4 ligands, Tc values of 0.23–0.32 support their uniqueness (Table 1). Although all four inhibitors are also topologically distinct from one another, the similar docked poses suggest that compounds 3 and 4 fall into the same structural class. All of the compounds have molecular weights of 300–350 and calculated logP values of 0.5–3.5, placing them within the lead-like (30) range.

Biological Activity.

For biological relevance, compounds must not only inhibit calcium flux, but must also inhibit lymphocyte migration. All five ligands inhibited human THP-1 monocyte migration induced by CXCL12 in cell culture (Fig. 3 and SI Appendix, Table S1), with compounds 1 and 2 almost completely inhibiting chemotaxis at 100 μM.

Fig. 3.

Fig. 3.

Inhibition of CXCL12 induced chemotaxis in THP-1 cells. (A) Following incubation with compounds 1–5 the number of cells that migrated into the lower chamber was counted. (B) Schematic representation of the Transwell chemotaxis chamber.

Small molecules may perturb chemokine signaling without competitive displacement of the large chemokine protein, for instance binding under it in the transmembrane part of the site (31, 32). Four of the five ligands modulated the binding of radiolabeled CXCL12 (Fig. 1F and SI Appendix, Fig. S3). Compounds 3 and 5 disrupted CXCL12 binding with IC50 of 306 nM and 14 μM, respectively. Compound 1 reduced binding with an IC50 of 224 μM, but had a steep dose–response curve (aggregation counter screening, below). Compound 4, although efficacious as a signaling antagonist, actually increased CXCL12 binding, whereas compound 2 did not modulate binding at all. These observations are consistent with the often allosteric binding of small molecules to chemokine receptors.

Ligand efficiency (LE) corrects binding energies for size, dividing free energy of binding expressed as RTlog(IC50) (in kcal/mol) by heavy atom count. Although LE is notoriously low (poor) for PPI inhibitors, compound 3 has an LE of 0.36, placing it in the range of that for oral drugs. The four other compounds have LE values of 0.24–0.28 (Table 1), more typical of PPI inhibitors.

Model Analysis.

The release of the CXCR4 crystal structure allowed us to compare the prospective model against the experimental structure. In the prospectively screened model, ligand IT1t docks deep in the transmembrane bundle, similar to ligand binding in the structural templates. In the CXCR4 crystal structure, IT1t binds higher in the site (SI Appendix, Fig. S4), resulting in a poor root mean square deviation (RMSD) of 9.5 Å between observed and predicted ligand position (SI Appendix, Table S2). Overall binding site agreement was better, with a transmembrane binding site heavy-atom RMSD of 2.3 Å. Retrospective enrichment of known ligands was also high at 21% LogAUC (30% LogAUC before modeling extracellular loop 2), comparing favorably with the crystal structure at 28% LogAUC (SI Appendix, Fig. S5A). This placed it at the top of the enrichment distribution among all of the models we built, despite having an average binding site RMSD to the crystal structure (SI Appendix, Fig. S6).

We were interested to compare our predicted model with those with higher fidelity to the ultimate crystal structure that had been submitted to the GPCR Dock 2010 Assessment (8), with a view to evaluating their usefulness for docking screens. We investigated the two top-scoring models in the assessment, both of which predicted the IT1t ligand pose better than our model had, and computed their retrospective enrichment of ChEMBL04 ligands. These two models, VU-5 (33) and COH-1 (34), led to ligand enrichments of 5% and 6% LogAUC, respectively, at least using our docking method (SI Appendix, Table S2 and Fig. S5A). We also docked the five previously undiscovered ligands and more than 3 million lead-like molecules in ZINC (SI Appendix, Table S3 and Fig. S5B); consistent with their modest retrospective enrichments of ligands in ChEMBL, neither model ranked any of the new ligands well. Docked against the VU-5 model, the best scoring previously undiscovered ligand was compound 5 with a rank of 3282, whereas none of the other four ligands ranked better than 312,000. Similarly, against the COH-1, the best-ranked ligand was compound 1, which ranked 19,977, whereas no other ligand ranked above 63,000. These docking results take nothing away from the success of these models in the competition, but support the idea that even the field's best models, at this level of sequence identity, may struggle to achieve a structural fidelity that is high enough to support new ligand discovery.

Posing this question another way, we wondered if we ourselves had explored a model closer to the crystal structure, from among the several thousand calculated, that may have performed better in the docking. We retrospectively used ligand RMSD, known ligand enrichment, and binding site RMSD to select the most accurate model from among 2044 loopless models we originally sampled. The selected model had a much improved ligand RMSD of 2.9 Å (compared with the 9.5 Å that we had originally predicted; SI Appendix, Table S2). Despite enriching known ligands well (LogAUC of 22%, SI Appendix, Fig. S5A), the performance of the second model was substantially below the 30% LogAUC found for the model that was ultimately used in the prospective docking. Indeed, when we docked the five previously undiscovered ligands against what was structurally the best of our sampled models, their rankings were mediocre: the top scoring of these molecules was compound 2, which ranked 15,344 of more than 3 million ZINC molecules docked, whereas three other previously undiscovered ligands ranked below 22,200. Meanwhile, for the truly prospective homology model, despite its poor predicting of the geometry of the crystallographic ligand IT1t, our top scoring ligand in the docking was compound 1, which ranked 2803 of the 3.3 million ZINC molecules docked, whereas compounds 2, 3, 4, and 5 ranked 50121, 30898, 5380, and 5800, respectively. This suggests that a combination of geometric and also docking ranking criteria are appropriate in selecting models to be used for docking prediction, as has been suggested by others. Overall, these observations support the idea that a certain minimum of sequence identity is required to be able to calculate a high-fidelity model that can reliably select dissimilar ligands, a point to which we will return.

Aggregation Counter Screen.

Colloidal aggregation constitutes perhaps the greatest source of false-positive results in screens against soluble proteins, but has not previously been observed for membrane-bound receptors. All of our ligands had Hill slopes in the calcium flux assay of 1.5 or higher, which is often associated with a colloidal mechanism of inhibition. Although such a slope could be accounted for classically, by binding to a dimeric form of the receptor, as adopted in the crystal structure, all were counter screened for nonspecific inhibition due to aggregation (35, 36). To test for aggregation under our exact assay conditions, spin-down precipitation was used to remove putative colloids (37). After spin-down, the supernatant activity of compounds 1 to 5 was unaffected, whereas the activity of compound 6 (SI Appendix, Table S4) was sharply attenuated. Due to the high Hill slope of 5 and inhibition of the counter screen enzyme cruzain at 200 μM (SI), we cannot completely discount the aggregation of compound 1 at high concentration, although, at its IC50 value of 107 μM, it seems well behaved. Conversely, compound 6 (SI Appendix, Fig. S7) seems to be inhibiting CXCR4 via a colloidal aggregation mechanism, which constitutes a previously undocumented description of aggregation-based activity against membrane-bound receptors; this mechanism may merit future vigilance in GPCR screening efforts. The behavior of all four other antagonists was consistent with well-behaved, classical binding to CXCR4.

Discussion

Two key results emerge from this study. First, five CXCR4 inhibitors, representing three scaffolds dissimilar to those previously known, were identified; they are all substantially smaller than most known CXCR4 ligands, giving them relatively favorable ligand efficiencies. Indeed the best of them, compound 3, a 306 nM antagonist of the receptor, has a ligand efficiency (LE) of 0.36; its good physical properties put it well-within the lead-like range of compounds that might be optimized as tools and bioactive molecules. All five ligands inhibit CXCR4-mediated chemotaxis in cell culture. The four inhibitors derived from the X-ray screen are specific for CXCR4 versus CCR2, a close homolog, and so may hold potential as reagents to modulate HIV infection, metastasis and inflammation. Second, we compare a blind prospective virtual screen against a GPCR homology model to both a subsequent screen against the crystal structure, and to a twin study against dopamine receptor D3 (DRD3) (22). The CXCR4 homology model had a hit rate of 4%, with a single antagonist of modest dissimilarity and specificity; the crystal structure screen had a hit rate of 17%, with at least three of the four ligands being dissimilar to previously known scaffolds and all four being specific. Conversely, docking against a DRD3 homology model discovered as many ligands as docking against the crystal structure. Contrasting these two targets and four campaigns illuminates the areas of the GPCR landscape that may be amenable to structure-based ligand discovery.

PPI targets are notoriously difficult to modulate with “drug-like” organic molecules. Few PPI inhibitors possess an LE greater than 0.23 (7); to achieve a reasonable affinity, they are large and often hydrophobic, requiring extensive optimization to deliver them into biological milieus. In some ways, CXCR4 is typical of PPI sites: at 20 Å across and 20 Å deep, its orthosteric site is much larger and more solvent exposed than those in biogenic amine GPCRs, for example, and it bears a high net charge with at least five anionic residues. Still, despite its large orthosteric site, CXCR4 has well-defined subsites where mixtures of charge and hydrophobic complementarity might be exploited by small molecules. Indeed the docking poses of the previously undiscovered inhibitors exploit one such subsite, also occupied by the cocrystal ligand, defined by E7.39 and D2.63. This may explain the unusually high LE (0.36) of compound 3, which is far above that expected for most PPI inhibitors and indeed above that for the one approved CXCR4 drug, plerixafor (LE 0.25). Intriguingly, whereas most other chemokine antagonists—all of them PPI inhibitors—have low LE values typical of the field, several have unusually high values. Thus, repartaxin (38), a CXCR1 antagonist, has an LE of 0.43, whereas the CCR5 antagonist, maroviroc, one of the few PPI drugs on the market (7), has an LE of 0.33. It may be that chemokine receptors, like GPCRs, are privileged when it comes to discovering ligands with favorable physical properties. This observation, the conserved nature of the chemokine receptor interface (32), and the high LE of compound 3, together suggest that structure-based campaigns against additional chemokine receptors may result in reagents with pharmaceutically relevant properties.

From a technical perspective, it is interesting to ask why the CXCR4 homology model performed so much worse than either the dopamine D3 homology model or the crystal structure of CXCR4. One important contribution was ligand bias in the database. A great advantage of docking against the dopamine receptor was the bias toward biogenic amine mimetics in even an “unbiased” library such as ZINC, which simply catalogs commercially available molecules. There were not only many dopaminergic-like molecules in ZINC to find, but also many analogues of these hits in ZINC, allowing an SAR by catalog campaign that drove affinity from 1.6 μM for an original DRD3 docking hit to 81 nM for an optimized lead. This bias was also observed in docking screens against the adenosine A2a receptor (4, 39) and the β2 adrenergic receptor (3). Several lines of evidence suggest that the bias toward CXCR4-like ligands was much reduced in ZINC: there are relatively few molecules that share the same size and charge properties as known ligands, and, in contrast to DRD3 we found very few analogs in the database even for our lead-like ligands.

A larger contribution to the weakness of the prospective CXCR4 model screen was clearly the accuracy of the model itself. The DRD3 model, with 42% sequence identity to its template, closely resembles the crystal structure (binding site RMSD, 1.65 Å), and the large number of known ligands and ample mutational data helped to correctly predict the cocrystal ligand pose (also 1.65 Å RMSD). This level of accuracy was sufficient to attain a 23% hit rate in a virtual screen of the DRD3 model. Although the relatively poor 4% hit rate of the prospective CXCR4 homology model may simply reflect our ineptness, our models were competitive with the field, predicting the CXCR4-IT1t conformation better than all but a few models submitted to the public competition (8). To further assess the overall state of CXCR4 homology modeling, we performed docking screens against the two most accurate competition CXCR4 models, assessing how well they ranked the five ligands identified here. Against these external models, retrospective enrichment of previously known CXCR4 ligands was poor, as was ranking of our five hits. The relatively poor outcome of screening CXCR4 homology models vs. screening a DRD3 homology model likely reflects the reduced accuracy that can be achieved at 25% vs. 42% sequence identity to structural templates, even with the best homology models that the field now offers.

A lesson we draw is that for GPCRs sharing 42% or better sequence identity with a structurally determined template, and with sufficient mutant studies to predict ligand binding, which was the case for the D3 receptor, accurate models may be within the reach of general approaches. For those targets with much lower sequence identities, certainly in the 18–25% range that characterized the templates available to us for CXCR4, homology models accurate enough for predictive ligand discovery may be out of reach, even with domain expertise; exactly where the boundary lies between these two regimens is uncertain at this time. Putting aside issues of disease-relevance and experimental pragmatism, which will naturally dominate, one might imagine an additional prioritization axis for future GPCR crystal structures that considers the number of new targets that these structures enable to be reliably modeled.

Those technical points should not obscure the key, biological result from this study: the ability to discover new chemical matter, with favorable physical properties, for this critical PPI. Given the difficulties for which these PPI targets are notorious, and a lack of favorable bias in the docking library, we were uncertain as to whether even the CXCR4 crystal structure would lead to new ligands. Instead, the 17% hit rate observed was substantial, certainly much higher than we have experienced with soluble enzymes (40, 41), and three previously undiscovered chemotypes emerged. The ligand with the highest affinity by radioligand displacement, compound 3, had an IC50 of 306 nM and a ligand efficiency of 0.36. This is within the range of favorable leads for drug discovery, well above the 0.23 ligand efficiency expected for most PPI inhibitors (7). All of our inhibitors were active in cell culture, inhibiting CXCR4-mediated chemotaxis, the primary cellular endpoint for a chemokine receptor ligand, consistent with their promise as leads. More generally, the study findings suggest that structures of chemokine receptors will provide pragmatic templates for probe and drug discovery; such molecules are much needed for biological understanding and for treating devastating diseases in the areas of cancer, virology, and inflammation.

Methods

Homology Modeling and Docking.

Homology modeling and docking proceeded as described (Results and SI Appendix, SI Methods). The large solvent exposed CXCR4 binding site presents a challenge for docking: to compensate, we used a unique procedure to balance electrostatics with rapid context-dependent ligand desolvation (26). We filled the CXCR4 pocket with a single layer of low-dielectric spheres, excluding any spheres displaced from the surface to perturb the bulk dielectric minimally, while allowing the ligand to interact strongly with charged groups throughout the binding site. The large binding cavity also presented a challenge for exhaustive ligand sampling. To compensate, we divided the binding pocket into three partially overlapping subsites for sampling and docked separately against each (this reduces number of orientations by ∼34) (42). A single scoring grid was used to represent the entire site.

Calcium Flux–Based Assays.

THP-1 monocytes were resuspended in assay buffer containing FLIPR Calcium4 dye. Compounds were added at 100 μM (single point) or the indicated concentrations (dose–response). After a 20 s baseline measurement, CXCL12 was added at 30 nM, and the resulting calcium response was measured for an additional 50 s. CCL2, a chemokine that targets CCR2 (a distinct receptor of THP-1 cells) was added as a control for compound specificity during the single-point compound screening. Approximately 60 s after CXCL12 addition, 6 nM CCL2 was added to each well and calcium mobilization was measured for an additional 40 s. Percent calcium flux for each agonist was calculated from the maximum fluorescence minus the minimum fluorescence as a percentage of baseline. A two-tailed Student t test between either the 30 nM CXCL12 control or the 6 nM CCL2 control and the compound of interest was used to identify statistically significant inhibitory compounds (SI Appendix, Fig. S8 A and B). For significant CXCR4 inhibitors, the assay was repeated in a dose–response format.

Chemotaxis and Viability.

Chemotaxis experiments were performed in THP-1 cells as described (SI Appendix, SI Methods). CXCL12 ligand (30 nM) and respective compounds (100 μM) were added to the lower chamber. Percent maximal migration was calculated as the number of migrated cells with compound divided by number that migrated to CXCL12 alone.

Radioligand Binding.

Binding studies were performed on pre-B cell leukemia REH cells as described (SI Appendix, SI Methods). The competition binding assays were carried out using 50 pM [I125]-CXCL12 as a tracer.

Counter Screens for Aggregation.

In spin-down counter screens for aggregation, compounds were centrifuged at 16,000 × g for 20 min. Supernatant was removed and used for calcium flux experiments as above. Cruzain inhibition assays were performed as reported elsewhere (36).

Compound Sources.

Compounds were obtained from the National Cancer Institute and commercial suppliers. All active compounds were tested for purity by liquid chromatography/mass spectrometry (LC/MS) at the University of California San Francisco and were judged to be pure by peak height and identity (SI Appendix, SI Methods).

Supplementary Material

Supporting Information

Acknowledgments

We thank Jens Carlsson and Ryan Coleman for discussions and scripts used in the parallel DRD3 experiment, Qingyi Yang for help with 3K-ENM, and Oliv Eidam and Magdalena Korczynska for reading of this manuscript. This work was supported by National Institutes of Health (NIH) Grants GM59957 and GM71630 (to B.K.S.), GM072970 (to principal investigator, R. Altman), and R01GM58072 (to B.F.V.), and by Canadian Institutes of Health Research Grant HOP-93431 (to N.H.). M.M.M. was supported in part by NIH Training Grant T32 GM007175. D.R.W. was supported in part by F32GM093580 from the National Institute of General Medical Sciences. J.J.Z. is supported by a grant from the Cancer Center of the Medical College of Wisconsin. S.G. is a scholar of the Fonds de la Recherche en Santé du Québec.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1120431109/-/DCSupplemental.

References

  • 1.Overington JP, Al-Lazikani B, Hopkins AL. How many drug targets are there? Nat Rev Drug Discov. 2006;5:993–996. doi: 10.1038/nrd2199. [DOI] [PubMed] [Google Scholar]
  • 2.Katritch V, et al. Structure-based discovery of novel chemotypes for adenosine A(2A) receptor antagonists. J Med Chem. 2010;53:1799–1809. doi: 10.1021/jm901647p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kolb P, et al. Structure-based discovery of beta2-adrenergic receptor ligands. Proc Natl Acad Sci USA. 2009;106:6843–6848. doi: 10.1073/pnas.0812657106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Carlsson J, et al. Structure-based discovery of A2A adenosine receptor ligands. J Med Chem. 2010;53:3748–3755. doi: 10.1021/jm100240h. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Sabio M, Jones K, Topiol S. Use of the X-ray structure of the beta2-adrenergic receptor for drug discovery. Part 2: Identification of active compounds. Bioorg Med Chem Lett. 2008;18:5391–5395. doi: 10.1016/j.bmcl.2008.09.046. [DOI] [PubMed] [Google Scholar]
  • 6.Veldkamp CT, Ziarek JJ, Peterson FC, Chen Y, Volkman BF. Targeting SDF-1/CXCL12 with a ligand that prevents activation of CXCR4 through structure-based drug design. J Am Chem Soc. 2010;132:7242–7243. doi: 10.1021/ja1002263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wells JA, McClendon CL. Reaching for high-hanging fruit in drug discovery at protein-protein interfaces. Nature. 2007;450:1001–1009. doi: 10.1038/nature06526. [DOI] [PubMed] [Google Scholar]
  • 8.Kufareva I, Rueda M, Katritch V, Stevens RC, Abagyan R. GPCR Dock 2010 participants Status of GPCR modeling and docking as reflected by community-wide GPCR Dock 2010 assessment. Structure. 2011;19:1108–1126. doi: 10.1016/j.str.2011.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zou YR, Kottmann AH, Kuroda M, Taniuchi I, Littman DR. Function of the chemokine receptor CXCR4 in haematopoiesis and in cerebellar development. Nature. 1998;393:595–599. doi: 10.1038/31269. [DOI] [PubMed] [Google Scholar]
  • 10.Ma Q, et al. Impaired B-lymphopoiesis, myelopoiesis, and derailed cerebellar neuron migration in CXCR4- and SDF-1-deficient mice. Proc Natl Acad Sci USA. 1998;95:9448–9453. doi: 10.1073/pnas.95.16.9448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hu X, et al. Stromal cell derived factor-1 alpha confers protection against myocardial ischemia/reperfusion injury: Role of the cardiac stromal cell derived factor-1 alpha CXCR4 axis. Circulation. 2007;116:654–663. doi: 10.1161/CIRCULATIONAHA.106.672451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Hernandez PA, et al. Mutations in the chemokine receptor gene CXCR4 are associated with WHIM syndrome, a combined immunodeficiency disease. Nat Genet. 2003;34:70–74. doi: 10.1038/ng1149. [DOI] [PubMed] [Google Scholar]
  • 13.Endres MJ, et al. CD4-independent infection by HIV-2 is mediated by fusin/CXCR4. Cell. 1996;87:745–756. doi: 10.1016/s0092-8674(00)81393-8. [DOI] [PubMed] [Google Scholar]
  • 14.Burger JA, Kipps TJ. CXCR4: A key receptor in the crosstalk between tumor cells and their microenvironment. Blood. 2006;107:1761–1767. doi: 10.1182/blood-2005-08-3182. [DOI] [PubMed] [Google Scholar]
  • 15.Thoma G, et al. Orally bioavailable isothioureas block function of the chemokine receptor CXCR4 in vitro and in vivo. J Med Chem. 2008;51:7915–7920. doi: 10.1021/jm801065q. [DOI] [PubMed] [Google Scholar]
  • 16.Zhu A, et al. Dipyrimidine amines: A novel class of chemokine receptor type 4 antagonists with high specificity. J Med Chem. 2010;53:8556–8568. doi: 10.1021/jm100786g. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.De Clercq E. Recent advances on the use of the CXCR4 antagonist plerixafor (AMD3100, Mozobil™) and potential of other CXCR4 antagonists as stem cell mobilizers. Pharmacol Ther. 2010;128:509–518. doi: 10.1016/j.pharmthera.2010.08.009. [DOI] [PubMed] [Google Scholar]
  • 18.Kellenberger E, et al. Identification of nonpeptide CCR5 receptor agonists by structure-based virtual screening. J Med Chem. 2007;50:1294–1303. doi: 10.1021/jm061389p. [DOI] [PubMed] [Google Scholar]
  • 19.Evers A, Klebe G. Successful virtual screening for a submicromolar antagonist of the neurokinin-1 receptor based on a ligand-supported homology model. J Med Chem. 2004;47:5381–5392. doi: 10.1021/jm0311487. [DOI] [PubMed] [Google Scholar]
  • 20.Cavasotto CN, et al. Discovery of novel chemotypes to a G-protein-coupled receptor through ligand-steered homology modeling and structure-based virtual screening. J Med Chem. 2008;51:581–588. doi: 10.1021/jm070759m. [DOI] [PubMed] [Google Scholar]
  • 21.Tikhonova IG, et al. Discovery of novel agonists and antagonists of the free fatty acid receptor 1 (FFAR1) using virtual screening. J Med Chem. 2008;51:625–633. doi: 10.1021/jm7012425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Carlsson J, et al. Ligand discovery from a dopamine D3 receptor homology model and crystal structure. Nat Chem Biol. 2011;7:769–778. doi: 10.1038/nchembio.662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wu B, et al. Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists. Science (New York, N. Y. 2011;330:1066–1071. doi: 10.1126/science.1194396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Yang Q, Sharp KA. Building alternate protein structures using the elastic network model. Proteins. 2009;74:682–700. doi: 10.1002/prot.22184. [DOI] [PubMed] [Google Scholar]
  • 25.Gaulton A, et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012 doi: 10.1093/nar/gkr777. 40: D1100--D1107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Mysinger MM, Shoichet BK. Rapid context-dependent ligand desolvation in molecular docking. J Chem Inf Model. 2010;50:1561–1573. doi: 10.1021/ci100214a. [DOI] [PubMed] [Google Scholar]
  • 27.Wong RS, et al. Comparison of the potential multiple binding modes of bicyclam, monocylam, and noncyclam small-molecule CXC chemokine receptor 4 inhibitors. Mol Pharmacol. 2008;74:1485–1495. doi: 10.1124/mol.108.049775. [DOI] [PubMed] [Google Scholar]
  • 28.Irwin JJ, Shoichet BK. ZINC—a free database of commercially available compounds for virtual screening. J Chem Inf Model. 2005;45:177–182. doi: 10.1021/ci049714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Hert J, et al. Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org Biomol Chem. 2004;2:3256–3266. doi: 10.1039/B409865J. [DOI] [PubMed] [Google Scholar]
  • 30.Ursu O, Rayan A, Goldblum A, Oprea TI. Understanding drug-likeness. Wires Comput Mol Sci. 2011;1:760–781. [Google Scholar]
  • 31.Vaidehi N, et al. Predictions of CCR1 chemokine receptor structure and BX 471 antagonist binding followed by experimental validation. J Biol Chem. 2006;281:27613–27620. doi: 10.1074/jbc.M601389200. [DOI] [PubMed] [Google Scholar]
  • 32.Allegretti M, et al. Allosteric inhibitors of chemoattractant receptors: Opportunities and pitfalls. Trends Pharmacol Sci. 2008;29:280–286. doi: 10.1016/j.tips.2008.03.005. [DOI] [PubMed] [Google Scholar]
  • 33.Roumen L, et al. In silico veritas: The pitfalls and challenges of predicting GPCR-ligand interactions. Pharmaceuticals. 2011;4 [Google Scholar]
  • 34.Lam AR, et al. Importance of receptor flexibility in binding of cyclam compounds to the chemokine receptor CXCR4. J Chem Inf Model. 2011;51:139–147. doi: 10.1021/ci1003027. [DOI] [PubMed] [Google Scholar]
  • 35.McGovern SL, Helfand BT, Feng B, Shoichet BK. A specific mechanism of nonspecific inhibition. J Med Chem. 2003;46:4265–4272. doi: 10.1021/jm030266r. [DOI] [PubMed] [Google Scholar]
  • 36.Doak AK, Wille H, Prusiner SB, Shoichet BK. Colloid formation by drugs in simulated intestinal fluid. J Med Chem. 2010;53:4259–4265. doi: 10.1021/jm100254w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Coan KE, Shoichet BK. Stoichiometry and physical chemistry of promiscuous aggregate-based inhibitors. J Am Chem Soc. 2008;130:9606–9612. doi: 10.1021/ja802977h. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Allegretti M, et al. 2-Arylpropionic CXC chemokine receptor 1 (CXCR1) ligands as novel noncompetitive CXCL8 inhibitors. J Med Chem. 2005;48:4312–4331. doi: 10.1021/jm049082i. [DOI] [PubMed] [Google Scholar]
  • 39.Katritch V, Rueda M, Lam PC, Yeager M, Abagyan R. GPCR 3D homology models for ligand screening: Lessons learned from blind predictions of adenosine A2a receptor complex. Proteins. 2010;78:197–211. doi: 10.1002/prot.22507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Powers RA, Morandi F, Shoichet BK. Structure-based discovery of a novel, noncovalent inhibitor of AmpC beta-lactamase. Structure. 2002;10:1013–1023. doi: 10.1016/s0969-2126(02)00799-2. [DOI] [PubMed] [Google Scholar]
  • 41.Chen Y, Shoichet BK. Molecular docking and ligand specificity in fragment-based inhibitor discovery. Nat Chem Biol. 2009;5:358–364. doi: 10.1038/nchembio.155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Shoichet BK, Bodian DL, Kuntz ID. Molecular Docking Using Shape Descriptors. J Comput Chem. 1992;13:380–397. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information
1120431109_sapp.pdf (943.2KB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES