Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2011 Aug 1;108(33):13528-13533. doi: 10.1073/pnas.1101835108

Structural conservation of druggable hot spots in protein–protein interfaces

Dima Kozakov a,1, David R Hall a,1, Gwo-Yu Chuang a,1, Regina Cencic b,c,d, Ryan Brenke a, Laurie E Grove e, Dmitri Beglov a, Jerry Pelletier b,c,d, Adrian Whitty f,2, Sandor Vajda a,f,2
PMCID: PMC3158149  PMID: 21808046

Abstract

Despite the growing number of examples of small-molecule inhibitors that disrupt protein–protein interactions (PPIs), the origin of druggability of such targets is poorly understood. To identify druggable sites in protein–protein interfaces we combine computational solvent mapping, which explores the protein surface using a variety of small “probe” molecules, with a conformer generator to account for side-chain flexibility. Applications to unliganded structures of 15 PPI target proteins show that the druggable sites comprise a cluster of binding hot spots, distinguishable from other regions of the protein due to their concave topology combined with a pattern of hydrophobic and polar functionality. This combination of properties confers on the hot spots a tendency to bind organic species possessing some polar groups decorating largely hydrophobic scaffolds. Thus, druggable sites at PPI are not simply sites that are complementary to particular organic functionality, but rather possess a general tendency to bind organic compounds with a variety of structures, including key side chains of the partner protein. Results also highlight the importance of conformational adaptivity at the binding site to allow the hot spots to expand to accommodate a ligand of drug-like dimensions. The critical components of this adaptivity are largely local, involving primarily low energy side-chain motions within 6 Å of a hot spot. The structural and physicochemical signature of druggable sites at PPI interfaces is sufficiently robust to be detectable from the structure of the unliganded protein, even when substantial conformational adaptation is required for optimal ligand binding.

Keywords: fragment-based drug discovery, ligand binding site, inhibitor design, side-chain adjustment


Understanding the druggability of protein–protein interaction (PPI) interfaces is a major current problem in chemical biology, with substantial practical implications for the discovery of previously undescribed drugs and biological probe compounds. Reversible protein–protein interactions are involved at multiple points in virtually all biological pathways, including disease pathways where therapeutic intervention could bring widespread benefit (1, 2). Although many PPI interfaces are biologically compelling targets for drug discovery, and a number of systems are known for which small molecules inhibit the interactions of two proteins with moderate to high potency (13), identifying druggable sites remains largely an unsolved problem. The surface cavities available at protein–protein interfaces to bind small-molecule inhibitors substantially differ from those seen in traditional drug target proteins. The latter have one or two disproportionately large pockets with an average volume of 260 3, which form the binding site for their endogenous ligands in over 90% of proteins (46). The accuracy of binding site identification can be further improved by accounting for additional properties such as shape, rigidity, and amino acid composition (7). In contrast, the average volume of pockets seen at protein–protein interfaces is only 54 3, the same as the mean for all protein surface pockets (4). The interface, on the average, includes six such small pockets (4), and it is difficult to determine which one, if any, will be able to bind an inhibitor. In addition, the binding of a drug-sized ligand generally depends on the ability of a pocket to expand, and thus it is necessary to account for potential conformational changes (8, 9). Molecular dynamics simulations demonstrate the substantial plasticity of the interface regions, but rarely provide the information required for the identification of specific sites and for determining their druggability (9).

In this paper we show that the sites capable of binding drug-sized ligands can be identified computationally using a fragment-based approach, even when only unbound protein structures are available. The method is based on the observation that the binding sites of proteins generally include smaller regions called hot spots that are major contributors to the binding free energy (10). In drug design applications, such hot spots can be identified by screening for the binding of fragment-sized organic molecules (11, 12). Because the binding of small compounds is weak, experimental screening of fragment libraries requires special techniques. Examples include nuclear magnetic resonance methods such as SAR (structure-activity relationship) by NMR (11), and the MSCS (multiple solvent crystal structures) approach (12) based on X-ray crystallography. The results of experimental fragment screens confirm that the hot spots of proteins are characterized by their ability to bind a variety of small molecules and that the number of different “probe” molecules observed to bind to a particular site predicts the potential importance of the site and is predictive of overall druggability (1114).

Computational solvent mapping was developed as a virtual analogue of the SAR by NMR and MSCS methods (13). Mapping places molecular probes—small organic molecules that vary in size and shape—on a dense grid around the protein, finds favorable positions using empirical free energy functions, clusters the conformations, and ranks the clusters on the basis of the average free energy. The regions that bind several probe clusters are called consensus sites, and the one binding the largest number of probe clusters is considered the main hot spot (1315). Based on a variety of proteins using 16 different types of probes, we have shown that the main hot spot in all druggable targets binds at least 16 probe clusters and, together with nearby hot spots, predicts the site that can potentially bind drug-size ligands (1315). Although solvent mapping shows formal similarity to earlier computational methods (16, 17), it correctly shows that small organic molecules bind and cluster only at a few sites on a protein (12).

Here we describe a four-step algorithm, which extends computational solvent mapping to PPI targets. The basic idea of the method is to identify small pockets by an initial mapping and then identify energetically accessible conformational adaptations of nearby side chains to accommodate drug-size molecules. Accordingly, the method finds the main hot spot, uses a set of rules to select the potentially important side chains nearby, generates their energetically accessible conformers, maps all alternative structures (18), and selects the one with the highest number of probe clusters in the binding site.

Results

Table 1 lists the rank of the consensus sites, and in parentheses the number of probe clusters, from the mapping of 15 PPI targets. The analysis of the first six targets, discussed by Wells and McClendon (1), will be described in more details (see also Table S1). The next six proteins (XIAP Bir3 through survivin) are PPI targets from the landmark paper on druggability indices by Fesik and co-workers (11), whereas druggability information on HIV integrase (19), CD40L (20), and B-cell activating factor (BAFF) (21) is based on original reports. The mapping of the last nine proteins is described in SI Appendix, which also provides results for 28 additional targets. Both ligand-free and (if available) ligand-bound structures were mapped, the latter after removing the bound ligand. We denote the largest, second largest, and further consensus sites of each protein as CS1, CS2, CS3, and so on, where the size of each site is defined in terms of the number of probe clusters it binds (14, 15). In the figures the inhibitor is always superimposed on the mapping results for reference, but we emphasize that all ligands are removed prior to mapping, and we rely only on the protein structure. After validating the method, we studied the feasibility of disrupting the eIF4E/eIF4G complex, a promising PPI target for anticancer therapy (2224).

Table 1.

Location, ranking, and size of hot spots of target proteins in the protein–protein interface

Target Binding Site Structures
Initial consensus sites at the binding site* Druggable Side-chain adjustment


Unbound
Bound


Movable
Final consensus sites*
IL-2 IL-2Rα 1m47 1pw6 1(20), 4(10) yes none
Bcl-xL BAK peptide 1r2d 2yxj 1(18), 2(17) yes 100, 101 1(22), 2(18), 3(17), 4(14)
MDM2 p53 1z1m 1rv1 1(21), 2(21 yes NMR model 4
HPV-11 E2 HPV-11 E1 1r6k 1r6n 1(18), 4(15), 5(11), 6(10) yes 19, 32, 98, 100 1(20), 2(18), 5(10)
ZipA FtsZ 1f46 1s1s 8(6), 9(5), 11(3) no none
TNF-α TNFR1 1tnf (trimer) 2az5 7(5) no none
XIAP Bir3 Smac/diablo peptide 1f9x 1g3f 2(18), 6(6), 7(2) yes 319 2(19), 5(13), 7(4)
PDZ-PSD95 peptide 1iu2 1rgr 1(22), 5(11) yes NMR model 29
Pin1 peptide 1i6c 1i8g 1(28), 4(13) yes NMR model 6
Urokinase peptide 2o8t 1fv9 1(29), 2(26), 3(13) yes none
Stromelysin catalytic domain substrate 1cqr 1g4k 1(22), 7(3) yes 224 1(26), 5(7)
Survivin Bir3 1e31 none 7(5) no none
HIV integrase LEDGF/p75 1bi4 3lpu 1(18), 5(9) yes none
CD40L CD40 1aly (trimer) none Cleft 1: 3(14) Cleft 2: 4(12) Cleft 3: 5(12), 8(2) no none
BAFF BAFFR 1kd7 (trimer, chains klm) none Cleft 1: 6(8) Cleft 2: 8(4) no none

*The number of probe clusters in the consensus site are given in parentheses.

Druggability results based on our criteria.

Side chains that were adjusted to increase the total number of probe clusters at the binding site. For NMR structures results are shown for the model resulting in the highest number of probe clusters.

§Not shown if initial consensus sites are unchanged.

Hot Spots and Druggability Assessment for Six Well-Studied PPI Targets.

Interleukin-2.

Interleukin 2 (IL-2) is an immunoregulatory cytokine that stimulates normal and pathogenic T cells and contributes to rejection of tissue grafts (25). A number of small molecules (e.g., compound 1 in Fig. S1) inhibit the interactions of IL-2 with IL-2Rα (2628). The binding site for these compounds includes a largely polar and rigid pocket, and a highly adaptive hydrophobic region (27). We mapped both the unliganded and ligand-bound IL-2 structures and in each case found the hot spots that were also identified experimentally at the two ends of the inhibitor (Fig. 1A and Table S1). Throughout the paper, the results of mapping two different structures of the same protein will be compared in terms of mapping fingerprints, i.e., the percentages of nonbonded interactions between the probes and each amino acid residue of the protein. Fig. 1A also shows the mapping fingerprints for the two Il-2 structures, with the stars indicating the residues that interact with the inhibitor (27). Although there are differences in the distributions of probe-residue contacts, apart from residue T41 mapping finds the same residues in both unliganded and ligand-bound structures, in contrast to the prevailing view that the ligand binding site in IL-2 is not predictable based on the unbound structure (1, 9, 27). We note that no low energy alternative side-chain conformers were found for the unbound IL-2 (Table 1).

Fig. 1.

Fig. 1.

Mapping results for IL-2 and Bcl-xL. (A) Mapping of IL-2. (Top) Unliganded IL-2 (PDB ID code 1f47). CS1 (cyan, 20) is in the rigid hydrophilic pocket close to the site that binds the guanido group of compound 1, CS4 (salmon, 10) is at the adaptive hydrophobic pocket overlapping with the dichlorophenyl moiety. The number in parentheses following the color code indicates the number of probe clusters. Only the protein is used in the mapping; the inhibitor is shown for reference. CS1 is in the IL-2/IL-2Rα interface, and CS4 is close to it. We note that CS2 (17 probe clusters) and the small hot spot CS8 (4 clusters) are in the IL-2/IL-2Rβ interface, which makes this second interface also druggable. (Middle) IL-2 structure from the cocrystal with compound 1 (PDB ID code 1pw6). CS1 (cyan, 16) is now in the adaptive hydrophobic pocket, and CS3 (salmon, 13) identifies the rigid polar pocket. (Lower) Mapping fingerprints for IL-2, i.e., the percentages of nonbonded interactions between the probes and each amino acid residue. Green, unbound; blue, bound. Stars indicate residues interacting with compound 1 (Fig. S1). (B) Mapping of Bcl-xL. (Top) Unliganded Bcl-xL (PDB ID code 1r2d) with the modified conformations of the R100 and Y101 side chains. Both CS1 (cyan, 22) and CS3 (yellow, 16) are in the pocket which binds the distal 4-chlorophenyl ring of ABT-737 (compound 2 in Fig. S1). CS2 (magenta, 18) overlaps with the thiophenyl group of ABT-737. CS4 (salmon, 14) is in a pocket binding the piperazine and acylsulfonamide groups of ABT-737. (Middle) Bcl-xL from the cocrystal with ABT-737. CS1 (cyan, 28) overlaps with the thiophenyl, CS2 (magenta, 25) and CS4 (salmon, 9) are in the pocket binding the chlorophenyl, and CS5 (gray, 8) is in the middle. (Lower) Mapping fingerprints for Bcl-xL; green, unbound; blue, bound. Stars indicate residues interacting with ABT-737.

B-cell lymphoma-extra large (Bcl-xL).

Bcl-xL is overexpressed in many cancers and consequently has been actively pursued as a target for small-molecule drug discovery (1, 2, 29, 30). Abbott Laboratories has developed a set of Bcl-xL inhibitors, among them ABT-737 (compound 2 in Fig. S1) (31). Mapping the unliganded Bcl-xL structure yielded only one large hot spot (Fig. S2A). Assessing the energetically accessible conformations of the side chains within the 6-Å neighborhood of this initial hot spot identifies several mobile residues (Table S3). The conformer that gave the strongest mapping result involved switching to the second lowest energy conformers for the side chains of R100 and Y101 (Table S4). These side-chain motions opened up a pocket that bound the same number of probe clusters as the inhibitor-bound structure (70 in both cases). The side-chain rearrangements that optimized the computational mapping results agree well with the conformational changes that were observed experimentally to occur upon ligand binding (Fig. S2B). Mapping of this conformer yielded four key hot spots (Fig. 1B and Table S1), in good agreement with the results of mapping the ligand-bound Bcl-xL structure and overlapping well with the binding site for ABT-737 as established from the cocrystal structure of this compound with Bcl-xL. Fig. 1B also shows the mapping fingerprints for the modified unbound and the bound structures, confirming the very good agreement achieved by adjusting the two side chains (see the SI Appendix for additional discussion).

Mouse double minute protein 2 (MDM2).

The human version of the MDM2 influences transcription by binding to the tumor suppressor p53 (32, 33). Roche reported a series of cis-imidazoline analog inhibitors termed Nutlins (e.g., compound 3 in Fig. S1) (34). We mapped the 24 NMR structures of unliganded MDM2 (Table S7). Results for the structure binding the largest number of probe clusters show that the two main hot spots overlap the binding location of compound 3 (Fig. 2A). The same sites are identified by mapping the structure of MDM2, cocrystallized with compound 3 (Fig. 2A and Table S1). The results show that mapping an ensemble of conformations and selecting a structure with the highest number of probe clusters correctly identify the druggable site. Fig. 2A also shows the mapping fingerprints for the unbound and bound MDM2 structures, as well as the interactions with compound 3 (indicated by stars).

Fig. 2.

Fig. 2.

Mapping results for MDM2 and HPV-11 E2. (A) Mapping of MDM2. (Top) Unliganded MDM2 (PDB ID code 4z1m). CS1 (cyan, 21) is in the pocket that binds the two bromophenyl groups of compound 3 (Fig. S1), shown in green. CS2 (magenta, 21) superimposes the ethyoxyphenyl group. (Middle) MDM2 from the cocrystal with compound 3 (PDB ID code 1rv1). CS1 (cyan, 21) and CS2 (magenta, 20) identify the subsites that bind the bromophenyl and ethoxyphenyl groups of compound 3. CS3 (yellow, 19) defines a distinct subsite, which binds the second bromophenyl moiety of compound 3. CS5 (gray, 8) does not directly interact with the inhibitor. (Lower) Mapping fingerprints for MDM2; green, unbound; blue, bound. Stars indicate residues interacting with compound 3. (B) Mapping of HPV-11 E2. (Top) Unliganded HPV-11 E2 (PDB ID code 1r6k) after adjusting the “movable” side chains. CS1 (cyan, 20) is in a pocket that occupied by an isobutyric acid molecule in the inhibitor-bound structure. CS2 (magenta, 18) overlaps with the indandione moiety of inhibitor 1 (shown in green), and CS5 (gray, 10) the dichlorophenyl group of inhibitor 2 (shown in magenta). (Middle) HPV-11 E2 from the cocrystal with compound 4 (PDB ID code 1r6n). CS1 (cyan, 20) is in the pocket binding the indandione moiety of inhibitor 1 (green). CS3 (yellow, 16) overlaps with the dichlorophenyl of inhibitor 2 (magenta). CS4 (salmon, 16) is in the pocket that is occupied by an isobutyric acid. (Lower) Mapping fingerprints for HPV-11 E2; green, unbound; blue, bound. Stars and diamonds indicate residues interacting with inhibitors 1 and 2, respectively.

Human papilloma virus (HPV)-11 E2.

The interaction between the HPV transcription factor E2 and viral helicase E1 is an important PPI target (1, 35, 36). A series of small molecules have been reported that bind to E2 and inhibit its interaction with E1 (36). Although two inhibitor molecules were observed in the binding pocket, it was suggested that the interaction with the second molecule is probably a crystallization artifact (36). Results of mapping the unliganded HPV-11 E2 are shown in Fig. S3. Application of the alternative side-chain algorithm in the neighborhood of this hot spot indicates a number of potentially mobile side chains (Table S5). The conformer that gave the strongest mapping results involved alternative conformers for four side chains, expanding the pocket in this region and improving the correlation with the mapping results for the ligand-bound structure (Table S6). Fig. 2B and Table S1 show results for both the adjusted unbound structure and the structure bound to compound 4. In both cases a large consensus site (CS1 or CS2) identifies the pocket that binds the indandione moiety of the higher affinity inhibitor. Mapping also finds two additional hot spots, one overlapping with the second inhibitor molecule, and the other at a site that in the X-ray structure binds isobutyric acid, a component of the crystallization medium. As shown in Fig. 2B, each inhibitor molecule binds only to a single hot spot. The parts of the ligands that do not interact with hot spot residues are unlikely to substantially contribute to the binding free energy, suggesting that better lead compounds might be found that bridge the two main hot spots.

ZipA.

The ZipA/FtsZ interaction has been considered as a potential target for antibacterial agents (1, 37). Mapping of both unliganded and ligand-bound ZipA structures yielded a large hot spot (CS1), but it is not in the ZipA/FtsZ interface (Fig. 3 A and B). In the interface region we find only three weak hot spots, each binding less than 10 probe clusters. All side chains in the ligand-free X-ray structure within 6 Å of the initial hot spots turned out to be stationary. Hot spots that bind fewer than 16 probe clusters do not contribute significantly to the binding free energy, and hence we predict that the ZipA/FtsZ interface is not a druggable target. In fact, efforts to identify small molecular PPI inhibitors for this system, including high throughput screening of 250,000 compounds, resulted only in weak inhibitors, with the best inhibitor (compound 5 in Fig. S1) having an IC50 value of approximately 1 mM (3840).

Fig. 3.

Fig. 3.

Mapping results for Zip and TNF-α. A. Mapping of ZipA. (Top) Unliganded ZipA (PDB ID code 1f46). CS1 (cyan, 17) is not in the ZipA/FtsZ peptide interface and does not overlap with the inhibitor. The consensus sites in the interface are CS8 (green, 6), CS9 (gray, 5), and CS11 (magenta, 3). (Lower) ZipA cocrystallized with compound 5 (Fig. S1). (PDB ID code 1s1s). The consensus sites in the interface are CS4 (gray, 10), CS8 (green, 4), and CS10 (magenta, 2). B. Mapping of TNFα. Top panel: Intact TNFα trimer. The only consensus site in the TNFα-TNFR1 interface is CS7 (orange, 5). The hot spots CS1 (cyan, 22), CS2 (magenta, 20), CS3 (19, yellow), CS4 (salmon, 7), and CS6 (blue, 6) are in the interior of the protein. (Lower) Results for the A and B chains of TNFα obtained by removing chain C from the trimeric structure (PDB ID code 1tnf). CS2 (cyan, 19) overlaps with the trifluoromethylphenyl indole moiety of compound 6 (shown in green), CS3 (yellow, 19) is close to K98 and Y119, CS4 (salmon, 10) is near Y59 and Y151, and CS6 (blue, 9) overlaps with the dimethyl chromone group.

Tumor necrosis factor α (TNFα).

Monoclonal antibodies have validated the trimeric cytokine TNFα as a high-value drug target in multiple inflammatory and immune disorders. Despite extensive efforts, no synthetic small-molecule inhibitors have been reported that bind TNFα and disrupt the interactions between the homotrimer and its receptor TNFR1 (2). Mapping the unliganded TNFα trimer gave the unexpected result that all major consensus sites occurred in the interior of the protein among the three subunits (Fig. 3B). The only consensus site in the TNFα-TNFR1 interface is CS7 with five probe clusters between the Y87 and Q125 side chains (Table S1). This finding agrees with the fact that the regions near Y87 have been identified as the principal surface binding site in an experimental fragment screen of TNFα (41). However, the small number of probe clusters leads to categorization of the TNFα-TNFR1 interaction as not druggable, which is consistent with the failure of extensive efforts by multiple groups to identify small-molecule antagonists of TNFR binding. However, the mapping results show that the interior of the TNFα trimer has high binding affinity for the small molecules used as probes, which suggests that a suitable larger molecule might disrupt the constitutive trimer interface. Indeed, He and co-workers (41) developed SP307 (compound 6 in Fig. S1), which inhibited TNFα activity by displacing one of the TNFα subunits. To determine whether mapping could correctly characterize this unusual binding site, we mapped the A and B chains of TNFα, obtained by removing chain C from the trimeric structure. Once the structure becomes open with two subunits only, the largest consensus site CS1 shifts to the outside pocket near Y87, but the other consensus sites remain in the interface and coincide closely with the observed binding site of inhibitor 6 (Fig. 3B). The agreement is further improved when mapping the inhibitor-bound TNFα dimer (Table S1 and Fig. S4).

Disrupting the eIF4E/eIF4G Complex.

EIF4E is a eukaryotic translation initiation factor involved in directing ribosomes to the cap structure of mRNAs and is frequently overexpressed in human cancers (22). Assembly of the eIF4E/eIF4G complex has a central role in translation initiation, and inhibition of this interaction has tumor-suppressor activity (22). The small molecule 4EGI-1 (compound 7 in Fig. 4A) and an analogue are the only known small inhibitors of the eIF4E/eIF4G interaction (KD ∼ 25 and 16 μM, respectively). Although no eIF4E structure is available with bound inhibitor, line broadening observed by NMR indicates that residues H37, V69, L131, and I138 on eIF4E interact with 4EGI-1 (23). Because the inhibitor 4EGI-1 binds to eIF4E but does not compete with mRNA binding, we blocked the mRNA cap binding site prior to mapping. Mapping both the unbound eIF4E and the structure from the complex eIF4E/4E-BP1 revealed that the main hot spots CS1 through CS4 form an elongated site, which could support small-molecule interactions (Fig. 4). We used this information to place the inhibitor 4EGI-1 using an algorithm that scores docked ligand poses based on the degree of overlap between the compound and the consensus sites (see Methods). The results suggest that the bound inhibitor occupies only CS1 and part of CS3 (Fig. 4C). This agrees with the published NMR line broadening (23), showing that the inhibitor interacts with residues V69, L131, and I138 within CS1 and CS3. The proximity of the hot spots to the residues with line broadening provides strong support that mapping has found the correct location for inhibitor binding.

Fig. 4.

Fig. 4.

Mapping results for eIF4E. (A) eIF4E inhibitors: 4EGI-1 (compound 7) and 4E1RCat (compound 8). (B) Mapping fingerprint for eIF4E. Stars indicate the eIF4E residues (H37, V69, L131, and I138) that, based on NMR line broadening (20), interact with 4EGI-1. (C) Mapping of the eIF4E structure from the eIF4E/4E-BP1 complex (PDB ID code 1wkw). The hot spots shown are CS1 (cyan, 24), CS2 (magenta, 22), CS3 (yellow, 19), and CS4 (salmon, 10). The inhibitor 4EGI-1 (KD ∼ 25 μM) is docked to best superimpose the hot spots. Residues V69, L131, and I138 are colored blue. H37 is on the flexible amino terminal portion of the protein and is not shown. (D) The inhibitor 4E1RCat (KD ∼ 4 μM) is docked to superimpose the hot spots.

To identify eIF4E/eIF4G inhibitors, a collection of approximately 218,000 compounds was screened using a time-resolved fluorescence resonance energy transfer assay (24). Among a number of hits was the compound 4E1RCat (compound 8 in Fig. 4A; PubChem ID code 16195554). Additional tests showed that 4E1RCat inhibited the eIF4E:eIF4G interaction with IC50 = 4 μM (24). The probable binding mode of 4E1RCat was determined by the same computational method we used to identify the binding mode of 4EGI-1. Fig. 4C shows that 4E1RCat occupies CS1 and part of CS3, but unlike 4EGI-1 additionally reaches into CS4, which may explain its somewhat higher affinity compared to 4EGI-1.

Discussion

The results of this paper show that based on mapping a target protein using 16 different types of probe molecules, a druggable site comprises a main hot spot binding at least 16 probe clusters in the protein–protein interface, and one or two additional hot spots nearby, within reach of a drug-sized molecule. The hot spots are distinguishable from other regions of the protein surface due to their concave topology combined with a mosaic-like pattern of hydrophobic and polar functionality (13). This combination of properties confers on the hot spots a tendency to bind drug-like organic species possessing some polar functionality decorating a largely hydrophobic scaffold. Thus, druggable sites at PPI are not simply sites that are complementary to particular organic functionality, but rather possess a general tendency to bind organic compounds with a variety of structures. This property of hot spots accounts for their identification as consensus sites by computational solvent mapping, as well as for the experimental observation that fragment hit rate is significantly predictive of the overall prospects for identifying a high affinity drug-like ligand (11). Our results additionally highlight the importance of conformational adaptivity at the binding site to allow hot spots to expand to accommodate a ligand of drug-like dimensions. Moreover, we show that the critical components of this adaptivity are largely local, involving primarily low energy side-chain motions within 6 Å of a hot spot. Most importantly, we show that the structural and physicochemical signature of druggable sites at PPI interfaces is sufficiently robust to be detectable from the structure of the unliganded protein, even when substantial conformational adaptation is required for optimal ligand binding. This information could potentially allow those PPI targets most likely to be druggable to be identified from structural data alone, without expanding resources on exploratory lead finding efforts against intractable targets.

The tendency of hot spots to bind many different compounds suggests that in the protein–protein complex such sites strongly interact with some residues of the partner protein. Indeed, it has been observed that the hot spot residues of protein–protein interactions generally either protrude or are located in tightly complemented pockets that become filled upon binding (42). Finding small molecular inhibitors has a better chance if the hot spots cluster; e.g., the protruding residues are on a short peptide fragment of the partner protein. Among the proteins discussed, Bcl-xL, MDM2, ZipA, and eIF4E have been cocrystallized with such peptide fragments. The peptide-bound structures are very useful for ligand design, because the peptides and the small molecular inhibitors bind to the same sites, in good agreement with the observed generic binding tendencies. We mapped the peptide-bound structures (obviously after removing the peptide) and found that the relative ranking of the hot spots between peptide-bound and ligand-bound structures is completely conserved (Figs. S5 AC). However, in this paper we focused on the conservation of druggable sites between unbound and ligand-bound proteins and hence did not utilize information from the peptide-bound structures.

Methods

Computational Solvent Mapping.

Protein structures were downloaded from the Protein Data Bank (43). All ligand and bound water molecules were removed. Mapping was performed using the FTMAP algorithm (15) through its online server (http://ftmap.bu.edu). FTMAP scans the entire surface of the protein with a library of 16 small organic probe molecules, with varying hydrophobicity and hydrogen bonding capability (see SI Methods). For each probe, six bound probe clusters with the lowest mean interaction energies are retained. The clusters from the different probe types are then clustered into CSs, which define hot spots where multiple probes congregate with high affinity. The CSs are ranked on the basis of the number of probe clusters they incorporate, with the largest CSs representing the most important sites.

Accounting for Side-Chain Flexibility when Mapping PPI Targets.

The algorithm consists of four main steps as follows:

Step 1: Initial mapping. The unbound protein is mapped to find the largest consensus site.

Step 2: Residue selection. We consider the solvent accessible residues within 6 Å from the center of any cluster in the consensus site and select those that (i) have at least 75% of the maximum hydrophobicity value calculated for all surface residues and (ii) are located in a cavity. For (ii), we calculate a cavity measure, defined in ref. 15, for all residues, and retain only those that exhibit 60% or more of the maximum value.

Step 3: Generating low-energy conformers for selected side chains. The conformers of the selected side chains are explored one side chain at a time by multistart energy minimization, performed in the absence of the partner protein. Only the atoms of the selected side chain are allowed to move during the minimization. The conformers from a backbone-dependent rotamer library (44) are used as initial states in the minimizations. The final frames of minimization trajectories are clustered based on the positions of their end groups using a 1-Å clustering radius. For each cluster a probability is defined by summing the probabilities of the original rotamers that converge to the particular cluster. For each cluster we also calculate the Boltzmann average energy of its members. A side chain is considered movable if it has multiple low energy and/or highly populated clusters, and the centers of the low-energy populated clusters plus the unbound state define its potential conformers. Otherwise a side chain is considered stationary, and its conformation seen in the unbound structure is retained.

Step 4: Selecting the structure with the largest pocket. Alternative protein structures are generated by combining all conformers of movable side chains. The alternative structures are mapped. For assessing druggability we select the structure that has the highest number of probe clusters within the 6-Å vicinity of the main consensus site.

The results of mapping two different structures of a protein are compared in terms of mapping fingerprints, i.e., the percentages of the nonbonded interactions between the probes and each amino acid residue of the protein. A high correlation coefficient between two fingerprints indicates similar mapping results and thus similar hot spots and binding sites. Table S2 shows the results of applying the algorithm to the unbound structures of 15 well-studied drug targets (45). We note that one can account for protein flexibility without generating alternative conformations if multiple protein structures are available as in the case of NMR structures (see Table S7). The algorithm was also applied to 22 additional targets (see Tables S8 and S9), including 9 additional PPI targets also shown in Table 1.

Docking of eIF4E Inhibitors.

Based on the mapping results, a box with 4-Å padding was created around the putative binding site. The docking was carried out using the standard settings of AutoDock Vina 1.1.0 (46), and the 10 lowest energy binding modes were retained for each ligand. The selection of the most likely pose was based on the atom densities calculated from the mapping results. We considered each retained pose separately, for each atom summed the atomic densities on the grid points within a 1-Å radius, and then added these values for all atoms. The poses were ranked on the basis of this overlap measure, and the pose with the best overlap was selected.

Supplementary Material

Supporting Information

Acknowledgments.

This investigation was supported by Grants GM064700 and GM094551 from the National Institute of General Medical Sciences and Grant ES07381 from the National Institute of Environmental Health Sciences.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1101835108/-/DCSupplemental.

References

  • 1.Wells J, McClendon C. Reaching for high-hanging fruit in drug discovery at protein-protein interfaces. Nature. 2007;450:1001–1009. doi: 10.1038/nature06526. [DOI] [PubMed] [Google Scholar]
  • 2.Whitty A, Kumaravel G. Between a rock and a hard place? Nat Chem Biol. 2006;2:112–118. doi: 10.1038/nchembio0306-112. [DOI] [PubMed] [Google Scholar]
  • 3.Berg T. Small-molecule inhibitors of protein-protein interactions. Curr Opin Drug Discov Devel. 2008;11:666–674. [PubMed] [Google Scholar]
  • 4.Fuller JC, Burgoyne NJ, Jackson RM. Predicting druggable binding sites at the protein-protein interface. Drug Disc Today. 2009;14:155–161. doi: 10.1016/j.drudis.2008.10.009. [DOI] [PubMed] [Google Scholar]
  • 5.Laskowski RA, Luscombe NM, Swindells MB, Thornton JM. Protein clefts in molecular recognition and function. Protein Sci. 1996;5:2438–2452. doi: 10.1002/pro.5560051206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.An J, Totrov M, Abagyan R. Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol Cell Proteomics. 2005;4:752–761. doi: 10.1074/mcp.M400159-MCP200. [DOI] [PubMed] [Google Scholar]
  • 7.Nayal M, Honig B. On the nature of cavities on protein surfaces: Application to the identification of drug-binding sites. Proteins. 2006;63:892–906. doi: 10.1002/prot.20897. [DOI] [PubMed] [Google Scholar]
  • 8.Brown SP, Hajduk PJ. Effects of conformational dynamics on predicted protein druggability. ChemMedChem. 2006;1:70–72. doi: 10.1002/cmdc.200500013. [DOI] [PubMed] [Google Scholar]
  • 9.Eyrisch S, Helms V. Transient pockets on protein surfaces involved in protein-protein interaction. J Med Chem. 2007;50:3457–3464. doi: 10.1021/jm070095g. [DOI] [PubMed] [Google Scholar]
  • 10.Clackson T, Wells JA. A hot spot of binding energy in a hormone-receptor interface. Science. 1995;267:383–386. doi: 10.1126/science.7529940. [DOI] [PubMed] [Google Scholar]
  • 11.Hajduk PJ, Huth JR, Fesik SW. Druggability indices for protein targets derived from NMR-based screening data. J Med Chem. 2005;48:2518–2525. doi: 10.1021/jm049131r. [DOI] [PubMed] [Google Scholar]
  • 12.Mattos C, Ringe D. Locating and characterizing binding sites on proteins. Nat Biotechnol. 1996;14:595–599. doi: 10.1038/nbt0596-595. [DOI] [PubMed] [Google Scholar]
  • 13.Dennis S, Kortvelyesi T, Vajda S. Computational mapping identifies the binding sites of organic solvents on proteins. Proc Natl Acad Sci USA. 2002;99:4290–4295. doi: 10.1073/pnas.062398499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Landon MR, Lancia DR, Jr, Yu J, Thiel SC, Vajda S. Identification of hot spots within druggable binding sites of proteins by computational solvent mapping. J Med Chem. 2007;50:1231–1240. doi: 10.1021/jm061134b. [DOI] [PubMed] [Google Scholar]
  • 15.Brenke R, et al. Fragment-based identification of druggable “hot spots” of proteins using Fourier domain correlation techniques. Bioinformatics. 2009;25:621–627. doi: 10.1093/bioinformatics/btp036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Goodford PJ. A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J Med Chem. 1985;28:849–875. doi: 10.1021/jm00145a002. [DOI] [PubMed] [Google Scholar]
  • 17.Miranker A, Karplus M. Functionality maps of binding sites: A multiple copy simultaneous search method. Proteins. 1991;11:29–34. doi: 10.1002/prot.340110104. [DOI] [PubMed] [Google Scholar]
  • 18.Landon MR, et al. Novel druggable hot spots in avian influenza neuraminidase H5N1 revealed by computational solvent mapping of a reduced and representative receptor ensemble. Chem Biol Drug Des. 2008;71:106–116. doi: 10.1111/j.1747-0285.2007.00614.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Christ F, et al. Rational design of small-molecule inhibitors of the LEDGF/p75-integrase interaction and HIV replication. Nat Chem Biol. 2010;6:442–448. doi: 10.1038/nchembio.370. [DOI] [PubMed] [Google Scholar]
  • 20.Silvian LF, et al. Small molecule inhibition of the TNF family cytokine CD40 ligand through a subunit fracture mechanism. ACS Chem Biol. 2011;6:636–647. doi: 10.1021/cb2000346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Cachero TG, et al. Formation of virus-like clusters is an intrinsic property of the tumor necrosis factor family member BAFF (B cell activating factor) Biochemistry. 2006;45:2006–2013. doi: 10.1021/bi051685o. [DOI] [PubMed] [Google Scholar]
  • 22.Graff JR, Konicek BW, Carter JH, Marcusson EG. Targeting the eukaryotic translation initiation factor 4E for cancer therapy. Cancer Res. 2008;68:631–634. doi: 10.1158/0008-5472.CAN-07-5635. [DOI] [PubMed] [Google Scholar]
  • 23.Moerke NJ, et al. Small-molecule inhibition of the interaction between the translation initiation factors eIF4E and eIF4G. Cell. 2007;128:257–267. doi: 10.1016/j.cell.2006.11.046. [DOI] [PubMed] [Google Scholar]
  • 24.Cencic R, et al. Reversing chemoresistance by small molecule inhibition of the translation initiation complex eIF4F. Proc Natl Acad Sci USA. 2011;108:1046–1051. doi: 10.1073/pnas.1011477108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Rickert M, Wang X, Boulanger MJ, Goriatcheva N, Garcia KC. The structure of interleukin-2 complexed with its alpha receptor. Science. 2005;308:1477–1480. doi: 10.1126/science.1109745. [DOI] [PubMed] [Google Scholar]
  • 26.Tilley JW, et al. Identification of a small molecule inhibitor of the IL-2/IL-2R alpha receptor interaction which binds to IL-2. J Am Chem Soc. 1997;119:7589–7590. [Google Scholar]
  • 27.Arkin M, et al. Binding of small molecules to an adaptive protein-protein interface. Proc Natl Acad Sci USA. 2003;100:1603–1608. doi: 10.1073/pnas.252756299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Braisted A, et al. Discovery of a potent small molecule IL-2 inhibitor through fragment assembly. J Am Chem Soc. 2003;125:3714–3715. doi: 10.1021/ja034247i. [DOI] [PubMed] [Google Scholar]
  • 29.Sattler M, et al. Structure of Bcl-xL-Bak peptide complex: Recognition between regulators of apoptosis. Science. 1997;275:983–986. doi: 10.1126/science.275.5302.983. [DOI] [PubMed] [Google Scholar]
  • 30.Oltersdorf T, et al. An inhibitor of Bcl-2 family proteins induces regression of solid tumours. Nature. 2005;435:677–681. doi: 10.1038/nature03579. [DOI] [PubMed] [Google Scholar]
  • 31.Lee EF, et al. Crystal structure of ABT-737 complexed with Bcl-xL: Implications for selectivity of antagonists of the Bcl-2 family. Cell Death Differ. 2007;14(9):1711–1713. doi: 10.1038/sj.cdd.4402178. [DOI] [PubMed] [Google Scholar]
  • 32.Momand J, Wu H, Dasgupta G. MDM2-master regulator of the p53 tumor suppressor protein. Gene. 2000;242:15–29. doi: 10.1016/s0378-1119(99)00487-4. [DOI] [PubMed] [Google Scholar]
  • 33.Kussie PH, et al. Structure of the MDM2 oncoprotein bound to the p53 tumor suppressor transactivation domain. Science. 1996;274(5289):948–53. doi: 10.1126/science.274.5289.948. [DOI] [PubMed] [Google Scholar]
  • 34.Vassilev L, et al. In vivo activation of the p53 pathway by small-molecule antagonists of MDM2. Science. 2004;303:844–848. doi: 10.1126/science.1092472. [DOI] [PubMed] [Google Scholar]
  • 35.Abbate EA, Berger JM, Botchan MR. The X-ray structure of the papillomavirus helicase in complex with its molecular matchmaker E2. Genes Dev. 2004;18:1981–1996. doi: 10.1101/gad.1220104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Wang Y, et al. Crystal structure of the E2 transactivation domain of human papillomavirus type 11 bound to a protein interaction inhibitor. J Biol Chem. 2004;279:6976–6985. doi: 10.1074/jbc.M311376200. [DOI] [PubMed] [Google Scholar]
  • 37.Mosyak L, et al. The bacterial cell-division protein ZipA and its interaction with an FtsZ fragment revealed by X-ray crystallography. EMBO J. 2000;19:3179–3191. doi: 10.1093/emboj/19.13.3179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Jennings LD, et al. Design and synthesis of indolo[2,3-a]quinolizin-7-one inhibitors of the ZipA-FtsZ interaction. Bioorg Med Chem Lett. 2004;14:1427–1431. doi: 10.1016/j.bmcl.2004.01.028. [DOI] [PubMed] [Google Scholar]
  • 39.Tsao DH, et al. Discovery of novel inhibitors of the ZipA/FtsZ complex by NMR fragment screening coupled with structure-based design. Bioorg Med Chem. 2006;14:7953–7961. doi: 10.1016/j.bmc.2006.07.050. [DOI] [PubMed] [Google Scholar]
  • 40.Rush TS, 3rd, Grant JA, Mosyak L, Nicholls A. A shape-based 3-D scaffold hopping method and its application to a bacterial protein-protein interaction. J Med Chem. 2005;48:1489–1495. doi: 10.1021/jm040163o. [DOI] [PubMed] [Google Scholar]
  • 41.He MM, et al. Small-molecule inhibition of TNF-alpha. Science. 2005;310:1022–1025. doi: 10.1126/science.1116304. [DOI] [PubMed] [Google Scholar]
  • 42.Li X, Keskin O, Ma B, Nussinov R, Liang J. Protein-protein interactions: Hot spots and structurally conserved residues often locate in completed pockets that preorganized in the unbound states: Implications for docking. J Mol Biol. 2004;344:781–785. doi: 10.1016/j.jmb.2004.09.051. [DOI] [PubMed] [Google Scholar]
  • 43.Berman H, et al. The Protein Data Bank. Nucl Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Dunbrack RL, Karplus M. Backbone-dependent rotamer library for proteins. Application to side-chain prediction. J Mol Biol. 1993;230:543–574. doi: 10.1006/jmbi.1993.1170. [DOI] [PubMed] [Google Scholar]
  • 45.Verdonk ML, Mortenson PN, Hall RJ, Hartshorn MJ, Murray CW. Protein-ligand docking against non-native protein conformers. J Chem Inf Model. 2008;48:2214–2225. doi: 10.1021/ci8002254. [DOI] [PubMed] [Google Scholar]
  • 46.Trott O, Olson AJ. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010;31:455–461. doi: 10.1002/jcc.21334. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES