Abstract

Compound promiscuity is often attributed to nonspecific binding or assay artifacts. On the other hand, it is well-known that many pharmaceutically relevant compounds are capable of engaging multiple targets in vivo, giving rise to polypharmacology. To explore and better understand promiscuous binding characteristics of small molecules, we have searched X-ray structures (and very few qualifying solution structures) for ligands that bind to multiple distantly related or unrelated target proteins. Experimental structures of a given ligand bound to different targets represent high-confidence data for exploring promiscuous binding events. A total of 192 ligands were identified that formed crystallographic complexes with proteins from different families and for which activity data were available. These “multifamily” compounds included endogenous ligands and were often more polar than other bound compounds and active in the submicromolar range. Unexpectedly, many promiscuous ligands displayed conserved or similar binding conformations in different active sites. Others were found to conformationally adjust to binding sites of different architectures. A comprehensive analysis of ligand–target interactions revealed that multifamily ligands frequently formed different interaction hotspots in binding sites, even if their bound conformations were similar, thus providing a rationale for promiscuous binding events at the molecular level of detail. As a part of this work, all multifamily ligands we have identified and associated activity data are made freely available.
1. Introduction
Compound optimization efforts in medicinal chemistry traditionally aim to develop drug candidates that are highly selective and potent toward a specific biological target. This principle is based upon the assumption that therapeutic effects following drug administration solely result from interactions with a single target. However, this paradigm was called into question and revised when it became evident that the efficacy of drugs, but also side effects, frequently depended on multitarget activities and associated functional consequences, a concept referred to as “polypharmacology”.1−6
Despite the relevance of polypharmacology for drug efficacy, compounds with promiscuous binding behavior are often viewed controversially.7,8 This is the case because high hit rates of small molecules in biological assays are frequently not the result of multiple binding events.9 Rather, aggregation effects and potential chemical reactivities under assay conditions can lead to false positive assay signals.9−12 In light of concerns about such artifacts, studying multitarget activities of ligands and differentiating between false positive and true positive interactions have become important tasks in medicinal chemistry and biological screening.13−17
In addition to their relevance for drug development, the study of promiscuous small molecules is also of high interest in basic research. Importantly, physiological effects of endogenous chemical entities such as coenzymes, substrates, or transmitters are often elicited because of their ability to interact with distantly related or unrelated proteins having diverse functions.18,19 Hence, “true” promiscuity represents an evolutionary principle for physiologically relevant ligands. However, the molecular basis of promiscuous binding events remains to be further explored.
Although the ligand specificity paradigm will continue to play an important role in drug discovery, there are many opportunities to utilize polypharmacology.3 For example, multitarget compounds used for the treatment of a given pathology might be repositioned for other therapeutic applications that require engagement of different targets.20 A text book example of such repurposing efforts is methotrexate, a drug used for many years in cancer treatment, which has recently found alternative low-dose applications in the treatment of inflammatory disorders like psoriasis and rheumatoid arthritis.21 Notably, polypharmacology has high potential for treatment of diseases that result from perturbation of target networks and associated signaling pathways. Promiscuous kinase inhibitors successfully used in oncology are prime examples for compounds that interfere with target networks and their signaling cascades.22
Given the complex nature of polypharmacology, rational design of multitarget ligands is an equally challenging and attractive area of research.3,7,23−25 To this end, several studies have attempted to determine structure–activity relationship profiles of multitarget compounds. For example, on the basis of publicly available activity data, compounds with multitarget activity were identified and similarity relationships between them were explored.25−27 Furthermore, X-ray structures were used to associate multitarget drugs with proteins having similar functions,28 relate multitarget activities of ligands to protein binding site similarity,29 or identify compounds bound to targets from different families (multifamily ligands).30 Although structural data are limited, studying multitarget and multifamily ligands on the basis of complex X-ray structures, rather than assay data, has the intrinsic advantage that these binding events are confirmed at the molecular level of detail and can be investigated as such.
Herein, we have searched for multifamily ligands with available X-ray [or nuclear magnetic resonance (NMR)] structures to better understand origins of ligand promiscuity across different target families. Therefore, we have carried out a systematic search for experimental structures of small molecules bound to multiple targets from different protein families. A set of structure-based multifamily compounds was identified that included endogenous ligands as well as approved drugs. Molecular properties and bound conformations of these multifamily ligands were systematically analyzed and interaction hotspots in different protein binding sites were identified. Taken together, the results of our analysis shed light on the ability of small molecules to interact with distantly or unrelated targets.
2. Results and Discussion
2.1. Identification and Characterization of Multifamily Ligands
From 112 212 structures (entries) available in the Protein Data Bank (PDB),31 26 073 bound ligands were extracted. These ligands included 6496 organic compounds with a molecular weight of at least 300 Da and one or more reported activity values (in original references) of at least 10 μM (pIC50, pKi, or pKd ≥ 5). This set of PDB ligands provided the basis of our study.
The preselected ligands were subjected to a two-stage analysis. First, target family assignments were computationally carried out in a consistent manner (without subjective intervention) to identify ligands that were active against different target families and ensure reproducibility of the analysis (see Materials and Methods). Second, for each designated multifamily ligand, assigned targets and binding domains were carefully compared to examine similarities between targets from different families and prioritize multifamily ligands for promiscuity analysis, as further discussed below.
Computational analysis of the preselected PDB ligands identified 192 compounds that formed complexes with a variety of target proteins from 2 to 16 different families. These 192 compounds were designated multifamily ligands and further analyzed. Figure 1 shows exemplary compounds and Figure 2 shows the distribution of multifamily ligands over protein families. Kinases and other transferases formed the largest number of complexes with multifamily ligands (with 42 and 36 ligands, respectively). The majority of complex structures involved cytosolic enzymes, which are overrepresented in the PDB because of ease of crystallization.
Figure 1.
Exemplary multifamily ligands. For each ligand, the PDB Ligand Expo identifier (PDB-ID) is given and the number of qualifying complex X-ray structures (entries), targets, and target families is reported. Shown are exemplary ligands from (a) subset IV and (b) subset III.
Figure 2.
Distribution of multifamily ligands over target families. The bar plot reports the distribution of multifamily ligands over families of crystallographic targets according to the ChEMBL protein family classification scheme.
Multifamily ligands were available in complexes with 2 to 131 crystallographic targets, with a median value of 3 unique targets per ligand. The 192 ligands were represented by a total of 3398 complex structures. These structures only included 20 solution (NMR) structures of ligand–protein complexes and 34 NMR structures of ligand–DNA complexes (for completeness, DNA was included as a biological target). The small number of solution structures only entered the initial statistical analysis of multifamily ligands. Subsequent analysis was focused on X-ray structures. Distributions of potency (pIC50, pKi, or pKd) values of the 192 multifamily ligands for X-ray targets are shown in Figure 3. The distributions were broad and interquartile ranges spanned several orders of magnitude, with median values in the low micromolar to submicromolar range.
Figure 3.
Potency values. For multifamily ligands, the distributions of different logarithmic potency values (pIC50, pKi, and pKd) are reported in box plots. The yellow horizontal line indicates the median value of each distribution (reported next to the line).
In stage two of our analysis, targets of all multifamily ligands were compared individually and the ligands were assigned to 4 different subsets:
Subset I: ligands whose multifamily assignment depended on complexes with metabolizing enzymes or serum proteins (10 ligands); II: endogenous ligands (40); III: ligands binding to similar proteins from different families or to similar binding domains (51); and IV: multifamily ligands interacting with distinct targets (91).
The 10 ligands from subset I were omitted from further consideration because binding to serum proteins or metabolizing enzymes such as cytochromes is not relevant for polypharmacology (for all remaining ligands, complexes with such proteins were not included in subsequent analysis steps). Endogenous ligands such as adenosine 5′-triphosphate (ATP) or nucleoside derivatives have evolved to interact with different proteins. As such, these naturally occurring ligands are set apart from synthetic compounds and should best be separately considered. Furthermore, proteins from different families distinguished by established classification schemes might partly be structurally related and have similar biological functions. Therefore, subset III captured multifamily ligands for which at least some of the participating proteins had similar enzymatic functions or similar binding domains. By contrast, subset IV contained ligands that interacted exclusively with unrelated or distantly related targets (both in terms of structure and function). Figure 1a shows representative examples of subset IV ligands such as QUE that interacts with numerous distinct targets. Figure 1b shows subset III ligands. For example, NGH inhibits metalloproteases from 2 different families and BMF binds to bromodomains in proteins from 4 different families.
On the basis of our analysis, the 91 multifamily ligands belonged to subset IV having highest priority for promiscuity analysis, given that they interacted with unrelated targets. Therefore, specific examples discussed below were taken from subset IV.
For the initial set of 192 multifamily ligands and all other preselected PDB, different molecular properties were calculated and compared, revealing some interesting differences in the topological polar surface area (TPSA) and S log P values. Multifamily ligands had overall large TPSA (with a median of 145.5 Å) and low S log P values (median 1.7), indicating that multifamily ligands were generally polar. The apparent increase in hydrophilicity among multifamily ligands was further investigated by calculating TPSA (Figure 4a) and S log P values (Figure 4b) for individual ligand subsets. With a median TPSA of 255.7 Å and S log P value of −2.2, endogenous ligands were partly—but not exclusively—responsible for the relative increase in hydrophilicity because they included a variety of nucleosides with phosphate groups. However, even after removal of all subset II ligands, the remaining multifamily ligands had detectably higher hydrophilic character than other PDB compounds, with a median TPSA of 120.5 Å vs. 101.7 Å and median S log P value of 2.4 versus 3.4, respectively. Thus, the multifamily activity of ligands was not attributable to hydrophobic “stickiness”. Rather, they were more hydrophilic in nature than many other PDB compounds, even when endogenous ligands were excluded. Furthermore, only 17 multifamily ligands (<10%) were found to contain substructures implicated in assay interference effects. We also searched for structural analogues and analogue series among multifamily ligands. Only a single series containing 3 analogues was identified. Thus, multifamily ligands were not dominated by individual compound classes but were structurally diverse.
Figure 4.
Molecular properties. The distributions of (a) TPSA and (b) SlogP values for endogenous multifamily ligands (subset II, blue), remaining multifamily ligands (white), and all preselected PDB ligands (PDB CPDs, gray) are reported in box plots. The yellow horizontal line indicates the median value of each distribution.
2.2. Binding Conformations
Next, bound conformations of each multifamily ligand were systematically superposed and compared. Figure 5 shows exemplary pairwise superpositions of target-dependent conformations. Figure 6 shows the distribution of Root-Mean-Square Deviation (RMSD) values resulting from exhaustive comparison of binding conformations of all 192 multifamily ligands. Pairwise rmsd values ranged from close to 0 to 6.0 Å, with a median RMSD of 1.0 Å. Thus, approximately half of the comparisons identified similar binding conformations in different structural environments. The third quartile reached a value of 1.8 Å. At this level, conformations of typical ligands become dissimilar. Therefore, approximately a quarter of the comparisons indicated target-dependent conformational differences. However, overall most bound conformations of multifamily ligands were similar, regardless of the conformational space available to ligands and differences in the geometry and shape of binding sites. Figure 6 also reports the corresponding distribution of RMSD values for the 91 high-priority ligands from subset IV. In this case, the median RMSD value was only 0.8 Å, thus even lower, despite interactions with unrelated targets.
Figure 5.

Binding conformations of a multifamily ligand. Shown are exemplary pairwise superpositions of crystallographic conformations of doxorubicin. As a reference conformation, doxorubicin bound to DNA (Ligand Expo ID: DM2, PDB entry ID: 1DA9) is used (gray carbon atoms) onto which bound conformations of doxorubicin extracted from complex structures with diverse targets are superposed. For each superposition, the RMSD value is reported.
Figure 6.

Comparison of binding conformations. For all 192 multifamily ligands (MF PDB CPDs, blue) and the subset of prioritized multifamily ligands (subset III, white), the distribution of RMSD values for all pairwise superpositions of bound conformations is reported in a box plot. The yellow horizontal line indicates the median value of the distributions.
Hence, it remained to be determined how similar ligand conformations were accommodated in different structural environments.
2.3. Target–Ligand Interactions
Therefore, a systematic analysis of intermolecular interactions was carried out (details are provided in Materials and Methods). Directed polar interactions including hydrogen bonds, ligand–metal contacts, ionic, and π-interactions were accounted for and, in addition, van der Waals (vdW) contacts between ligand atoms and nonpolar amino acids. These vdW contacts were quantified as an indicator of hydrophobic interactions and shape complementarity. Figure 7 illustrates target–ligand interaction analysis using indomethacin as an exemplary ligand bound to human peroxisome proliferator-activated receptor γ (PPARγ).32 Atoms involved in directed and/or vdW interactions were uniquely indexed and individual atomic contacts were counted. Then, contacts were mapped onto ligand atoms and color-coded according to their frequency. Accordingly, dark green and dark orange atoms, or groups of atoms, indicated centers of polar and vdW interactions, respectively, as illustrated in Figure 7. For each multifamily ligand, interaction patterns were then monitored separately for targets belonging to different families and compared. The analysis revealed that multifamily ligands mostly formed different “interaction hotspots” with targets belonging to different families, even if bound conformations were similar, as discussed in the following.
Figure 7.

Identification of target–ligand interaction hotspots. For an exemplary bound ligand (green carbon atoms, Ligand Expo ID: IMN, PDB entry 3ADS), the atom-based number of directed interactions (hydrogen bonds, ligand–metal contacts, ionic, and π-interactions) and vdW interactions with hydrophobic protein residues are reported. Atoms involved in interactions are indexed. In the X-ray structure, directed interactions are shown as dotted lines and vdW contacts are presented as magenta spheres. In the corresponding 2D representations, ligand atoms are color-coded according to the number of their interactions (directed: light to dark green, vdW: light to dark orange).
For the examples presented, binding site similarity between participating protein families was also calculated (see Materials and Methods) and binding sites reaching a threshold for detectable similarity were identified.
2.4. Interaction Hotspots of Multifamily Ligands
Interaction hotspots were defined as ligand atoms most frequently involved in specific ligand–target interactions. They were calculated by mapping detectable ligand–target interactions on participating ligand atoms and determining their frequency on a per-atom basis (see Materials and Methods). Thus, so-defined hotspots revealed centers of interactions in ligands and other regions that did not participate in such interactions. Combining the analysis of binding conformations and target–ligand interactions made it possible to rationalize different multitarget binding events. For example, indomethacin represents a well-characterized polypharmacological drug33 that is known to interact with unrelated targets including cyclooxygenases,34 phospholipase A2,35 and PPARγ,32 and also serum albumin.36 As revealed by its RMSD matrix in Figure 8a, indomethacin belongs to the subset of multifamily ligands that display target-dependent differences in binding conformations with largest RMSD values exceeding 2.0 Å. Largest conformational variations were observed for transcription factor binding compared to hydrolases, reductases, and secreted proteins. Hence, indomethacin conformationally adapted to different structural environments. Figure 8b compares the interactions between indomethacin and targets from different families. The aliphatic carboxylic acid group of indomethacin was a conserved hotspot for polar interactions across all 3 protein families. On the other hand, aromatic interactions of the central indole ring moiety were only observed in binding sites of reductases. However, vdW interactions involving this moiety were mostly found in reductases and the transcription factor. By contrast, in the active site of hydrolases, no interactions with the central part of indomethacin were detectable. Thus, binding of this drug across different target families involved both conserved and distinct interaction patterns, which was a recurrent theme among multifamily ligands.
Figure 8.

Multifamily binding of indomethacin. (a) Shows an RMSD value matrix for comparison of indomethacin conformations bound to targets from different families. Each cell represents an RMSD value for a pairwise superposition. Cells are color-coded according to RMSD values (from light blue (0.0 Å) to purple (maximum rmsd). (b) Shows interaction hotspots of indomethacin for different protein families. The number of targets per family is given in parentheses. The representation of interaction hotspots is according to Figure 7. A low binding site similarity was detected for transcription factors, secreted proteins, and hydrolases.
The HIV protease inhibitor ritonavir37 is an example of a multifamily ligand with different binding conformations, yielding largest RMSD values exceeding 3.0 Å (Figure 9a). Among other targets, ritonavir is known to bind to cytochrome P450 enzymes, which causes undesirable side effects and drug interactions.38 The peptidomimetic nature of ritonavir with a large number of rotatable bonds supports flexibility of binding conformations. Accordingly, as shown in Figure 9b, this ligand displayed different polar interaction patterns when bound to proteases and hydrolases that were closely related and had significant binding site similarity. Polar interactions in proteases and hydrolases were centered on a hydroxyl group, while distinct polar hotspots were identified for the thiazole ring. Moreover, this ligand displayed overlapping yet distinct vdW interactions in different binding sites with three hotspots, only one of which (the central phenyl moiety) was shared by the two target families. Hence, ritonavir provided an intuitive example of a multifamily ligand where conformational adaptability was accompanied by the formation of different interaction hotspots.
Figure 9.

Multifamily binding of ritonavir. (a) RMSD matrix. (b) Interaction hotspots for different protein families. The presentation is according to Figure 8. A high binding site similarity was detected for hydrolases and proteases.
Because ligand binding across different protein families was not only attributable to conformational variability and resulting differences in interaction hotspots, we reasoned that differences in interaction patterns should also be present for multifamily ligands that bound with similar conformations to different targets. For example, quercetin is a relatively small and rigid compound that belongs to the large subset of multifamily ligands with essentially conserved binding conformations across different target families (with only one exception), as illustrated by its RMSD matrix in Figure 10a. Quercetin contains a polyphenolic flavonoid scaffold. Notably, flavonoids were considered privileged substructures in drug discovery39 capable of forming interactions with kinases,40 DNA,41 or hydrolases.42 In addition, there is crystallographic evidence for the oxidative cleavage of quercetin by quercetinase.43 However, polyphenols such as quercetin were also implicated in reactivity under assay conditions and other potential liabilities such as membrane perturbation, adding them to the spectrum of interference compounds.12,13 X-ray structures of quercetin in complex with DNA and targets from three protein families also revealed both conserved and family-dependent interaction hotspots, as shown in Figure 10b. The 3-hydroxy group of quercetin was consistently involved in target–ligand interactions, whereas the carbonyl oxygen formed a hydrolase-specific interaction hot spot. The C-ring of quercetin was involved in π-interactions in all complexes except when bound to hydrolases. Especially in kinases, all three rings were involved in aromatic interactions. In addition, extensive vdW contacts were formed when quercetin was bound to reductases and kinases, which were largely absent in hydrolases (and DNA).
Figure 10.

Multifamily binding of quercetin. (a) RMSD matrix. (b) Interaction hotspots for different protein families. No binding site similarity was detected.
Comparable conformational invariance was also observed for the chemotherapeutic agent doxorubicin, given its rigid structure. The presumed mechanism of action of doxorubicin involves the intercalation of the planar anthracycline core with the DNA double helix.44 Similar to quercetin, doxorubicin contains structural elements that contribute to ligand–target interactions but also cause assay liabilities and potentially adverse pharmacological effects.45−47 In light of the complex pharmacokinetics of anthracyclines, interactions of doxorubicin with different target proteins were analyzed in a number of crystallographic investigations including complexes with efflux pumps48 and cytosolic reductases.49 Because of the rigidity of the anthracycline core, limited conformational flexibility was due to bond rotation in the terminal carboxylic acid and aminoglycoside moiety, respectively (Figure 11a). Rather unexpectedly, interaction analysis of doxorubicin revealed that π-interactions of the aromatic core were only dominant when binding to DNA but that vdW contacts involving this moiety were preferentially observed in complexes with reductases and a transporter (Figure 11b). By contrast, the primary amine of the aminoglycoside was found to be a conserved interaction hotspot across 3 protein families. On the other hand, carbonyl and hydroxyl oxygens of the central anthracycline core only formed polar contacts when binding to reductases. Thus, polar and vdW interactions distinguished binding of doxorubicin in different structural environments.
Figure 11.

Multifamily binding of doxorubicin. (a) RMSD matrix. (b) Interaction hotspots for different protein families. No binding site similarity was detected.
The ATP-competitive kinase inhibitor dinaciclib has much more conformational freedom than quercetin and doxorubicin. However, it also bound with similar conformations to cyclin-dependent kinases50 and epigenetic regulators,51 as shown in Figure 12a, with a maximum RMSD value of 1.6 Å. Cyclin-dependent kinases and epigenetic regulators had detectable binding site similarity. Figure 12b compares interaction hotspots of dinaciclib with these 2 protein families. In both cases, extensive vdW interactions with essentially all parts of the inhibitor were observed, reflecting a high degree of shape complementarity in these binding sites. By contrast, distinct hotspots for polar interactions emerged. For epigenetic regulators, directed interactions with the central pyrazol[1,5-a]pyrimidine scaffold were detected. On the other hand, charge-assisted interactions and π-interactions involving the pyridine-N-oxide moiety were prevalent in complexes with kinases. Accordingly, dinaciclib was also representative of many multifamily ligands that had largely conserved binding conformations across different targets but formed different interaction hotspots in changing protein environments.
Figure 12.

Multifamily binding of dinaciclib. (a) RMSD matrix. (b) Interaction hotspots for different protein families. Binding site similarity between epigenetic regulators and kinases was detected.
3. Conclusions
In this work, we have systematically identified ligands with available experimental structures of complexes with targets from different families. These structures of multifamily ligands provided firm evidence for the presence of true binding events. Properties and binding characteristics of multifamily ligands were analyzed in detail to better understand the molecular basis of their promiscuous binding behavior. Multifamily ligands also included drugs with known polypharmacology. Surprisingly, multifamily ligands were overall slightly more hydrophilic than other PDB compounds. Moreover, many—but not all—multifamily ligands had similar binding conformations when interacting with targets from different families. In some instances, conformational variability in different binding sites was expectedly accompanied by the formation of different interaction hotspots. In other cases, conserved binding conformations of rigid or flexible ligands revealed overlapping yet distinct interaction hotspots across different target families. The formation of target family-dependent interaction hotspots in the presence of variable or conserved binding conformations emerged as a recurrent theme across multifamily ligands. The ligands interacted similarly with targets from the same family, leading to family-dependent hotspots, but interaction hotspots clearly differed between families. These observations provided a rationale for the promiscuous binding capacity of ligands at the molecular level of detail.
4. Materials and Methods
All calculations were carried out using in-house Python scripts with the aid of RDKit52 and the OpenEye’s chemistry toolkit,53 KNIME protocols,54 and the molecular operating environment (MOE).55
4.1. Ligands from X-ray Structures
X-ray structures and associated compound data were extracted from the Ligand Expo section56 of the PDB and complemented with experimental binding affinity data from the PDBbind database.57 Ligands were considered for further analysis if they had a minimum molecular weight of 300 Da and if at least one activity value of 10 μM (pIC50, pKi, or pKd) or better was available. Application of the molecular weight and activity cutoff ensured that salts, small organic components, and molecular fragments were excluded. Molecular descriptors of PDB ligands were calculated using RDKit. PDB ligands were screened in silico for structures containing Pan Assay Interference Compounds (PAINS)11 utilizing SMARTS58 strings obtained from three publicly available filters (ZINC,59 RDKit, and ChEMBL60).
4.2. Target Family Distribution
For crystallographic targets, family assignments were obtained by matching UniProt61 target identifiers to ChEMBL identifiers and applying the ChEMBL protein family classification scheme. In addition, the number of targets per compound was determined on the basis of unique UniProt identifiers.
4.3. Searching for Structural Analogues
A systematic search for analogues among multifamily ligands was carried out using a matched molecular pair-based computational method.62
4.4. Analysis of Binding Conformations
For each multifamily ligand, bound conformations were extracted from the corresponding X-ray complexes and superposed. From these superpositions, pairwise RMSD values of ligand conformations were calculated using MOE.
4.5. Analysis of Target–Ligand Interactions
For multifamily ligands, crystallographic target–ligand interactions were systematically analyzed using a KNIME implementation of MOE if two or more complexes representing a protein family were available. Crystallographic water molecules were removed from X-ray structures to avoid overestimation of water contacts in complexes.63 Nonbonded interactions involving ligand atoms were determined within a radius of 4.5 Å. Hydrogen bonds, ligand–metal contacts, ionic, and π-interactions were identified with the aid an empirical geometry-based scoring function.64 In addition, vdW contacts between ligand atoms and hydrophobic protein residues were determined by applying a maximal interaction energy of −0.5 kcal/mol. The sum of all polar and vdW interactions was calculated for each multifamily ligand atom over all binding sites in X-ray structures of a given protein family. For each family, the atom-based number of interactions was mapped onto a 2D representation of the ligand64 using the chemistry toolkit of RDKit. Atom positions were color-coded according to the number of mapped interactions.
4.6. Binding Site Similarity
Similarity of binding sites from different protein families was analyzed using ProBiS.65 For pairwise comparison of nonredundant targets from different families, the lowest recommended similarity z-score of 1.0 was applied as a threshold for detectable binding site similarity.65 If a pairwise comparison yielded a score of 1.0 or greater, the binding sites were classified as similar.
4.7. Data Availability
All multifamily ligands, family assignments, available affinity data, and the ligand subset classification are made available as Table S1 of the Supporting Information.
Acknowledgments
The use of OpenEye’s toolkits was made possible by their free academic licensing program.
Supporting Information Available
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acsomega.8b03481.
Multifamily ligands, family assignments, available affinity data, and the ligand subset classification (PDF)
Author Contributions
The study was carried out and the manuscript written with contributions of all authors. All authors have approved the final version of the manuscript.
The authors declare no competing financial interest.
Supplementary Material
References
- Zimmermann G. R.; Lehár J.; Keith C. T. Multi-Target Therapeutics: When the Whole is greater than the Sum of the Parts. Drug Discovery Today 2007, 12, 34–42. 10.1016/j.drudis.2006.11.008. [DOI] [PubMed] [Google Scholar]
- Hopkins A. L. Pharmacology: The Next Paradigm in Drug Discovery. Nat. Chem. Biol. 2008, 4, 682–690. 10.1038/nchembio.118. [DOI] [PubMed] [Google Scholar]
- Anighoro A.; Bajorath J.; Rastelli G. Polypharmacology: Challenges and Opportunities in Drug Discovery. J. Med. Chem. 2014, 57, 7874–7887. 10.1021/jm5006463. [DOI] [PubMed] [Google Scholar]
- Bolognesi M. L. Polypharmacology in a Single Drug: Multitarget Drugs. Curr Med Chem 2013, 20, 1639–1645. 10.2174/0929867311320130004. [DOI] [PubMed] [Google Scholar]
- Bolognesi M. L.; Cavalli A. Multitarget Drug Discovery and Polypharmacology. ChemMedChem 2016, 11, 1190–1192. 10.1002/cmdc.201600161. [DOI] [PubMed] [Google Scholar]
- Rosini M. The Rise of Multitarget Drugs over Combination Therapies. Future Med. Chem. 2014, 6, 485–487. 10.4155/fmc.14.25. [DOI] [PubMed] [Google Scholar]
- Hu Y.; Bajorath J. Compound Promiscuity - What Can We Learn From Current Data. Drug Discovery Today 2013, 18, 644–650. 10.1016/j.drudis.2013.03.002. [DOI] [PubMed] [Google Scholar]
- Kuhn M.; Banchaabouchi M. A.; Campillos M.; Jensen L. J.; Gross C.; Gavin A.-C.; Bork P. Systematic Identification of Proteins that Elicit Drug Side Effects. Mol Syst Biol 2013, 9, 663. 10.1038/msb.2013.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shoichet B. K. Screening in a Spirit Haunted World. Drug Discovery Today 2006, 11, 607–615. 10.1016/j.drudis.2006.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGovern S. L.; Caselli E.; Grigorieff N.; Shoichet B. K. A Common Mechanism Underlying Promiscuous Inhibitors from Virtual and High-Throughput Screening. J. Med. Chem. 2002, 45, 1712–1722. 10.1021/jm010533y. [DOI] [PubMed] [Google Scholar]
- Baell J. B.; Holloway G. A. New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays. J. Med. Chem. 2010, 53, 2719–2740. 10.1021/jm901137j. [DOI] [PubMed] [Google Scholar]
- Baell J.; Walters M. A. Chemistry: Chemical Con Artists Foil Drug Discovery. Nature 2014, 513, 481–483. 10.1038/513481a. [DOI] [PubMed] [Google Scholar]
- Baell J. B.; Nissink J. W. M. Seven Year Itch: Pan-Assay Interference Compounds (PAINS) in 2017 – Utility and Limitations. ACS Chem. Biol. 2017, 13, 36–44. 10.1021/acschembio.7b00903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aldrich C.; Bertozzi C.; Georg G. I.; Kiessling L.; Lindsley C.; Liotta D.; Merz K. M. Jr.; Schepartz A.; Wang S. The Ecstasy and Agony of Assay Interference Compounds. ACS Cent. Sci. 2017, 3, 143–147. 10.1021/acscentsci.7b00069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Capuzzi S. J.; Muratov E. N.; Tropsha A. Phantom PAINS: Problems with the Utility of Alerts for Pan-Assay INterference CompoundS. J. Chem. Inf. Model. 2017, 57, 417–427. 10.1021/acs.jcim.6b00465. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jasial S.; Hu Y.; Bajorath J. How Frequently Are Pan Assay Interference Compounds Active? Large-Scale Analysis of Screening Data Reveals Diverse Activity Profiles, Low Global Hit Frequency, and Many Consistently Inactive Compounds. J. Med. Chem. 2017, 60, 3879–3886. 10.1021/acs.jmedchem.7b00154. [DOI] [PubMed] [Google Scholar]
- Gilberg E.; Jasial S.; Stumpfe D.; Dimova D.; Bajorath J. Highly Promiscuous Small Molecules from Biological Screening Assays Include Many Pan-Assay Interference Compounds but Also Candidates for Polypharmacology. J. Med. Chem. 2016, 59, 10285–10290. 10.1021/acs.jmedchem.6b01314. [DOI] [PubMed] [Google Scholar]
- Nath A.; Atkins W. M. A Quantitative Index of Substrate Promiscuity. Biochemistry 2008, 47, 157–166. 10.1021/bi701448p. [DOI] [PubMed] [Google Scholar]
- Srinivasan B.; Marks H.; Mitra S.; Smalley D. M.; Skolnick J. Catalytic and Substrate Promiscuity: Distinct Multiple Chemistries Catalysed by the Phosphatase Domain of Receptor Protein Tyrosine Phosphatase. Biochem. J. 2016, 473, 2165–2177. 10.1042/bcj20160289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keiser M. J.; Setola V.; Irwin J. J.; Laggner C.; Abbas A. I.; Hufeisen S. J.; Jensen N. H.; Kuijer M. B.; Matos R. C.; Tran T. B.; Whaley R.; Glennon R. A.; Hert J.; Thomas K. L. H.; Edwards D. D.; Shoichet B. K.; Roth B. L. Predicting New Molecular Targets for Known Drugs. Nature 2009, 462, 175–181. 10.1038/nature08506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cronstein B. N. Low-Dose Methotrexate: A Mainstay in the Treatment of Rheumatoid Arthritis. Pharmacol. Rev. 2005, 57, 163–172. 10.1124/pr.57.2.3. [DOI] [PubMed] [Google Scholar]
- Knight Z. A.; Lin H.; Shokat K. M. Targeting the Cancer Kinome through Polypharmacology. Nat. Rev. Cancer 2010, 10, 130–137. 10.1038/nrc2787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hopkins A.; Mason J.; Overington J. Can We Rationally Design Promiscuous Drugs?. Curr. Opin. Struct. Biol. 2006, 16, 127–136. 10.1016/j.sbi.2006.01.013. [DOI] [PubMed] [Google Scholar]
- Morphy R.; Rankovic Z. Designed Multiple Ligands. An Emerging Drug Discovery Paradigm. J. Med. Chem. 2005, 48, 6523–6543. 10.1021/jm058225d. [DOI] [PubMed] [Google Scholar]
- Gupta-Ostermann D.; Hu Y.; Bajorath J. Systematic Mining of Analog Series with Related Core Structures in Multi-target Activity Space. J. Comput.-Aided Mol. Des. 2013, 27, 665–674. 10.1007/s10822-013-9671-5. [DOI] [PubMed] [Google Scholar]
- Hu Y.; Bajorath J. SAR Matrix Method for Large-scale Analysis of Compound Structure-Activity Relationships and Exploration of Multi-Target Activity Spaces. Methods Mol. Biol. 2018, 1825, 339–352. 10.1007/978-1-4939-8639-2_11. [DOI] [PubMed] [Google Scholar]
- de la Vega de León A.; Bajorath J. Design of a Three-Dimensional Multi-Target Activity Landscape. J. Chem. Inf. Model. 2012, 52, 2876–2883. 10.1021/ci300444p. [DOI] [PubMed] [Google Scholar]
- Moya-García A.; Adeyelu T.; Kruger F. A.; Dawson N. L.; Lees J. G.; Overington J. P.; Orengo C.; Ranea J. A. G. Structural and Functional View of Polypharmacology. Sci. Rep. 2017, 7, 10102. 10.1038/s41598-017-10012-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haupt V. J.; Daminelli S.; Schroeder M. Drug Promiscuity in PDB: Protein Binding Site Similarity Is Key. PLoS One 2013, 8, e65894 10.1371/journal.pone.0065894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilberg E.; Stumpfe D.; Bajorath J. X-Ray Structure Based Identification of Compounds with Activity against Targets from Different Families and Generation of Templates for Multitarget Ligand Design. ACS Omega 2018, 3, 106–111. 10.1021/acsomega.7b01849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berman H. M.; Westbrook J.; Feng Z.; Gilliland G.; Bhat T. N.; Weissig H.; Shindyalov I. N.; Bourne P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235–242. 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waku T.; Shiraki T.; Oyama T.; Maebara K.; Nakamori R.; Morikawa K. The Nuclear Receptor PPARγ Individually Responds to Serotonin- and Fatty Acid-Metabolites. EMBO J. 2010, 29, 3395–3407. 10.1038/emboj.2010.197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Song J.; Liu X.; Rao T. S.; Chang L.; Meehan M. J.; Blevitt J. M.; Wu J.; Dorrestein P. C.; Milla M. E. Phenotyping Drug Polypharmacology via Eicosanoid Profiling of Blood. J. Lipid Res. 2015, 56, 1492–1500. 10.1194/jlr.m058677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blanco F. J.; Guitian R.; Moreno J.; de Toro F. J.; Galdo F. Effect of Antiinflammatory Drugs on COX-1 and COX-2 Activity in Human Articular Chondrocytes. J. Rheumatol. 1999, 26, 1366–1373. [PubMed] [Google Scholar]
- Singh N.; Kumar R. P.; Kumar S.; Sharma S.; Mir R.; Kaur P.; Srinivasan A.; Singh T. P. Simultaneous Inhibition of Anti-Coagulation and Inflammation: Crystal Structure of Phospholipase A2 Complexed with Indomethacin at 1.4 Å Resolution Reveals the Presence of the New Common Ligand-Binding Site. J. Mol. Recognit. 2009, 22, 437–445. 10.1002/jmr.960. [DOI] [PubMed] [Google Scholar]
- Bogdan M.; Pirnau A.; Floare C.; Bugeac C. Binding Interaction of Indomethacin with Human Serum Albumin. J. Pharm. Biomed. Anal. 2008, 47, 981–984. 10.1016/j.jpba.2008.04.003. [DOI] [PubMed] [Google Scholar]
- Rock B. M.; Hengel S. M.; Rock D. A.; Wienkers L. C.; Kunze K. L. Characterization of Ritonavir-Mediated Inactivation of Cytochrome P450 3A4. Mol. Pharmacol. 2014, 86, 665–674. 10.1124/mol.114.094862. [DOI] [PubMed] [Google Scholar]
- Foy M.; Sperati C. J.; Lucas G. M.; Estrella M. M. Drug Interactions and Antiretroviral Drug Monitoring. Curr. HIV/AIDS Rep. 2014, 11, 212–222. 10.1007/s11904-014-0212-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reis J.; Gaspar A.; Milhazes N.; Borges F. Chromone as a Priviliged Scaffold in Drug Discovery: Recent Advances. J. Med. Chem. 2017, 60, 7941–7957. 10.1021/acs.jmedchem.6b01720. [DOI] [PubMed] [Google Scholar]
- Yokoyama T.; Kosaka Y.; Mizuguchi M. Structural Insight into the Interactions between Death-Associated Protein Kinase 1 and Natural Flavonoids. J. Med. Chem. 2015, 58, 7400–7408. 10.1021/acs.jmedchem.5b00893. [DOI] [PubMed] [Google Scholar]
- Srivastava S.; Somasagara R. R.; Hegde M.; Nishana M.; Tadi S. K.; Srivastava M.; Choudhary B.; Raghavan S. C. Quercetin, a Natural Flavonoid Interacts with DNA, Arrests Cell Cycle and Causes Tumor Regression by Activating Mitochondrial Pathway of Apoptosis. Sci. Rep. 2016, 6, 24049. 10.1038/srep24049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xue G.; Gong L.; Yuan C.; Xu M.; Wang X.; Jiang L.; Huang M. A Structural Mechanism of Flavonoids in Inhibiting Serine Proteases. Food Funct. 2017, 8, 2437–2443. 10.1039/c6fo01825d. [DOI] [PubMed] [Google Scholar]
- Jeoung J.-H.; Nianios D.; Fetzner S.; Dobbek H. Quercetin 2,4-Dioxygenase Activates Dioxygen in a Side-on O2-Ni Complex. Angew. Chem., Int. Ed. Engl. 2016, 55, 3281–3284. 10.1002/anie.201510741. [DOI] [PubMed] [Google Scholar]
- Howerton S. B.; Nagpal A.; Dean Williams L. Surprising Roles of Electrostatic Interactions in DNA-Ligand Complexes. Biopolymers 2003, 69, 87–99. 10.1002/bip.10319. [DOI] [PubMed] [Google Scholar]
- Gilberg E.; Gütschow M.; Bajorath J. X-ray Structures of Target-Ligand Complexes Containing Compounds with Assay Interference Potential. J. Med. Chem. 2018, 61, 1276–1284. 10.1021/acs.jmedchem.7b01780. [DOI] [PubMed] [Google Scholar]
- Mordente A.; Meucci E.; Silvestrini A.; Martorana G.; Giardina B. New Developments in Anthracycline-Induced Cardiotoxicity. Curr. Med. Chem. 2009, 16, 1656–1672. 10.2174/092986709788186228. [DOI] [PubMed] [Google Scholar]
- Motlagh N. S. H.; Parvin P.; Ghasemi F.; Atyabi F. Fluorescence Properties of Several Chemotherapy Drugs: Doxorubicin, Paclitaxel and Bleomycin. Biomed. Opt. Express 2016, 7, 2400–2406. 10.1364/boe.7.002400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eicher T.; Cha H.-j.; Seeger M. A.; Brandstatter L.; El-Delik J.; Bohnert J. A.; Kern W. V.; Verrey F.; Grutter M. G.; Diederichs K.; Pos K. M. Transport of Drugs by the Multidrug Transporter AcrB Involves an Access and a Deep Binding Pocket that are Separated by a Switch-Loop. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 5687–5692. 10.1073/pnas.1114944109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leung K. K. K.; Shilton B. H. Binding of DNA-Intercalating Agents to Oxidized and Reduced Quinone Reductase 2. Biochemistry 2015, 54, 7438–7448. 10.1021/acs.biochem.5b00884. [DOI] [PubMed] [Google Scholar]
- Chen P.; Lee N. V.; Hu W.; Xu M.; Ferre R. A.; Lam H.; Bergqvist S.; Solowiej J.; Diehl W.; He Y.-A.; Yu X.; Nagata A.; VanArsdale T.; Murray B. W. Spectrum and Degree of CDK Drug Interactions Predicts Clinical Performance. Mol. Cancer Ther. 2016, 15, 2273–2281. 10.1158/1535-7163.mct-16-0300. [DOI] [PubMed] [Google Scholar]
- Ember S. W. J.; Zhu J.-Y.; Olesen S. H.; Martin M. P.; Becker A.; Berndt N.; Georg G. I.; Schönbrunn E. Acetyl-lysine Binding Site of Bromodomain-Containing Protein 4 (BRD4) Interacts with Diverse Kinase Inhibitors. ACS Chem. Biol. 2014, 9, 1160–1171. 10.1021/cb500072z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- RDKit . Cheminformatics and Machine Learning Software, 2013. http://www.rdkit.org (accessed Aug 01, 2018).
- OEChem . TK, version2.0.0; OpenEye Scientific Software: Santa Fe, NM, 2015.
- Berthold M. R.; Cebron N.; Dill F.; Gabriel T. R.; Kötter T.; Meinl T.; Ohl P.; Sieb C.; Thiel K.; Wiswedel B.. KNIME: TheKonstanz Information Miner. In Studies in Classification, Data Analysis,and Knowledge Organization; Preisach C., Burkhart H., Schmidt Thieme L., Decker R., Eds.; Springer: Berlin, 2008; pp 319–326. [Google Scholar]
- Molecular Operating Environment (MOE), 2018.01; Chemical Computing Group ULC: 1010 Sherbooke St. West, Suite #910, Montreal, QC, Canada, H3A 2R7, 2018.
- Feng Z.; Chen L.; Maddula H.; Akcan O.; Oughtred R.; Berman H. M.; Westbrook J. Ligand Depot: A Data Warehouse for Ligands Bound to Macromolecules. Bioinformatics 2004, 20, 2153–2155. . http://ligand-expo.rcsb.org/(accessed September 16, 2018). 10.1093/bioinformatics/bth214. [DOI] [PubMed] [Google Scholar]
- Wang R.; Fang X.; Lu Y.; Wang S. The PDBbind Database: Collection of Binding Affinities for Protein-Ligand Complexes with Known Three-Dimensional Structures. J. Med. Chem. 2004, 47, 2977–2980. 10.1021/jm030580l. [DOI] [PubMed] [Google Scholar]
- James C. A.; Weininger D.; Delany J.. SMARTS Theory. Daylight Theory Manual; Daylight Chemical Information Systems; Laguna Niguel: CA, 2000. [Google Scholar]
- Sterling T.; Irwin J. J. ZINC 15–Ligand Discovery for Everyone. J. Chem. Inf. Model. 2015, 55, 2324–2337. 10.1021/acs.jcim.5b00559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gaulton A.; Bellis L. J.; Bento A. P.; Chambers J.; Davies M.; Hersey A.; Light Y.; McGlinchey S.; Michalovich D.; Al-Lazikani B.; Overington J. P. ChEMBL: A Large-Scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2011, 40, D1100. 10.1093/nar/gkr777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- The UniProt Consortium UniProt: The Universal Protein Knowledgebase. Nucleic Acids Res. 2017, 45, D158–D169. 10.1093/nar/gkw1099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stumpfe D.; Dimova D.; Bajorath J. Computational Method for the Systematic Identification of Analog Series and Key Compounds Representing Series and Their Biological Activity Profiles. J. Med. Chem. 2016, 59, 7667–7676. 10.1021/acs.jmedchem.6b00906. [DOI] [PubMed] [Google Scholar]
- Lu Y.; Wang R.; Yang C.-Y.; Wang S. Analysis of Ligand-Bound Water Molecules in High-Resolution Crystal Structures of Protein-Ligand Complexes. J. Chem. Inf. Model. 2007, 47, 668–675. 10.1021/ci6003527. [DOI] [PubMed] [Google Scholar]
- Clark A. M.; Labute P. 2D Depiction of Protein-Ligand Complexes. J. Chem. Inf. Model. 2007, 47, 1933–1944. 10.1021/ci7001473. [DOI] [PubMed] [Google Scholar]
- Konc J.; Česnik T.; Konc J. T.; Penca M.; Janežič D. ProBiS-Database: Precalculated Binding Site Similarities and Local Pairwise Alignments of PDB Structures. J. Chem. Inf. Model. 2012, 52, 604–612. 10.1021/ci2005687. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All multifamily ligands, family assignments, available affinity data, and the ligand subset classification are made available as Table S1 of the Supporting Information.




