Skip to main content
F1000Research logoLink to F1000Research
. 2013 Jul 26;2:144. Originally published 2013 Jun 27. [Version 2] doi: 10.12688/f1000research.2-144.v2

High-resolution view of compound promiscuity

Ye Hu 1, Jürgen Bajorath 1,a
PMCID: PMC3799544  PMID: 24358872

Version Changes

Updated. Changes from Version 1

In version 2, three references (9, 15, and 16) have been updated. In response to reviewer comments of Dr. Hans Matter, we now also report the results of compound promiscuity analysis for five well-known target families including G protein-coupled receptor (GPCR) class A, protein kinases, ion channels, proteases, and nuclear hormone receptors. In addition, we have determined promiscuity levels for compounds in different molecular weight ranges, as also suggested by Dr. Matter. Four new tables (3-6) have been added.  Furthermore, in response to reviewer comments of Dr. Jeremy Jenkins, we report median promiscuity rates compared to average rates and briefly discuss a potential relationship between privileged structures and compounds displaying intra-family promiscuity.

Abstract

Compound promiscuity is defined as the ability of a small molecule to specifically interact with multiple biological targets. So-defined promiscuity is relevant for drug discovery because it provides the molecular basis of polypharmacology, which is increasingly implicated in the therapeutic efficacy of drugs. Recent studies have analyzed different aspects of compound promiscuity on the basis of currently available activity data. In this commentary, we present take-home messages from these studies augmented with new results to generate a detailed picture of compound promiscuity that might serve as a reference for further discussions and research activities.

Introduction

Polypharmacology is an emerging theme in drug discovery 1, 2. It is generally accepted that drugs often elicit their therapeutic effects through interactions with different targets and the ensuing modulation of multiple signaling pathways. In some therapeutic areas such as oncology, polypharmacology is heavily exploited, for example, through the use of promiscuous ATP site-directed protein kinase inhibitors 3. In other areas, such as the treatment of infectious or chronic inflammatory diseases, achieving a high degree of target selectivity of drug candidates plays a major role.

The study of drug polypharmacology has become an important topic in pharmaceutical research 4, 5, especially focusing on combined computational and experimental analysis 5. On the basis of drug-target networks, it was estimated early on that a drug interacts on average with approximately two targets 4. More recent estimates from computational data analysis suggest that drugs might bind on average to two to seven targets, depending on the primary target families, and that more than 50% of current drugs might interact with more than five targets 6.

Compound promiscuity as defined herein is the origin of polypharmacology. Promiscuity analysis can be extended from drugs to bioactive compounds through computational mining of currently available activity data. The results of activity data analysis are generally affected by data incompleteness 7. This potential influence can only be eliminated by reaching the ultimate (and probably elusive) goal of chemogenomics 8, i.e., testing all compounds against all targets. In the presence of data incompleteness, compound promiscuity rates are likely underestimated. However, it is not certain that further increasing amounts of assay data will indeed significantly alter the currently emerging view of compound promiscuity ( vide infra).

Recent studies have generated a differentiated picture of compound promiscuity. The interested reader is also referred to comprehensive reviews of compound promiscuity analysis 9 and polypharmacology 6. In this commentary, we summarize key messages from recent promiscuity analysis in a compact format. It is hoped that this summary might be helpful as a reference for further studies.

Key results of compound promiscuity analysis

Public data sources for compound promiscuity analysis discussed herein have been ChEMBL 10, the major repository of compound activity data from medicinal chemistry (currently in May 2013 containing 1,295,510 compounds with a total of 11,420,351 activity annotations), the PubChem BioAssay database 11, the major repository of screening data (with more than 3300 confirmatory assays), and DrugBank 12, which currently contains 1518 approved and 5080 experimental drugs.

It is important to note that collecting all activity annotations for a compound reported in the literature including, for example, reporter gene or other cell-based assays is at best providing a measure of assay promiscuity, but not of specific interactions with different targets 9. Therefore, it is generally required to apply data confidence criteria such as the presence of well-defined activity measurements or evidence for direct ligand-target interactions 9 (as provided in ChEMBL as activity data filters).

Activity measurement dependence

When monitoring the growth of compound activity data in ChEMBL over a period of more than two years from its original release (January 2010) to release 13 (May 2012), a significant increase in the number of promiscuous compounds was detected 13. However, by quantifying compound-based target relationships, it was determined that the increase in compounds with activity against targets from different families was largely due to (assay-dependent) IC 50 measurements, rather than (assay-independent) equilibrium constants (K i values) 13. IC 50 values are easier to determine than K i values and provide the readout of most primary biochemical assays (except single-point screening assays), which might at least in part rationalize greater target coverage and the IC 50-dependent increase in compound promiscuity across different families. However, it can also not be excluded that apparent promiscuity in different assays is higher on the basis of IC 50 measurements, given their assay dependence (and often limited accuracy). Regardless, the type of activity measurements that are taken into account influences the outcome of promiscuity analysis. Thus, clear specification of activity measurements and data selection criteria are required.

The subset of compounds with available K i measurements from ChEMBL release 13 was further investigated. On the basis of K i measurements, approximately 62% of all compounds were only annotated with a single target, ~36% with two or more targets from the same family, and only ~2% of all active compounds with multiple targets from different families 14. A promiscuous bioactive compound was found to interact on average with two to three targets.

Accordingly, compounds that display intra-family promiscuity might also be considered as candidates for privileged structures/compounds that are preferentially active against targets from a particular family. Therefore, these compounds can be distinguished from those that are promiscuous across different target families.

Activity data from different sources

One might anticipate that the degree of compound promiscuity would be particularly high in screening assays (even if frequent hitters and other non-specific compounds are excluded). Therefore, 1085 confirmatory bioassays from PubChem were systematically analyzed. It was found that ~77% of all confirmed active compounds were tested in more than 50 different assays 15. Thus, these active PubChem compounds provided a sound basis for promiscuity assessment. These results were in part surprising. An active PubChem compound displayed a ~50% probability to interact with two or more targets. The probability to interact with more than five targets was only ~8%. On average, a PubChem screening hit was active against 2.5 targets. For comparison, compounds from the IC 50- and K i-based subsets of ChEMBL release 14 (August 2012) interacted on average with 1.4 and 1.7 targets, respectively 15. The comparably low ratios observed for both compound subsets indicated that IC 50 measurements did not systematically increase promiscuity rates ( vide supra). The analysis of active compounds from PubChem confirmatory assays provided an upper level estimate of promiscuity, which was not significantly higher than that for ChEMBL compounds.

Prevalent promiscuity profile

Detailed analysis of compound activity data from ChEMBL release 14 (August 2012) has made it possible to derive a promiscuity profile that is most characteristic of bioactive compounds from medicinal chemistry sources. The majority of currently available promiscuous compounds is active in the sub-µM range against two to five targets from the same family and displays potency differences against these targets within one or two orders of magnitude 16. An important aspect of this representative profile is that promiscuity does not imply low potency. Furthermore, compounds that are highly potent against a (primary) target and weakly potent against others are not frequently found 16.

Up-to-date promiscuity rates

In Table 1, current average promiscuity rates are summarized for compounds from ChEMBL, PubChem, and DrugBank. For promiscuity assessment of drugs, all targets reported in DrugBank were considered.

Table 1. Average promiscuity of different compound categories.

Compound categories Avg. # targets/compound
ChEMBL 14/all bioactive compounds K i 1.7
IC 50 1.4
DrugBank/drugs Approved 5.9
Experimental 1.8
PubChem/active compounds 2.5
ChEMBL 14/promiscuous compounds K i 2.9
IC 50 2.7
DrugBank/promiscuous drugs Approved 6.9
Experimental 4.7
PubChem/promiscuous active compounds 3.7

The average number of targets is reported for compounds from ChEMBL release 14 (divided into K i and IC 50 value-based subsets), approved or experimental drugs from DrugBank 3.0, and active compounds from PubChem confirmatory bioassays. Corresponding statistics are provided in italics for promiscuous compounds (having two or more target annotations). For compounds from ChEMBL, only high-confidence activity annotations were taken into account (i.e., explicit activity measurements with the highest confidence level of direct ligand-target interactions). For calculations on drugs, all DrugBank target categories were taken into account.

If all compounds with single or multiple target annotations are analyzed, ChEMBL compounds interact on average with one to two targets and PubChem compounds with two to three. However, approved drugs have on average close to six targets. In contrast, the degree of promiscuity of experimental drugs is considerably lower, with less than two targets per drug candidate. If only promiscuous compounds or drugs are taken into account (i.e., if compounds with single target annotations are excluded), promiscuity rates only slightly increase by about one target per compound, the exception being experimental drugs whose average number of targets increases from 1.8 to 4.7. Furthermore, median promiscuity rates were also calculated for promiscuous compounds from different sources, i.e., ChEMBL compounds with activity against at least two targets (K i and IC 50), approved and experimental drugs annotated with more than four or at least two targets, respectively, and PubChem compounds active against at least three targets. Compared to the average promiscuity rates reported in Table 1, the median rates were consistently lower. However, the differences between the average and median rates were small, i.e., less than one for ChEMBL and PubChem compounds. By contrast, differences were larger than one for approved and experimental drugs, i.e., on the basis of median rates, drug target numbers were reduced by 1.9 and 2.7, respectively. Hence, average promiscuity rates for drugs were likely biased by highly promiscuous drugs.

In Table 2, the probability of promiscuity is reported for compounds from different sources (calculated from target distributions of compounds). For a ChEMBL compound with available IC 50 and K i measurements, the current probability of activity against two or more targets is ~25% and ~38%, respectively (if both IC 50 and K i measurements were available for a compound, they were separately considered). However, for activity against more than five targets, the probabilities are reduced to only ~1%. Similar observations are made for confirmed PubChem screening hits (providing an upper-limit promiscuity assessment for bioactive compounds, vide supra). In this case, the probability of activity against two or more, or against more than five targets is ~51% and ~8%, respectively. Furthermore, the probability of promiscuity of approved drugs from DrugBank is ~84% and the probability to interact with more than five targets still ~37%. For experimental drugs, the corresponding probabilities are much lower, with only ~24% and ~3%, respectively.

Table 2. Probability of promiscuity.

Compound categories # Targets Probability (%)
ChEMBL 14/all bioactive compounds K i 2 37.9
> 5 1.2
IC 50 2 24.7
> 5 0.8
DrugBank/drugs Approved 2 84.1
> 5 37.4
Experimental 2 23.6
> 5 3.4
PubChem/active compounds 2 50.9
> 5 7.6

For different compound categories and activity measurements, the probability of a compound to be active against two or more targets or more than five targets is reported.

Compound promiscuity for different target families

All available compounds active against targets belonging to the five target families, including G protein-coupled receptor (GPCR) class A, protein kinases, ion channels, proteases, and nuclear hormone receptors, were assembled from ChEMBL release 14 and separated into K i and IC 50 value-based subsets, as described above. Average promiscuity rates were calculated for all compounds active against a given family as well as compounds active against multiple targets within the family, as reported in Table 3. With the exception of the K i subset of the ion channel family, promiscuity degrees for compounds active against these target families were similar to those reported in Table 1. In Table 4, the probability of promiscuity (i.e., activity against at least two or more than five targets) is reported for compounds active against these families (according to Table 2). Similar observations were made. A significant relative increase (~10%) in probability of promiscuity was only observed for compounds active against two or more targets from the nuclear receptor family on the basis of the IC 50 subset. Thus, for prominent target families, no above-average compound promiscuity rates were detected.

Table 3. Average promiscuity of compounds active against prominent target families.

Family K i
# Targets # Compounds Avg. # targets/compound
All Promiscuous
GPCR class A 121 21,754 1.7 2.8
Kinases 74 1151 1.4 2.4
Ion channels 22 1086 1.2 3.9
Proteases 90 4488 1.5 2.8
Nuclear receptors 13 901 1.4 2.6
Family IC 50
# Targets # Compounds Avg. # targets/compound
All Promiscuous
GPCR class A 135 16,968 1.3 3.0
Kinases 192 13,316 1.3 2.7
Ion channels 52 4150 1.1 2.2
Proteases 108 11,833 1.6 3.0
Nuclear receptors 26 3782 1.4 2.1

From ChEMBL release 14, K i and IC 50 value-based compound subsets active against targets belonging to five prominent families were collected. The average number of targets is reported for compounds from individual target families. In addition, corresponding statistics are provided (in italics) for promiscuous compounds only (i.e., compounds having two or more target annotations within the family).

Table 4. Probability of promiscuity for compounds active against prominent target families.

Family Probability (%)
K i IC 50
≥ 2 Targets > 5 Targets ≥ 2 Targets > 5 Targets
GPCR class A 39.1 0.8 16.0 0.8
Kinases 26.3 0.4 20.1 1.0
Ion channels 8.4 0.1 5.3 0.02
Proteases 27.5 0.5 27.2 1.0
Nuclear receptors 24.9 1.2 34.8 0.03

For compounds active against different target families, the probability of promiscuity (activity against two or more targets and activity against more than five targets) is reported.

Promiscuity of compounds with increasing molecular weight

The degree of promiscuity was also determined for compounds with different sizes, i.e., molecular weight (MW). Seven subsets of compounds with increasing MW were collected from ChEMBL release 14 and organized into K i and IC 50 value-based subsets, as reported in Table 5. Average promiscuity rates of compounds with increasing MW were found to be comparable to the global rates. However, a significant relative increase in promiscuity was observed for the smallest compounds with MW ≤ 200 in K i subset. Furthermore, the probability of activity against two or more targets also increased by more than 10% for the smallest compounds in both subsets, as reported in Table 6. For larger compounds across all MW ranges, no significant increases in promiscuity were observed compared to the global degree and probability of compound promiscuity reported in Table 1 and Table 2, respectively.

Table 5. Average promiscuity of compounds with different molecular weight.

MW range K i
#Compounds Avg. # targets/compound
All Promiscuous
≤ 200 786 2.7 4.1
(200, 300) 3949 2.0 3.2
(300, 400) 10,913 1.8 2.9
(400, 500) 11,501 1.6 2.7
(500, 600) 6015 1.6 2.8
(600, 700) 1878 1.6 3.0
> 700 1497 1.7 2.8
MW range IC 50
#Compounds Avg. # targets/compound
All Promiscuous
≤ 200 1022 1.6 2.8
(200, 300) 9627 1.6 2.8
(300, 400) 25,190 1.4 2.7
(400, 500) 26,358 1.4 2.6
(500, 600) 12,534 1.4 2.6
(600, 700) 3247 1.3 2.8
> 700 2495 1.4 2.9

From ChEMBL release 14, compounds were selected and divided into seven subsets with increasing MW. The average number of targets is reported for compounds in all MW ranges. In addition, corresponding statistics are provided (in italics) for promiscuous compounds only (i.e., compounds having two or more target annotations).

Table 6. Probability of promiscuity of compounds with different molecular weight.

MW range Probability (%)
K i IC 50
≥ 2 Targets > 5 Targets ≥ 2 Targets > 5 Targets
≤ 200 53.8 9.9 36.0 2.4
(200, 300) 44.9 2.4 32.7 0.9
(300, 400) 39.4 1.2 26.0 0.9
(400, 500) 37.3 0.7 22.8 0.7
(500, 600) 31.5 0.6 22.1 0.7
(600, 700) 30.7 0.9 17.6 0.7
> 700 38.3 1.2 18.9 0.3

For compounds with different molecular weight, the probability of promiscuity (activity against two or more targets and activity against more than five targets) is reported.

Conclusions

Herein, we have provided a detailed and up-to-date view of compound promiscuity, the molecular basis of polypharmacology. For active compounds from medicinal chemistry and biological screening sources, the degree of promiscuity is lower than for drugs. There is a notable increase in promiscuity from bioactive compounds over drug candidates to approved drugs. The exploration of possible reasons for this apparent "promiscuity enrichment" along the drug discovery pathway should provide interesting opportunities for future research. On the basis of currently available high-confidence activity data, promiscuity of bioactive compounds is limited (and very low across different target families). However, if compounds are promiscuous, they typically bind to their targets with relatively high potency. Given the overall low degree of promiscuity of bioactive compounds including screening hits in the presence of nearly exponential data growth in recent years, it remains an open question if future chemogenomics efforts might substantially change the current picture of compound promiscuity ( vide supra). The majority of available bioactive compounds have single target annotations and we believe it is unlikely that most of them will display a high degree of currently undiscovered promiscuity. Hence, we would also conclude that the target specificity paradigm that has long dominated small molecule discovery efforts should continue to play a major role, despite emerging "anti-reductionism" and the increasing focus on phenotypic readouts.

Funding Statement

The author(s) declared that no grants were involved in supporting this work.

v2; ref status: indexed

References

  • 1.Paolini GV, Shapland RH, van Hoorn WP, et al. : Global mapping of pharmacological space. Nat Biotechnol. 2006;24(7):805–815 10.1038/nbt1228 [DOI] [PubMed] [Google Scholar]
  • 2.Boran AD, Iyengar R: Systems approaches to polypharmacology and drug discovery. Curr Opin Drug Discov Devel. 2010;13(3):297–309 [PMC free article] [PubMed] [Google Scholar]
  • 3.Knight ZA, Lin H, Shokat KM: Targeting the cancer kinome through polypharmacology. Nat Rev Cancer. 2010;10(2):130–137 10.1038/nrc2787 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Yildirim MA, Goh KI, Cusick ME, et al. : Drug-target network. Nat Biotechnol. 2007;25(10):1119–1126 10.1038/nbt1338 [DOI] [PubMed] [Google Scholar]
  • 5.Lounkine E, Keiser MJ, Whitebread S, et al. : Large-scale prediction and testing of drug activity on side-effect targets. Nature. 2012;486(7403):361–367 10.1038/nature11159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Jalencas X, Mestres J: On the origins of drug polypharmacology. Med Chem Comm. 2013;4(1):80–87 10.1039/C2MD20242E [DOI] [Google Scholar]
  • 7.Mestres J, Gregori-Puigjané E, Valverde S, et al. : Data completeness--the Achilles heel of drug-target networks. Nat Biotechnol. 2008;26(9):983–984 10.1038/nbt0908-983 [DOI] [PubMed] [Google Scholar]
  • 8.Rognan D: Chemogenomic approaches to rational drug design. Br J Pharmacol. 2007;152(1):38–52 10.1038/sj.bjp.0707307 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hu Y, Bajorath J: Compound promiscuity: what can we learn from current data? Drug Discov Today. 2013;18(13–14):644–650 10.1016/j.drudis.2013.03.002 [DOI] [PubMed] [Google Scholar]
  • 10.Gaulton A, Bellis LJ, Bento AP, et al. : ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40(Database issue):D1100–D1107 10.1093/nar/gkr777 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wang Y, Xiao J, Suzek TO, et al. : PubChem’s bioassay database. Nucleic Acids Res. 2012;40(Database issue):D400–D412 10.1093/nar/gkr1132 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Knox C, Law V, Jewison T, et al. : DrugBank 3.0: a comprehensive resource for ‘omics’ research on drugs. Nucleic Acids Res. 2011;39(Database issue):D1035–41 10.1093/nar/gkq1126 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hu Y, Bajorath J: Growth of ligand-target interaction data in ChEMBL is associated with increasing and activity measurement-dependent compound promiscuity. J Chem Inf Model. 2012;52(10):2550–2558 10.1021/ci3003304 [DOI] [PubMed] [Google Scholar]
  • 14.Hu Y, Bajorath J: How promiscuous are pharmaceutically relevant compounds? A data-driven assessment. AAPS J. 2013;15(1):104–111 10.1208/s12248-012-9421-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hu Y, Bajorath J: What is the likelihood of an active compound to be promiscuous? Systematic assessment of compound promiscuity on the basis of PubChem confirmatory bioassay data. AAPS J. 2013;15(3):808–15 10.1208/s12248-013-9488-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Hu Y, Bajorath J: Promiscuity profiles of bioactive compounds: potency range and difference distributions and the relation to target numbers and families. Med Chem Commun. 2013;4:1196–1201, in press. 10.1039/C3MD00105A [DOI] [Google Scholar]
F1000Res. 2013 Aug 1.

Referee response for version 2

Stefan Laufer 1

as already said, a very valid approach for my (kinase) community to deal with multi-target compounds

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

F1000Res. 2013 Jul 29.

Referee response for version 2

Jeremy Jenkins 1

My questions from v1 were sufficiently answered.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

F1000Res. 2013 Jul 16.

Referee response for version 1

Jeremy Jenkins 1

Hu and Bajorath’s update on compound promiscuity in public compound bioactivity databases is timely with the increasing cognizance of polypharmacology and its role in the efficacy and safety of drugs. The article aims to raise reader awareness and present new questions to be answered, and so the title, abstract, and content are appropriate. All data are freely available for download for primary sources mentioned.

The conclusions are fair and unbiased.

Additional questions do arise from this survey. First, does the average number of targets per compound differ from the median (or do highly promiscuous compounds skew the average?).

Second, is it reasonable to begin distinguishing promiscuous from privileged compounds? For example, by incorporating target class information, staurosporine might be viewed differently from quercetin, where the former represents a highly privileged scaffold among kinases and the latter displays IC50 values against an abundance of target types.

Third, the drug discovery field needs to understand if the "promiscuity enrichment" that occurs between the screening hits phase to the marketed drug phase largely reflects the depth of bioactivity data coverage for drugs, as drugs are highly profiled globally. The hit rates of drugs and medchem compounds across the same set of assays and targets would be needed to definitively conclude that drugs are more promiscuous.

However, the apparent increased promiscuity of drugs supports the growing resurgence of phenotypic screening, the impetus for exploring compound combinations in the context of multiple genotypes, and begs the question of how medchem optimization of multiple targets should be attacked.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

F1000Res. 2013 Jul 9.

Referee response for version 1

Hans Matter 1

This interesting manuscript presents a view on compound promiscuity based on in-vitro data and the number of potential targets per compound in public databases such as ChEMBL, PubChem and DrugBank. In particular the authors investigate and challenge the notion that most compounds today in lead findings are active on a large multitude of biological data. The title is appropriate for this contribution and the abstract sufficiently summarizes this study. The conclusions are balanced and justified on the basis of the data analysis; this is therefore an essential view on the number of targets.

It is an interesting observation from this study that DrugBank annotated drugs appear to interact with a higher number of molecular targets compared to early phase compounds in ChEMBL or PubChem. Any interpretation of this finding should be treated with caution, but it is tempting to discuss from a partially historical view as DrugBank may be enriched with older drugs that would have been subjected to less strict requirements for in-vitro selectivity than in today’s drug discovery. In addition during and after approval, drugs may have been tested in more profiling assays as is the case with earlier screening-type substances.

Following the authors, this interesting argument also supports target-specific drug discovery paradigms used in past years. However, working with public databases leads to many caveats, all of which have been pointed out earlier, e.g. incompleteness of the data matrix and differences of data from different sources. It might be interesting for future investigations to cross-check this conclusion for compounds targeting families like kinases or GPCRs. Due to the challenges of inherent selectivity in those families one could expect a larger percentage of promiscuous compounds. The same discrimination might possibly be true for smaller versus larger compounds.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

F1000Res. 2013 Jul 1.

Referee response for version 1

Stefan Laufer 1

Excellent work. My field (“the kinase community”) will benefit a lot form this commentary as compound promiscuity is an issue.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.


Articles from F1000Research are provided here courtesy of F1000 Research Ltd

RESOURCES