Abstract
In drug discovery, polypharmacology encompasses the use of small molecules with defined multi‐target activity and in vivo effects resulting from multi‐target engagement. Multi‐target compounds are often efficacious in the treatment of complex diseases involving target and pathway networks, but might also elicit unwanted side effects. Computational approaches such as target prediction or multi‐target ligand design have been used to support polypharmacological drug discovery. In addition to efforts directed at the identification or design of new multi‐target compounds, other computational investigations have aimed to differentiate such compounds from potential false‐positives or explore the molecular basis of multi‐target activities. Herein, a concise overview of the field is provided and recent advances in computational polypharmacology through machine learning are discussed.
Keywords: Polypharmacology, multi-target compounds, medicinal chemistry, computational methods, molecular design, explainable machine learning
Explaining multi‐target activity predictions. Shown is a multi‐target (promiscuous) compound correctly predicted by explainable machine learning. Structural features driving the prediction are determined and mapped onto the compound.
1. Introduction
For the treatment of multi‐factorial diseases, the polypharmacology concept has come of age in drug discovery.[ 1 , 2 , 3 , 4 , 5 ] Polypharmacology refers to the use of compounds, which elicit a therapeutic effect through the engagement of multiple targets and the ensuing pharmacological consequences.[ 1 , 2 ] The success of multi‐target (promiscuous) drugs in therapeutic areas such as oncology, [3] neurodegenerative diseases, [4] or metabolic diseases [5] is attributable to the presence of pharmacological networks, in which multiple signaling and/or metabolic pathways provide functional links between different targets.[ 6 , 7 ] In such networks, out‐of‐control pathways might give rise to multi‐factorial diseases, the treatment of which requires therapeutic intervention of more than one target, preferably with individual drugs instead of combination therapies. For the use of such drugs, promiscuous kinase inhibitors in oncology have become a paradigm, [3] partly catalyzing polypharmacological drug discovery efforts in other areas. Furthermore, analyzing single‐ versus multi‐target activity of drugs and searching for other than primarily intended drug targets provides the basis for drug repurposing, [8] another approach falling into the polypharmacology spectrum that has also gained in popularity over the past two decades. Notably, while the extent of polypharmacology among current drugs is yet to be fully determined, single‐target drugs continue to be highly desirable for many therapeutic applications such as the treatment of chronic diseases. Polypharmacology is complementing but not replacing target‐specific drug discovery and design.
Exploring and exploiting polypharmacology rigorously depends on the availability of multi‐target compounds (MT‐CPDs). In biological screening and medicinal chemistry, such compounds might be discovered serendipitously, for example, as hits in multiple screens, target profiling campaigns, or phenotypic assays. Moreover, finding compounds that are active against pre‐defined target combinations is of prime interest in polypharmacological drug discovery, naturally leading to multi‐target compound design at varying levels of sophistication. For such efforts, understanding the molecular basis of multi‐target activity and investigating how MT‐CPDs might be distinguished from corresponding single‐target compounds (ST‐CPDs) are critical steps – and also scientifically stimulating tasks, for which computational approaches have been increasingly considered. For example, many computational studies in polypharmacology have focused on different models for target prediction, [9] being equally relevant for the discovery of MT‐CPDs and drug repurposing. Early approaches have employed series of single‐target models to computationally screen compounds for potential multi‐target activity. Such models were typically derived using various machine learning (ML) methods to distinguish compounds with activity against a given target from randomly selected compounds. In addition, chemogenomic or proteochemometric ML models were trained that combined compound and protein descriptors to distinguish true ligand‐target pairs from false pairs, providing higher‐level predictions compared to arrays of single‐target classifiers. Furthermore, with the advent of deep learning, multi‐task learning schemes have been applied to predict multi‐target compounds, as further discussed below.
Other types of investigations have concentrated on systematic analysis of growing amounts of compound activity data to identify MT‐CPDs. [10] However, given the need for new compounds with pre‐defined multi‐target activity for polypharmacology, ligand‐based design of MT‐CPDs has been a major focal point.
Notably, ompound promiscuity also has negative connotations. The term promiscuity is frequently used to refer to compounds that have assay interference potential and might cause false‐positive assay signals. [11] For the detection of potential assay interference compounds, catalogs of medicinal chemistry rules have been compiled defining structural motifs implicated in assay interference. Rule‐based approaches are typically used as computational filters to flag questionable compounds. [11]
2. Practical Compound Design
In medicinal chemistry, robust and chemically intuitive compound design strategies are highly desirable. In practice, prospective design of MT‐CPDs for polypharmacology is typically confined to two or three targets. MT‐CPD design can be based on scaffolds identified in compounds with activity against different targets, [12] which might include privileged substructures. [13] Such scaffold‐based design strategies, which also include merging of substructures from different ligands, are largely driven by medicinal chemistry knowledge. Furthermore, given the central role of pharmacophore modeling for hypothesis generation in medicinal chemistry, [14] MT‐CPD design has been investigated by merging or fusing different pharmacophores [15] and parallel or sequential screening of test compounds using pharmacophore models of different activities. [16] In addition to prospective scaffold‐ or pharmacophore‐based design strategies for MT‐CPDs, which are readily applicable in the practice of medicinal chemistry, other computational approaches have been reported that are relevant for polypharmacology from different points of view, many of which employ ML. In the following, exemplary studies are discussed that mirror current trends in the field.
3. Trends in Computational Polypharmacology
Compounds with multiple positive assay readouts might be MT‐CPDs or false‐positives, due to the presence of non‐specific interactions or assay interference. There are many different mechanisms potentially causing assay interference. Regardless of the mechanism, false‐positive compounds faking multi‐target activities represent a major caveat for polypharmacology. Therefore, a variety of computational models have been developed to detect interference compounds in biological screens. Among these is Hit Dexter 2.0, a web platform for the identification of frequent hitters irrespective of the underlying mechanism of action, hence including both MT‐CPDs and false‐positives. [17] The models constituting the platform are extra tree classifiers, [18] trained on primary screening and confirmatory assay data to distinguish between compounds with high and low assay hit rates. Cross‐validation revealed generally high predictivity of the classifiers. However, with increasing distance of test compounds to the training set, the rate of misclassifications increased (similar to other ML models). Hit Dexter 2.0 was used to search for frequent hitters among approved drugs. Depending on the model, 4.9–14.4 % of test drugs were classified as frequent hitters. [17] Although some drugs are known to have assay interference potential, [11] a number of drugs classified as frequent hitters might have as of yet undetected multi‐target activities and thus act through polypharmacology.
In a complementary approach, a neural network was trained to separate compounds with sub‐micromolar activity against targets from at least three different classes (promiscuous/multi‐target compounds) from others flagged by multiple rule‐based filters for potential assay interference (putative false‐positives). [19] The binary classifier was then used to screen DrugBank, [20] where 7 % of listed compounds were predicted to have multi‐target activity, similar to the results of Hit Dexter 2.0. In addition, a virtual combinatorial compound library was evaluated. From top scoring candidates, an exemplary compound was synthesized and tested against 63 targets, displaying activity against eight targets from two different classes. [19]
Multi‐target predictions can be further extended by computational target profiling. While early studies have virtually screened test compounds against arrays of target‐based activity prediction models, deep learning enables the generation of multi‐task models exploiting potential synergies between targets. For example, Li et al. introduced a multi‐task model for a panel of 391 kinases that was trained on more than 30.000 kinase inhibitors. [21] The model displayed high performance in cross‐validation trials and on external validation sets. It was then applied to predict the kinase profile of five structurally diverse inhibitors, yielding high correlation between predictions and subsequently conducted experiments. [21]
Such studies indicate the potential of ML for polypharmacology predictions. However, novel drug‐target interactions can also be uncovered using simplistic molecular similarity analysis. For example, systematic comparison of ligands from co‐crystal structures with different proteins revealed that pairs of ligands with highest shape similarity of their bound conformations had a high probability of target cross‐activity. [22] Only a small percentage of the postulated target cross‐activities were found to be reported in public databases, leaving many pairs of similar compounds for the assessment of putative cross‐target interactions. An exemplary compound with high shape similarity to another active compound was tested against the cross‐target, confirming sub‐micromolar potency. [22]
Irrespective of predictive modeling, X‐ray structures of ligand‐target complexes provide a wealth of information for polypharmacology. In a systematic analysis of publicly available structures, 1418 X‐ray ligands in complexes with different targets were identified, which included 702 compounds that interacted with targets from different families (multi‐family ligands). [12] Figure 1 shows structures with an exemplary multi‐family ligand identified in this study. An analogue search identified 168 analogue series containing one or more multi‐family ligands, which yielded 133 unique analogue‐series‐based scaffolds as templates for MT‐CPD design. [12]
Figure 1.
X‐ray structures of a multi‐family ligand. The thyronine analogue shown in the center is found in co‐crystal structures with eight targets from five different families, hence providing a prime example for an experimentally confirmed MT‐CPD. On the left, the structure of a complex with human serum albumin is shown (PDB ID: 1HK4) and on the right, a complex with the human thyroid hormone receptor alpha (PDB ID: 4LNX) is illustrated. In these structures, the compound adopted different binding modes.
As stated above, the design of MT‐CPDs for desirable target combinations is of prime interest in medicinal chemistry. Going beyond standard scaffold‐ or pharmacophore‐based design, novel computational strategies for MT‐CPD design are developed. For example, to generate new inhibitors of the TAM (TYRO‐3, AXL, MERTK) family of receptor tyrosine kinases, which are implicated in immunosuppressive activity in tumor cells, kinase inhibitors from X‐ray structures were systematically fragmented, recording rings, functional groups, and interacting amino acids. [23] Starting from a known inhibitor available in an X‐ray complex with MERTK, stored fragments and corresponding amino acids were mapped onto the structure, retaining only those with interacting residues from the TAM family. These fragments included a subset located in a pocket with significant structural differences between the TAM family and other targeted kinases, thus providing an opportunity to selectively inhibit TAM kinases. Fragments mapping to this pocket were computationally recombined into synthetically feasible compounds, some of which were tested. The most promising candidate was active in the nanomolar range against the three TAM kinases, selective over other kinases from which fragments were generated, and active in vivo, [23] hence providing a compelling example for the design of an MT‐CPD for controlled polypharmacology.
4. Rationalizing Multi‐target Activities
Understanding molecular features that distinguish MT‐ and ST‐CPDs is of critical importance for devising generally applicable strategies for multi‐target drug design, which is still in its infancy. The increasing availability of public compounds with reliable activity annotations against two or three targets has enabled systematic assessment of distinguishing features. Therefore, different ML models were derived to classify MT‐ and corresponding ST‐CPDs on the basis of chemical structure. In these studies, support vector machine or random forest (RF) models achieved prediction accuracy at the 80 % level, comparable or superior to deep ML models, [24] hence indicating the presence of structural features differentiating MT‐ and ST‐CPDs. In a large‐scale classification analysis, MT‐ and corresponding ST‐CPDs were assembled for 170 different target pairs involving 137 unique targets, for which at least 100 MT‐ and (50+50) ST‐CPDs with high‐quality activity annotations were available. [25] For each pair, ML models using extended connectivity (atom environment) fingerprints were derived to predict test compounds for this pair (native prediction) as well as for all other 169 target pairs (cross‐pair predictions). Native predictions typically reached an accuracy at the 80 % level or above, whereas cross‐pair predictions mostly failed, [25] showing that structural features distinguishing between MT‐ and ST‐CPDs depended on target pairs and were not generalizable. Essentially the same results were obtained using RF models for inhibitors of triplets of closely related kinases and corresponding single‐kinase compounds. [26]
From a drug discovery perspective, obtaining evidence for the presence of structural features that might be characteristic of MT‐CPDs is attractive, but not sufficient for practical applications, which require detailed knowledge of such structural motifs. Approaches for explainable ML (XML) [27] such as molecular Shapley Additive exPlanations (SHAP)[ 28 , 29 ] analysis can be applied to identify structural features determining correct predictions. Identifying features that are most important for the prediction of MT‐CPDs enables chemically intuitive follow‐up analysis, as illustrated in Figure 2. SHAP analysis quantifies contributions of individual features to predictions that are present or absent in a test compound.[ 28 , 29 ] The sum of all positive and negative SHAP feature contributions results in the probability of a given prediction. Important features can then be mapped on test compounds and the potential relevance of delineated substructures for multi‐target activities can be further explored.
Figure 2.
From feature weights to substructures. Shown is a correctly predicted MT‐CPD. In a), exemplary structural features supporting the prediction (positive SHAP values) and opposing the prediction (negative values) are delineated. While only a few negative contributions were detected, many (partly redundant) fingerprint features made comparably small positive contributions, leading to a high cumulative probability of multi‐target activity (94 %). In b), SHAP values of features present in the compound were mapped to corresponding atoms, highlighting a coherent substructure largely responsible for the correct prediction.
Identification of structural features determining correct predictions of MT‐ and ST‐CPD by molecular SHAP analysis revealed a consistent scenario for inhibitors of closely related kinases as well as compounds with activity against unrelated targets. [30] In both cases, accurate predictions were determined by structural features that were present in MT‐CPDs but absent in corresponding ST‐CPDs.[ 26 , 30 ] Thus, at the level of predictions, these features were characteristic of MT‐CPDs for selected target combinations. Mapping of characteristic features identified different substructures in test compounds with activity against pairs of unrelated targets that were experimentally confirmed in independent studies to exhibit activity against both targets of a pair, [30] thus providing a rationale for their polypharmacology potential.
Evaluating hypotheses obtained from XML requires additional (experimental) data. For example, Figure 3 illustrates the augmentation of XML results with structural information. Here, substructures with highest importance for the correct prediction of multi‐kinase inhibitors were further analyzed on the basis of X‐ray structures of kinase‐inhibitor complexes, revealing the formation of interaction hotspots involving residues conserved in different kinases, thus providing a rationale for multi‐kinase activity. [26]
Figure 3.
Explainable machine learning in polypharmacology. The compound at the top left shows an exemplary inhibitor with multi‐kinase activity that was correctly predicted via ML. Structural features determining the correct prediction were identified using SHAP analysis and mapped to the inhibitor (second representation following the arrow). These features delineated a coherent substructure (third representation) that distinguished multi‐kinase from single‐kinase inhibitors. This substructure was found in an X‐ray structure of another inhibitor in complex with a kinase (the binding site is enlarged on the right). Following this approach, explainable machine learning has systematically differentiated between inhibitors with multi‐ and single‐kinase activity, leading to experimentally testable hypotheses concerning distinguishing structural motifs.
5. Conclusions
Polypharmacology is supported by a variety of computational approaches. Among these, methods for target prediction, MT‐CPD design, and detection of false‐positive activity annotations play a central role. In recent years, ML has been increasingly applied for polypharmacology predictions. While computational target profiling of compounds via ML has become a widely used approach, similar to computational screening for assay interference compounds, the design of new MT‐CPDs is still largely driven by medicinal chemistry knowledge and pharmacophore modeling. However, new design strategies are beginning to emerge and it is expected that deep generative modeling will be increasingly investigated for polypharmacology. Furthermore, predictive modeling via explainable ML will help to bridge between computational and practical medicinal chemistry, in polypharmacology and beyond. In addition to its immediate relevance for practical applications, exploring characteristics of MT‐CPDs is also of high interest from a basic research perspective. Understanding how compounds “pseudo‐specifically” interact with multiple targets is scientifically stimulating and can likely be translated into new compound design strategies. Hence, in computational polypharmacology, basic and applied research go hand‐in‐hand and new developments further advancing the field are anticipated.
Conflict of interest
None declared.
6.
Acknowledgements
Open Access funding enabled and organized by Projekt DEAL.
C. Feldmann, J. Bajorath, Mol. Inf. 2022, 41, 2200190.
Data Availability Statement
Data sharing is not applicable to this article as no new data were created or analyzed in this study.
References
- 1.J. U. Peters (Ed.), Polypharmacology in Drug Discovery. John Wiley & Sons, Hoboken, 2012.
- 2. Anighoro A., Bajorath J., Rastelli G., J. Med. Chem. 2014, 57, 7874–7887. [DOI] [PubMed] [Google Scholar]
- 3. Knight Z. A., Lin H., Shokat K. M., Nat. Rev. Cancer 2010, 10, 130–137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Benek O., Korabecny J., Soukup O., Trends Pharmacol. Sci. 2020, 41, 434–445. [DOI] [PubMed] [Google Scholar]
- 5. Tschöp M. H., Finan B., Clemmensen C., Gelfanov V., Perez-Tilve D., Müller T. D., DiMarchi R. D., Cell Metab. 2016, 24, 51–62. [DOI] [PubMed] [Google Scholar]
- 6. Ainsworth C., Nat. Med. 2011, 17, 1166–1168. [DOI] [PubMed] [Google Scholar]
- 7. Hopkins A. L., Nat. Biotechnol. 2007, 25, 1110–1111. [DOI] [PubMed] [Google Scholar]
- 8. Pushpakom S., Iorio F., Eyers P. A., Escott K. J., Hopper S., Wells A., Doig A., Guilliams T., Latimer J., McNamee C., Norris A., Sanseau P., Cavalla D., Pirmohamed M., Nat. Rev. Drug Discovery 2019, 18, 41–58. [DOI] [PubMed] [Google Scholar]
- 9. Sydow D., Burggraaff L., Szengel A., van Vlijmen H. W. T., IJzerman A. P., van Westen G. J. P., Volkamer A., J. Chem. Inf. Model. 2019, 59, 1728–1742. [DOI] [PubMed] [Google Scholar]
- 10. Hu Y., Bajorath J., Drug Discovery Today 2013, 18, 644–650. [DOI] [PubMed] [Google Scholar]
- 11. Baell J. B., Nissink J. W. M., ACS Chem. Biol. 2018, 13, 36–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Gilberg E., Stumpfe D., Bajorath J., ACS Omega 2018, 3, 106–111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Müller G., Drug Discovery Today 2003, 8, 681–691. [DOI] [PubMed] [Google Scholar]
- 14. Yang S. Y., Drug Discovery Today 2010, 15, 444–450. [DOI] [PubMed] [Google Scholar]
- 15. Morphy R., Rankovic Z., J. Med. Chem. 2005, 48, 6523–6543. [DOI] [PubMed] [Google Scholar]
- 16. Moser D., Wisniewska J. M., Hahn S., Achenbach J., la Buscató E., Klingler F.-M., Hofmann B., Steinhilber D., Proschak E., ACS Med. Chem. Lett. 2012, 3, 155–158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Stork C., Chen Y., Šícho M., Kirchmair J., J. Chem. Inf. Model. 2019, 59, 1030–1043. [DOI] [PubMed] [Google Scholar]
- 18. Geurts P., Ernst D., Wehenkel L., Mach. Learn. 2006, 63, 3–42. [Google Scholar]
- 19. Schneider P., Röthlisberger M., Reker D., Schneider G., Chem. Commun. 2016, 52, 1135–1138. [DOI] [PubMed] [Google Scholar]
- 20. Law V., Knox C., Djoumbou Y., Jewison T., Guo A. C., Liu Y., Maciejewski A., Arndt D., Wilson M., Neveu V., Tang A., Gabriel G., Ly C., Adamjee S., Dame Z. T., Han B., Zhou Y., Wishart D. S., Nucleic Acids Res. 2014, 42, D1091–D1097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Li X., Li Z., Wu X., Xiong Z., Yang T., Fu Z., Liu X., Tan X., Zhong F., Wan X., Wang D., Ding X., Yang R., Hou H., Li C., Liu H., Chen K., Jiang H., Zheng M., J. Med. Chem. 2020, 63, 8723–8737. [DOI] [PubMed] [Google Scholar]
- 22. Pinzi L., Rastelli G., J. Chem. Inf. Model. 2020, 60, 372–390. [DOI] [PubMed] [Google Scholar]
- 23. Da C., Zhang D., Stashko M., Vasileiadi E., Parker R. E., Minson K. A., Huey M. G., Huelse J. M., Hunter D., Gilbert T. S. K., Norris-Drouin J., Miley M., Herring L. E., Graves L. M., DeRyckere D., Earp H. S., Graham D. K., Frye S. V., Wang X., Kireev D., J. Am. Chem. Soc. 2019, 141, 15700–15709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Bajorath J., Future Drug Discov. 2021, 3, FDD60. [Google Scholar]
- 25. Feldmann C., Bajorath J., Sci. Rep. 2021, 11, 7863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Feldmann C., Bajorath J., Biomol. Eng. 2022, 12, 557. [Google Scholar]
- 27. Belle V., Papantonis I., Front. Big Data 2021, 4, e688969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Lundberg S. M., Lee S.-I., Advances in Neural Information Processing Systems (NIPS) 2017, 30, 4766–4775. [Google Scholar]
- 29. Rodríguez-Pérez R., Bajorath J., J. Med. Chem. 2020, 63, 8761–8777. [DOI] [PubMed] [Google Scholar]
- 30. Feldmann C., Philipps M., Bajorath J., Sci. Rep. 2021, 11, 21594. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data sharing is not applicable to this article as no new data were created or analyzed in this study.