Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 May 30.
Published in final edited form as: Future Med Chem. 2011 May;3(7):797–801. doi: 10.4155/fmc.11.44

Growing Pains in Academic Drug Discovery

Adrian Whitty 1
PMCID: PMC3667403  NIHMSID: NIHMS467113  PMID: 21644825

Abstract

In a recent Perspective, Jonathon Baell argues that compounds published as drug leads by academic laboratories commonly contain functionality that identifies them as nonspecific “pan assay interference” compounds (PAINS). Baell’s article raises broad questions about why best practices for hit and lead qualification that are well known in industry are not more widely employed in academia, and about the role of journals in publishing manuscripts that report drug leads of little potential value. Barriers to adoption of best practices by some academic drug discovery researchers include knowledge gaps and infrastructure deficiencies, but also arise from fundamental differences in how academic research is structured and how success is measured. Academic drug discovery should not seek to become identical to commercial pharmaceutical research, but we can do a better job of assessing and communicating the true potential of the drug leads we publish, and reducing the wastage of resources on non-viable compounds.

Keywords: frequent hitters, promiscuous compounds, false positives, druglikeness, ADME


Vast resources are expended word-wide on research aimed at discovering new and better drugs. The efficiency and effectiveness of these efforts is clearly of great consequence both in financial and in human terms. The later stages of the drug discovery and development process progressively become vastly more expensive in time and resources. Consequently, a major lever for maximizing the efficiency of the drug discovery enterprise is to minimize the expenditure of resources on compounds that can be identified as having little or no chance to succeed. In his recent Perspective, “Observations on screening-based research and some concerning trends in the literature” (Future Med. Chem. 2, 1529–1546, 2010), Jonathan Baell argues that, contrary to what might be expected, the increasing access of academic researchers to compound libraries and the means to screen them has in some ways been detrimental to the efficiency of the drug discovery enterprise. Specifically, he argues that, by and large, the academic community and the scientific journals that publish their work frequently ignore well-established criteria for distinguishing the rare screening hit that represents a compound of real promise from the majority of compounds that do not. Consequently, he maintains, the literature is becoming increasingly “polluted” with compounds whose main claim to fame is that they show activity in a biochemical or cellular assay, but that in fact represent false positive results or act by mechanisms that make them unsuitable for advancement to become a drug. In particular, he argues that a significant fraction of these compounds that are reported as being of potential value can be seen, upon closer inspection, to contain chemical functionality or sub-structural elements that qualify them as what he has previously described as Pan Assay Interference Compounds (“PAINS”) [1], and that others have termed “promiscuous inhibitors” [2] or “frequent hitters” [3]. He substantiates this contention by citing a number of recent studies that exemplify this phenomenon [4].

Quite apart from the scientific arguments involved, Baell’s article raises a number of broader issues that are worthy of examination. To what extent are academic investigators who pursue drug discovery projects aware of best practices that are widely known in industry? What are the barriers to dissemination of this knowledge – most of which is generic and nonproprietary – from the industrial to the academic research community? Do academics typically have access to the expertise and infrastructure necessary to implement these best practices, and are the correct motivations in place to encourage them to do so? Is this even a legitimate expectation of the academic research enterprise? Finally, what is the role of journals in adjudicating which studies have identified compounds of real potential value and which have not?

Many academic drug discoverers are highly knowledgeable and sophisticated in their approaches not just to the science but also to the practical aspects of making drugs. However, a significant fraction of drug discovery publications that derive from academic labs contain fundamental flaws that in a company setting would be perceived as egregious. Baell is correct, moreover, that many compounds that are reported as promising starting points for drug discovery would be considered by a pharmaceutical company researcher to have little or no value.

One cause of this problem is that the approaches the pharmaceutical industry uses to assess the value of hit and lead molecules have evolved considerably in recent decades, but for a variety of reasons adoption of these evolving ideas by academic researchers has been spotty. Thus, many drug discovery efforts in academia still adopt a linear approach to drug discovery that is predicated on the twin assumptions that (i) the central challenge is achieving high affinity and selectivity for binding to the target protein; and (ii) that the potential of a compound to be active in vivo, such as its prospects for good oral bioavaiability and pharmacokinetics, can be evaluated and if necessary remedied at a later stage. This “affinity first” model was abandoned by the pharmaceutical industry decades ago, when it became clear through hard experience that if a lead molecule does not already possess the basic physicochemical properties required for good pharmaceutical behavior then it is often impossible to fix these shortcomings later on. This change resulted from experience of myriad projects in which a seemingly promising high affinity lead molecule could not be advanced beyond a certain point due to poor properties of absorption, distribution, metabolism, excretion (ADME) or toxicity. As a result, it became clear that it is important to work right from the start with compounds that possess certain basic drug-like properties [5]. Thus, to the extent possible, screening collections must be constituted so as to eliminate non-druglike compounds, and an assessment of druglikeness should be a prime criterion in choosing which screening hits to prioritize for further work [6]. This assessment requires objective tools; research has shown that relying on the judgment of even very experienced chemists to identify unpromising compound structures by visual inspection is very unreliable [7]. Fortunately, approaches for assessing the druglikeness of compounds based on simple structural considerations are widely available and easy to implement (e.g. as described in reference [6]) using free or low cost computational tools. In addition, for more advanced projects it is critical that lead optimization does not focus solely on improving activity against the target, but also involves parallel assessment and optimization of ADME properties, a philosophy sometimes referred to as “ADME early”.

A related consideration is that a significant proportion of compounds identified as hits in screening assays can in fact reflect assay artifacts or other false positive results, or have an undesirable mechanism of action that makes them unsuitable for development into a drug. This concern is particularly acute for challenging targets such as protein-protein interactions [8] and intrinsically disordered proteins [9], for which bona fide hits are rare and thus the proportion of false positive hits to true hits is often highly unfavorable. Moreover, certain classes of compounds turn up as hits in a wide variety of different and mechanistically unrelated assays. Many of these “frequent hitters” act through formation of insoluble microaggregates that create a surface that adsorbs and inactivates the target protein [10]. Such compounds are typically characterized by a very steep dose-response relationship in inhibition experiments, a solubility threshold that is roughly comparable to the concentration at which they cause inhibition, and activity that is sensitive to the presence of nonionic detergents [11]. Others are compounds that react covalently with the target protein, or contain contaminants that can undergo such reactions [1,12]. In this regard, it should be noted that very many successful drugs function by a covalent mechanism, and so covalency is not necessarily undesirable and can indeed confer advantageous properties [13]. But the effective development of such compounds requires that their covalent mechanism is known and properly taken into account, so it is important that compounds that function by covalent modification are not erroneously treated as simple reversible binders. Baell makes the point that computational filters to identify PAINS and other “bad actors” are available [1,4]. In principle, these could be incorporated into a convenient computational tool or web server that would make them easily usable by academic researchers, journal editors and reviewers. It is therefore a significant flaw in the system that such compounds can enter the literature or can be used as the basis for grant funding as starting points for drug discovery without the likelihood that they are PAINS or otherwise fatally flawed being acknowledged and addressed.

The failure of some academic drug discovery researchers to keep pace with these and other improved practices for early stage drug discovery is not simply a matter of lack of knowledge. It also reflects important differences in the structure of the academic research enterprise, and also to some extent its goals and incentives. The approach that industry has evolved to efficiently zero in on promising compounds is highly parallel and interdisciplinary. Even early stage lead discovery requires multiple distinct types of expertise including, minimally, computational chemistry, medicinal chemistry, biochemistry and pharmacology. Academic research groups, in contrast, are typically based on one or two areas of core expertise. Although multi-investigator collaborations are becoming increasingly common in academia and are being actively promoted by some funding agencies, it remains unusual for an academic researcher interested in drug discovery to have access to all of the necessary expertise. Consequently, a significant proportion of publications reporting drug discovery projects from academic laboratories address only a narrow aspect of the problem, most typically the identification of compounds that possess affinity for the target protein. Development of compounds based on optimization of target inhibition only, without establishing basic information about their mechanism of action and without any parallel assessment of basic ADME properties, is very unlikely to find molecules that have potential for development into drugs. Consequently, a significant fraction of the compounds discovered using this narrow approach are being unintentionally misrepresented when it is claimed that they have value as drug leads.

It is also important to note that many academic researchers, no matter how sophisticated their knowledge, simply lack ready access to some of the basic experimental tools required to assess the ADME potential of compounds. There are a growing number of drug discovery core facilities and “centers of excellence” being set up in universities and academic research institutes, but these facilities typically concentrate on assembling compound collections and the means to screen them. It is unusual for an investigator to be able to access even basic in vitro ADME assays such as cell permeability or liver microsome stability in an academic setting, let alone to routinely obtain rat pharmacokinetic data, serum protein binding, hERG activity, Ames tests or CYP450 inhibition data on substantial numbers of compounds. Obtaining such data from commercial contract labs is very expensive, and thus is only practical on one or two late stage compounds. This infrastructure gap enforces a linear, sequential model of lead optimization rather than the parallel approach necessary to routinely identify good drug leads.

Yet another aspect of the problem is that success in academia is measured primarily by papers published, grants obtained, and students and postdocs trained not, as is the case in industry, in terms of whether truly useful compounds are discovered. For many academic groups in the basic sciences, publication of the lead structures and their activity against the target is the end-point, and they do not plan to participate directly in the further advancement of the compounds. This modus operandi is perfectly appropriate and is not intrinsically problematic. However, it has the drawback that it allows considerations that are critical to the real value of the resulting compounds, such as their solubility and cell permeability, to be relegated to some future time that may or may not ever come. The result is what Baell calls the “pollution” of the literature with compounds that are presented as having value as drug leads, but that minimal analysis or characterization from the point of view of mechanism of action, druglikeness and ADME properties would show in fact they do not. Most importantly, were these kinds of information available to the investigators from an early stage and continuously throughout their efforts, it would undoubtedly lead them to make different and better compounds and greatly increase the translational impact of their research.

Baell proposes some solutions to these problems. In particular, he advocates that investigators and journals should be more rigorous in requiring that claims that a compound has potential as a drug lead must be justified by some assessment to rule out any high likelihood that the compound’s activity is due to it functioning as a PAIN or via other unsuitable mechanism of action. This step is eminently sensible, and basic assessment such as Baell proposes can easily be performed by any investigator without requiring special knowledge or expensive software. It is likely that such a basic evaluation of compounds by the authors or the journal would cast a very different light on the value of what has been achieved in some studies. I would add to this that some minimal assessment of the compounds’ druglikeness, using one of many freely available and easy to use web-based analysis tools (e.g. www.molsoft.com, www.molinspiration.com, etc.) or widely available commercial software tools (often free-of-charge to academic researchers), would also be a major step forward. Journals have a major role to play in establishing and enforcing these standards, as many routinely do already for instance for the analytical characterization of new chemical compounds [14,15], the presentation of proteomics data [16], and other types of information that are central to the value of a piece of published research. The adoption of standards for the characterization of compounds that are presented as potential drug leads will not only clean up the literature, but will also encourage researchers to think about these issues well before they are considering publication, and so will help decrease the proportion of their effort and resources that they spend on unpromising compounds.

This is not to say that any compounds that fail to pass muster according to these measures should not be published. Indeed, Baell points out that many compounds that have no prospects whatsoever of becoming a drug are nevertheless very valuable in establishing some principle of protein-ligand binding or other fundamental scientific point that is important and eminently publishable. The point is simply that where the authors claim that a compound has value as a drug lead, like any other claim this should be supported by appropriate evidence. Thus, the current situation where inhibitory activity in a biochemical or cellular assay alone is accepted by many journals as prima facie evidence of value as a starting point for drug discovery is detrimental to the field.

In the longer term, any real push towards enabling academic researchers in the basic sciences to routinely make translational contributions to drug discovery will necessarily require much better levels of access to basic in vitro ADME assays and pharmacokinetic analyses, at a cost that allows this information to be obtained early and often. This experimental infrastructure is technically impractical and also prohibitively expensive to establish in an individual medicinal chemistry laboratory. However, it becomes much more feasible and cost effective when implemented in core facilities and regional centers that serve a large number of research groups. Such an investment would dramatically increase the ability of researchers in the basic sciences to have a translational impact by developing molecules that, with minimal additional characterization, can be used to perform meaningful animal experiments and ultimately human clinical trials.

The above discussion is by no means intended to imply that academic drug discovery research should become identical to or redundant with commercial pharmaceutical research. This would be a very undesirable outcome. Rather, the point is simply that where discovery of a conventional drug is the goal, then it is important to adopt well-established practices that eliminate unsuitable compounds as early as possible, and to ensure that when the results of drug discovery efforts are published the true potential of the compounds for further advancement is accurately communicated. It nonetheless very important for academic researchers to undertake projects that challenge conventional wisdom on what makes a good drug, and aim to identify new approaches and new solutions that violate conventional guidelines. Otherwise, the current, largely empirical guidelines for druglikeness will forever constrain us to work with molecules that resemble those we already know. To advance the range of targets we can address with synthetic drugs will almost certainly require the discovery of new chemotypes that achieve the required combination of activity against the target and good pharmaceutical properties in novel ways. Examples of such potentially ground-breaking research include exploration of novel chemotypes through diversity-oriented synthesis [17] and other creative synthetic approaches, and technologies as stapled peptides and other approaches to mimicking protein secondary structural features [18]. Although it is not clear today exactly how some of these approaches will achieve utility in vivo, only through continued research will their true potential be established and new vistas for the drug discoverer be opened up.

Finally, it is important to note that even the most careful and stringent application of the principles and practices described above cannot guarantee success. Biology is simply too complicated for our current capabilities to allow us to predict, early on, which chemical structures will possess the properties required to be safe and effective drugs. However, Baell’s article highlights the fact that, as an academic drug discovery community, we can do a much better job of assessing and accurately communicating the value of the drug leads we discover, and of reducing the wastage of increasingly scarce resources on non-viable compounds.

Executive Summary.

  • In a recent Perspective, Jonathon Baell argues that compounds published as drug leads by academic laboratories commonly contain chemical functionality that identifies them as nonspecific “pan assay interference” compounds (PAINS) that have no real prospect for advancement into drugs.

  • Baell’s article raises broad questions about why best practices for hit and lead qualification that are well known in industry are not more widely employed in academia, and about the role of journals in publishing manuscripts that report drug leads of little potential value.

  • Because they are PAINS, or are undruglike in other easily diagnosed ways, many compounds reported by academic laboratories as promising starting points for drug discovery would be considered by a pharmaceutical company to have little or no potential for advancement.

  • One cause of this problem is that many drug discovery efforts in academia still adopt a flawed, highly linear “affinity first” approach to drug discovery, rather than the parallel optimization of affinity and ADME properties that is the accepted approach in industry.

  • The failure of the academic community to keep pace with best practices for early stage drug discovery also reflects important differences in the structure of the academic research enterprise - which is centered around individual research groups with specialized expertise - and also to some extent its goals and incentives.

  • Even academic researchers who are knowledgeable in good drug discovery practices typically lack ready access to some of the basic experimental tools required to assess the ADME potential of compounds.

  • Another issue is that success in academia is measured primarily by papers published, grants obtained, and students and postdocs trained, not in terms of whether medically useful compounds are discovered, discouraging the necessary early consideration of downstream development issues.

  • Methods to identify PAINS and other promiscuous compounds are well established, and should be routinely applied to compounds reported by academic laboratories before claims are made about value as drug leads.

  • In the longer term, enabling academic researchers in the basic sciences to routinely make translational contributions to drug discovery research will require greatly inmproved access to basic in vitro ADME assays and pharmacokinetic analyses.

  • It is not proposed that academic drug discovery research should become identical to or redundant with commercial pharmaceutical research, but simply that it is important to eschew easily avoidable errors and to accurately communicate the true potential of reported compounds.

  • Even the most stringent application of best practices in drug discovery cannot guarantee success, but as an academic drug discovery community we can do a better job of assessing and communicating the true potential of the compounds we publish, and thus reducing the wastage of attention and resources on non-viable compounds.

Acknowledgments

This work was supported in part by grant 1R01GM094551 from the National Institute for General Medical Sciences of the United States National Institutes of Health.

References

  • 1.Baell JB, Holloway GA. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J Med Chem. 53(7):2719–2740. doi: 10.1021/jm901137j. [DOI] [PubMed] [Google Scholar]
  • 2.McGovern SL, Caselli E, Grigorieff N, Shoichet BK. A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. J Med Chem. 2002;45(8):1712–1722. doi: 10.1021/jm010533y. [DOI] [PubMed] [Google Scholar]
  • 3.Roche O, Schneider P, Zuegge J, et al. Development of a virtual screening method for identification of “frequent hitters” in compound libraries. J Med Chem. 2002;45(1):137–142. doi: 10.1021/jm010934d. [DOI] [PubMed] [Google Scholar]
  • 4.Baell JB. Observations on screening-based research and some concerning trends in the literature. Future Medicinal Chemistry. 2010;2(10):1529–1546. doi: 10.4155/fmc.10.237. [DOI] [PubMed] [Google Scholar]
  • 5.Di L, Kerns EH, Carter GT. Drug-like property concepts in pharmaceutical design. Curr Pharm Des. 2009;15(19):2184–2194. doi: 10.2174/138161209788682479. [DOI] [PubMed] [Google Scholar]
  • 6.Lobell M, Hendrix M, Hinzen B, et al. In silico ADMET traffic lights as a tool for the prioritization of HTS hits. Chem Med Chem. 2006;1(11):1229–1236. doi: 10.1002/cmdc.200600168. [DOI] [PubMed] [Google Scholar]
  • 7.Lajiness MS, Maggiora GM, Shanmugasundaram V. Assessment of the consistency of medicinal chemists in reviewing sets of compounds. J Med Chem. 2004;47(20):4891–4896. doi: 10.1021/jm049740z. [DOI] [PubMed] [Google Scholar]
  • 8.Whitty A, Kumaravel G. Between a rock and a hard place? Nat Chem Biol. 2006;2(3):112–118. doi: 10.1038/nchembio0306-112. [DOI] [PubMed] [Google Scholar]
  • 9.Metallo SJ. Intrinsically disordered proteins are potential drug targets. Curr Opin Chem Biol. 2010;14(4):481–488. doi: 10.1016/j.cbpa.2010.06.169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Coan KE, Maltby DA, Burlingame AL, Shoichet BK. Promiscuous aggregate-based inhibitors promote enzyme unfolding. J Med Chem. 2009;52(7):2067–2075. doi: 10.1021/jm801605r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Feng BY, Shoichet BK. A detergent-based assay for the detection of promiscuous inhibitors. Nat Protoc. 2006;1(2):550–553. doi: 10.1038/nprot.2006.77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Huth JR, Mendoza R, Olejniczak ET, et al. ALARM NMR: a rapid and robust experimental method to detect reactive false positives in biochemical screens. J Am Chem Soc. 2005;127(1):217–224. doi: 10.1021/ja0455547. [DOI] [PubMed] [Google Scholar]
  • 13.Singh J, Petter R, Baillie T, Whitty A. The Resurgence of Covalent Drugs. Nature Reviews Drug Discovery. 2011 doi: 10.1038/nrd3410. [in press] [DOI] [PubMed] [Google Scholar]
  • 14.Society JotAC. Guidelines for Characterization of Organic Compounds. American Chemical Society; [Google Scholar]
  • 15.Chemistry JoM. Guidelines for Authors. American Chemical Society; [Google Scholar]
  • 16.Proteomics MaC. Revised Publication Guidelines for Documenting the Identification and Quantification of Peptides, Proteins, and Post-Translational Modifications by Mass Spectrometry. American Society for Biochemistry and Molecular Biology; [Google Scholar]
  • 17.Dandapani S, Marcaurelle LA. Current strategies for diversity-oriented synthesis. Curr Opin Chem Biol. 14(3):362–370. doi: 10.1016/j.cbpa.2010.03.018. [DOI] [PubMed] [Google Scholar]
  • 18.Murray JK, Gellman SH. Targeting protein-protein interactions: lessons from p53/MDM2. Biopolymers. 2007;88(5):657–686. doi: 10.1002/bip.20741. [DOI] [PubMed] [Google Scholar]

RESOURCES