Abstract
For a significant number of years, scientists of many persuasions have assayed natural product materials ranging from crude extracts to pure compounds, in a multitude of assays causally related to some biological processes. However, in a very significant number of submitted papers and published articles, what may be considered as canned biological assays were used, and if a positive effect was observed, then the authors would claim that the material assayed was a potential drug lead. This also occurred with pure synthetic compounds and compounds derived from natural products by simple chemical modifications.
However, what has now become quite obvious—with all such classes of materials—is that there are many promiscuous players with multiple bioactivities. These can range from relatively crude extracts, pure compounds from natural products, synthetic processes that produce natural product derivatives, and even compounds that are truly synthetic in origin. There is also a potential problem with the data from crude to purified extracts being used to claim some form of beneficial activities for such materials, to sell that particular mixture to the lay public, by very careful descriptions of its possible uses due to legal hurdles.
With the advent of artificial intelligence and very large compound databases, some of which may well contain impure materials, scientists from a variety of backgrounds have begun to utilize such listings to obtain compounds for their low to high throughput biological screens, without realizing that there are very significant numbers of active compounds (eg, pan assay interference compounds and invalid metabolic panaceas), that will hit in many different screens for a variety of reasons, thus leading to significant wasted efforts and published scientific articles that have incorrect results.
This commentary gives some of the history of such materials but is designed to be used as a warning to both researchers and in particular, journal editors, and reviewers, that reports of biological results that are claimed to be the result of the compounds used, need to be very carefully screened for results due to such promiscuous compounds, irrespective of their nominal source(s).
All literature searches were made by the author and the background knowledge has come from more than 55 years of research in industry and governmental laboratories in both the United Kingdom and the United States, for enzyme inhibitors/activators as well as antimicrobial and antitumor lead compounds mainly from natural product sources.
The conclusion that I came up with as a result is this: Caveat emptor. (Curr Ther Res Clin Exp. 2021; 82:XXX–XXX)
© 2021 Elsevier HS Journals, Inc.
Key words: Bioassay Pro, frequent hitters, invalid metabolic panacea compounds, PAINS compounds
Introduction
I was asked by the Editor in Chief to write a Commentary on the types of problems that can and do frequently occur when every type of compound—from simple extracts from biological materials to pure compounds—is assayed in some biological system(s) ranging from an isolated enzyme to a cell-based system. In the past 40 or so years, I have been asked by editors of various journals to referee manuscripts in which the authors have reported the following (frequently as part of a title) “interesting biological results from assays of extract and/or natural product X,” that they consider worth having published in the (bio)chemical literature via that journal. This request is usually due to them considering that the results are indicative of significant drug capabilities of the material(s) used, but these data frequently consist of the results from what I will call canned biological assays.
For example, there are many such assays that claim to denote a “redox-active” potential from an extract/compound whether crude, partially purified, purified or even (semi)-synthetic, but in a fair amount of these cases, when investigated, the presence of promiscuous compounds that are nonspecific is well documented.1 Although Feng et al1 were referring to problem compounds in medium to high throughput drug screens, the same phenomena have been seen repeatedly, and nowadays, with the advent of machine learning/artificial intelligence (AI) used to select compounds from multiple external databases for subsequent assay(s). What can be implied from the data offered by Feng et al1 (and others) is that many of the datasets/compounds that the AI system could utilize are not suitable for such systems and should be very carefully inspected before wasting time and effort on their use.
For example, the latest ZINC20 system has more than 1.3 billion entries.2 The major unrecognized problem that can occur if an AI system is used to choose a series of compounds for subsequent testing is not due to the structures selected, but to the linked assay results in/from such databases. If the assay results are not reliable, the exercise is not going to produce reliable data for subsequent analysis.
Other potential sources of natural product libraries (some commercial, others open) are given in the 2020 review by Wilson et al,3 although which are available for AI-based searching is not noted. The recent article by Rodrigues4 demonstrates the good, the bad, and the ugly aspects of such data, and how scientists who wish to utilize the bioactivity databases that are available, need to thoroughly investigate the sources of the compounds whose subsequent data they wish to utilize.
A Recent Example of Problems Using Natural Product Extracts
An example of what can happen with crude extracts from parts of plants is the recent article by Alhawarri et al5 in the journal Molecules, (an open access journal). Page 3 of that article has a listing of the “assays” utilized, followed by a discussion of many of the classes of molecules that were “considered to be present.” However, many of the compounds listed were identified using what in basic analytical chemistry are known as spot tests.
To demonstrate such problems, the assay that they considered as relating to “antioxidant properties” of the crude extracts was used for more than 70 years as a basic method for determining phenolic components and was originally designed to show the presence of tyrosine, although as shown below it has many more activities associated with it.
What is not usually recognized or even mentioned by users is that there are many compound classes that react with this reagent, including proteins and free thiol groups. A partial listing of other compound types was given by Everette et al6 in an excellent study in 2010 that should be required reading for any investigator before using this assay system.
This reagent system was the basis for the Lowry et al7 protein assay from 1951, which was based on the 1922 article by Wu8 that extended the original 1919 article by Folin and Wu.9 Another point related to quantification is that assays based on the Folin reagent do not give linear results in protein determinations because it does not obey Beer's Law,10 and the assay is interfered with by multiple detergents.
Specific Comments Related to Isolation and Assay Problems
Ill-defined results
Feng et al1 and Rodrigues4 cover purified compounds, but in the cases of natural products, one starts with a plant, microbe, or marine invertebrate and then proceeds basically by 1 of 2 routes.
The first is simply to macerate/extract the dried material using a number of solvents, frequently an aqueous alcohol mixture and/or a number of aliphatic solvents such as hexanes, although if mimicking an ethnobotanical process, the treatments may be different. However, once extracts have been obtained, they are screened using whatever assays the investigators have access to. At times, a low purity compound might be isolated, and it is tested in addition to the extracts in the assay system(s) available to the investigator. Although this may at first glance be similar to the bioactivity-driven method described below, it is not because the investigators are simply using the data to claim a specific activity.
The second method is known as a bioactivity-driven isolation process. In this case, which in general is only used in laboratories that have access to relatively sophisticated biological and chemical isolation/assay systems, the source material is treated in a series of isolation steps usually utilizing a variety of sequential solvent and/or chromatographic isolation. The difference is that at each step from the first crude product, its biological activity in the assay(s) desired is the driving force for the next purification step in the process. If the activity is lost, then one reevaluates the processes leading up to that result.
Compounds that give spurious results
Over the past few years, there have been a number of excellent articles describing pure compounds that in some cases might have come originally from natural sources, or be semisynthetic variants, and other compounds that consistently give what can be considered spurious results in multiple assays.11,12 Why these reports have not altered the content of submissions to journals when authors repeatedly ignore this information may be due in a number of cases to a lack of access to scientific libraries (particularly in lesser-developed countries), or perhaps is coupled to the publish or perish systems that occur in certain countries, where an advanced degree is not awarded until an article reporting the results is published or presented in an international journal or conference. This latter requirement has led to the proliferation of very significant numbers of journals that will effectively publish any article if the necessary publication fees are paid, irrespective of the validity of the research.
Data from Pure Materials with Unrecognized Interfering Assay Components
Feng et al1 also described the problems with pure compounds irrespective of source, when they reported the effects in 2005 of detergent addition on 2 simple enzyme processes, a β-lactamase assay and a chymotrypsin assay. This report built upon data from 1997, 1999, and 2003 (references in Feng et al1) and came to the conclusion that some organic molecules form large colloidal aggregates that can sequester, and thus inhibit enzymes when assayed at 30 µM. This is a common level used even with approximations for semipurified materials.
Sub-Comment on Colloids
Although apparently out of order in this discussion, the finding of aggregation with or without colloid formation is relevant to natural product studies. In many labs, a crude extract is assayed at concentrations much in excess of 20 mg/mL, which can give nominal concentrations well over the 30 µM level for individual components in a mixture. Thus, one should consider such problems before any such experiment because extracts and their subsequent fractions can well aggregate and/or form colloids due to other cellular components in the extracts.
Pure Compounds from All Sources Identified as Pan Assay Interference Compounds
Baell et al11, 12, 13 have reported extensively on pure compounds that are frequent hitters in biological assays. They use the term pan assay interference compounds (PAINS) to describe these molecules. Although they mainly utilized pure compounds in their reports, the substructures that enabled the multiple responses in different biological assays are found in many natural products, as they demonstrated in an article in the Journal of Natural Products in 2016.13
In 2015, another academic group14 reported what they called “PAINS in the assay” when looking at the results from a high throughput screen (HTS) targeting the histone acetyltransferase Rtt109, where the actives fell into the PAINS category. They also listed the assays that had similar problems, with extensive citations, also commenting that cell-based assays are prone to such effects.
Very recently in 2021, Dahlin et al15 published a review in Cell Chemical Biology covering nuisance compounds in cell-based assays. Although this group of coauthors from academia, the National Institutes of Health, and the pharma industry chose not to publish in an open access journal, it is well worth investigators obtaining a copy if only for the figures because they visually demonstrate the problems seen in a variety of cell-based assays, and the reference list is both extensive and up to date.
In addition, a recent report from Yang et al16 indicated that in addition to PAINS compounds that are already identifiable, there are significant numbers of other agents with varying chemical properties, including confirming a number of the compound types and their chemical substructures mentioned in the Wilson et al17 review that can interfere with assay components.
Pure Compounds from All-Natural Product Sources Identified as Invalid Metabolic Panaceas
From the aspect of natural product-based interfering compounds, the analytical review published by Bisson et al,18 identified what is effectively the natural products equivalent of PAINS. They called these agents invalid metabolic panaceas, and make the very valid point that the active principles in many crude to partially purified natural product-based therapies are polyfactorial agents. The group used data from the contents of NAPRALERT (www.napralert.org), a database covering more than 80 years of information that is hosted in the Pharmacy College at University of Illinois at Chicago. Table 1 in the review by Bisson et al18 is an eye-opener as to the promiscuity of many known natural product compounds, including as numbers 1 through 5 in rank order: 1. quercetin, 2. gossypol, 3. alpha-pinene, 4. rutin, and 5. berberine. This demonstrates the incipient problems of simply assaying even pure compounds, let alone a crude to partially purified extract, and claiming a valid bioactivity.
Current Methodologies in Extract Pretreatment
What is now becoming the major emphasis with natural product extracts before putting them in any assay, particularly an HTS assay, is the production of enriched fractions that are organized by polarity and molecular weight(s).19 A process utilizing adsorption and polarity-based separations was designed at the National Cancer Institute (NCI) to generate 10 fractions per initial extract in a format amenable to high throughput screening (384 well plates) covering the processing of more than 200,000 extracts. This process built on many years of working with extracts from microbes, marine organisms, and plant collections by the NCI. Later work from the same NCI group demonstrated the further work process with these separated fractions. Consideration was also given to the concomitant production and storage of spectral data, usually MSn and at times Fourier transform IR, so that if a particular fraction produces an initial hit, the isolation of any active component(s) is relatively simple, although identification might take somewhat longer. All data are machine-readable and multiple copies of each plate will be produced.20
A later review from basically the same NCI group21 discussed the problems, and some potential solutions, to the utilization of natural products (extracts to pure compounds) in HTS systems. That review included a discussion of methods of assessing the probability of identifying PAINS compounds in the extracts produced. The techniques proposed require significant preliminary determinations as to the intrinsic problems associated with specific assays. Such determinations will usually require that they be performed in sophisticated and well-funded laboratories to tag extracts and/or their fractions as containing potential PAINS-associated compounds, well before any significant further work is proposed on such extract fractions.
Part of the reasons for subsequently producing the polarity separated fractions referred to above and published in the last 3 references, in particular, Wilson et al,21 was the experience gained by the NCI group on the problems associated with screening a large number of crude (unfractionated) natural product extracts during the 2012-2014 time frame, against a variety of cancer-cell related Bcl2 protein targets in conjunction with a California group and their subsequent follow-up of a few cases.22
Comment on the Natural Product Circumin
As the best relatively current example of how much time and treasure has been wasted due to a multiplicity of problems as a result of publications that claim, usually without proper controls in the assays used, that “Compound X” will be the next lead against major diseases, an article by Nelson et al23 discussing curcumin should be required reading for any investigator studying natural products at any stage of purity.
The structures of the 3 major components of turmeric extract from which curcumin(oids) are extracted are shown in the Figure 1. The 3 compounds range from 1% to 6% by weight of turmeric on extraction from the host plant Curcuma longa, with the majority of the weight of the turmeric extract being due to carbohydrates. Of the 3 bioactive compounds, the major component curcumin is present at between 60% and 70% by weight followed by demethoxycurcumin at 20% to 27% by weight and then bisdemethoxycircumin at 10% tob15% by weight depending upon the growth conditions and the collection area. Note that these proportions of the bioactive principles are “of the 1% to 6% by weight of the original extract.”
Figure 1.
Circumin components of turmeric extracts.
From a chemical perspective, the ketone and hydroxyl groups in the middle of all the compounds permit keto-enol tautomerism to occur, and this leads to chemical instability. In addition, only the enol tautomer is soluble between pH 3 to 9. Thus, biological availability is not what users might think.
Since the late 1990s, curcumin has been reported as a result of assays of various etiologies to have activity in at least the following disease states: anti-inflammatory, anti-HIV, antibacterial, antifungal, nematocidal, antiparasitic, antimutagenic, antidiabetic, and antifibrinogenic, in addition to being radioprotective, anticarcinogenic, and even useful in Alzheimer's disease—plus the old standby, antioxidant activity.
As of August 2021 there have been more than 250 clinical trials of circuminoids (meaning regimens that included some version of curcumin) against diseases ranging from mucositis to various cancers. Of that number, only 13 studies were listed as “having results” in the National Institutes of Health Clinical Trials Database (www.ClinicalTrials.gov) as of the end of August 2021, with 1 being terminated at the Phase I or II level and only 1 going to Phase III as a potential treatment for radiation dermatitis. However, that trial, which finished in 2016, does not list any subsequent publication. Those 13 reports as noted came from more than 250 trials of circuminoids in 1 form or another.
If one now uses the filters on the ClinicalTrials.gov database for “recruiting/not recruiting,” then there are 32 studies listed with 5 at the Phase III and 3 at Phase IV, 1 of which is not yet recruiting. It should be noted that Phase IV trials are supposedly for drugs that have been approved. However, no curcumin-based drugs have been approved as of the end of September 201924 and a check of subsequent data on drug approvals on the Food and Drug Administration website does not show any circurminoid-based drug from October 2019 through the end of August 2021. Nor is there any published report of a successful clinical trial at Phase III using a double-blinded placebo controlled clinical study.
By using ClinicalTrials.gov as the resource and adjusting the filters to show diseases, drug treatment(s), methodologies, and so on, readers can see how many more trials of curcuminoids (and derivatives) at different clinical levels are underway. It should also be noted that this database covers not only US-sponsored trials, but also frequently comparable studies from overseas. In addition, similar free databases are available in the European Union, Japan, China, and Australia, which can also be searched as readers desire.
Conclusions
For the review of any article involving natural product sources irrespective of source, the experimental details should be the first section that is read in detail. If, for example, there is an aqueous ethanolic extraction of the crude mass, followed by fractionation of that extract using lower polarity solvents, then any subsequent results are at best debatable because the initial extraction will leave the less-polar fractions behind. If the initial work-up procedures have little or no relationship to using assays at each step, be very careful of the quoted results. If assays involve what can be considered to be analytical spot tests (see the earlier discussion on anti-oxidant assays above), be very aware of the promiscuity of such assays. If the assays only use purified compounds, be aware of the PAINS/invalid metabolic panacea compounds and their problems. Finally, the various problems mentioned above can be best expressed by the old Latin tag caveat emptor, which in English is “let the buyer beware.”
Conflicts of Interest
The author has indicated that he has no conflicts of interest regarding the content of this article.
References
- 1.Feng B.Y., Shelat A., Doman T.N., Guy R.K., Shoichet B.K. High-throughput assays for promiscuous inhibitors. Nat. Chem. Biol. 2005;1:146–148. doi: 10.1038/nchembio718. [DOI] [PubMed] [Google Scholar]
- 2.Irwin J.J., Tang K.G., Young J., Dandarchuluun C., Wong B.R., Khurelbaatar M., Moriz Y.S., Mayfield J., Sayle R.A. Zinc20-A free ultralarge-scale chemical database for ligan discovery. J. Chem. Inf. Model. 2020;60:6065–6073. doi: 10.1021/acs.jcim.0c00675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wilson B.P., Thornburg C.C., Henrich C.J., Grkovic T., O'Keefe B.R. Creating and screening natural product libraries. Nat. Prod. Rep. 2020;37:893–918. doi: 10.1039/c9np00068b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Rodrigues T. The good, the bad, and the ugly in chemical and biological data for machine learning. Drug Discov. Today Technol. 2019;32-33:3–8. doi: 10.1016/j.ddtec.2020.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Alhawarri M.B., Dianita R., Razak K.N.A., Mohamad S., Nogawa T., Wahab H.A. Antioxidant, anti-inflammatory, and inhibition of acetylcholinesterase potentials of Cassia timoriensis DC flowers. Molecules. 2021;26:2594. doi: 10.3390/molecules26092594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Everette J.D., Bryant Q.M., Green A.M., Abbey Y.A., Wangila G.W., Walker R.B. A thorough study of reactivity of various compound classes towards the Folin-Ciocalteau reagent. J. Agric. Food Chem. 2016;58:8139–8144. doi: 10.1021/jf1005935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lowry O.H., Rosebrough N.J., Farr A.L., Randall R.J. Protein measurement with the Folin phenol reagent. J. Biol. Chem. 1951;193:265–275. [PubMed] [Google Scholar]
- 8.Wu H. A new colorimetric method for the determination of plasma proteins. J. Biol. Chem. 1922;51:33–39. [Google Scholar]
- 9.Folin O., Wu H. A system of blood analysis. J. Biol. Chem. 1919;38:81–110. [Google Scholar]
- 10.Buijis K., Maurice M.J. Some considerations on apparent deviations from Lambert-Beer's Law. Anal. Chim. Acta. 1969;47:469–474. [Google Scholar]
- 11.Baell J.B., Holloway G.A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 2010;53:2719–2740. doi: 10.1021/jm901137j. [DOI] [PubMed] [Google Scholar]
- 12.Baell J.B., Walters M.A. Chemical con artists foil drug discovery. Nature. 2014;513:481–483. doi: 10.1038/513481a. [DOI] [PubMed] [Google Scholar]
- 13.Baell J.B. Feeling Nature's PAINS: Natural products, natural product drugs, and pan assay interference compounds (PAINS) J. Nat. Prod. 2016;79:616–628. doi: 10.1021/acs.jnatprod.5b00947. [DOI] [PubMed] [Google Scholar]
- 14.Dahlin J.L., Nissink W.M., Strasser J.M., Francis S., Liggins L., Zhou H., Zhang Z., Walters M.A. PAINS in the Assay: Chemical mechanisms of assay interference and promiscuous enzymatic inhibition observed during a sulfhydryl-scavenging HTS. J. Med. Chem. 2015;58:2091–2113. doi: 10.1021/jm5019093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Dahlin J.L., Auld D.S., Rothenaigner I., Haney S., Sexton J.Z., Nissink J.W.M., Walsh J., Lee J.A., Strelow J.M., Willard F.S., Ferrins L., Baell J.B., Walters M.A., Hua B.K., Hadian K., Wagner B.K. Nuisance compounds in cellular assays. Cell Chem. Biol. 2012;28:356–370. doi: 10.1016/j.chembiol.2021.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Yang Z-Y., Yang Z.-J., He J.-H., Lu A.-P., Shao Liu S., Hou T.-J., Cao D.-S. Benchmarking the mechanisms of frequent hitters: limitation of PAINS alerts. Drug Discov. Today. 2021;26:1353–1358. doi: 10.1016/j.drudis.2021.02.003. [DOI] [PubMed] [Google Scholar]
- 17.Wilson B.P., Thornburg C.C., Henrich C.J., Grkovic T., O'Keefe B.R. Creating and screening natural product libraries. Nat. Prod. Rep. 2020;37:893–918. doi: 10.1039/c9np00068b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bisson J., McAlpine J.B., Friesen, Chen S.-N., Graham J., Pauli G.F. Can invalid bioactives undermine natural product-based drug discovery? J. Med. Chem. 2016;59:1671–1690. doi: 10.1021/acs.jmedchem.5b01009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Thornburg C., Britt J., Evans J., Akee R., Whitt J., Trinh S., Harris M., Thompson J., Ewing T., Shipley S., Grothaus P., Newman D., Grkovic T., O'Keefe B. NCI program for natural product discovery: Generation of the largest, publicly accessible library of natural product fractions for high-throughput screening. ACS Chem. Biol. 2018;13:2484–2497. doi: 10.1021/acschembio.8b00389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Grkovic T., Akee R.K., Thornburg C.C., Trinh S.K., Britt J.R., Harris M.J., Evans J.R., Kang U., Ensel S., Henrich C.J., Gustafson K.R., Schneider J.P., O'Keefe B.R. National Cancer Institute (NCI) program for natural products discovery: Rapid isolation and identification of biologically active natural products from the NCI prefractionated library. ACS Chem. Biol. 2020;15:1104–1111. doi: 10.1021/acschembio.0c00139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wilson B.P., Thornburg C.C., Henrich C.J., Grkovic T., O'Keefe B.R. Creating and screening natural product libraries. Nat. Prod. Rep. 2020;37:893–918. doi: 10.1039/c9np00068b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Hassig C.A., Zeng F.-Y., Kung P., Kiankarimi M., Kim S., Diaz P.W., Zhai D., Welsh K., Morshedian S., Su Y., O'Keefe B., Newman D.J., Rusman Y., Kaur H., Salomon C.E., Brown S.G., Baire B., Michel A.R., Hoye T.R., Francis S., Georg G.I., Walters M.A., Divlianska D.B., Roth G.P., Wright A.E., Reed J.C. Ultra High Throughput Screening of Natural Product Extracts to Identify Pro-apoptotic Inhibitors of Bcl-2 Family Proteins. J. Biomol. Screening. 2014;19:1201–1211. doi: 10.1177/1087057114536227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Nelson K.M., Dahlin J.L., Bisson J., Graham J., Pauli G.F., Walters M.A. The essential medicinal chemistry of curcumin. J. Med. Chem. 2017;60:1620–1637. doi: 10.1021/acs.jmedchem.6b00975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Newman D.J., Cragg G.M. Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019. J. Nat. Prod. 2020;83:770–803. doi: 10.1021/acs.jnatprod.9b01285. [DOI] [PubMed] [Google Scholar]

