Abstract
Natural products continue to play important roles in biomedical, agricultural and ecological science. Yet despite ongoing advances in “omics” technologies, including genomics, transcriptomics, phenomics and metabolomics, there is still no clear consensus on the scope and scale of chemical diversity in the natural world. The evolution and maturation of chemical databases for natural products offer opportunities to explore this question from a range of different perspectives. This Outlook will use data from the Natural Products Atlas to examine rates of similarity and variation among biosynthetic classes of molecules, to explore how structure can be related to function, and to examine the scope and scale of new scaffold discovery in the current era of natural products science. It presents an examination of known chemical diversity, investigates what this diversity can tell us about potential translational applications, and explores how current knowledge informs what we might expect to discover in future studies.


Introduction
The natural world has long been exploited as a source of societally important small molecules. A central justification for the continued study of natural products is that nature contains a vast array of chemotypes, many of which have evolved to confer a competitive advantage to their producers. These molecules are hypothesized to be important in both ecological and translational settings, including host-microbe interactions and drug discovery.
Understanding the full landscape of chemical diversity in the natural world would influence many aspects of natural products science. For example, what is the potential return on investment of screening different types of natural product libraries (e.g., plants, sponges, fungi) against a given biological target? Which classes of natural products possess the highest potential for novel scaffold discovery? Is scaffold novelty required for identifying molecules with future value to society? If not, how does one assemble the highest diversity of scaffolds from the natural world in the most efficient format? This in turn raises interesting philosophical questions about the state of natural products research. For example, how much of naturally occurring chemical space is captured in the existing scientific literature? What is the untapped biosynthetic capacity left to discover? And how should this coverage influence future natural products investigations?
It is very difficult to characterize and classify the unknown. In the 1980s Staley and Konopka coined the phrase the “great plate count anomaly”, to describe their observation of large discrepancies between the number of microorganisms present in water samples that could be detected by direct counting compared to the number of colony forming units that would grow on Petri plates in a laboratory setting, In a similar vein, natural products science is experiencing a “great biosynthetic gene cluster anomaly” where vastly more biosynthetic gene clusters (BGCs) have been detected in genomic DNA than there are known natural products in the scientific literature. , It is not clear whether this is because only a small fraction of available natural products have been identified to date, because current annotation technologies fail to recognize BGCs with low sequence homologies that make similar products, because the compounds from many BGCs are not produced under laboratory conditions, or because they are expressed at such low levels that they are difficult to detect and characterize. , Natural products diversity can therefore be viewed through two different lenses. On the one hand, increasing rates of rediscovery suggest that many scaffolds have already been encountered. By contrast, examination of genomic , and metabolomic data and heterologous expression of targeted BGCs − continues to suggest the presence of a diverse array of novel scaffolds.
Natural products science is experiencing a “great biosynthetic gene cluster anomaly” where vastly more biosynthetic gene clusters (BGCs) have been detected in genomic DNA than there are known natural products in the scientific literature.
Although the chemical environment is constantly evolving, at any given time point the number of chemical structures on the globe is fixed. This chemical diversity is finite and can be organized into molecular classes based on structural similarity or biosynthetic origin. On an evolutionary time scale the entire history of molecular natural products science arguably forms one such “snapshot” for isolable and structurally defined metabolites. This outlook article will use The Natural Products Atlas, a database of published microbial natural products structures, − to examine the subject of natural products diversity from several different perspectives. It will consider both organization of compound scaffolds within the data set, and the frequency of discovery of molecules with unique carbon skeletons with the aim of stimulating discussion in this interesting and complex subject area.
How is Chemical Diversity Defined?
Although “chemical diversity” is often used in medicinal chemistry to describe variation in chemical structure, it lacks an accepted definition or method for quantification. In many cases measures of chemical diversity convert molecular representations of structures (i.e., graph representations where atoms (nodes) are connected through one or more bonds (edges)) into an array of values (a “fingerprint” or molecular encoding) where each entry describes the presence or absence of a specific structural attribute (e.g., functional groups in a list). These fingerprints can then be compared to one another, with more similar fingerprints indicating more similar structures. , Grouping using these approaches is influenced by two factors. First, the fingerprinting method influences which functional groups are detected/prioritized. Second, the method used for scoring fingerprint similarity influences group membership.
Without clear definitions of chemical similarity and chemical diversity, assessment of chemical space becomes a subjective exercise. Taken to an extreme, it is possible to select fingerprinting methods and similarity score cutoffs that indicate that all natural products are structurally unrelated, or conversely that all natural products are part of one structural “super family”. However, chemical similarity methods can offer powerful insight into questions surrounding chemical diversity in nature if appropriately optimized. Recent examples include models for automatically assigning biosynthetic class based on structure or determining natural product “likeness”, examination of functional group distributions in natural and synthetic molecules, and several studies examining structural differences between source organism types. ,−
This article will use chemical similarity scoring to explore topics surrounding chemical redundancy and structural novelty in natural products. This approach is subject to several limitations, not least of which is that it is only possible to analyze molecules whose structures have previously been determined. Therefore, while these analyses can inform our thinking on known chemistry from nature, they cannot directly inform our understanding of future discovery potential, or the evolutionary basis of this diversity. By contrast, genomic analysis of biosynthetic potential can sometimes provide information about the evolutionary origins of BGCs but is typically not able to predict precise chemical structures and cannot determine whether BGCs are functional under a particular growth condition.
The Similarity Landscape for Microbial Natural Products
The Natural Products Atlas uses the Morgan method (radius 2) for fingerprinting and the Dice metric (cutoff = 0.75) to score similarity between fingerprints. Applying these methods to the full database (36,454 compounds, version: v2024_09) generates 4,148 clusters containing two or more compounds which together contain 30,094 compounds (82.6%). The median cluster size is 3, and the number of clusters containing at least five members is 1,209. Somewhat surprisingly 1,093 of these clusters are at least 95% fungal or bacterial by origin. This indicates that scaffold diversity is split cleanly along taxonomic lines, with very few examples of compound classes that are made by both source types despite the use of the same primary metabolism building blocks by both kingdoms.
This grouping method typically creates clusters whose core scaffolds are closely related, meaning that molecules with similar biosynthetic origins tend to group together. As an example, cluster 70 contains 50 compounds, subdivided into three subclusters (Figure A). The central subcluster (box 2) contains the pyridone-containing metabolite piericidin A1 and 19 structurally related variants. Box 1 contains glycosylated piericidin derivatives. There is high interconnectivity both within the glycosylated variants in box 1, and between glycosylated and nonglycosylated members of this family (box 2). Box 3 contains members of the γ-pyrone-containing actinopyrone, , yoshinone and kalkipyrone , families. Interestingly, although the carbon skeletons of the piericidins and γ-pyrones are very similar, substitution of oxygen for nitrogen in the core heterocycle motif significantly alters the chemical fingerprint, meaning that there is low interconnectivity between these two groups with only a single compound pair linking the two subclusters.
1.
Trends in chemical diversity in the Natural Products Atlas. (A) Network diagram for cluster 70. Blue nodes are individual compounds. Edges indicate chemical similarity at or above 0.75 (Morgan fingerprint radius 2, Dice score). For a labeled version of the panel see Data Availability. (B) Plot of cluster size vs median edge count for clusters containing up to 250 members. For underlying plot data see Data Availability. (C) Plot of compound rank vs similarity score for each of the 245 members of the microcystin cluster (cluster 50) against the full NP Atlas database. Each data point represents a compound from the NP Atlas database, ranked by decreasing similarity to the test molecule (top 500 matches shown). Compounds from cluster 50 are shown in red. For underlying plot data see Data Availability.
This trait of high connectivity within clusters, but low similarity to other scaffold classes is common within the data set. One way to visualize this is to plot cluster size vs median number of connections to other compounds in the cluster (edge count). Clusters with high structural similarity (and therefore high interconnectivity) will have high median edge counts, whereas clusters containing a broader range of scaffolds will have lower median edge counts. Figure B presents this plot for clusters up to 250 members in size. Among these, three clusters stand out as having very high interconnectivities. These clusters contain microcystins (cluster 50), peptaibols (cluster 263) and anabaenopeptins (cluster 415), all of which are well-known compound classes in natural product science.
Microcystins are of particular societal importance due to their role as toxins in harmful algal blooms, and are subject to public health monitoring in freshwater environments. Interestingly, the median edge count (196) for this cluster is very close to the cluster size (245) indicating very high interconnectivity between compounds. This tight grouping could be because these compounds form a “island of chemical diversity” that is distinct from all other scaffolds in the data set. Alternatively, it could be an artifact of setting a similarity cutoff of 0.75 for cluster membership. To examine this question, each member of the cluster was separately compared to all compounds in the NP Atlas and the results ranked by descending similarity score. Figure C presents these results for all 245 members of the microcystin cluster, plotting similarity rank (x-axis) against similarity score (y-axis). Compounds are colored red if they are members of the microcystin cluster. This plot indicates that, as expected, there are very high similarity scores between members of the cluster, but that there is a steep decrease in similarity score at the boundary of the cluster, confirming that these molecules are structurally distinct from all other members of the data set.
What Are the Drivers of Chemical Diversification in Nature?
This observation raises the question of why observed chemical diversity is sometimes localized into such structural “hotspots” rather than being more evenly distributed. Although this outlook is focused on discussing observable trends in chemical space for known molecules, it is worth briefly considering the factors that may contribute to (or define) limitations in chemical diversification. Early Darwinian proposals suggested that natural product evolution is driven by organismal fitness, and that to be retained a molecule must confer an advantage to its producer. − Following this logic, it was argued that structures are only retained if they interact with receptors present in competitor organisms, and that molecular recognition and consequent biological function are the principal drivers of molecular diversity. Subsequently, Jones and Firn presented the “screening hypothesis” in the early 1990s which suggested that at any given time most natural products conferred no competitive advantage to the producing organism, but that by hosting a diverse array of scaffolds organisms were better prepared to respond to emerging threats to their survival. More recently, others have reframed this idea to suggest that natural products play no single defined role in host fitness, but rather provide different advantages under different scenarios, and that it is this collection of potential positive responses that drives selection (the Dynamic Matrix Chemical Evolutionary hypothesis).
Separately from evolutionary drivers, chemical diversification is also influenced by the fundamental tenets of physical organic chemistry. For example, synthetic chemists have long recognized the difficulties associated with synthesizing medium-sized macrocyclic rings (8 – 11 members), , and polyketides with these ring sizes are not commonly encountered as natural products. In a similar vein, cyclic tetrapeptides are known to be challenging to form using standard organic chemistry methods in the absence of one or more turn-inducing residues (proline, D amino acids etc.) to relieve ring strain, and are similarly rare in nature.
Chemical diversification may therefore be considered to be limited by two independent factors; physical organic chemistry which influences what can be made, and ecological factors which influence what is retained. Of course, these two factors intersect when one adds the issue of biochemical pathway evolution, which places additional constraints on accessible chemical diversity. Returning to the microcystin example discussed above it is tempting to speculate that environmental function, rather than chemical inaccessibility, is the driving factor that limits scaffold diversity in this case. Within the microcystin class most amino acid residues are highly conserved, with most variation occurring at just two positions in the ring. , Split-pool solid phase peptide synthesis and DNA-encoded libraries have demonstrated that it is possible to synthesize enormous compound libraries (∼106 compounds) under standard coupling conditions, meaning that there is no intrinsic reason why diversity should be limited by synthetic accessibility. Instead, it is possible that variations at conserved positions in the ring modify the tertiary structure or other physicochemical properties, removing or reducing their environmental functions, and eliminating the selective pressure for their retention. If true, this would help to explain the large drop in structural similarity seen for molecules outside the microcystin cluster.
Variation on a Theme is a Prevalent Trend
The analysis presented in Figure suggests that, while there is a high degree of natural variation around many scaffolds, these variations are often limited to minor changes to the core pharmacophore. This observation is supported by the full cluster network (Data Availability) which contains both highly interconnected clusters, and many clusters that include highly interconnected subregions. This reinforces many of the concepts discussed in the previous section, and demonstrates that “variation on a theme” is a common phenomenon in natural products.
2.
Network diagram for platensimycin/platencin cluster (cluster 112). (A) Labeled cluster network, indicating reported biological activities against the Gram-positive bacterial pathogen Staphylococcus aureus. (B–D) Networks indicating presence/absence of three core pharmacophore units for this cluster. Green = present, yellow = absent. MIC data derived from original isolation papers. Cytoscape file including original citations for bioactivity data included in Data Availability section.
The concept of natural pharmacophore optimization is intriguing, as it suggests that cluster interconnectivity could be used as a proxy for determining structure activity relationships for some compound classes. Considering microbially derived antibiotics, whose therapeutic function likely mirrors a key function in nature, we find that this hypothesis is supported in many cases. For example, platensimycin and platencin, two related terpene-aminobenzoic acid conjugates, are both found in the same cluster along with 46 related analogues (Figure A). Platensimycin and platencin both disrupt fatty acid biosynthesis through inhibition of acyl intermediates of the condensing enzymes FabB/F/H. Given their unique mode of action these compounds were the subject of intense investigation in the early 2000s. Fortunately, most derivatives were screened against the same panel of target organisms, making it possible to directly compare their biological activities. Overlaying the MIC values from broth microdilution assays against Staphylococcus aureus onto the cluster (Figure A) reveals that the analogues with the most potent antibacterial activities lie at the center of the cluster, with high interconnectivities to one another. By contrast, compounds at the edges of the cluster are uniformly inactive in this assay.
It is possible that this is because the central nodes possess the optimal structures for antibacterial activity in the environment, and that the compounds on the perimeter of the cluster are less active natural variations to the core scaffold due to shunt metabolism, post translational modification, or natural variation in biosynthetic logic. The 3-amino-2,4-dihydroxybenzoic acid (ADHBA) moiety has been shown to form key contacts in the active site of FabF. Unsurprisingly, structural variants lacking the ADHDA motif (e.g., platencin vs platencin A6 and platensimycin vs platensimide A, Figure A) are inactive in this assay and are not closely structurally related to many other members of the cluster. Panels 2B – D indicate the presence/absence of the ADHBA, platencin terpene and platensimycin terpene substructures. Active members of the cluster include both the ADHBA subunit and either the platencin or platensimycin terpene core, providing direct insight into the structure activity relationship for this compound class. More broadly, this observation suggests that in some cases cluster architecture may be useful in defining pharmacophores for the endogenous roles of natural products, even if those roles are unknown and no biological screening data exist.
This observation suggests that in some cases cluster architecture may be useful in defining pharmacophores for the endogenous roles of natural products, even if those roles are unknown and no biological screening data exist.
Biosynthetic Potential vs Chemical Reality
By even the simplest approximation, the biosynthetic capacity of nature is vast. Even considering only the most common motifs present in natural product structures, the number of possible combinations is enormous. The opportunity offered by this chemical diversity is one of the principal reasons why natural products science remains an important and integral part of modern biomedical research. However, as discussed above it is difficult to estimate the true diversity of chemical space in nature. Selection pressures including the restrictions imposed by physical organic chemistry, and the requirement that molecules must convey a competitive advantage to their producing organism likely have significant influence on this diversity. This raises the question of what fraction of possible chemical space is known to exist in nature.
As an example of biosynthetic potential versus chemical reality we can consider the production of polyketide synthase-derived macrocyclic lactones. Assembly line polyketide biosynthesis involves the sequential addition of monomeric subunits to a growing polyketide chain, followed by optional modification of each subunit prior to further chain extension. The completed chain is often offloaded by intramolecular cyclization, yielding a macrocyclic lactone product. Simplifying this process to consider only the addition of two-carbon malonyl-CoA subunits, there are five possible outcomes from each chain extension step: Addition of a keto subunit, reduction of the ketone to either the R or S alcohol, dehydration of the alcohol to form an olefin, and reduction of the olefin to yield a saturated two carbon unit.
Formation of a 16-member macrocycle requires the addition of 8 monomer units, one of which is retained as a keto group to anchor the growing chain to the ketosynthase and provide the required position for macrolactonization. In principle, the remaining seven units can form 5 (78,125) products. In reality, the number of variations is far higher because monomer addition can include a range of alternative substrates (e.g., propionate, methylmalonyl-CoA) and a wide range of additional modifications are possible at each subunit (e.g., oxidation, amination, glycosylation). It is therefore surprising that searching the Natural Products Atlas for polyketides containing a 16-member macrolactone core returns only 312 structures, grouped into 13 compound classes containing three or more members (Figure , A - N).
3.
Network of all molecules from the Natural Products Atlas containing a 16-membered macrocyclic lactone. Blue nodes are individual compounds. Edges indicate chemical similarity at or above 0.75 (Morgan fingerprint radius 2, Dice score). For full labeled network see Data Availability.
Among these groups, three (A, B and C) contain very similar polyketide cores, with structural differences driven by differing sites of glycosylation around the macrolactone core. A similar situation is also observed in the avermectin class, with groups D and E differing only by the presence (E) or absence (D) of glycosylation on the ring. Lastly, two groups (H an I) are structural variations of the abyssomicin core that differ in the connectivities of intramolecular cyclizations in the central ring structure. Together these similarities reduce the number of discrete PKS classes to 10.
Closer examination of these structures indicates that many are likely constructed by polyketide synthases containing more than eight modules. Several classes contain 16 member macrolactones that are substructures of larger ring systems (VM-44857, D; avermectin A1a, E; spirohexenolide A, M; phocoenamicin, N). In addition, several structures contain long polyketide tails that extend beyond the position of lactonization (bafilomycin A1, F; rhizoxin, K). Finally, one class (abyssomicins, H and I) is constructed using the noncanonical incorporation of glycerol in place of module 8 in the assembly line. In total only four structural classes (leucomycins/cirramycins/mycinamicins, epothilones, brefeldins, berkeleylactones) derive from PKS assembly lines containing eight modules.
This limited diversity is somewhat surprising given the wide range of approaches that have been applied to natural products discovery over the past century. Natural product discovery is biased by many factors, including biological activity, chromatographic properties, chemical stability, and environmental titer among others. Nevertheless, the repeated discovery of a small subset of macrolactones by many different research teams with different research objectives over many decades suggests that these scaffolds are widely distributed in the environment and that other macrocycles are either not prioritized using typical isolation protocols, not constructed because of organic chemistry constraints, or not retained by evolutionary selection.
While 82.6% of the compounds in the Natural Products Atlas are present in clusters containing two or more compounds, the remaining 17.4% (6,360 compounds) are singletons that are structurally distinct from all other microbial natural products.
The Long Tail of Chemical Novelty
While 82.6% of the compounds in the Natural Products Atlas are present in clusters containing two or more compounds, the remaining 17.4% (6,360 compounds) are singletons that are structurally distinct from all other microbial natural products. These include highly selective signaling molecules such as autoinducers-2 and −3 that play critical roles in interspecies and crosskingdom communication, as well as molecules such as geldanamycin G that are structural variants of well-known natural products (Figure A). This highlights one of the central advantages of natural products as a source of structural diversity, namely that unusual structures can derive either from evolution of unique chemical scaffolds (e.g., autoinducers) or from biosynthetic flexibility (e.g., geldanamycin G).
4.
Distribution of structurally unusual natural products in the Natural Products Atlas. (A) Examples of chemical structures that form “singleton” clusters (i.e., are not related to any other structures in the database. Morgan fingerprint radius 2, Dice score ≥ 0.75). (B) Number of singleton compounds (blue) and total number of compounds (red) in the Natural Products Atlas separated by year of discovery. (C) Plot of the percentage of compounds that were singletons in the year of their discovery (green) and the percentage of compounds in each year that remain singletons when compared to all known molecules in the database (blue). For underlying data for panels B and C see Data Availability.
Arguably, this pool of structurally unique molecules is one of the most valuable and important elements of natural products research. The presence of these molecules in the data set raises several questions about their origin and distribution. For example, is structural novelty an artifact of date of discovery or are there examples of molecules that remain unique decades after their original isolation? Are most of these molecules produced by unique biochemical pathways or are they structural variations of well-established chemical classes? What is the taxonomic distribution among this group? And to what extent does niche-specific adaptation contribute to the evolution of unique/rare scaffolds?
It is well recognized that de novo structure elucidation is more challenging than determining the structures of known compound classes. This is because complex structural motifs can be difficult to identify, and nuclear magnetic resonance spectra and mass spectrometry data alone are sometimes insufficient to unequivocally determine the structures of densely functionalized molecules. By contrast, matching the spectral data of a new analogue to literature data of a related congener is significantly more straightforward, requiring the researcher only to determine the point(s) of difference with the existing structure. Following this logic, one might expect that the compounds that were unique at their time of initial discovery would become members of larger clusters as additional congeners are discovered.
To test this hypothesis, compounds were separated by year of original discovery and compared to all compounds discovered in previous years, to assess degree of novelty at the time of discovery (contemporary singleton percentage, Figure C, green line). Separately, compounds in each year were compared to all other compounds in the database (excluding self-comparison) to assess the current singleton percentage (blue line). As expected, rates of contemporary novelty are higher in all cases than rates of current novelty because some proportion of singletons become members of compound families in subsequent years. In the early years of the plot (1940 – 1970) the rates of contemporary novelty are often high because the total number of known natural products was low, and many discoveries were “first in class”. Over time, as the number of known natural products has grown and the annual rate of discovery has increased, the percentage of contemporary singletons has decreased, reaching a plateau of ∼20% of molecules isolated per year.
Interestingly, the percentage of singletons has remained remarkably consistent at ∼20% since the 1970s, in the face of a dramatic increase in the number of known scaffolds from microbial sources. Over the same period the percentage of molecules that were singletons in their year of discovery has gradually decreased, suggesting that many of the common classes of microbial natural products are now known. However, this analysis should not be used to suggest that the number of novel scaffolds being discovered today is decreasing. Because many compounds are isolated as families, consideration of singletons tells only part of the story. Instead, every cluster in the data set, from the very large clusters discussed in Figure to the smallest clusters containing just two compounds, represent chemical space that is distinct from all other scaffolds in the data set. Both new clusters and new singletons continue to be discovered at high rates, indicating that the natural world continues to be a diverse and important source of chemical novelty. Together, the data in Figures and suggest that natural product structures occupy a continuum, from highly refined molecular classes with low structural variation that are widely distributed throughout the natural world, to rare scaffolds that remain structurally unique decades after their initial discoveries. This diversity analysis, coupled with the promise offered from large-scale sequencing programs suggests that there remains a wealth of new chemical matter to discover from natural sources.
Outlook
Natural products science remains an important element of biomedical research and is central to the study of ecological systems and host-microbe interactions (e.g., human microbiome). Over the past 50 years our understanding of how these molecules are made in the environment has matured to the point that most molecules have clear biosynthetic origins. At the same time, both analytical and sequencing technologies have advanced dramatically, enabling the field to explore both the chemical composition and the biosynthetic potential of large organism libraries. What these analyses reveal is that we have an as-yet incomplete understanding of the chemistry found in nature. While many common scaffolds are now well characterized and their biosynthesis known, new class members are being discovered at high rates. For example, ribosomally synthesized and post-translationally modified peptides (RiPPs) have evolved from a biosynthetic curiosity when first discovered to a major and important class of natural products that are now found in all domains of life.
Nature continues to reveal spectacular and unexpected molecules at every turn. It will be interesting to see what the next decade reveals about the chemistry of the natural world.
Beyond natural products discovery, this analysis suggests that there is significant opportunity for research programs involved in the production of “non-natural” products. The assessment of PKS diversity (Figure ) suggests that only a small proportion of possible biosynthetic combinations are found in the environment. Ex vivo biosynthetic strategies such as transient plant expression offer exciting new opportunities for library-scale production of natural product-like libraries to generate scaffolds outside those likely to be found in nature, but which use naturally occurring biosynthetic enzymes for their production. Going one step further, there clearly remains significant value in the development of modular methods in organic synthesis to create molecules that capture the structural features of natural products without being limited by the availability and/or interoperability of biosynthetic enzymes to construct these molecules.
Another important consideration for the field is the role that informatics and information science will play in the coming decades. A previously noted, supporting the development of new technologies to better systematize the extraction of accurate structural information from genomes and metabolomes will be critical to future success. ,, Tied to this is the importance of accurate extraction of published data into open-source databases. To be of highest value data must be curated accurately initially, and existing entries reviewed regularly for updates and corrections. Data errors can significantly reduce the accuracy of tools that use these databases for model training (e.g., predicting MS2 or NMR spectra) if experimental data and structures are not correctly aligned. This can further complicate analysis problems by inflating the number of candidate structures that must be considered, and including functional groups that are not actually present in nature.
As an example of the difficulties raised by this issue, in preparing this article the structure of the dichloromethyl ether containing metabolite acrodontiolamide was revisited. This structure is a singleton in the current version of the Natural Products Atlas and is the only molecule containing the unusual dichloromethyl ether functional group. Reinspection of the published 13C NMR data suggested a possible mis-assignment of the well-known microbial antibiotic chloramphenicol which contains a dichloromethyl ketone in place of the dichloromethyl ether. Comparison against calculated data for both candidate structures supported this reassignment, which was confirmed by comparing the published data to experimental NMR data for chloramphenicol available in the Natural Products Magnetic Resonance Database. Unfortunately, the incorrect structure can now be found in other chemical structure databases and is likely to persist for the foreseeable future, even once the entry is updated in the Natural Products Atlas, highlighting the challenge of accurate data management in this area.
Natural products science is enjoying a period of renewed prominence in both basic science and translational settings. Following over a decade of effort focused on the development of new technologies to systematize and automate information extraction from genomic, metabolomic, and screening data, natural products research is now once again the central focus of large-scale discovery programs for a range of human health and agricultural applications. This includes the formation of new companies dedicated to natural products science as well as the expansion of large-scale collaborative projects with academic/industrial natural products libraries − Many of these efforts are including both known and novel chemical scaffolds in therapeutic development pipelines, meaning that the full range of chemical space available from nature is relevant to these programs, rather than just the much small set of novel scaffolds found each year. Lastly, expansion of investigation into new areas of taxonomic space continues to yield an exciting and unusual array of new chemotypes, many of which have no precedent in natural products science. From metagenomic sequencing and the identification of obligate symbionts, , to the investigation of taxonomic groups that have been historically understudied (e.g., Burkholderiales , and Acidobacter), Nature continues to reveal spectacular and unexpected molecules at every turn. It will be interesting to see what the next decade reveals about the chemistry of the natural world.
Supplementary Material
Spreadsheets and Cytoscape files containing the underlying data for Figures – are available on Zenodo (10.5281/zenodo.15346913).
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acscentsci.5c00804.
Transparent Peer Review report available (PDF)
Funding for this work was provided by the Natural Sciences and Engineering Council of Canada Discovery program, and the National Institutes of Health (Center for Complementary and Integrative Health; Office of Dietary Supplements; U41-AT008718).
The author declares no competing financial interest.
References
- Staley J. T., Konopka A.. Measurement of in Situ Activities of Nonphotosynthetic Microorganisms in Aquatic and Terrestrial Habitats. Annu. Rev. Microbiol. 1985;39(1):321–346. doi: 10.1146/annurev.mi.39.100185.001541. [DOI] [PubMed] [Google Scholar]
- Gavriilidou A., Kautsar S. A., Zaburannyi N., Krug D., Müller R., Medema M. H., Ziemert N.. Compendium of Specialized Metabolite Biosynthetic Diversity Encoded in Bacterial Genomes. Nat. Microbiol. 2022;7(5):726–735. doi: 10.1038/s41564-022-01110-2. [DOI] [PubMed] [Google Scholar]
- Nayfach S., Roux S., Seshadri R., Udwary D., Varghese N., Schulz F., Wu D., Paez-Espino D., Chen I.-M., Huntemann M., Palaniappan K., Ladau J., Mukherjee S., Reddy T. B. K., Nielsen T., Kirton E., Faria J. P., Edirisinghe J. N., Henry C. S., Jungbluth S. P., Chivian D., Dehal P., Wood-Charlson E. M., Arkin A. P., Tringe S. G., Visel A.. IMG/M Data Consortium; Woyke T., Mouncey N. J., Ivanova N. N., Kyrpides N. C., Eloe-Fadrosh E. A.. et al. A Genomic Catalog of Earth’s Microbiomes. Nat. Biotechnol. 2021;39(4):499–509. doi: 10.1038/s41587-020-0718-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smanski M. J., Zhou H., Claesen J., Shen B., Fischbach M. A., Voigt C. A.. Synthetic Biology to Access and Expand Nature’s Chemical Diversity. Nat. Rev. Microbiol. 2016;14(3):135–149. doi: 10.1038/nrmicro.2015.24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cameron R. C., Berry D., Richardson A. T., Stevenson L. J., Lukito Y., Styles K. A., Nipper N. S. L., McLellan R. M., Parker E. J.. An Overlooked Cyclase Plays a Central Role in the Biosynthesis of Indole Diterpenes. Chem. Sci. 2025;16(21):9441–9446. doi: 10.1039/D5SC02009C. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bağcı C., Nuhamunada M., Goyat H., Ladanyi C., Sehnal L., Blin K., Kautsar S. A., Tagirdzhanov A., Gurevich A., Mantri S., von Mering C., Udwary D., Medema M. H., Weber T., Ziemert N.. BGC Atlas: A Web Resource for Exploring the Global Chemical Diversity Encoded in Bacterial Genomes. Nucleic Acids Res. 2025;53(D1):D618–D624. doi: 10.1093/nar/gkae953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hart C. E., Gadiya Y., Kind T., Krettler C. A., Gaetz M., Misra B. B., Healey D., Allen A., Colluru V., Domingo-Fernández D.. Defining the Limits of Plant Chemical Space: Challenges and Estimations. Gigascience. 2025;14:giaf033. doi: 10.1093/gigascience/giaf033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paulo B. S., Recchia M. J. J., Lee S., Fergusson C. H., Romanowski S. B., Hernandez A., Krull N., Liu D. Y., Cavanagh H., Bos A., Gray C. A., Murphy B. T., Linington R. G., Eustaquio A. S.. Discovery of Megapolipeptins by Genome Mining of a Burkholderiales Bacteria Collection. Chem. Sci. 2024;15(40):16567–16581. doi: 10.1039/D4SC03594A. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Romanowski S. B., Lee S., Kunakom S., Paulo B. S., Recchia M. J. J., Liu D. Y., Cavanagh H., Linington R. G., Eustáquio A. S.. Identification of the Lipodepsipeptide Selethramide Encoded in a Giant Nonribosomal Peptide Synthetase from a Burkholderia Bacterium. Proc. Natl. Acad. Sci. U. S. A. 2023;120(42):e2304668120. doi: 10.1073/pnas.2304668120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yuan Y., Huang C., Singh N., Xun G., Zhao H.. Self-Resistance-Gene-Guided, High-Throughput Automated Genome Mining of Bioactive Natural Products from Streptomyces. Cell Syst. 2025;16(3):101237. doi: 10.1016/j.cels.2025.101237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Santen J. A., Jacob G., Singh A. L., Aniebok V., Balunas M. J., Bunsko D., Neto F. C., Castaño-Espriu L., Chang C., Clark T. N., Cleary Little J. L., Delgadillo D. A., Dorrestein P. C., Duncan K. R., Egan J. M., Galey M. M., Haeckl F. P. J., Hua A., Hughes A. H., Iskakova D., Khadilkar A., Lee J.-H., Lee S., LeGrow N., Liu D. Y., Macho J. M., McCaughey C. S., Medema M. H., Neupane R. P., O’Donnell T. J., Paula J. S., Sanchez L. M., Shaikh A. F., Soldatou S., Terlouw B. R., Tran T. A., Valentine M., van der Hooft J. J. J., Vo D. A., Wang M., Wilson D., Zink K. E., Linington R. G.. The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery. ACS Cent Sci. 2019;5(11):1824–1833. doi: 10.1021/acscentsci.9b00806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Santen J. A., Poynton E. F., Iskakova D., McMann E., Alsup T. A., Clark T. N., Fergusson C. H., Fewer D. P., Hughes A. H., McCadden C. A., Parra J., Soldatou S., Rudolf J. D., Janssen E. M.-L., Duncan K. R., Linington R. G.. The Natural Products Atlas 2.0: A Database of Microbially-Derived Natural Products. Nucleic Acids Res. 2022;50(D1):D1317–D1323. doi: 10.1093/nar/gkab941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poynton E. F., van Santen J. A., Pin M., Contreras M. M., McMann E., Parra J., Showalter B., Zaroubi L., Duncan K. R., Linington R. G.. The Natural Products Atlas 3.0: Extending the Database of Microbially Derived Natural Products. Nucleic Acids Res. 2025;53(D1):D691–D699. doi: 10.1093/nar/gkae1093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mullowney M. W., Duncan K. R., Elsayed S. S., Garg N., van der Hooft J. J. J., Martin N. I., Meijer D., Terlouw B. R., Biermann F., Blin K., Durairaj J., Gorostiola González M., Helfrich E. J. N., Huber F., Leopold-Messer S., Rajan K., de Rond T., van Santen J. A., Sorokina M., Balunas M. J., Beniddir M. A., van Bergeijk D. A., Carroll L. M., Clark C. M., Clevert D.-A., Dejong C. A., Du C., Ferrinho S., Grisoni F., Hofstetter A., Jespers W., Kalinina O. V., Kautsar S. A., Kim H., Leao T. F., Masschelein J., Rees E. R., Reher R., Reker D., Schwaller P., Segler M., Skinnider M. A., Walker A. S., Willighagen E. L., Zdrazil B., Ziemert N., Goss R. J. M., Guyomard P., Volkamer A., Gerwick W. H., Kim H. U., Müller R., van Wezel G. P., van Westen G. J. P., Hirsch A. K. H., Linington R. G., Robinson S. L., Medema M. H.. Artificial Intelligence for Natural Product Drug Discovery. Nat. Rev. Drug Discovery. 2023;22(11):895–916. doi: 10.1038/s41573-023-00774-7. [DOI] [PubMed] [Google Scholar]
- Willett P.. Similarity Searching Using 2D Structural Fingerprints. Methods Mol. Biol. 2010;672:133–158. doi: 10.1007/978-1-60761-839-3_5. [DOI] [PubMed] [Google Scholar]
- Capecchi A., Reymond J.-L.. Classifying Natural Products from Plants, Fungi or Bacteria Using the COCONUT Database and Machine Learning. J. Cheminform. 2021;13(1):82. doi: 10.1186/s13321-021-00559-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zabolotna Y., Ertl P., Horvath D., Bonachera F., Marcou G., Varnek A.. NP Navigator: A New Look at the Natural Product Chemical Space. Mol. Inform. 2021;40(9):e2100068. doi: 10.1002/minf.202100068. [DOI] [PubMed] [Google Scholar]
- Kim H. W., Wang M., Leber C. A., Nothias L.-F., Reher R., Kang K. B., van der Hooft J. J. J., Dorrestein P. C., Gerwick W. H., Cottrell G. W.. NPClassifier: A Deep Neural Network-Based Structural Classification Tool for Natural Products. J. Nat. Prod. 2021;84(11):2795–2807. doi: 10.1021/acs.jnatprod.1c00399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sorokina M., Steinbeck C.. NaPLeS: A Natural Products Likeness Scorer-Web Application and Database. J. Cheminform. 2019;11(1):55. doi: 10.1186/s13321-019-0378-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ertl P., Schuhmann T.. A Systematic Cheminformatics Analysis of Functional Groups Occurring in Natural Products. J. Nat. Prod. 2019;82(5):1258–1263. doi: 10.1021/acs.jnatprod.8b01022. [DOI] [PubMed] [Google Scholar]
- Capecchi A., Reymond J.-L.. Assigning the Origin of Microbial Natural Products by Chemical Space Map and Machine Learning. Biomolecules. 2020;10(10):1385. doi: 10.3390/biom10101385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ertl P., Schuhmann T.. Cheminformatics Analysis of Natural Product Scaffolds: Comparison of Scaffolds Produced by Animals, Plants, Fungi and Bacteria. Mol. Inform. 2020;39(11):e2000017. doi: 10.1002/minf.202000017. [DOI] [PubMed] [Google Scholar]
- Henz Ryen A., Backlund A.. Charting Angiosperm Chemistry: Evolutionary Perspective on Specialized Metabolites Reflected in Chemical Property Space. J. Nat. Prod. 2019;82(4):798–812. doi: 10.1021/acs.jnatprod.8b00767. [DOI] [PubMed] [Google Scholar]
- Takahashi N., Suzuki A., Saburo T.. Chemical Structure of Piericidin A. Agric. Biol. Chem. 1966;30(1):1–26. doi: 10.1080/00021369.1966.10858543. [DOI] [Google Scholar]
- Yano K., Yokoi K., Sato J., Oono J., Kouda T., Ogawa Y., Nakashima T.. Actinopyrones A, B and C, New Physiologically Active Substances. II. Physico-Chemical Properties and Chemical Structures. J. Antibiot. (Tokyo) 1986;39(1):38–43. doi: 10.7164/antibiotics.39.38. [DOI] [PubMed] [Google Scholar]
- Hayakawa Y., Saito J., Izawa M., Shin-ya K.. Actinopyrone D, a New Downregulator of the Molecular Chaperone GRP78 from Streptomyces Sp. J. Antibiot. (Tokyo) 2014;67(12):831–834. doi: 10.1038/ja.2014.76. [DOI] [PubMed] [Google Scholar]
- Inuzuka T., Yamamoto K., Iwasaki A., Ohno O., Suenaga K., Kawazoe Y., Uemura D.. An Inhibitor of the Adipogenic Differentiation of 3T3-L1 Cells, Yoshinone A, and Its Analogs, Isolated from the Marine Cyanobacterium Leptolyngbya Sp. Tetrahedron Lett. 2014;55(49):6711–6714. doi: 10.1016/j.tetlet.2014.10.032. [DOI] [Google Scholar]
- Graber M. A., Gerwick W. H.. Kalkipyrone, a Toxic Gamma-Pyrone from an Assemblage of the Marine Cyanobacteria Lyngbya Majuscula and Tolypothrix Sp. J. Nat. Prod. 1998;61(5):677–680. doi: 10.1021/np970539j. [DOI] [PubMed] [Google Scholar]
- Bertin M. J., Demirkiran O., Navarro G., Moss N. A., Lee J., Goldgof G. M., Vigil E., Winzeler E. A., Valeriote F. A., Gerwick W. H.. Kalkipyrone B, a Marine Cyanobacterial γ-Pyrone Possessing Cytotoxic and Anti-Fungal Activities. Phytochemistry. 2016;122:113–118. doi: 10.1016/j.phytochem.2015.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drobac Backović D., Tokodi N.. Blue Revolution Turning Green? A Global Concern of Cyanobacteria and Cyanotoxins in Freshwater Aquaculture: A Literature Review. J. Environ. Manage. 2024;360:121115. doi: 10.1016/j.jenvman.2024.121115. [DOI] [PubMed] [Google Scholar]
- Haslam E.. Secondary Metabolism ? Fact and Fiction. Nat. Prod. Rep. 1986;3:217. doi: 10.1039/np9860300217. [DOI] [Google Scholar]
- Williams D. H., Stone M. J., Hauck P. R., Rahman S. K.. Why Are Secondary Metabolites (Natural Products) Biosynthesized? J. Nat. Prod. 1989;52(6):1189–1208. doi: 10.1021/np50066a001. [DOI] [PubMed] [Google Scholar]
- Firn R. D., Jones C. G.. A Darwinian View of Metabolism: Molecular Properties Determine Fitness. J. Exp. Bot. 2009;60(3):719–726. doi: 10.1093/jxb/erp002. [DOI] [PubMed] [Google Scholar]
- Jones C. G., Firn R. D.. On the Evolution of Plant Secondary Chemical Diversity. Philos. Trans. R. Soc. London Series B. 1991;333(1267):273–280. doi: 10.1098/rstb.1991.0077. [DOI] [Google Scholar]
- Chevrette M. G., Gutiérrez-García K., Selem-Mojica N., Aguilar-Martínez C., Yañez-Olvera A., Ramos-Aboites H. E., Hoskisson P. A., Barona-Gómez F.. Evolutionary Dynamics of Natural Product Biosynthesis in Bacteria. Nat. Prod. Rep. 2020;37(4):566–599. doi: 10.1039/C9NP00048H. [DOI] [PubMed] [Google Scholar]
- Cope A. C., Martin M. M., McKervey M. A.. Transannular Reactions in Medium-Sized Rings. Q. Rev. 1966;20(1):119. doi: 10.1039/qr9662000119. [DOI] [Google Scholar]
- Clarke A. K., Unsworth W. P.. A Happy Medium: The Synthesis of Medicinally Important Medium-Sized Rings via Ring Expansion. Chem. Sci. 2020;11(11):2876–2881. doi: 10.1039/D0SC00568A. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sarojini V., Cameron A. J., Varnava K. G., Denny W. A., Sanjayan G.. Cyclic Tetrapeptides from Nature and Design: A Review of Synthetic Methodologies, Structure, and Function. Chem. Rev. 2019;119(17):10318–10359. doi: 10.1021/acs.chemrev.8b00737. [DOI] [PubMed] [Google Scholar]
- Pye C. R., Bertin M. J., Lokey R. S., Gerwick W. H., Linington R. G.. Retrospective Analysis of Natural Products Provides Insights for Future Discovery Trends. Proc. Natl. Acad. Sci. U. S. A. 2017;114(22):5601–5606. doi: 10.1073/pnas.1614680114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baliu-Rodriguez D., Peraino N. J., Premathilaka S. H., Birbeck J. A., Baliu-Rodriguez T., Westrick J. A., Isailovic D.. Identification of Novel Microcystins Using High-Resolution MS and MSn with Python Code. Environ. Sci. Technol. 2022;56(3):1652–1663. doi: 10.1021/acs.est.1c04296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones M. R., Pinto E., Torres M. A., Dörr F., Mazur-Marzec H., Szubert K., Tartaglione L., Dell’Aversano C., Miles C. O., Beach D. G., McCarron P., Sivonen K., Fewer D. P., Jokela J., Janssen E. M. L.. CyanoMetDB, a Comprehensive Public Database of Secondary Metabolites from Cyanobacteria. Water Res. 2021;196:117017. doi: 10.1016/j.watres.2021.117017. [DOI] [PubMed] [Google Scholar]
- Ji X., Nielsen A. L., Heinis C.. Cyclic Peptides for Drug Development. Angew. Chem., Int. Ed. Engl. 2024;63(3):e202308251. doi: 10.1002/anie.202308251. [DOI] [PubMed] [Google Scholar]
- Singh S. B., Jayasuriya H., Ondeyka J. G., Herath K. B., Zhang C., Zink D. L., Tsou N. N., Ball R. G., Basilio A., Genilloud O., Diez M. T., Vicente F., Pelaez F., Young K., Wang J.. Isolation, Structure, and Absolute Stereochemistry of Platensimycin, a Broad Spectrum Antibiotic Discovered Using an Antisense Differential Sensitivity Strategy. J. Am. Chem. Soc. 2006;128(36):11916–11920. doi: 10.1021/ja062232p. [DOI] [PubMed] [Google Scholar]
- Wang J., Kodali S., Lee S. H., Galgoci A., Painter R., Dorso K., Racine F., Motyl M., Hernandez L., Tinney E., Colletti S. L., Herath K., Cummings R., Salazar O., González I., Basilio A., Vicente F., Genilloud O., Pelaez F., Jayasuriya H., Young K., Cully D. F., Singh S. B.. Discovery of Platencin, a Dual FabF and FabH Inhibitor with in Vivo Antibiotic Properties. Proc. Natl. Acad. Sci. U. S. A. 2007;104(18):7612–7616. doi: 10.1073/pnas.0700746104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rudolf J. D., Dong L.-B., Shen B.. Platensimycin and Platencin: Inspirations for Chemistry, Biology, Enzymology, and Medicine. Biochem. Pharmacol. 2017;133:139–151. doi: 10.1016/j.bcp.2016.11.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoffmann T., Krug D., Bozkurt N., Duddela S., Jansen R., Garcia R., Gerth K., Steinmetz H., Müller R.. Correlating Chemical Diversity with Taxonomic Distance for Discovery of Natural Products in Myxobacteria. Nat. Commun. 2018;9(1):803. doi: 10.1038/s41467-018-03184-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen X., Schauder S., Potier N., Van Dorsselaer A., Pelczer I., Bassler B. L., Hughson F. M.. Structural Identification of a Bacterial Quorum-Sensing Signal Containing Boron. Nature. 2002;415(6871):545–549. doi: 10.1038/415545a. [DOI] [PubMed] [Google Scholar]
- Kim C. S., Gatsios A., Cuesta S., Lam Y. C., Wei Z., Chen H., Russell R. M., Shine E. E., Wang R., Wyche T. P., Piizzi G., Flavell R. A., Palm N. W., Sperandio V., Crawford J. M.. Characterization of Autoinducer-3 Structure and Biosynthesis in E. Coli. ACS Cent. Sci. 2020;6(2):197–206. doi: 10.1021/acscentsci.9b01076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang X., Zhang Y., Ponomareva L. V., Qiu Q., Woodcock R., Elshahawi S. I., Chen X., Zhou Z., Hatcher B. E., Hower J. C., Zhan C.-G., Parkin S., Kharel M. K., Voss S. R., Shaaban K. A., Thorson J. S.. Mccrearamycins A-D, Geldanamycin-derived Cyclopentenone Macrolactams from an Eastern Kentucky Abandoned Coal Mine Microbe. Angew. Chem., Int. Ed. Engl. 2017;56(11):2994–2998. doi: 10.1002/anie.201612447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Y. M., Milne J. C., Madison L. L., Kolter R., Walsh C. T.. From Peptide Precursors to Oxazole and Thiazole-Containing Peptide Antibiotics: Microcin B17 Synthase. Science. 1996;274(5290):1188–1193. doi: 10.1126/science.274.5290.1188. [DOI] [PubMed] [Google Scholar]
- Montalbán-López M., Scott T. A., Ramesh S., Rahman I. R., van Heel A. J., Viel J. H., Bandarian V., Dittmann E., Genilloud O., Goto Y., Grande Burgos M. J., Hill C., Kim S., Koehnke J., Latham J. A., Link A. J., Martínez B., Nair S. K., Nicolet Y., Rebuffat S., Sahl H.-G., Sareen D., Schmidt E. W., Schmitt L., Severinov K., Süssmuth R. D., Truman A. W., Wang H., Weng J.-K., van Wezel G. P., Zhang Q., Zhong J., Piel J., Mitchell D. A., Kuipers O. P., van der Donk W. A.. New Developments in RiPP Discovery, Enzymology and Engineering. Nat. Prod. Rep. 2021;38(1):130–239. doi: 10.1039/D0NP00027B. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spence P., Reed J., Osbourn A.. Harnessing Plant Biosynthesis for the Development of Next-Generation Therapeutics. PLoS Biol. 2024;22(11):e3002886. doi: 10.1371/journal.pbio.3002886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McAlpine J. B., Chen S. N., Kutateladze A., Macmillan J. B., Appendino G., Barison A., Beniddir M. A., Biavatti M. W., Bluml S., Boufridi A., Butler M. S., Capon R. J., Choi Y. H., Coppage D., Crews P., Crimmins M. T., Csete M., Dewapriya P., Egan J. M., Garson M. J., Genta-Jouve G., Gerwick W. H., Gross H., Harper M. K., Hermanto P., Hook J. M., Hunter L., Jeannerat D., Ji N. Y., Johnson T. A., Kingston D. G. I., Koshino H., Lee H. W., Lewin G., Li J., Linington R. G., Liu M., McPhail K. L., Molinski T. F., Moore B. S., Nam J. W., Neupane R. P., Niemitz M., Nuzillard J. M., Oberlies N. H., Ocampos F. M. M., Pan G., Quinn R. J., Reddy D. S., Renault J. H., Rivera-Chávez J., Robien W., Saunders C. M., Schmidt T. J., Seger C., Shen B., Steinbeck C., Stuppner H., Sturm S., Taglialatela-Scafati O., Tantillo D. J., Verpoorte R., Wang B. G., Williams C. M., Williams P. G., Wist J., Yue J. M., Zhang C., Xu Z., Simmler C., Lankin D. C., Bisson J., Pauli G. F.. The Value of Universally Available Raw NMR Data for Transparency, Reproducibility, and Integrity in Natural Product Research. Natural Product Reports. 2019;36:35–107. doi: 10.1039/C7NP00064B. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sorokina M., Steinbeck C.. Review on Natural Products Databases: Where to Find Data in 2020. J. Cheminform. 2020;12(1):1–51. doi: 10.1186/s13321-020-00424-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Gusmáo N. B., Kaouadji M., Seigle-Murandi F., Steiman R., Thomasson F.. Acrodontiolamide, a Chlorinated Fungal Metabolite FromAcrodontium Salmoneum. Spectrosc. Lett. 1993;26(8):1373–1380. doi: 10.1080/00387019308011616. [DOI] [Google Scholar]
- Yiu C., Honoré B., Gerrard W., Napolitano-Farina J., Russell D., Trist I. M. L., Dooley R., Butts C. P.. IMPRESSION Generation 2 - Accurate, Fast and Generalised Neural Network Model for Predicting NMR Parameters in Place of DFT. Chem. Sci. 2025;16(19):8377–8382. doi: 10.1039/D4SC07858F. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wishart D. S., Sajed T., Pin M., Poynton E. F., Goel B., Lee B. L., Guo A. C., Saha S., Sayeeda Z., Han S., Berjanskii M., Peters H., Oler E., Gautam V., Jordan T., Kim J., Ledingham B., Tretter Z. M., Koller J. T., Shreffler H. A., Stillwell L. R., Jystad A. M., Govind N., Bade J. L., Sumner L. W., Linington R. G., Cort J. R.. The Natural Products Magnetic Resonance Database (NP-MRD) for 2025. Nucleic Acids Res. 2025;53(D1):D700–D708. doi: 10.1093/nar/gkae1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grkovic T., Akee R. K., Thornburg C. C., Trinh S. K., Britt J. R., Harris M. J., Evans J. R., Kang U., Ensel S., Henrich C. J., Gustafson K. R., Schneider J. P., O’Keefe B. R.. National Cancer Institute (NCI) Program for Natural Products Discovery: Rapid Isolation and Identification of Biologically Active Natural Products from the NCI Prefractionated Library. ACS Chem. Biol. 2020;15(4):1104–1114. doi: 10.1021/acschembio.0c00139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martínez-Fructuoso L., Arends S. J. R., Freire V. F., Evans J. R., DeVries S., Peyser B. D., Akee R. K., Thornburg C. C., Kumar R., Ensel S., Morgan G. M., McConachie G. D., Veeder N., Duncan L. R., Grkovic T., O’Keefe B. R.. Screen for New Antimicrobial Natural Products from the NCI Program for Natural Product Discovery Prefractionated Extract Library. ACS Infect. Dis. 2023;9(6):1245–1256. doi: 10.1021/acsinfecdis.3c00067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalkreuter E., Kautsar S. A., Yang D., Bader C. D., Teijaro C. N., Fluegel L. L., Davis C. M., Simpson J. R., Lauterbach L., Steele A. D., Gui C., Meng S., Li G., Viehrig K., Ye F., Su P., Kiefer A. F., Nichols A., Cepeda A. J., Yan W., Fan B., Jiang Y., Adhikari A., Zheng C.-J., Schuster L., Cowan T. M., Smanski M. J., Chevrette M. G., de Carvalho L. P. S., Shen B.. The Natural Products Discovery Center: Release of the First 8490 Sequenced Strains for Exploring Actinobacteria Biosynthetic Diversity. bioRxiv. 2023 doi: 10.1101/2023.12.14.571759. [DOI] [Google Scholar]
- Yamabe S., Yoshitake K., Ninomiya A., Piel J., Takeyama H., Matsunaga S., Takada K.. Metagenomic Insights Reveal Unrecognized Diversity of Entotheonella in Japanese Theonella Sponges. Mar. Biotechnol. (NY) 2024;26(5):1009–1016. doi: 10.1007/s10126-024-10350-8. [DOI] [PubMed] [Google Scholar]
- Miller I. J., Rees E. R., Ross J., Miller I., Baxa J., Lopera J., Kerby R. L., Rey F. E., Kwan J. C.. Autometa: Automated Extraction of Microbial Genomes from Individual Shotgun Metagenomes. Nucleic Acids Res. 2019;47(10):e57. doi: 10.1093/nar/gkz148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leopold-Messer S., Chepkirui C., Mabesoone M. F. J., Meyer J., Paoli L., Sunagawa S., Uria A. R., Wakimoto T., Piel J.. Animal-Associated Marine Acidobacteria with a Rich Natural-Product Repertoire. Chem. 2023;9(12):3696–3713. doi: 10.1016/j.chempr.2023.11.003. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Spreadsheets and Cytoscape files containing the underlying data for Figures – are available on Zenodo (10.5281/zenodo.15346913).




