Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2025 Sep 25;88(10):2546–2563. doi: 10.1021/acs.jnatprod.5c00876

The Role of Nucleotide Sequencing in Natural Product Drug Discovery

Samantha C Waterworth 1,*
PMCID: PMC12560086  PMID: 40998407

Abstract

For millennia, humans have drawn upon natural sources for medicinal remedies. However, the discovery process was often a serendipitous endeavor, and the true biosynthetic potential of the planet’s biodiversity remained largely hidden. The advent of nucleotide sequencing offered a pivotal shift, providing tools to decipher the cryptic codes within biosynthetic machinery. In this Perspective, I describe the arc of the sequencing era, from its foundational impact to the current landscape of sophisticated bioinformatic tools. Furthermore, I will address the international legal frameworks being implemented to protect the very biodiversity upon which the field of natural product discovery relies.

Keywords: drug discovery, biodiversity, natural products, biosynthesis


graphic file with name np5c00876_0006.jpg


graphic file with name np5c00876_0005.jpg

Introduction

Imagine you knew that incredible treasure lay somewhere over the horizon, and now imagine someone gave you a map to that treasure, and every year, it gets increasingly detailed. You are a creative scientist, how would you use it? Nucleotide sequencing, coupled with advanced computational algorithms, now provides just such a treasure map to the untapped reservoir of natural products hidden in the genetic code of organisms. It offers a direct view into the biosynthetic machinery of life, from microbial players like bacteria and fungi to larger organisms like plants and nematodes. By rapidly identifying BGCs and revealing their evolutionary context and regulatory triggers, genomics has become an indispensable engine for accelerating the discovery, characterization, and understanding of natural product compounds.

My own entry into natural product research, only 12 years ago in 2013, coincided with a field in the midst of a seismic technological shift. My timing gave me a front row seat to the rapid transition away from the methods at the time. My very first projects involved assessing microbial diversity with techniques like denaturing gradient gel electrophoresis (DGGE), a labor-intensive approach of temperamental gels and toxic chemicals. Yet, in what felt like a blink, the entire field pivoted to (relatively) high-throughput amplicon sequencing for assessment of the same bacterial diversity. I vividly recall the steep learning curve: from barely knowing where to find the command-line terminal, I was suddenly navigating it to analyze growing 16S rRNA gene amplicon sequence data sets. This acceleration did not stop. By 2014, I was attempting to assemble my first bacterial genome in search of genes encoding an elusive antimicrobial compound. By 2016, I was evaluating entire metagenomes: hunting for bacteria with a talent for bioactive compound production. The subsequent decade has witnessed an explosion in both sequencing power and sophisticated algorithms that unlock meaning from within that data, expanding the frontiers of natural product discovery. Here, I aim to provide an overview of these tools and technologies, illustrate their applications through case studies, and explore what the future may hold for sequencing in natural product discovery.

A (Very) Brief History of Sequencing Technologies

The history of sequencing has been reviewed several times, but the pivotal and transformational moment really came in 1977 when Frederick Sanger and colleagues published their method for determining nucleic acid sequences via chain termination with dideoxynucleotides. A different approach was proposed by Maxam and Gilbert the same year, which was based on the cleavage of DNA, but owing to its complexity, reliance on hazardous chemicals, and challenges with large-scale implementations, this approach ultimately lost its appeal. A decade later, the Sanger method was improved upon with the addition of different colored fluorophores for each of the four bases, allowing a new level of automation, efficiency, and speed, which was initially commercialized by Applied Biosystems.

The second generation, or “next generation”, took advantage of advances in pyrosequencing made by Ronaghi and colleagues, , which enabled millions of “short reads” of nucleotides to be sequenced simultaneously. The first commercialization of this approach was made by 454 Life Sciences, later acquired by Roche, in 2006. For this approach, the nucleic material requiring sequencing needed to be in the form of short, adapter-ligated fragments (generated by PCR amplification or other fragmentation methods) that initially produced reads around 100 nucleotides (nts)this would improve to around 400 nts as the technology advanced. IonTorrent and Illumina would soon follow with their own technological offerings , and would ultimately outcompete the 454 technology, causing its discontinuation in 2013. Both IonTorrent and Illumina have gone on to extensively improve and expand their platforms and continue to be of significant use today. Indeed, a brief survey of the NCBI short read archive (SRA) database showed that as of August 19th, 2025, there are 31.2 million Illumina SRA data sets (19.5 million DNA, 11.7 million RNA), and approximately 510,000 IonTorrent SRA data sets (291,104 DNA, 209,171 RNA). Short read data can be either a) “targeted”, where a particular target gene (or gene fragment), such as the bacterial 16S rRNA gene, is amplified and sequenced, or b) “shotgun”, where total DNA or RNA from a given sample is fragmented and sequenced. These approaches will be discussed in greater detail later.

The third generation of sequencing arrived with the advent of “long-read” sequencing, the development of which received the Nature Methods “2022 Method of the Year” award. Pacific BioScience (PacBio) was the first to bring the sequencing of much larger fragments to the commercial market in 2011, via Single Molecule, Real-Time (SMRT) sequencing. , Soon after, in 2014, Oxford Nanopore Technologies (ONT) introduced Nanopore sequencing, which was based on nucleic acids being fed through membrane nanopores and nucleotides being identified via electric current perturbations. When these long-read technologies initially hit the market, their accuracy could not match that of the short-read options. To overcome the accuracy limitations of early long-read technologies while leveraging their length, a hybrid approach emerged where accurate short reads were aligned to long-read scaffolds to achieve more accurate sequences. ,, More recently reported accuracy is 99.99% with PacBio HiFi sequencing ,, and 99.75% with R10.4.1 Nanopore sequencing. ,−

As sequencing technologies have advanced through different generations, they have not only offered a wider range of capabilities but also generated unprecedented volumes and complexity of data. Navigating this wealth of information, from raw reads to meaningful biological insights, necessitates a sophisticated set of computational algorithms. In the next section, I will explore examples of bioinformatic tools and algorithms essential for assembly, quality assessment, and annotation, among other analyses that allow us to turn a seemingly random collection of A’s, T’s, C’s, and G’s into meaningful results. Please note that throughout this review, I will be focusing on the role of sequencing in the discovery of natural product compounds, not the biological screening or reported utility of the resultant compounds.

Bioinformatic Programs and Algorithms for Biological Insight

Trimming and Assembly

So, here you are, armed with a verifiable treasure trove of sequence dataso, now what? Whether you sequenced the genome of a single isolated bacterium or the entire genomic cornucopia of an environmental sample, many of the fundamental steps will remain the same (Figure ). For each step, I will endeavor to provide examples of commonly used or popular bioinformatic tools (Tables −). However, given that the field is constantly evolving, with numerous tools available and new ones continually emerging, it is important to note that this list is not exhaustive, and many other valuable tools exist beyond the ones listed here. Indeed, if the reader is interested in bioinformatics in the context of complex metagenomic shotgun sequence analysis, I would like to direct them to the Critical Assessment of Metagenome Interpretation (CAMI) challenges, , in which a wide variety of tools commonly used in metagenome analysis pipelines are compared and discussed. Additionally, while I do outline a possible sequence of tools used for bioinformatic investigation here (Figure ), the choice and order of bioinformatic analyses should be dictated by the research question(s) at hand and can vary substantially.

1.

1

A schematic representation of a possible bioinformatic workflow for using nucleotide sequence data in natural product compound discovery.

1. Bioinformatic Tools for Read Preparation and Assembly.

Tool name Read type
Quality assessment and trimming
Trimmomatic Short
Cutadapt Short
FastP Short
BBDuk Short
LongQC Long
NanoPack Long
LongReadSum Long
Read assembly
SPAdes Hybrid
OPERA-MS Hybrid
MEGAHIT Short
NanoPhase Long
Flye Long
Canu Long
Assembly polishing
Unicycler Hybrid
Racon Hybrid
Pilon Hybrid
PolyPolish Hybrid
Medaka Long (Nanopore only)

3. Bioinformatic Tools for Binning and Taxonomic Classification.

Tool Input data type Organism
Taxonomic classifiers
Kraken2 , Reads All
RAT Reads All
Metabuli Reads All
CAT Contigs All
MMSeqs2 Contigs All
Tiara Contigs Eukaryotes
EukRep Contigs Eukaryotes
GTDB-Tk , MAGs Bacteria
CAMITAX MAGs Bacteria
MetaShot MAGs Bacteria
EUKulele MAGs Eukaryotes
Binning tools
MetaBat2 , Contigs Bacteria
MaxBin2 , Contigs Bacteria
Autometa2 , Contigs Bacteria
MetaBinner Contigs Bacteria
CONCOCT Contigs Bacteria
AAMB , Contigs Bacteria
SemiBin2 , Contigs Bacteria
COMEBin Contigs Bacteria
McDevol Contigs Bacteria
GenomeFace Contigs Bacteria
EukFinder Contigs Eukaryotes
EukHeist Contigs Eukaryotes
MAG polishing tools
MAGScoT Contigs and MAGs Bacteria
BinSPreader , Assembly graph, contigs, MAGs Bacteria
uBin MAGs Bacteria and archaea
RefineM Contigs, MAGs and mapped reads Bacteria
MAG quality assessment
CheckM2 , MAGs Bacteria
EukCC MAGs Eukaryotes
BUSCO MAGs Eukaryotes
Gene calling and annotation
Prodigal MAGs or contigs Bacteria
Prokka MAGs Bacteria
Balrog MAGs Bacteria
Bakta MAGs Bacteria
MetaEuk MAGs or contigs Eukaryotes
AUGUSTUS , MAGs or contigs Eukaryotes

The first step in sequence analysis is to ensure that you have quality sequence data, and a great example tool for the first step, for short read data, is FastQC. This algorithm will determine if any sequence adapters are present in your raw sequence data (reads) and the quality of the reads via the Phred score, also known as the quality (Q) score. In short, the higher a Phred score, the better the quality of data you have. The accuracy, or error probability (P), of a given Q-score can be calculated by using the following equation:

P=10Q/10

Generally, a Q-score of 20 is acceptable as this represents an accuracy level of 99%, but others may opt for a minimum score of 30, which reflects 99.9% accuracy. You can choose between a variety of tools (Table ) to trim adapters and remove low-quality reads from your data set. The next step is to get the quality reads assembled into contiguous sequences, colloquially referred to as “contigs”. There are a wide variety of assemblers available, some designed specifically for short- or long-read data, while others can generate hybrid assemblies from both data types (Table ). Additionally, other tools can be used to “polish” assemblies. This can be achieved through mapping of reads onto the assembly (most common) or via kmer-frequency-based approaches. Independent studies comparing the performance of these tools are available for deciding which tools are best suited to individual research needs. Finally, tools like QUAST can be used to assess the quality of your assembly.

Genome Mining for Biosynthetic Gene Clusters

In the context of natural product discovery, after obtaining genomic data from a single isolated bacterium or an entire holobiont metagenome, one could opt to move directly to identifying genes involved in potentially bioactive compound production. This process can be broadly categorized into two approaches: a top-down method, where a known compound has been isolated and the goal is to find the responsible genes; or a bottom-up method, which involves surveying the genes, particularly biosynthetic gene clusters (BGCs), to discover potentially novel chemistry or compounds. One of the most popular tools to achieve this BGC detection is antiSMASH. antiSMASH was first introduced in 2011 and is now in its eighth iteration. The algorithm was initially designed for the detection of bacterial BGCs, but the authors have gone on to include alternative offerings for the detection of fungal and plant BGCs. In addition to antiSMASH, several excellent additional tools for BGC detection exist (Table ).

2. Bioinformatic Tools for Identification and Comparison of BGCs.

Tool Host organism BGC type(s)
BGC identification
plantiSMASH Plants All
antiSMASH Bacteria All
DeepBGC Bacteria All
PRISM , Bacteria All
EvoMining Bacteria All
GECCO Bacteria All
BGCProphet Bacteria and Archaea All
SMURF Fungi All
TOUCAN Fungi All
fungiSMASH Fungi All
RiPPMiner , Bacteria and Fungi Ribosomally Synthesized and Post-translationally Modified Peptides (RiPPs)
NaPDoS2 , Bacteria and Fungi Polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs)
BAGEL4 Bacteria RiPPs and bacteriocins
RODEO2 , Bacteria RiPPs
BGC comparison
BiG-SCAPE N/A All
BiG-SLICE N/A All
CAGECAT N/A All
BGCFlow N/A All
SocialGene N/A All

Along with these detection tools, robust and reliable databases have been developed to catalogue the growing number of characterized and experimentally validated BGCs, as well as tools to compare newly discovered BGCs with both each other and known reference BGCs. The mainstay of BGC databases has been the Minimum Information about a Biosynthetic Gene cluster (MIBiG) database, with new and expansive databases, such as the Secondary Metabolism Collaboratory (SMC), BiG-FAM, The Natural Product Atlas, and BGC Atlas, as alternative comprehensive and useful resources. Comparative tools for BGCs (Table ) have been developed to work alongside these databases, enabling the rapid, large-scale comparison of hundreds to thousands of BGCs. This capability allows researchers to efficiently determine the novelty of their BGCs and predict whether they might produce compounds similar to potent, known bioactive molecules. Finally, while not an analysis tool per se, I would be remiss not to mention clinker, a personal favorite tool for rapid generation of BGC comparison figures, which are publication-worthy straight out of the box, and is now available both online and for local command-line terminal usage.

Binning and Taxonomic Classification

When BGCs are sourced from a metagenomic sample, one may want to discover which organisms host them and potentially produce valuable compounds. To do this, you can explore binning contigs into metagenome-assembled genomes (MAGs) or taxonomically classifying the specific contigs or MAGs that contain the BGCs you are interested in. The first option is to taxonomically classify the reads or assembled contigs from a metagenomic sample using a variety of tools (Table ) and helper tools like Taxometer. The second option is to bin the metagenome into MAGs, followed by their taxonomic classification and further characterization. A wide variety of bioinformatic tools for these processes are currently available (Table ). Additionally, MAGs can be refined using polishing tools, their quality assessed, and taxonomically classified (Table ). If there is interest in examining surrounding genes for clues into the ecological role or triggers for BGC expression and compound production, a researcher could use tools for gene calling and annotation (Table ), and other tools like ARTS , or FunARTS for the identification of self-resistance mechanisms, or perhaps autoMLST, or PhyloPhlAn , for MAG phylogeny.

The development of all these tools has been a critical key to bringing sequencing into the world of natural product discovery, where we have now been able to assess the biosynthetic potential of the unculturable (aka mining microbial “dark matter”), activate silent or cryptic gene clusters, and uncover true biosynthetic origins for improved production of compounds.

RNA-Seq in Natural Product Regulation and Expression

Along with the detection and annotation of the nucleotide sequence of BGCs, it is often advantageous to understand the regulation and expression patterns of BGCs through analysis of transcriptomic data, otherwise known as (bulk) RNA-Seq data. The transcriptomic data can be used on its own or mapped back to a (meta)­genome of interest. The tools and approaches to investigate this will vary extensively depending on the research question(s) to be answered. By and large, the data can be quality controlled and otherwise preprocessed using many of the tools listed for the DNA, but additional, more specialized tools include RNA-SeQC, RNA-QC-Chain, and FastqPuri. If a researcher would like to map their transcripts back to a reference (meta)­genome, options for alignment tools include Bowtie, BBMap, and STAR. Genes can be predicted using tools like Prodigal (bacterial), FINDER (eukaryotes), Augustus ,, (eukaryotes), and several others. , The raw counts of RNA transcripts mapped to these genes can be counted using tools like FeatureCounts or HTSeq. If the experimental approach included different conditions, treatments, or time points, one could use tools like DESeq2, edgeR, or many others to assess the differential expression of genes across samples. Additionally, new tools like SeMa-Trap are making their way into the field and aim to facilitate the exploration of RNA-Seq data for optimal culture conditions for the production of bioactive molecules from cultured microbes.

Case Studies of Sequencing in Natural Product Discovery

In this section, I will cover a variety of both seminal and more unusual examples of where nucleotide sequencing has been used to uncover novel BGCs, bioactive compounds, or reveal the true producers of previously isolated natural product compounds. First, we will explore how genomic data from cultured microbes have been used to facilitate the discovery of natural products. Then, we will examine how this same goal is accomplished using more complex metagenomic data to investigate microbes that are not willing or able to grow under laboratory conditions. Finally, we explore a few examples of how transcriptomic data can be used to access novel chemistries.

Using Genomic Data to Guide Microbial Culture Conditions

An oft-touted axiom is that the vast majority of microbes are recalcitrant to traditional culture methods in the lab. However, the development of sophisticated and imaginative solutions to access the microbial dark matter, , such as the iChip, , the SlipChip, using source nutrients, , coculture, and other approaches, are gradually knocking that barrier down. An interesting study was performed in 2024, wherein researchers leveraged genome sequence data from cultivable and uncultivatable MAGs from the order Burkholderiales and found, through comparison of genetic features, that genes involved in vitamin B12 biosynthesis and oxidative stress were enriched in the cultivable group. Armed with this, the researchers tested this observation by adding vitamin B12 to their standard culture media and found that new Burkholderiales isolates grew. This genomic-guided isolation should come as exciting news for microbiologists and natural product researchers alike, as comparison of recalcitrant genomes to their readily cultured cousins may ease the way to exploring intractable microbes and the bioactive treasures they may hold.

Although not specifically focused on bioactive compounds, a comparative transcriptomic study on toxin production in cultured Prorocentrum lima dinoflagellates demonstrated that phosphate limitation correlated with increased expression of hypothesized toxin BGCs. This transcriptional finding was confirmed by concurrent toxin content measurements and indicates that comparative transcriptomics can be readily used to optimize the culture conditions to produce bioactive molecules.

Using Genomics Data for Targeted BGC Silencing to Uncover Minor Metabolites

An intriguing case of difficult bacteria and sequencing coming together is that of clovibactin. Here, a rare Eleftheria bacterium, the same genus that produces teixobactin, was isolated in vitro for the first time following prolonged incubation. Initial inspections of the extract fraction from the E. terrae ssp. carolina. yielded known compound, kalimantacin, and a less abundant, unidentified compound. To increase the yield of the less abundant molecule, researchers sequenced the genome of the bacterium via the generation of long-read PacBio sequencing and subsequent assembly with the Hierarchical Genome Assembly Process (HGAP4) assembler, part of the SMRT Analysis software suite. They identified the kalimantacin-like BGC using antiSMASH in conjunction with the MIBiG database and used this information to silence the BGC through interruption of the first gene in the operon. Silencing of the BGC encoding the abundant, known compound led to the discovery of clovibactin, a distant structural relative of teixobactin that is a more potent antibacterial agent with a different mechanism of action.

Using BGC Homology to Identify, Target, and Isolate Novel Natural Product Compounds

A striking example of the innovation that is possible with sequencing in natural product discovery is that of WDB002 from Streptomyces malaysiensis. The researchers, Verdine and colleagues, had observed that the two structurally related FDA-approved immunosuppressive molecules, rapamycin and FK506, had “constant” and “variable” regions in their structure (Figure ), not unlike the constant and variable regions of an antibody. The constant region of these molecules would bind peptidyl prolyl isomerase FK506-binding protein (FKBP12), while the variable region would determine the target selectivity. Acting together, the FKBP12 and molecule can then allosterically bind the target, stimulating a conformational change which results in inhibition. , They reasoned that, given the BGCs encoding these two molecules were similar, they could mine other Actinomycete genomes for related BGCs of modular compounds (i.e., compounds with constant and variable regions). After screening the sequenced genomes of approximately 135,000 actinomycete isolates, the authors found the related “X1” BGC in Streptomyces malaysiensis DSM41697. Through cloning of the X1 BGC, a series of analogs were produced that, like rapamycin and FK506, formed a complex with FKBP12. Among the analogs was WBD002, which targeted the human centrosomal protein CEP250 via a coiled-coil domain previously deemed to be “undruggable”. The WBD002 molecule and synthetic analogs have since been licensed to Gingko Bioworks for finding the properties of WBD002 that enabled the drugging of the “undruggable” coiled-coil and subsequently for development to treat infectious diseases.

2.

2

“Modular” compounds with “constant” and “variable” regions are products of related BGCs. Constant regions enable consistent binding to “presenter” proteins (e.g., FKBP12) while “variable” regions dictate target selectivity. The BGC comparison was generated with clinker. This figure was adapted, with permission from the authors, from Shigdel et al., 2020, published under a Creative Commons Attribution License 4.0 (CC BY). Copyright 2020 the Author(s). Published by the National Academy of Sciences of the United States of America (PNAS).

Heterologous Expression of BGCs

Identifying a BGC through sequencing does not always guarantee production in the native host, often due to challenges with standard culture conditions or uncharacterized regulatory pathways. Consequently, researchers may leverage their knowledge of the BGC nucleotide sequence to pursue heterologous expression in a more amenable host organism. Recent examples of this include the discovery of atranorin and novel bilothiazoles from complex metagenomes.

Atranorin is a depside produced in a wide variety of lichenized fungi with an equally diverse array of reported bioactivities. A survey of PKS BGCs detected by antiSMASH in Cladonia lichen and related species, clustered using BiG-SCAPE, revealed the presence of a BGC gene cluster family (GCF) exclusively conserved in lichenized fungal species that produce atranorin. However, at this point in 2021, no one had yet definitively linked a BGC to any known lichen-produced compound. Knowing that prior expression attempts in established fungal heterologous hosts had been unsuccessful, the authors, Kim and colleagues, introduced the putative atranorin BGC genes (atr1–atr4) into an unconventional host: the plant pathogen Ascochyta rabiei, leading to the successful synthesis of atranorin. The same year, Kealey and colleagues used the sequence data of a BGC from a Pseudevernia furfuracea lichen to isolate lecanoric acid, heterologously expressed in an engineered Saccharomyces cerevisiae.

The story of the bilothiazoles is particularly intriguing, as the BGC was identified from a metagenomic data set of a human microbiome sample, showcasing the potential for natural product synthesis in our own gut microbiota. The authors identified thiazol­(in)­e structural motifs from genomes of gut-associated bacteria as a promising lead for bioactivity. Therefore, they opted to focus on NRPS BGCs from two human gut shotgun metagenomic data sets, identified using antiSMASH, that included modules predicted to be involved in heterocyclization. Next, they contrasted their prioritized BGCs into gene cluster families (GCFs) with BiG-SCAPE, which ultimately led them to focus on the GCF including the novel bilothiazoles (bil) BGC. The BGC is predicted to originate from an uncultured Bilophila sp. strain, relatives of which are known pathobionts of the human gut, linked to the incidence of disease and cancer. In this case, the BGC was not cloned from the source organism. Instead, it was produced via de novo synthesis, reorganized and codon-optimized, and cloned into an E. coli host. Subsequent induced expression of the BGC resulted in the production of a suite of novel bilothiazoles, which exhibited weak to nonexistent anticancer and antibacterial activity, but did exhibit notable indicators for inhibition of DNA-repair and transpeptidases.

Using Metagenomics to Reveal the Evolutionary History of Bacterial Symbionts

For many years, the origin of the potent cytotoxic cyclic peptides known as patellamides, isolated from the sea squirt (ascidian) Lissoclinum patella, was uncertain. In 2005, the Schmidt group provided a definitive answer via metagenomics, showing that the true producer was the symbiotic cyanobacterium: Prochloron didemni. Using the tblastn algorithm (using an amino acid sequence query against a nucleotide reference database), the authors concluded that the putative BGC for patellamide comprised seven genes (patApatG). The authors then generated fosmid libraries and used the BGC sequence information to determine which of their clones contained the complete predicted pathway. Heterologous expression of the fosmids resulted in the production of both patellamides A and C, confirming unequivocally that the identified BGC in P. didemni. was responsible for patellamide biosynthesis. An additional suite of compounds, the patellazoles, had similarly been isolated from the L. patella tunicate. , However, there was no evidence of their biosynthesis in the symbiont P. didemni genome. The patellazoles were subsequently hypothesized to be produced by other members of the complex L. patella microbiome. The Schmidt group confirmed this hypothesis by constructing a complete genome of a symbiotic α-proteobacterium: Candidatus Endolissoclinum faulkneri from assembled Illumina shotgun metagenomic sequence data. Within this MAG, a single hybrid NRPS-trans-AT PKS BGC (ptz) was detected that matched the predicted biosynthesis of patellazoles. A remarkable finding was that, unlike the sympatric P. didemni symbiont, which included all genes necessary for a free-living lifestyle, the Ca. E. faulkneri symbiont had undergone significant genome reduction. The maintenance of the ptz BGC despite the genome streamlining was concluded as indicative of its importance to the symbiotic interaction. Finally, in the absence of fossil records, the genomic sequence data extracted from the metagenomic assembly were used to estimate the evolutionary divergence of the symbiont and showed that the Ca. E. faulkneri symbiont had likely been associated with the Lissoclinum tunicate for 6–30 million years.

Using Metagenomics to Find the True Producers of Pederin-like Compounds

The pederin-like family of compounds has long fascinated me from both biosynthetic and evolutionary perspectives. While largely unified in their potent biological activity, this family of compounds has been isolated from a remarkably diverse array of organisms, and sequencing efforts have gone on to show that the BGCs responsible for their production stem from an equally wide array of bacterial taxa (Figure ).

3.

3

Comparison of the family of pederin-like compounds, their associated BGCs, and the organisms from which they were isolated. Homologous genes are indicated by color, and the level of homology is indicated with a color scale. Comparative BGC visualization was generated using clinker.

Pederin was initially isolated from the Paederus fuscipes beetle, and its structure was determined in the early 1900s. But the BGC (ped) would not be described until 2002. The ped cluster was elucidated by the Piel group through the construction of a cosmid library and screening of the library with PCR primers to identify clones with PKS regions. Sequencing and subsequent assembly of the cloned regions resulted in the recovery of a ∼110kb region including genes pedApedH. Analysis of genes beyond the ped BGC showed that they were most homologous with genes from P. aeruginosa, providing the first evidence that the pederin producer was likely a Pseudomonas bacterium. A subsequent study, also using cosmid clones, would reveal additional ped genes elsewhere on the symbiont’s genome.

Following this, in 2004, the BGC encoding onnamides, theopederins, and pseudoonnamides was recovered from the yellow chemotype of the marine sponge Theonella swinhoei using a similar cosmid library approach. The producing bacterium was elusive until 2014, when single-cell sorting, genomic amplification, and subsequent sequencing revealed that Candidatus Entotheonella factor TSY1 was the true producing organism. In 2009, the BGC of psymberin (also known as irciniastatin A) was recovered from an unknown bacterium harbored in the marine sponge Psammocinia aff. bulbosa, through a combination of targeted sequencing of PKS genes using the specificity of the ketosynthase domains as a guide and fosmid libraries. In 2023, guided by 16S phylogeny analysis, Candidatus Entotheonella consociata was proposed as the psymberin-producing bacterium, but a full genome has not yet been sequenced to validate this hypothesis.

In 2013, the BGC of another pederin-like compound, nosperin, was elucidated. This time, the compound would be recovered from a terrestrial Peltigera membranacea lichen, and the producing organism found to be its resident Nostoc sp. photobiont. The researchers, Andrésson and colleagues, achieved this by extracting the total genomic DNA from the lichen thallus, sequencing the DNA via both Roche 454 and Illumina sequencing before assembling with an older assembler, MIRA. PKS genes were searched for using tblastn, and 18 candidate BGCs were recovered, one of which bore striking similarity to the known pederin BGC. Unfortunately, attempts to recover a pederin-like compound from the lichen tissue were unsuccessful. However, spurred on by their sequencing evidence, the researchers were not only able to show that the BGC was expressed in situ but also were able to culture the Nostoc symbiont, confirm the presence of the pederin-like BGC via PCR, and then recover the novel nosperin compound from a 25L culture.

The same year, the BGC for another pederin-like compound, diaphorin, was uncovered. The Asian citrus psyllid, Diaphorina citri, was known to harbor two distinct intracellular symbionts: Carsonella_DC (a predicted nutritional symbiont), and a second, more enigmatic betaproteobacterial symbiont, which would be later named Candidatus Profftella armature. In an effort to better understand the latter, Fukatsu and colleagues extracted DNA from bacteriomes dissected out from a female D. citri specimen. Sanger sequencing of the shotgun plasmid library and subsequent assembly with an older assembler, Phred-Phrap, allowed the researchers to recover complete genomes for both intracellular symbionts. Much like the Ca. E. faulkneri symbiont from L. patella (patellazole producer), the Ca. P. armature symbiont genome was drastically reduced but devoted 15% of its 0.46Mbp genome to two disjointed loci of a PKS BGC (dipAdipT). Given the striking similarity between their recovered BGC and that of pederin, Fukatsu and colleagues collected and extracted over 1000 D. citri psyllid specimens to isolate and structurally elucidate diaphorin. Similar to the case of nosperin, without sequence data to tip them off, diaphorin and its potential defensive role in the symbiosis may never have been discovered!

In 2018, Hrouzek and colleagues successfully cultured Cuspidothrix issatschenkoi, a cyanobacterium from a freshwater bloom. An antiSMASH , analysis of a draft genome from this bacterium, which was binned from shotgun metagenomic sequencing data from the same bloom, indicated the presence of a nosperin-like BGC. The researchers subsequently isolated a novel pederin-like compound, cusperin, from the C. issatschenkoi culture and confirmed the presence of the cusperin BGC from whole genome sequencing. Furthermore, by analyzing the cusperin BGC modules, the researchers inferred the absolute configuration of several chiral carbons within the cusperin molecule. Thus, the sequencing data not only revealed that a new compound was to be found but also proved to be useful in its structural elucidation.

In 2020, Piel and colleagues generated shotgun metagenomic data from the marine sponge Mycale hentscheli from which mycalamide, a pederin family compound, had been previously isolated. Illumina sequence data had been assembled with both MEGAHIT and metaSPAdes, and the resultant contigs were binned into MAGs with MetaBAT2. Following quality assessment and taxonomic classification of the MAGs with CheckM and GTDB-Tk, respectively, the MAGs were searched for BGCs using antiSMASH. , They discovered a BGC predicted to encode mycalamide within a MAG classified within the marine gammaproteobacterium group UBA10353, later designated as Candidatus Entomycale ignis.

Finally, a free-living and culturable marine bacterium, Labrenzia sp. PHM005, was isolated from marine sediments. Following long-read sequencing and subsequent assembly into a single contig, the genome was assessed with antiSMASH , and found to include a pederin-like BGC. Subsequent genetic manipulation of the BGC allowed the researchers to prove unequivocally that the BGC was responsible for the production of novel pederin family compound, labrenzin, and further provided insights into the functions of genes common to pederin-family BGCs. ,

Using Transcriptomic Data for Drug Discovery

Transcriptomic analysis of Aspergillus flavus fungal strains revealed that global regulatory gene veA influenced the transcription of at least half (28 out of 56) of all predicted BGCs in the strains. The veA gene is involved in regulating a variety of processes in fungi, particularly Aspergillus species, that forms the core of the velvet complex, a multiprotein assembly crucial for development and secondary metabolism, primarily in response to light. Using this transcriptional data, Cary and colleagues, found that a novel BGC was expressed most intensely in the fungal sclerotia and subsequent isolation efforts resulted in the identification of a known compound aflavarin. The researchers additionally showed that the presence of aflavarin increased the production of sclerotia structures, and likely served an additional defensive anti-insectan role.

Another illustrative example of this comes in the form of a study of four cultured Salinispora bacterial strains, wherein comparative transcriptomics were used to uncover regulatory mechanisms silencing orphan gene clusters. An unexpected result of this study was that most BGCs (75%) present in the cultured bacteria were expressed in both exponential and stationary phases of growthan observation largely at odds with the long-held belief that bioactive compounds are produced largely in the stationary phase and that most BGCs are not expressed under standard culture conditions. More recently, the finding that secondary metabolites can be produced en masse during exponential growth was validated in the myxobacterium Sorangium sp. So ce836. In this work, researchers found, through comparative transcriptomics, that 80% of the detected BGCs were expressed during exponential growth and could be linked to increased bioactive compound production.

The application of transcriptomics for the discovery of plant natural products has helped researchers narrow down which tissues are best targeted for maximum compound recovery. One such example is the study of the tissues of Panax quinquefolius (ginseng), which led researchers to the discovery that the greatest concentration of characteristic flavonoids occurred in aerial plant components, such as leaves and flowers. Conversely, a recent study on the close relative Panax japonicus revealed that genes predicted to encode triterpene saponins were most transcribed in root tissues.

Another example is a study of Ferula assafoetida, a plant with ties to the mysterious Ferula species, the resin of which was the core of some of the earliest medicines for cancer prized by physicians of the ancient world. , Here, through contrasting transcriptional data from the various plant tissues, they concluded that the flowers likely had the greatest concentrations of potentially bioactive terpenoids and coumarins. Subsequently, a second Ferula species, F. feruloides, was tracked via transcriptomics to determine the optimal time for harvest for the recovery of bioactive compounds, informing future choices for efficient recovery of the compounds of interest.

Sequencing for Taxonomic Resolution, Dereplication, and Chemotaxonomy

A critical but often overlooked role of sequencing in natural product discovery is the accurate taxonomic classification of organisms and the dereplication advantage associated with it. It is a well-known adage that in the majority of cases, the chances of rediscovery of compounds increase when exploring organisms that are closely related. ,− While not without exceptions, there is often a correlation between the relatedness of secondary metabolites and host phylogeny as broad as kingdom level, as well as within lower taxonomic ranks, with examples in myxobacteria, Antarctic slugs, plant-associated fungal endosymbionts, bacteria, , marine sponges and corals, and various plant species.

In addition to reducing the odds of compound rediscovery in closely related organisms, sequence-based phylogeny can also be used directly for bioprospecting natural products. For example, in one report, the researchers assessed the polyynes BGC distribution across Pseudomonas species, and found a previously unreported, conserved BGC produced a novel polyyne that they named protegencin. The same compound was independently recovered from a different Pseudomonas by Murata and colleagues the same year, and it was later found that this compound could act as an algicide.

In a different study, researchers combined sequence-based phylogeny data with bioactivity for over 16,500 plant-derived compounds from more than 7,500 different plant species from the island of Java to ultimately propose “hot” and “cold” clades that would be more likely to yield antibiotic compounds or compounds with novel structures. The “hot” clades include native Javan plants within genera Blumea, Carpesium, Pluchea, Pterocaulon, and Sphaeranthus, where over half the included species produced compounds with antiinfective activities. In a similar phylogenetic investigation, the rbcL plastid marker was amplified from 2,621 plant species from South Africa, New Zealand, and Nepal, and used to construct phylogenetic trees. The researchers then appended additional metadata about which specimens were used in traditional medicine practices and found that plants utilized in traditional medicine systems, from these three distinct countries, were found to converge in specific phylogenetic “hot nodes”. The authors suggest that because related plants often share similar chemistry, a “hot node” (a lineage with significant medicinal use in one country) can predict the medicinal applications of related species elsewhere. Consequently, they argue that these genera are prime candidates for bioprospecting.

Natural Product Discovery through BGC Phylogeny

While not dissimilar to pure taxonomic phylogeny-based approaches, assessment of BGC phylogeny has yielded some exciting compounds from a variety of sources. The approach centers around finding clusters of related BGCs, exploring their evolutionary history or relatedness, and using this information to guide or otherwise facilitate production of the encoded molecules (Figure ). Some of the most popular tools for exploring BGC similarity are BiG-SCAPE and its sister algorithm for massive data sets, BiG-SLiCE.

4.

4

A simplified schematic of the conceptual frameworks guiding the use of biosynthetic gene cluster (BGC) phylogeny for novel compound discovery in the selected case studies.

In 2020, researchers used this approach to isolate a novel glycopeptide antibiotic, named corbomycin, from a Streptomyces strain. In this study, the researchers reasoned that by searching for BGCs with no known resistance genes they could uncover molecules with novel mechanisms of action. They built a phylogenetic tree of glycopeptide antibiotic BGCs they had previously identified, discovered a clade of BGCs that had no known self-resistance genes but did include a BGC for the known compound, complestatin. Corbomycin and complestatin were isolated from Streptomyces strains carrying BGCs from this clade and were subsequently found to have a shared, novel mechanism of action, whereby they inhibit autolysins, essential peptidoglycan hydrolases, and subsequently inhibit peptidoglycan biosynthesis.

More recently, the BGC-phylogeny guided approach has similarly resulted in the isolation of the novel polyene macrolide antifungal compound, mandimycin. The authors of this work undertook the gargantuan task of recovering almost 2 million BGCs from ∼316,000 sequenced bacterial genomes and then parsed them all in search of putative mycosamine-transferring glycosyltransferases, which narrowed the list down to a little under 300 BGCs of interest. Phylogeny of these glycosyltransferase genes revealed a clade of genes distinct from those from the known polyene macrolide antibiotic BGCs. A BGC originating from Streptomyces netropsis DSM 40259 within this clade was selected, ultimately resulting in the isolation of the novel, bioactive mandimycin compound. This case study highlights how sequencing data can facilitate and streamline the discovery of novel molecules while simultaneously mitigating the chances of rediscovery.

Finally, a sizable effort from Robey and colleagues to recover and cluster 36,399 BGCs from 1,037 fungal genomes did not result in the identification of any novel compounds, but did enable the prediction of metabolite scaffolds based on their clustering with known, curated BGCs from the MIBiG database. Additionally, they noted which BGCs tended to be found across different taxonomic lineages, which would aid in limiting rediscovery. They also went on to show that, contrasting the fungal BGCs they recovered against a set of 24,024 bacterial BGCs, fungi and bacteria occupy different biosynthetic spaces, a trend that was mirrored when they performed a similar clustering of known fungal and bacterial secondary metabolites. The authors also noted that when assessing the chemical space in which fungal and bacterial compounds are found, the fungal metabolites tend to cluster more closely to FDA-approved compounds and proposed that fungal metabolites may be “more druglike” than their bacterial counterparts. Together, this illustrates that novel compounds have yet to be discovered in fungi as there is enormous orphaned biosynthetic potential.

BGCs and Natural Products from Unexpected Sources

While the discovery of novel natural products and their associated BGCs is largely considered within the microbial world and plants, more recent discoveries have shown that BGCs can be found across all kingdoms of life. For example, a study of 208 algal genomes used antiSMASH to uncover 2,476 candidate BGCs. The authors utilized domain organization homology to compare the recovered BGCs to the reference BGCs. They found that algal BGCs had distinct characteristics, with NRPSs favoring LCL condensation domains (unlike those favored by bacteria, which are typically DCL/Dual domains), and PKSs were structured iteratively, similar to those in fungi.

Even animals prove to be unexpected sources. The nematode Caenorhabditis elegans produces nemamides, which are hybrid polyketide-nonribosomal peptides encoded by a BGC within the worm’s own genome, not a symbiont. , In 2003, researchers used transcriptomic sequencing data from purple sea urchin (Strongylocentrotus purpuratus) embryos to discover that an endogenous polyketide synthase (PKS) BGC is responsible for the production of its characteristic pigment, echinochrome. More recently, scientists sequenced five octocoral genomes, identifying a BGC that is conserved across diverse taxonomic families and is essential for producing a cembrene-B-γ-lactone, a core structure central to the biosynthesis of briarane diterpenoids. Perhaps most surprisingly, a study of common budgerigars (budgies) demonstrated that a single amino acid substitution in a polyketide synthase was responsible for the phenotypic switch from yellow to blue feathers. Researchers confirmed this by expressing the PKS in yeast, which resulted in the accumulation of a yellow pigment.

These discoveries illustrate the enormous biosynthetic potential that lies beyond our conventional comfort zones. Through sequencing and subsequent identification and heterologous expression of BGCs from diverse and complex organisms, we can begin to uncover an even wider array of chemical scaffolds than previously imagined.

Peering into the Future: Technologies and Algorithms to Look Out For

A brand new tool on the block, aimed at high-throughput detection of BGCs, is BGCProphet. It utilizes a transformer-based language model that treats genes as “words” within a genomic context, allowing it to “learn” the syntax and vocabulary of BGCs from known examples. The developers posit that this method can identify novel BGC types missed by rule-based finders like antiSMASH and demonstrated its scalability by screening over 90,000 genomes and MAGs to identify hundreds of thousands of putative BGCs. This approach is undoubtedly promising for uncovering novel BGCs, but some aspects of the initial report warrant careful consideration. A key concern relates to the model’s training and benchmarking. Using antiSMASH to define the negative training set (genomes with antiSMASH-defined BGCs removed) may have inadvertently contained novel BGCs that antiSMASH cannot detect, potentially biasing the model against identifying the very targets that it aims to find. Furthermore, although BGCProphet identifies a greater number of BGCs than established tools, the report lacks the experimental validation needed to confirm that these additional predictions are genuine. This, personally, raises questions about the model’s specificity and its ability to distinguish true BGCs from other colocated gene clusters with similar organizational features, such as operons involved in primary metabolism. Ultimately, BGCProphet represents an intriguing and powerful application of AI for BGC discovery, but the true test of its utility will be the experimental validation of its predictions by the natural products community. For now, BGCProphet could be viewed as a promising complementary tool whose novel predictions, when used in tandem with established methods, could reveal previously hidden biosynthetic potential.

One of the most compelling technologies currently being developed is spatially resolved metagenomics and metatranscriptomics. As the biosynthesis of bioactive natural products often occurs in very specific niches or microenvironments (e.g., soil aggregates, particular host tissues, and biofilms), spatial investigation could provide critical insights into the ecological underpinnings that trigger the production of these compounds. This could lead to significant advancement in cultivation strategies or activation approaches in vitro. In a recent elegant series of experiments, a team of researchers investigated how a plant host (Arabidopsis thaliana leaves) and its harbored microorganisms interact through the sequencing of short-read data captured on arrays with a variety of probes (i.e., a mix of probes for 16S, 18S, and ITS rRNA sequences, and poly d­(T) probes for untargeted host mRNA sequences). At a spatial resolution of 55 μm, the researchers could survey the A. thaliana host response to the density and diversity of microbial “hotspots” on leaf samples. For example, when the researchers artificially infected a leaf with a Pseudomonas pathogen, they observed a colocalized increase in plant host immune response to the localization of the pathogen. Further, they used this approach to map the distribution of bacterial and fungal taxa on leaves of outdoor-grown A. thaliana plants and found that the native bacteria and fungi occupied distinct sections of the leaves (with some overlapping areas) and that the plant host expressed a higher level of defense-related genes in these regions. Although this study did not explore BGCs, the approach paves the way for investigating their spatial expression. Incorporating probes for conserved regions of PKS/NRPS genes, for example, could reveal how BGCs are expressed across host tissues and whether host factors, microbial density, or microbial activity influence their activation.

Finally, an exciting new platform for natural product discovery, published by researchers from the University of Illinois Urbana–Champaign in March 2025, is FAST-NPS. The FAST-NPS pipeline covers all stages, from genome mining and BGC prediction to heterologous expression and product isolation. The key innovation of the pipeline is utilizing self-resistance genes for BGC prioritization (as used in previous studies, one of which was described earlier in this review) and as a proof-of-concept they used their pipeline to screen 11 Streptomyces species, clone 105 BGCs into heterologous Streptomyces lividans hosts, leading to the recovery of 23 compounds of which eight had either antibacterial or antineoplastic activity. The pipeline utilizes two existing tools: antiSMASH for BGC detection, and ARTS − , to detect self-resistance genes. However, the platform’s general applicability is limited by its use of the iBioFAB robotics system, which automates the CAPTURE process for capturing, cloning, and heterologous expression of BGCs, and this aspect of the platform may not be readily accessible to all researchers. Nonetheless, researchers with limited resources may still be able to apply many of the strategic approaches employed by the FAST-NPS developers to create an effective, albeit lower-throughput, version of the platform.

Emerging Regulatory Framework for Access to Digital Sequence Information

In an effort to protect the rich biodiversity often stewarded by developing countries, the Convention on Biological Diversity (CBD) was amended in 2014. The Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization (generally referred to as the Nagoya Protocol) established guidelines for access to these materials and resources. , Simply put, if researchers or companies aim to use a genetic resource, they need to obtain permission from the country of origin and/or community from which that resource originated as well as agree to share any benefits gained from the resource with the originating country or community. At the time of writing this review, 142 parties have ratified the protocol (i.e., formally agreed to be bound by the protocol). While the protocol was initially drafted with physical, biological samples in mind, there have been subsequent efforts to now include digital sequence information (DSI), or genomic sequence data (GSD) within the scope of the protocol. , While these two terms are not entirely synonymous, there is significant overlap between these two concepts, and I will use the acronym DSI with the intent of discussing the topic with both DSI and GSD in mind.

Given that a significant percentage of approved drugs are either natural products or a derivative thereof, and that natural products have been reported to lead to an increased likelihood of clinical success, the Nagoya protocol should be considered in the context of ethical natural product drug discovery. Additionally, given the increasing role that sequencing data may play in the future of natural product drug discovery, the topic of the use of DSI requires a nuanced consideration. The subject has been comprehensively reviewed by several experts, ,,− but I will briefly summarize the key points here.

While open access to DSI for research purposes is lauded as a positive boon for the scientific community, aiding global efforts to rapidly address challenges and facilitating innovation, the advent and subsequent advancement of both DNA sequencing and synthesis technologies have led to fears of a potential digital version of biopiracy. ,− In short, access to nucleotide sequences can negate the need to travel to a country, acquire necessary permits, set up sharing agreements, and physically collect a sample. Instead, nucleotide sequences can be harvested online and synthesized before being cloned into an appropriate host. This can bypass agreement mechanisms and, by extension, the legal requirements for benefit-sharing. Currently, the actions of delegates tasked with finding a solution to this problem are lagging behind the rapid pace at which sequencing, data transfer, hosting, and distribution technologies are evolving. , Compounding challenges due to a rapidly shifting technological environment, the exact definition of DSI has not yet been agreed upon. , Even if the definition of DSI were defined, strategies to ensure compliance in the digital age are a significant hurdle.

While different solutions have been proposed, the Conference of the Parties (COP) meeting 15 (COP15) resulted in the proposal for the establishment of a global fund, among other strategies, to address these issues. The goal was for countries to have negotiated terms to achieve this by October 2024, with guidelines stating that the multilateral mechanism by which benefits are shared must not hinder research or innovation, allow open access data sharing, but at the same time provide reliable legal provisions, and generate more benefit than cost. , In response, following COP16, the Cali Fund for the Fair and Equitable Sharing of Benefits from the Use of Digital Sequence Information (the “Cali fund”) was launched in February 2025. Significant hurdles and details remain to be worked out, but natural product researchers should be aware of these and other ongoing initiatives, their potential effects on research, and contribute their expertise where possible. Moving forward, continued international cooperation and innovative mechanisms will be crucial to both encourage research and discovery and enable equitable benefit-sharing. A future without robust conservation is a future with dwindling biodiversity and, in turn, a significantly impoverished landscape for natural product discovery.

Conclusion

From the foundational Sanger method to revolutionary next-gen technologies, sequencing has steadily revealed the hidden chemical potential encoded within diverse organisms. Marching in lockstep with the ever-advancing technologies are the algorithms for quality control, contig assembly, and crucially, for the targeted mining of biosynthetic gene clusters (BGCs), which have become important technologies for a natural product researcher. These tools not only enable the efficient identification of BGCs but also facilitate large-scale comparisons, allowing researchers to quickly assess the novelty and potential bioactivity of newly discovered gene clusters. In this way, sequencing has been instrumental in unlocking the “microbial dark matter”, revealing the vast biosynthetic capabilities of previously unknown and unculturable organisms. Beyond the identification of BGCs, transcriptomic analyses have provided a critical window into the expression and regulation of BGCs and their associated natural products. This allows researchers to optimize culture conditions for maximal compound production, prioritize promising microbial strains, and even challenge long-held assumptions about the timing of secondary metabolite production. As sequencing technologies continue to advance in accuracy, speed, and affordability and as bioinformatic tools become even more sophisticated, the future of natural product compound discovery promises an exciting era of exploration. The integration of “omics” data, artificial intelligence, and synthetic biology, all underpinned by the power of nucleotide sequencing, will undoubtedly lead to a cascade of discoveries that will invigorate and advance the field of natural product discovery for years to come.

Acknowledgments

This research was supported in whole with Federal Funds from the National Cancer Institute, Center for Cancer Research, National Institutes of Health, Department of Health and Human Services, Intramural Program, under project numbers ZIA 011471 (B.R.O.) and ZIA 011469 (J.A.B.). I would also extend my deep appreciation to Dr. Barry O’Keefe, Dr. John Beutler, and Prof. Rosemary Dorrington, who provided thoughtful, helpful, and constructive feedback during the preparation of this report.

The author declares no competing financial interest.

References

  1. Suring W., Hoogduin D., Le Ngoc G., Brouwer A., van Straalen N. M., Roelofs D.. Nonribosomal Peptide Synthetases in Animals. Genes (Basel) 2023;14(9):1741. doi: 10.3390/genes14091741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Robey M. T., Caesar L. K., Drott M. T., Keller N. P., Kelleher N. L.. An Interpreted Atlas of Biosynthetic Gene Clusters from 1,000 Fungal Genomes. Proc. Natl. Acad. Sci. U. S. A. 2021;118(19):e2020230118. doi: 10.1073/pnas.2020230118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Zhgun A. A.. Fungal BGCs for Production of Secondary Metabolites: Main Types, Central Roles in Strain Improvement, and Regulation According to the Piano Principle. Int. J. Mol. Sci. 2023;24(13):11184. doi: 10.3390/ijms241311184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Kautsar S. A., van der Hooft J. J. J., de Ridder D., Medema M. H.. BiG-SLiCE: A Highly Scalable Tool Maps the Diversity of 1.2 Million Biosynthetic Gene Clusters. Gigascience. 2021;10:giaa154. doi: 10.1093/gigascience/giaa154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Feng L., Gordon M. T., Liu Y., Basso K. B., Butcher R. A.. Mapping the Biosynthetic Pathway of a Hybrid Polyketide-Nonribosomal Peptide in a Metazoan. Nat. Commun. 2021;12(1):4912. doi: 10.1038/s41467-021-24682-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Heather J. M., Chain B.. The Sequence of Sequencers: The History of Sequencing DNA. Genomics. 2016;107(1):1–8. doi: 10.1016/j.ygeno.2015.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Shendure J., Balasubramanian S., Church G. M., Gilbert W., Rogers J., Schloss J. A., Waterston R. H.. DNA Sequencing at 40: Past, Present and Future. Nature. 2017;550(7676):345–353. doi: 10.1038/nature24286. [DOI] [PubMed] [Google Scholar]
  8. Tyagi P., Bhide M.. History of DNA Sequencing. Folia Vet. 2020;64(2):66–73. doi: 10.2478/fv-2020-0019. [DOI] [Google Scholar]
  9. Giani A. M., Gallo G. R., Gianfranceschi L., Formenti G.. Long Walk to Genomics: History and Current Approaches to Genome Sequencing and Assembly. Comput. Struct. Biotechnol. J. 2020;18:9–19. doi: 10.1016/j.csbj.2019.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Satam H., Joshi K., Mangrolia U., Waghoo S., Zaidi G., Rawool S., Thakare R. P., Banday S., Mishra A. K., Das G., Malonia S. K.. Next-Generation Sequencing Technology: Current Trends and Advancements. Biology (Basel) 2023;12(7):997. doi: 10.3390/biology12070997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Levy S. E., Boone B. E.. Next-Generation Sequencing Strategies. Cold Spring Harb. Perspect. Med. 2019;9(7):a025791. doi: 10.1101/cshperspect.a025791. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Sanger F., Nicklen S., Coulson A. R.. DNA Sequencing with Chain-Terminating Inhibitors. Proc. Natl. Acad. Sci. U. S. A. 1977;74(12):5463–5467. doi: 10.1073/pnas.74.12.5463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Maxam A. M., Gilbert W.. A New Method for Sequencing DNA. Proc. Natl. Acad. Sci. U. S. A. 1977;74(2):560–564. doi: 10.1073/pnas.74.2.560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Smith L. M., Sanders J. Z., Kaiser R. J., Hughes P., Dodd C., Connell C. R., Heiner C., Kent S. B., Hood L. E.. Fluorescence Detection in Automated DNA Sequence Analysis. Nature. 1986;321(6071):674–679. doi: 10.1038/321674a0. [DOI] [PubMed] [Google Scholar]
  15. Chait E., Page G., Hunkapiller M.. Battle of the DNA Sequencers. Nature. 1988;333(6172):477–478. doi: 10.1038/333477a0. [DOI] [Google Scholar]
  16. Ronaghi M., Uhlén M., Nyrén P.. A Sequencing Method Based on Real-Time Pyrophosphate. Science. 1998;281(5375):363. doi: 10.1126/science.281.5375.363. [DOI] [PubMed] [Google Scholar]
  17. Ronaghi M., Karamohamed S., Pettersson B., Uhlén M., Nyrén P.. Real-Time DNA Sequencing Using Detection of Pyrophosphate Release. Anal. Biochem. 1996;242(1):84–89. doi: 10.1006/abio.1996.0432. [DOI] [PubMed] [Google Scholar]
  18. Margulies M., Egholm M., Altman W. E., Attiya S., Bader J. S., Bemben L. A., Berka J., Braverman M. S., Chen Y.-J., Chen Z., Dewell S. B., Du L., Fierro J. M., Gomes X. V., Godwin B. C., He W., Helgesen S., Ho C. H., Irzyk G. P., Jando S. C., Alenquer M. L. I., Jarvie T. P., Jirage K. B., Kim J.-B., Knight J. R., Lanza J. R., Leamon J. H., Lefkowitz S. M., Lei M., Li J., Lohman K. L., Lu H., Makhijani V. B., McDade K. E., McKenna M. P., Myers E. W., Nickerson E., Nobile J. R., Plant R., Puc B. P., Ronan M. T., Roth G. T., Sarkis G. J., Simons J. F., Simpson J. W., Srinivasan M., Tartaro K. R., Tomasz A., Vogt K. A., Volkmer G. A., Wang S. H., Wang Y., Weiner M. P., Yu P., Begley R. F., Rothberg J. M.. Genome Sequencing in Microfabricated High-Density Picolitre Reactors. Nature. 2005;437(7057):376–380. doi: 10.1038/nature03959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Ronaghi M., Shokralla S., Gharizadeh B.. Pyrosequencing for Discovery and Analysis of DNA Sequence Variations. Pharmacogenomics. 2007;8(10):1437–1441. doi: 10.2217/14622416.8.10.1437. [DOI] [PubMed] [Google Scholar]
  20. Rothberg J. M., Hinz W., Rearick T. M., Schultz J., Mileski W., Davey M., Leamon J. H., Johnson K., Milgrew M. J., Edwards M., Hoon J., Simons J. F., Marran D., Myers J. W., Davidson J. F., Branting A., Nobile J. R., Puc B. P., Light D., Clark T. A., Huber M., Branciforte J. T., Stoner I. B., Cawley S. E., Lyons M., Fu Y., Homer N., Sedova M., Miao X., Reed B., Sabina J., Feierstein E., Schorn M., Alanjary M., Dimalanta E., Dressman D., Kasinskas R., Sokolsky T., Fidanza J. A., Namsaraev E., McKernan K. J., Williams A., Roth G. T., Bustillo J.. An Integrated Semiconductor Device Enabling Non-Optical Genome Sequencing. Nature. 2011;475(7356):348–352. doi: 10.1038/nature10242. [DOI] [PubMed] [Google Scholar]
  21. History of Illumina Sequencing. https://www.illumina.com/science/technology/next-generation-sequencing/illumina-sequencing-history.html (accessed 2025. –05–05).
  22. Slatko B. E., Gardner A. F., Ausubel F. M.. Overview of Next-Generation Sequencing Technologies. Curr. Protoc. Mol. Biol. 2018;122(1):e59. doi: 10.1002/cpmb.59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Marx V.. Method of the Year: Long-Read Sequencing. Nat. Methods. 2023;20(1):6–11. doi: 10.1038/s41592-022-01730-w. [DOI] [PubMed] [Google Scholar]
  24. https://www.pacb.com/blog/long-read-sequencing/ (accessed 2025. –05–05).
  25. Athanasopoulou K., Boti M. A., Adamopoulos P. G., Skourou P. C., Scorilas A.. Third-Generation Sequencing: The Spearhead towards the Radical Transformation of Modern Genomics. Life (Basel) 2022;12(1):30. doi: 10.3390/life12010030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Jain M., Fiddes I. T., Miga K. H., Olsen H. E., Paten B., Akeson M.. Improved Data Analysis for the MinION Nanopore Sequencer. Nat. Methods. 2015;12(4):351–356. doi: 10.1038/nmeth.3290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Warburton P. E., Sebra R. P.. Long-Read DNA Sequencing: Recent Advances and Remaining Challenges. Annu. Rev. Genomics Hum. Genet. 2023;24:109–132. doi: 10.1146/annurev-genom-101722-103045. [DOI] [PubMed] [Google Scholar]
  28. Stevens B. M., Creed T. B., Reardon C. L., Manter D. K.. Comparison of Oxford Nanopore Technologies and Illumina MiSeq Sequencing with Mock Communities and Agricultural Soil. Sci. Rep. 2023;13(1):9323. doi: 10.1038/s41598-023-36101-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Espinosa E., Bautista R., Larrosa R., Plata O.. Advancements in Long-Read Genome Sequencing Technologies and Algorithms. Genomics. 2024;116(3):110842. doi: 10.1016/j.ygeno.2024.110842. [DOI] [PubMed] [Google Scholar]
  30. https://www.pacb.com/technology/hifi-sequencing/ (accessed 2025. –05–06).
  31. Nanopore sequencing accuracy. Oxford Nanopore Technologies. https://nanoporetech.com/platform/accuracy/ (accessed 2025. –05–06).
  32. Zhao W., Zeng W., Pang B., Luo M., Peng Y., Xu J., Kan B., Li Z., Lu X.. Oxford Nanopore Long-Read Sequencing Enables the Generation of Complete Bacterial and Plasmid Genomes without Short-Read Sequencing. Front. Microbiol. 2023;14:1179966. doi: 10.3389/fmicb.2023.1179966. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Sereika M., Kirkegaard R. H., Karst S. M., Michaelsen T. Y., Sørensen E. A., Wollenberg R. D., Albertsen M.. Oxford Nanopore R10.4 Long-Read Sequencing Enables the Generation of near-Finished Bacterial Genomes from Pure Cultures and Metagenomes without Short-Read or Reference Polishing. Nat. Methods. 2022;19(7):823–826. doi: 10.1038/s41592-022-01539-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Meyer F., Fritz A., Deng Z.-L., Koslicki D., Lesker T. R., Gurevich A., Robertson G., Alser M., Antipov D., Beghini F., Bertrand D., Brito J. J., Brown C. T., Buchmann J., Buluç A., Chen B., Chikhi R., Clausen P. T. L. C., Cristian A., Dabrowski P. W., Darling A. E., Egan R., Eskin E., Georganas E., Goltsman E., Gray M. A., Hansen L. H., Hofmeyr S., Huang P., Irber L., Jia H., Jørgensen T. S., Kieser S. D., Klemetsen T., Kola A., Kolmogorov M., Korobeynikov A., Kwan J., LaPierre N., Lemaitre C., Li C., Limasset A., Malcher-Miranda F., Mangul S., Marcelino V. R., Marchet C., Marijon P., Meleshko D., Mende D. R., Milanese A., Nagarajan N., Nissen J., Nurk S., Oliker L., Paoli L., Peterlongo P., Piro V. C., Porter J. S., Rasmussen S., Rees E. R., Reinert K., Renard B., Robertsen E. M., Rosen G. L., Ruscheweyh H.-J., Sarwal V., Segata N., Seiler E., Shi L., Sun F., Sunagawa S., Sørensen S. J., Thomas A., Tong C., Trajkovski M., Tremblay J., Uritskiy G., Vicedomini R., Wang Z., Wang Z., Wang Z., Warren A., Willassen N. P., Yelick K., You R., Zeller G., Zhao Z., Zhu S., Zhu J., Garrido-Oter R., Gastmeier P., Hacquard S., Häußler S., Khaledi A., Maechler F., Mesny F., Radutoiu S., Schulze-Lefert P., Smit N., Strowig T., Bremges A., Sczyrba A., McHardy A. C.. Critical Assessment of Metagenome Interpretation: The Second Round of Challenges. Nat. Methods. 2022;19(4):429–440. doi: 10.1038/s41592-022-01431-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Sczyrba A., Hofmann P., Belmann P., Koslicki D., Janssen S., Dröge J., Gregor I., Majda S., Fiedler J., Dahms E., Bremges A., Fritz A., Garrido-Oter R., Jørgensen T. S., Shapiro N., Blood P. D., Gurevich A., Bai Y., Turaev D., DeMaere M. Z., Chikhi R., Nagarajan N., Quince C., Meyer F., Balvočiu̅tė M., Hansen L. H., Sørensen S. J., Chia B. K. H., Denis B., Froula J. L., Wang Z., Egan R., Don Kang D., Cook J. J., Deltel C., Beckstette M., Lemaitre C., Peterlongo P., Rizk G., Lavenier D., Wu Y.-W., Singer S. W., Jain C., Strous M., Klingenberg H., Meinicke P., Barton M. D., Lingner T., Lin H.-H., Liao Y.-C., Silva G. G. Z., Cuevas D. A., Edwards R. A., Saha S., Piro V. C., Renard B. Y., Pop M., Klenk H.-P., Göker M., Kyrpides N. C., Woyke T., Vorholt J. A., Schulze-Lefert P., Rubin E. M., Darling A. E., Rattei T., McHardy A. C.. Critical Assessment of Metagenome Interpretation-a Benchmark of Metagenomics Software. Nat. Methods. 2017;14(11):1063–1071. doi: 10.1038/nmeth.4458. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Babraham bioinformatics - FastQC A quality control tool for high throughput sequence data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed 2025. –05–07).
  37. Guo A., Salzberg S. L., Zimin A. V.. JASPER: A Fast Genome Polishing Tool That Improves Accuracy of Genome Assemblies. PLoS Comput. Biol. 2023;19(3):e1011032. doi: 10.1371/journal.pcbi.1011032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Luan T., Commichaux S., Hoffmann M., Jayeola V., Jang J. H., Pop M., Rand H., Luo Y.. Benchmarking Short and Long Read Polishing Tools for Nanopore Assemblies: Achieving near-Perfect Genomes for Outbreak Isolates. BMC Genomics. 2024;25(1):679. doi: 10.1186/s12864-024-10582-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Chen Z., Erickson D. L., Meng J.. Benchmarking Hybrid Assembly Approaches for Genomic Analyses of Bacterial Pathogens Using Illumina and Oxford Nanopore Sequencing. BMC Genomics. 2020;21(1):631. doi: 10.1186/s12864-020-07041-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Mc Cartney A. M., Shafin K., Alonge M., Bzikadze A. V., Formenti G., Fungtammasan A., Howe K., Jain C., Koren S., Logsdon G. A., Miga K. H., Mikheenko A., Paten B., Shumate A., Soto D. C., Sović I., Wood J. M. D., Zook J. M., Phillippy A. M., Rhie A.. Chasing Perfection: Validation and Polishing Strategies for Telomere-to-Telomere Genome Assemblies. Nat. Methods. 2022;19(6):687–695. doi: 10.1038/s41592-022-01440-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Gurevich A., Saveliev V., Vyahhi N., Tesler G.. QUAST: Quality Assessment Tool for Genome Assemblies. Bioinformatics. 2013;29(8):1072–1075. doi: 10.1093/bioinformatics/btt086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Bolger A. M., Lohse M., Usadel B.. Trimmomatic: A Flexible Trimmer for Illumina Sequence Data. Bioinformatics. 2014;30(15):2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Martin M.. Cutadapt Removes Adapter Sequences from High-Throughput Sequencing Reads. EMBnet J. 2011;17(1):10. doi: 10.14806/ej.17.1.200. [DOI] [Google Scholar]
  44. Chen S.. Ultrafast One-Pass FASTQ Data Preprocessing, Quality Control, and Deduplication Using Fastp. Imeta. 2023;2(2):e107. doi: 10.1002/imt2.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. BBDuk guide. Archive. https://archive.jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/ (accessed 2025. –05–07).
  46. Fukasawa Y., Ermini L., Wang H., Carty K., Cheung M.-S.. LongQC: A Quality Control Tool for Third Generation Sequencing Long Read Data. G3 (Bethesda) 2020;10(4):1193–1196. doi: 10.1534/g3.119.400864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. De Coster W., D’Hert S., Schultz D. T., Cruts M., Van Broeckhoven C.. NanoPack: Visualizing and Processing Long-Read Sequencing Data. Bioinformatics. 2018;34(15):2666–2669. doi: 10.1093/bioinformatics/bty149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Perdomo J. E., Ahsan M. U., Liu Q., Fang L., Wang K.. LongReadSum: A Fast and Flexible Quality Control and Signal Summarization Tool for Long-Read Sequencing Data. Comput. Struct. Biotechnol. J. 2025;27:556–563. doi: 10.1016/j.csbj.2025.01.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Prjibelski A., Antipov D., Meleshko D., Lapidus A., Korobeynikov A.. Using SPAdes DE Novo Assembler. Curr. Protoc. Bioinformatics. 2020;70(1):e102. doi: 10.1002/cpbi.102. [DOI] [PubMed] [Google Scholar]
  50. Bertrand D., Shaw J., Kalathiyappan M., Ng A. H. Q., Kumar M. S., Li C., Dvornicic M., Soldo J. P., Koh J. Y., Tong C., Ng O. T., Barkham T., Young B., Marimuthu K., Chng K. R., Sikic M., Nagarajan N.. Hybrid Metagenomic Assembly Enables High-Resolution Analysis of Resistance Determinants and Mobile Elements in Human Microbiomes. Nat. Biotechnol. 2019;37(8):937–944. doi: 10.1038/s41587-019-0191-2. [DOI] [PubMed] [Google Scholar]
  51. Li D., Liu C.-M., Luo R., Sadakane K., Lam T.-W.. MEGAHIT: An Ultra-Fast Single-Node Solution for Large and Complex Metagenomics Assembly via Succinct de Bruijn Graph. Bioinformatics. 2015;31(10):1674–1676. doi: 10.1093/bioinformatics/btv033. [DOI] [PubMed] [Google Scholar]
  52. Liu L., Yang Y., Deng Y., Zhang T.. Nanopore Long-Read-Only Metagenomics Enables Complete and High-Quality Genome Reconstruction from Mock and Complex Metagenomes. Microbiome. 2022;10(1):209. doi: 10.1186/s40168-022-01415-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Kolmogorov M., Yuan J., Lin Y., Pevzner P. A.. Assembly of Long, Error-Prone Reads Using Repeat Graphs. Nat. Biotechnol. 2019;37(5):540–546. doi: 10.1038/s41587-019-0072-8. [DOI] [PubMed] [Google Scholar]
  54. Koren S., Walenz B. P., Berlin K., Miller J. R., Bergman N. H., Phillippy A. M.. Canu: Scalable and Accurate Long-Read Assembly via Adaptive k-Mer Weighting and Repeat Separation. Genome Res. 2017;27(5):722–736. doi: 10.1101/gr.215087.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Wick R. R., Judd L. M., Gorrie C. L., Holt K. E.. Unicycler: Resolving Bacterial Genome Assemblies from Short and Long Sequencing Reads. PLoS Comput. Biol. 2017;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Vaser R., Sović I., Nagarajan N., Šikić M.. Fast and Accurate de Novo Genome Assembly from Long Uncorrected Reads. Genome Res. 2017;27(5):737–746. doi: 10.1101/gr.214270.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Walker B. J., Abeel T., Shea T., Priest M., Abouelliel A., Sakthikumar S., Cuomo C. A., Zeng Q., Wortman J., Young S. K., Earl A. M.. Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement. PLoS One. 2014;9(11):e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Wick R. R., Holt K. E.. Polypolish: Short-Read Polishing of Long-Read Bacterial Genome Assemblies. PLoS Comput. Biol. 2022;18(1):e1009802. doi: 10.1371/journal.pcbi.1009802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Medaka: Sequence Correction Provided by ONT Research. Github. https://github.com/nanoporetech/medaka.
  60. Medema M. H., Blin K., Cimermancic P., de Jager V., Zakrzewski P., Fischbach M. A., Weber T., Takano E., Breitling R.. AntiSMASH: Rapid Identification, Annotation and Analysis of Secondary Metabolite Biosynthesis Gene Clusters in Bacterial and Fungal Genome Sequences. Nucleic Acids Res. 2011;39:W339–46. doi: 10.1093/nar/gkr466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Blin K., Medema M. H., Kazempour D., Fischbach M. A., Breitling R., Takano E., Weber T.. AntiSMASH 2.0-a Versatile Platform for Genome Mining of Secondary Metabolite Producers. Nucleic Acids Res. 2013;41:W204–12. doi: 10.1093/nar/gkt449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Weber T., Blin K., Duddela S., Krug D., Kim H. U., Bruccoleri R., Lee S. Y., Fischbach M. A., Müller R., Wohlleben W., Breitling R., Takano E., Medema M. H.. AntiSMASH 3.0-a Comprehensive Resource for the Genome Mining of Biosynthetic Gene Clusters. Nucleic Acids Res. 2015;43(W1):W237–43. doi: 10.1093/nar/gkv437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Blin K., Wolf T., Chevrette M. G., Lu X., Schwalen C. J., Kautsar S. A., Suarez Duran H. G., de Los Santos E. L. C., Kim H. U., Nave M., Dickschat J. S., Mitchell D. A., Shelest E., Breitling R., Takano E., Lee S. Y., Weber T., Medema M. H.. AntiSMASH 4.0-Improvements in Chemistry Prediction and Gene Cluster Boundary Identification. Nucleic Acids Res. 2017;45(W1):W36–W41. doi: 10.1093/nar/gkx319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Blin K., Shaw S., Steinke K., Villebro R., Ziemert N., Lee S. Y., Medema M. H., Weber T.. AntiSMASH 5.0: Updates to the Secondary Metabolite Genome Mining Pipeline. Nucleic Acids Res. 2019;47(W1):W81–W87. doi: 10.1093/nar/gkz310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Blin K., Shaw S., Kloosterman A. M., Charlop-Powers Z., van Wezel G. P., Medema M. H., Weber T.. AntiSMASH 6.0: Improving Cluster Detection and Comparison Capabilities. Nucleic Acids Res. 2021;49(W1):W29–W35. doi: 10.1093/nar/gkab335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Blin K., Shaw S., Augustijn H. E., Reitz Z. L., Biermann F., Alanjary M., Fetter A., Terlouw B. R., Metcalf W. W., Helfrich E. J. N., van Wezel G. P., Medema M. H., Weber T.. AntiSMASH 7.0: New and Improved Predictions for Detection, Regulation, Chemical Structures and Visualisation. Nucleic Acids Res. 2023;51(W1):W46–W50. doi: 10.1093/nar/gkad344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Blin K., Shaw S., Vader L., Szenei J., Reitz Z. L., Augustijn H. E., Cediel-Becerra J. D. D., de Crécy-Lagard V., Koetsier R. A., Williams S. E., Cruz-Morales P., Wongwas S., Segurado Luchsinger A. E., Biermann F., Korenskaia A., Zdouc M. M., Meijer D., Terlouw B. R., van der Hooft J. J. J., Ziemert N., Helfrich E. J. N., Masschelein J., Corre C., Chevrette M. G., van Wezel G. P., Medema M. H., Weber T.. AntiSMASH 8.0: Extended Gene Cluster Detection Capabilities and Analyses of Chemistry, Enzymology, and Regulation. Nucleic Acids Res. 2025 doi: 10.1093/nar/gkaf334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Kautsar S. A., Suarez Duran H. G., Blin K., Osbourn A., Medema M. H.. PlantiSMASH: Automated Identification, Annotation and Expression Analysis of Plant Biosynthetic Gene Clusters. Nucleic Acids Res. 2017;45(W1):W55–W63. doi: 10.1093/nar/gkx305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Hannigan G. D., Prihoda D., Palicka A., Soukup J., Klempir O., Rampula L., Durcak J., Wurst M., Kotowski J., Chang D., Wang R., Piizzi G., Temesi G., Hazuda D. J., Woelk C. H., Bitton D. A.. A Deep Learning Genome-Mining Strategy for Biosynthetic Gene Cluster Prediction. Nucleic Acids Res. 2019;47(18):e110. doi: 10.1093/nar/gkz654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Skinnider M. A., Johnston C. W., Gunabalasingam M., Merwin N. J., Kieliszek A. M., MacLellan R. J., Li H., Ranieri M. R. M., Webster A. L. H., Cao M. P. T., Pfeifle A., Spencer N., To Q. H., Wallace D. P., Dejong C. A., Magarvey N. A.. Comprehensive Prediction of Secondary Metabolite Structure and Biological Activity from Microbial Genome Sequences. Nat. Commun. 2020;11(1):6058. doi: 10.1038/s41467-020-19986-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Skinnider M. A., Dejong C. A., Rees P. N., Johnston C. W., Li H., Webster A. L. H., Wyatt M. A., Magarvey N. A.. Genomes to Natural Products PRediction Informatics for Secondary Metabolomes (PRISM) Nucleic Acids Res. 2015;43(20):9645–9662. doi: 10.1093/nar/gkv1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Cruz-Morales P., Kopp J. F., Martínez-Guerrero C., Yáñez-Guerra L. A., Selem-Mojica N., Ramos-Aboites H., Feldmann J., Barona-Gómez F.. Phylogenomic Analysis of Natural Products Biosynthetic Gene Clusters Allows Discovery of Arseno-Organic Metabolites in Model Streptomycetes. Genome Biol. Evol. 2016;8(6):1906–1916. doi: 10.1093/gbe/evw125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Carroll L. M., Larralde M., Fleck J. S., Ponnudurai R., Milanese A., Cappio E., Zeller G.. Accuratede NovoIdentification of Biosynthetic Gene Clusters with GECCO. bioRxiv. 2021 doi: 10.1101/2021.05.03.442509. [DOI] [Google Scholar]
  74. Lai Q., Yao S., Zha Y., Zhang H., Zhang H., Ye Y., Zhang Y., Bai H., Ning K.. Deciphering the Biosynthetic Potential of Microbial Genomes Using a BGC Language Processing Neural Network Model. Nucleic Acids Res. 2025;53(7):gkaf305. doi: 10.1093/nar/gkaf305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Khaldi N., Seifuddin F. T., Turner G., Haft D., Nierman W. C., Wolfe K. H., Fedorova N. D.. SMURF: Genomic Mapping of Fungal Secondary Metabolite Clusters. Fungal Genet. Biol. 2010;47(9):736–741. doi: 10.1016/j.fgb.2010.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Almeida H., Palys S., Tsang A., Diallo A. B.. TOUCAN: A Framework for Fungal Biosynthetic Gene Cluster Discovery. NAR Genom. Bioinform. 2020;2(4):lqaa098. doi: 10.1093/nargab/lqaa098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Agrawal P., Khater S., Gupta M., Sain N., Mohanty D.. RiPPMiner: A Bioinformatics Resource for Deciphering Chemical Structures of RiPPs Based on Prediction of Cleavage and Cross-Links. Nucleic Acids Res. 2017;45(W1):W80–W88. doi: 10.1093/nar/gkx408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Agrawal P., Amir S., Deepak, Barua D., Mohanty D.. RiPPMiner-Genome: A Web Resource for Automated Prediction of Crosslinked Chemical Structures of RiPPs by Genome Mining. J. Mol. Biol. 2021;433(11):166887. doi: 10.1016/j.jmb.2021.166887. [DOI] [PubMed] [Google Scholar]
  79. Klau L. J., Podell S., Creamer K. E., Demko A. M., Singh H. W., Allen E. E., Moore B. S., Ziemert N., Letzel A. C., Jensen P. R.. The Natural Product Domain Seeker Version 2 (NaPDoS2) Webtool Relates Ketosynthase Phylogeny to Biosynthetic Function. J. Biol. Chem. 2022;298(10):102480. doi: 10.1016/j.jbc.2022.102480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Ziemert N., Podell S., Penn K., Badger J. H., Allen E., Jensen P. R.. The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity. PLoS One. 2012;7(3):e34064. doi: 10.1371/journal.pone.0034064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. van Heel A. J., de Jong A., Song C., Viel J. H., Kok J., Kuipers O. P.. BAGEL4: A User-Friendly Web Server to Thoroughly Mine RiPPs and Bacteriocins. Nucleic Acids Res. 2018;46(W1):W278–W281. doi: 10.1093/nar/gky383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. de Jong A., van Hijum S. A. F. T., Bijlsma J. J. E., Kok J., Kuipers O. P.. BAGEL: A Web-Based Bacteriocin Genome Mining Tool. Nucleic Acids Res. 2006;34:W273–9. doi: 10.1093/nar/gkl237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. van Heel A. J., de Jong A., Montalbán-López M., Kok J., Kuipers O. P.. BAGEL3: Automated Identification of Genes Encoding Bacteriocins and (Non-)­Bactericidal Posttranslationally Modified Peptides. Nucleic Acids Res. 2013;41:W448–53. doi: 10.1093/nar/gkt391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. de Jong A., van Heel A. J., Kok J., Kuipers O. P.. BAGEL2: Mining for Bacteriocins in Genomic Data. Nucleic Acids Res. 2010;38:W647–51. doi: 10.1093/nar/gkq365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Tietz J. I., Schwalen C. J., Patel P. S., Maxson T., Blair P. M., Tai H.-C., Zakai U. I., Mitchell D. A.. A New Genome-Mining Tool Redefines the Lasso Peptide Biosynthetic Landscape. Nat. Chem. Biol. 2017;13(5):470–478. doi: 10.1038/nchembio.2319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Hudson G. A., Burkhart B. J., DiCaprio A. J., Schwalen C. J., Kille B., Pogorelov T. V., Mitchell D. A.. Bioinformatic Mapping of Radical S-Adenosylmethionine-Dependent Ribosomally Synthesized and Post-Translationally Modified Peptides Identifies New Cα, Cβ, and Cγ-Linked Thioether-Containing Peptides. J. Am. Chem. Soc. 2019;141(20):8228–8238. doi: 10.1021/jacs.9b01519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Navarro-Muñoz J. C., Selem-Mojica N., Mullowney M. W., Kautsar S. A., Tryon J. H., Parkinson E. I., De Los Santos E. L. C., Yeong M., Cruz-Morales P., Abubucker S., Roeters A., Lokhorst W., Fernandez-Guerra A., Cappelini L. T. D., Goering A. W., Thomson R. J., Metcalf W. W., Kelleher N. L., Barona-Gomez F., Medema M. H.. A Computational Framework to Explore Large-Scale Biosynthetic Diversity. Nat. Chem. Biol. 2020;16(1):60–68. doi: 10.1038/s41589-019-0400-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. van den Belt M., Gilchrist C., Booth T. J., Chooi Y.-H., Medema M. H., Alanjary M.. CAGECAT: The CompArative GEne Cluster Analysis Toolbox for Rapid Search and Visualisation of Homologous Gene Clusters. BMC Bioinformatics. 2023;24(1):181. doi: 10.1186/s12859-023-05311-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Nuhamunada M., Mohite O. S., Phaneuf P. V., Palsson B. O., Weber T.. BGCFlow: Systematic Pangenome Workflow for the Analysis of Biosynthetic Gene Clusters across Large Genomic Datasets. Nucleic Acids Res. 2024;52(10):5478–5495. doi: 10.1093/nar/gkae314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Clark C. M., Kwan J. C.. Creating and Leveraging Bespoke Large-Scale Knowledge Graphs for Comparative Genomics and Multi-Omics Drug Discovery with SocialGene. bioRxiv. 2024 doi: 10.1101/2024.08.16.608329. [DOI] [Google Scholar]
  91. Zdouc M. M., Blin K., Louwen N. L. L., Navarro J., Loureiro C., Bader C. D., Bailey C. B., Barra L., Booth T. J., Bozhüyük K. A. J., Cediel-Becerra J. D. D., Charlop-Powers Z., Chevrette M. G., Chooi Y. H., D’Agostino P. M., de Rond T., Del Pup E., Duncan K. R., Gu W., Hanif N., Helfrich E. J. N., Jenner M., Katsuyama Y., Korenskaia A., Krug D., Libis V., Lund G. A., Mantri S., Morgan K. D., Owen C., Phan C.-S., Philmus B., Reitz Z. L., Robinson S. L., Singh K. S., Teufel R., Tong Y., Tugizimana F., Ulanova D., Winter J. M., Aguilar C., Akiyama D. Y., Al-Salihi S. A. A., Alanjary M., Alberti F., Aleti G., Alharthi S. A., Rojo M. Y. A., Arishi A. A., Augustijn H. E., Avalon N. E., Avelar-Rivas J. A., Axt K. K., Barbieri H. B., Barbosa J. C. J., Barboza Segato L. G., Barrett S. E., Baunach M., Beemelmanns C., Beqaj D., Berger T., Bernaldo-Agüero J., Bettenbühl S. M., Bielinski V. A., Biermann F., Borges R. M., Borriss R., Breitenbach M., Bretscher K. M., Brigham M. W., Buedenbender L., Bulcock B. W., Cano-Prieto C., Capela J., Carrion V. J., Carter R. S., Castelo-Branco R., Castro-Falcón G., Chagas F. O., Charria-Girón E., Chaudhri A. A., Chaudhry V., Choi H., Choi Y., Choupannejad R., Chromy J., Donahey M. S. C., Collemare J., Connolly J. A., Creamer K. E., Crüsemann M., Cruz A. A., Cumsille A., Dallery J.-F., Damas-Ramos L. C., Damiani T., de Kruijff M., Martín B. D., Sala G. D., Dillen J., Doering D. T., Dommaraju S. R., Durusu S., Egbert S., Ellerhorst M., Faussurier B., Fetter A., Feuermann M., Fewer D. P., Foldi J., Frediansyah A., Garza E. A., Gavriilidou A., Gentile A., Gerke J., Gerstmans H., Gomez-Escribano J. P., González-Salazar L. A., Grayson N. E., Greco C., Gomez J. E. G., Guerra S., Flores S. G., Gurevich A., Gutiérrez-García K., Hart L., Haslinger K., He B., Hebra T., Hemmann J. L., Hindra H., Höing L., Holland D. C., Holme J. E., Horch T., Hrab P., Hu J., Huynh T.-H., Hwang J.-Y., Iacovelli R., Iftime D., Iorio M., Jayachandran S., Jeong E., Jing J., Jung J. J., Kakumu Y., Kalkreuter E., Kang K. B., Kang S., Kim W., Kim G. J., Kim H., Kim H. U., Klapper M., Koetsier R. A., Kollten C., Kovács Á. T., Kriukova Y., Kubach N., Kunjapur A. M., Kushnareva A. K., Kust A., Lamber J., Larralde M., Larsen N. J., Launay A. P., Le N.-T.-H., Lebeer S., Lee B. T., Lee K., Lev K. L., Li S.-M., Li Y.-X., Licona-Cassani C., Lien A., Liu J., Lopez J. A. V., Machushynets N. V., Macias M. I., Mahmud T., Maleckis M., Martinez-Martinez A. M., Mast Y., Maximo M. F., McBride C. M., McLellan R. M., Bhatt K. M., Melkonian C., Merrild A., Metsä-Ketelä M., Mitchell D. A., Müller A. V., Nguyen G.-S., Nguyen H. T., Niedermeyer T. H. J., O’Hare J. H., Ossowicki A., Ostash B. O., Otani H., Padva L., Paliyal S., Pan X., Panghal M., Parade D. S., Park J., Parra J., Rubio M. P., Pham H. T., Pidot S. J., Piel J., Pourmohsenin B., Rakhmanov M., Ramesh S., Rasmussen M. H., Rego A., Reher R., Rice A. J., Rigolet A., Romero-Otero A., Rosas-Becerra L. R., Rosiles P. Y., Rutz A., Ryu B., Sahadeo L.-A., Saldanha M., Salvi L., Sánchez-Carvajal E., Santos-Medellin C., Sbaraini N., Schoellhorn S. M., Schumm C., Sehnal L., Selem N., Shah A. D., Shishido T. K., Sieber S., Silviani V., Singh G., Singh H., Sokolova N., Sonnenschein E. C., Sosio M., Sowa S. T., Steffen K., Stegmann E., Streiff A. B., Strüder A., Surup F., Svenningsen T., Sweeney D., Szenei J., Tagirdzhanov A., Tan B., Tarnowski M. J., Terlouw B. R., Rey T., Thome N. U., Torres Ortega L. R., Tørring T., Trindade M., Truman A. W., Tvilum M., Udwary D. W., Ulbricht C., Vader L., van Wezel G. P., Walmsley M., Warnasinghe R., Weddeling H. G., Weir A. N. M., Williams K., Williams S. E., Witte T. E., Rocca S. M. W., Yamada K., Yang D., Yang D., Yu J., Zhou Z., Ziemert N., Zimmer L., Zimmermann A., Zimmermann C., van der Hooft J. J. J., Linington R. G., Weber T., Medema M. H.. MIBiG 4.0: Advancing Biosynthetic Gene Cluster Curation through Global Collaboration. Nucleic Acids Res. 2025;53(D1):D678–D690. doi: 10.1093/nar/gkae1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Terlouw B. R., Blin K., Navarro-Muñoz J. C., Avalon N. E., Chevrette M. G., Egbert S., Lee S., Meijer D., Recchia M. J. J., Reitz Z. L., van Santen J. A., Selem-Mojica N., Tørring T., Zaroubi L., Alanjary M., Aleti G., Aguilar C., Al-Salihi S. A. A., Augustijn H. E., Avelar-Rivas J. A., Avitia-Domínguez L. A., Barona-Gómez F., Bernaldo-Agüero J., Bielinski V. A., Biermann F., Booth T. J., Carrion Bravo V. J., Castelo-Branco R., Chagas F. O., Cruz-Morales P., Du C., Duncan K. R., Gavriilidou A., Gayrard D., Gutiérrez-García K., Haslinger K., Helfrich E. J. N., van der Hooft J. J. J., Jati A. P., Kalkreuter E., Kalyvas N., Kang K. B., Kautsar S., Kim W., Kunjapur A. M., Li Y.-X., Lin G.-M., Loureiro C., Louwen J. J. R., Louwen N. L. L., Lund G., Parra J., Philmus B., Pourmohsenin B., Pronk L. J. U., Rego A., Rex D. A. B., Robinson S., Rosas-Becerra L. R., Roxborough E. T., Schorn M. A., Scobie D. J., Singh K. S., Sokolova N., Tang X., Udwary D., Vigneshwari A., Vind K., Vromans S. P. J. M., Waschulin V., Williams S. E., Winter J. M., Witte T. E., Xie H., Yang D., Yu J., Zdouc M., Zhong Z., Collemare J., Linington R. G., Weber T., Medema M. H.. MIBiG 3.0: A Community-Driven Effort to Annotate Experimentally Validated Biosynthetic Gene Clusters. Nucleic Acids Res. 2023;51(D1):D603–D610. doi: 10.1093/nar/gkac1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Kautsar S. A., Blin K., Shaw S., Navarro-Muñoz J. C., Terlouw B. R., van der Hooft J. J. J., van Santen J. A., Tracanna V., Suarez Duran H. G., Pascal Andreu V., Selem-Mojica N., Alanjary M., Robinson S. L., Lund G., Epstein S. C., Sisto A. C., Charkoudian L. K., Collemare J., Linington R. G., Weber T., Medema M. H.. MIBiG 2.0: A Repository for Biosynthetic Gene Clusters of Known Function. Nucleic Acids Res. 2019;48(D1):D454–D458. doi: 10.1093/nar/gkz882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Medema M. H., Kottmann R., Yilmaz P., Cummings M., Biggins J. B., Blin K., de Bruijn I., Chooi Y. H., Claesen J., Coates R. C., Cruz-Morales P., Duddela S., Düsterhus S., Edwards D. J., Fewer D. P., Garg N., Geiger C., Gomez-Escribano J. P., Greule A., Hadjithomas M., Haines A. S., Helfrich E. J. N., Hillwig M. L., Ishida K., Jones A. C., Jones C. S., Jungmann K., Kegler C., Kim H. U., Kötter P., Krug D., Masschelein J., Melnik A. V., Mantovani S. M., Monroe E. A., Moore M., Moss N., Nützmann H.-W., Pan G., Pati A., Petras D., Reen F. J., Rosconi F., Rui Z., Tian Z., Tobias N. J., Tsunematsu Y., Wiemann P., Wyckoff E., Yan X., Yim G., Yu F., Xie Y., Aigle B., Apel A. K., Balibar C. J., Balskus E. P., Barona-Gómez F., Bechthold A., Bode H. B., Borriss R., Brady S. F., Brakhage A. A., Caffrey P., Cheng Y.-Q., Clardy J., Cox R. J., De Mot R., Donadio S., Donia M. S., van der Donk W. A., Dorrestein P. C., Doyle S., Driessen A. J. M., Ehling-Schulz M., Entian K.-D., Fischbach M. A., Gerwick L., Gerwick W. H., Gross H., Gust B., Hertweck C., Höfte M., Jensen S. E., Ju J., Katz L., Kaysser L., Klassen J. L., Keller N. P., Kormanec J., Kuipers O. P., Kuzuyama T., Kyrpides N. C., Kwon H.-J., Lautru S., Lavigne R., Lee C. Y., Linquan B., Liu X., Liu W., Luzhetskyy A., Mahmud T., Mast Y., Méndez C., Metsä-Ketelä M., Micklefield J., Mitchell D. A., Moore B. S., Moreira L. M., Müller R., Neilan B. A., Nett M., Nielsen J., O’Gara F., Oikawa H., Osbourn A., Osburne M. S., Ostash B., Payne S. M., Pernodet J.-L., Petricek M., Piel J., Ploux O., Raaijmakers J. M., Salas J. A., Schmitt E. K., Scott B., Seipke R. F., Shen B., Sherman D. H., Sivonen K., Smanski M. J., Sosio M., Stegmann E., Süssmuth R. D., Tahlan K., Thomas C. M., Tang Y., Truman A. W., Viaud M., Walton J. D., Walsh C. T., Weber T., van Wezel G. P., Wilkinson B., Willey J. M., Wohlleben W., Wright G. D., Ziemert N., Zhang C., Zotchev S. B., Breitling R., Takano E., Glöckner F. O.. Minimum Information about a Biosynthetic Gene Cluster. Nat. Chem. Biol. 2015;11(9):625–631. doi: 10.1038/nchembio.1890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Udwary D. W., Doering D. T., Foster B., Smirnova T., Kautsar S. A., Mouncey N. J.. The Secondary Metabolism Collaboratory: A Database and Web Discussion Portal for Secondary Metabolite Biosynthetic Gene Clusters. Nucleic Acids Res. 2025;53(D1):D717–D723. doi: 10.1093/nar/gkae1060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Kautsar S. A., Blin K., Shaw S., Weber T., Medema M. H.. BiG-FAM: The Biosynthetic Gene Cluster Families Database. Nucleic Acids Res. 2021;49(D1):D490–D497. doi: 10.1093/nar/gkaa812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. van Santen J. A., Poynton E. F., Iskakova D., McMann E., Alsup T. A., Clark T. N., Fergusson C. H., Fewer D. P., Hughes A. H., McCadden C. A., Parra J., Soldatou S., Rudolf J. D., Janssen E. M.-L., Duncan K. R., Linington R. G.. The Natural Products Atlas 2.0: A Database of Microbially-Derived Natural Products. Nucleic Acids Res. 2022;50(D1):D1317–D1323. doi: 10.1093/nar/gkab941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. van Santen J. A., Jacob G., Singh A. L., Aniebok V., Balunas M. J., Bunsko D., Neto F. C., Castaño-Espriu L., Chang C., Clark T. N., Cleary Little J. L., Delgadillo D. A., Dorrestein P. C., Duncan K. R., Egan J. M., Galey M. M., Haeckl F. P. J., Hua A., Hughes A. H., Iskakova D., Khadilkar A., Lee J.-H., Lee S., LeGrow N., Liu D. Y., Macho J. M., McCaughey C. S., Medema M. H., Neupane R. P., O’Donnell T. J., Paula J. S., Sanchez L. M., Shaikh A. F., Soldatou S., Terlouw B. R., Tran T. A., Valentine M., van der Hooft J. J. J., Vo D. A., Wang M., Wilson D., Zink K. E., Linington R. G.. The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery. ACS Cent. Sci. 2019;5(11):1824–1833. doi: 10.1021/acscentsci.9b00806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Poynton E. F., van Santen J. A., Pin M., Contreras M. M., McMann E., Parra J., Showalter B., Zaroubi L., Duncan K. R., Linington R. G.. The Natural Products Atlas 3.0: Extending the Database of Microbially Derived Natural Products. Nucleic Acids Res. 2025;53(D1):D691–D699. doi: 10.1093/nar/gkae1093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Bağcı C., Nuhamunada M., Goyat H., Ladanyi C., Sehnal L., Blin K., Kautsar S. A., Tagirdzhanov A., Gurevich A., Mantri S., von Mering C., Udwary D., Medema M. H., Weber T., Ziemert N.. BGC Atlas: A Web Resource for Exploring the Global Chemical Diversity Encoded in Bacterial Genomes. Nucleic Acids Res. 2025;53(D1):D618–D624. doi: 10.1093/nar/gkae953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Gilchrist C. L. M., Chooi Y.-H.. Clinker & Clustermap.Js: Automatic Generation of Gene Cluster Comparison Figures. Bioinformatics. 2021;37(16):2473–2475. doi: 10.1093/bioinformatics/btab007. [DOI] [PubMed] [Google Scholar]
  102. Kutuzova S., Nielsen M., Piera P., Nissen J. N., Rasmussen S.. Taxometer: Improving Taxonomic Classification of Metagenomics Contigs. Nat. Commun. 2024;15(1):8357. doi: 10.1038/s41467-024-52771-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Mungan M. D., Alanjary M., Blin K., Weber T., Medema M. H., Ziemert N.. ARTS 2.0: Feature Updates and Expansion of the Antibiotic Resistant Target Seeker for Comparative Genome Mining. Nucleic Acids Res. 2020;48(W1):W546–W552. doi: 10.1093/nar/gkaa374. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Alanjary M., Kronmiller B., Adamek M., Blin K., Weber T., Huson D., Philmus B., Ziemert N.. The Antibiotic Resistant Target Seeker (ARTS), an Exploration Engine for Antibiotic Cluster Prioritization and Novel Drug Target Discovery. Nucleic Acids Res. 2017;45(W1):W42–W48. doi: 10.1093/nar/gkx360. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Yılmaz T. M., Mungan M. D., Berasategui A., Ziemert N.. FunARTS, the Fungal BioActive Compound Resistant Target Seeker, an Exploration Engine for Target-Directed Genome Mining in Fungi. Nucleic Acids Res. 2023;51(W1):W191–W197. doi: 10.1093/nar/gkad386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Alanjary M., Steinke K., Ziemert N.. AutoMLST: An Automated Web Server for Generating Multi-Locus Species Trees Highlighting Natural Product Potential. Nucleic Acids Res. 2019;47(W1):W276–W282. doi: 10.1093/nar/gkz282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Asnicar F., Thomas A. M., Beghini F., Mengoni C., Manara S., Manghi P., Zhu Q., Bolzan M., Cumbo F., May U., Sanders J. G., Zolfo M., Kopylova E., Pasolli E., Knight R., Mirarab S., Huttenhower C., Segata N.. Precise Phylogenetic Analysis of Microbial Isolates and Genomes from Metagenomes Using PhyloPhlAn 3.0. Nat. Commun. 2020;11(1):2500. doi: 10.1038/s41467-020-16366-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Segata N., Börnigen D., Morgan X. C., Huttenhower C.. PhyloPhlAn Is a New Method for Improved Phylogenetic and Taxonomic Placement of Microbes. Nat. Commun. 2013;4(1):2304. doi: 10.1038/ncomms3304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Wood D. E., Salzberg S. L.. Kraken: Ultrafast Metagenomic Sequence Classification Using Exact Alignments. Genome Biol. 2014;15(3):R46. doi: 10.1186/gb-2014-15-3-r46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Wood D. E., Lu J., Langmead B.. Improved Metagenomic Analysis with Kraken 2. Genome Biol. 2019;20(1):257. doi: 10.1186/s13059-019-1891-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Hauptfeld E., Pappas N., van Iwaarden S., Snoek B. L., Aldas-Vargas A., Dutilh B. E., von Meijenfeldt F. A. B.. Integrating Taxonomic Signals from MAGs and Contigs Improves Read Annotation and Taxonomic Profiling of Metagenomes. Nat. Commun. 2024;15(1):3373. doi: 10.1038/s41467-024-47155-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Kim J., Steinegger M.. Metabuli: Sensitive and Specific Metagenomic Classification via Joint Analysis of Amino Acid and DNA. Nat. Methods. 2024;21(6):971–973. doi: 10.1038/s41592-024-02273-y. [DOI] [PubMed] [Google Scholar]
  113. von Meijenfeldt F. A. B., Arkhipova K., Cambuy D. D., Coutinho F. H., Dutilh B. E.. Robust Taxonomic Classification of Uncharted Microbial Sequences and Bins with CAT and BAT. Genome Biol. 2019;20(1):217. doi: 10.1186/s13059-019-1817-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Mirdita M., Steinegger M., Breitwieser F., Söding J., Levy Karin E.. Fast and Sensitive Taxonomic Assignment to Metagenomic Contigs. Bioinformatics. 2021;37(18):3029–3031. doi: 10.1093/bioinformatics/btab184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Karlicki M., Antonowicz S., Karnkowska A.. Tiara: Deep Learning-Based Classification System for Eukaryotic Sequences. Bioinformatics. 2022;38(2):344–350. doi: 10.1093/bioinformatics/btab672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. West P. T., Probst A. J., Grigoriev I. V., Thomas B. C., Banfield J. F.. Genome-Reconstruction for Eukaryotes from Complex Natural Microbial Communities. Genome Res. 2018;28(4):569–580. doi: 10.1101/gr.228429.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Chaumeil P.-A., Mussig A. J., Hugenholtz P., Parks D. H.. GTDB-Tk: A Toolkit to Classify Genomes with the Genome Taxonomy Database. Bioinformatics. 2020;36(6):1925–1927. doi: 10.1093/bioinformatics/btz848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Chaumeil P.-A., Mussig A. J., Hugenholtz P., Parks D. H.. GTDB-Tk v2: Memory Friendly Classification with the Genome Taxonomy Database. bioRxiv. 2022 doi: 10.1101/2022.07.11.499641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Bremges A., Fritz A., McHardy A. C.. CAMITAX: Taxon Labels for Microbial Genomes. Gigascience. 2020;9(1):giz154. doi: 10.1093/gigascience/giz154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  120. Defazio G., Tangaro M. A., Pesole G., Fosso B.. KMetaShot: A Fast and Reliable Taxonomy Classifier for Metagenome-Assembled Genomes. Brief. Bioinform. 2024;26(1):bbae680. doi: 10.1093/bib/bbae680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Krinos A. I., Hu S. K., Cohen N. R., Alexander H.. EUKulele: Taxonomic Annotation of the Unsung Eukaryotic Microbes. arXiv. 2020 doi: 10.48550/arXiv.2011.00089. [DOI] [Google Scholar]
  122. Kang D. D., Li F., Kirton E., Thomas A., Egan R., An H., Wang Z.. MetaBAT 2: An Adaptive Binning Algorithm for Robust and Efficient Genome Reconstruction from Metagenome Assemblies. PeerJ. 2019;7:e7359. doi: 10.7717/peerj.7359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Kang D. D., Froula J., Egan R., Wang Z.. MetaBAT, an Efficient Tool for Accurately Reconstructing Single Genomes from Complex Microbial Communities. PeerJ. 2015;3:e1165. doi: 10.7717/peerj.1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Wu Y.-W., Tang Y.-H., Tringe S. G., Simmons B. A., Singer S. W.. MaxBin: An Automated Binning Method to Recover Individual Genomes from Metagenomes Using an Expectation-Maximization Algorithm. Microbiome. 2014;2:26. doi: 10.1186/2049-2618-2-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. Wu Y.-W., Simmons B. A., Singer S. W.. MaxBin 2.0: An Automated Binning Algorithm to Recover Genomes from Multiple Metagenomic Datasets. Bioinformatics. 2016;32(4):605–607. doi: 10.1093/bioinformatics/btv638. [DOI] [PubMed] [Google Scholar]
  126. Miller I. J., Rees E. R., Ross J., Miller I., Baxa J., Lopera J., Kerby R. L., Rey F. E., Kwan J. C.. Autometa: Automated Extraction of Microbial Genomes from Individual Shotgun Metagenomes. Nucleic Acids Res. 2019;47(10):e57. doi: 10.1093/nar/gkz148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Rees E. R., Uppal S., Clark C. M., Lail A. J., Waterworth S. C., Roesemann S. D., Wolf K. A., Kwan J. C.. Autometa 2: A Versatile Tool for Recovering Genomes from Highly-Complex Metagenomic Communities. bioRxiv. 2023:2023.09.01.555939. doi: 10.1101/2023.09.01.555939. [DOI] [Google Scholar]
  128. Wang Z., Huang P., You R., Sun F., Zhu S.. MetaBinner: A High-Performance and Stand-Alone Ensemble Binning Method to Recover Individual Genomes from Complex Microbial Communities. Genome Biol. 2023;24(1):1. doi: 10.1186/s13059-022-02832-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Alneberg J., Bjarnason B. S., de Bruijn I., Schirmer M., Quick J., Ijaz U. Z., Lahti L., Loman N. J., Andersson A. F., Quince C.. Binning Metagenomic Contigs by Coverage and Composition. Nat. Methods. 2014;11(11):1144–1146. doi: 10.1038/nmeth.3103. [DOI] [PubMed] [Google Scholar]
  130. Nissen J. N., Johansen J., Allesøe R. L., Sønderby C. K., Armenteros J. J. A., Grønbech C. H., Jensen L. J., Nielsen H. B., Petersen T. N., Winther O., Rasmussen S.. Improved Metagenome Binning and Assembly Using Deep Variational Autoencoders. Nat. Biotechnol. 2021;39(5):555–560. doi: 10.1038/s41587-020-00777-4. [DOI] [PubMed] [Google Scholar]
  131. Líndez P. P., Johansen J., Kutuzova S., Sigurdsson A. I., Nissen J. N., Rasmussen S.. Adversarial and Variational Autoencoders Improve Metagenomic Binning. Commun. Biol. 2023;6(1):1073. doi: 10.1038/s42003-023-05452-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Pan S., Zhao X.-M., Coelho L. P.. SemiBin2: Self-Supervised Contrastive Learning Leads to Better MAGs for Short- and Long-Read Sequencing. Bioinformatics. 2023;39(39 Suppl 1):i21–i29. doi: 10.1093/bioinformatics/btad209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  133. Pan S., Zhu C., Zhao X.-M., Coelho L. P.. A Deep Siamese Neural Network Improves Metagenome-Assembled Genomes in Microbiome Datasets across Different Environments. Nat. Commun. 2022;13(1):2326. doi: 10.1038/s41467-022-29843-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. Wang Z., You R., Han H., Liu W., Sun F., Zhu S.. Effective Binning of Metagenomic Contigs Using Contrastive Multi-View Representation Learning. Nat. Commun. 2024;15(1):585. doi: 10.1038/s41467-023-44290-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  135. Arangasamy Y., Morice E., Jochheim A., Lieser B., Soeding J.. Evaluation of Metagenome Binning: Advances and Challenges. bioRxiv. 2025 doi: 10.1101/2025.02.21.639465. [DOI] [Google Scholar]
  136. Lettich R., Egan R., Riley R., Wang Z., Tritt A., Oliker L., Yelick K., Buluc A.. GenomeFace: A Deep Learning-Based Metagenome Binner Trained on 43,000 Microbial Genomes. bioRxiv. 2024 doi: 10.1101/2024.02.07.579326. [DOI] [Google Scholar]
  137. Zhao D., Salas-Leiva D. E., Williams S. K., Dunn K. A., Shao J. D., Roger A. J.. Eukfinder: A Pipeline to Retrieve Microbial Eukaryote Genome Sequences from Metagenomic Data. MBio. 2025;16(5):e0069925. doi: 10.1128/mbio.00699-25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  138. Alexander H., Hu S. K., Krinos A. I., Pachiadaki M., Tully B. J., Neely C. J., Reiter T.. Eukaryotic Genomes from a Global Metagenomic Data Set Illuminate Trophic Modes and Biogeography of Ocean Plankton. MBio. 2023;14(6):e0167623. doi: 10.1128/mbio.01676-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  139. Rühlemann M. C., Wacker E. M., Ellinghaus D., Franke A.. MAGScoT: A Fast, Lightweight and Accurate Bin-Refinement Tool. Bioinformatics. 2022;38(24):5430–5433. doi: 10.1093/bioinformatics/btac694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Tolstoganov I., Kamenev Y., Kruglikov R., Ochkalova S., Korobeynikov A.. BinSPreader: Refine Binning Results for Fuller MAG Reconstruction. iScience. 2022;25(8):104770. doi: 10.1016/j.isci.2022.104770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  141. Ochkalova S., Tolstoganov I., Lapidus A., Korobeynikov A.. Protocol for Refining Metagenomic Binning with BinSPreader. STAR Protoc. 2023;4(3):102417. doi: 10.1016/j.xpro.2023.102417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  142. Bornemann T. L. V., Esser S. P., Stach T. L., Burg T., Probst A. J.. UBin: A Manual Refining Tool for Genomes from Metagenomes. Environ. Microbiol. 2023;25(6):1077–1083. doi: 10.1111/1462-2920.16351. [DOI] [PubMed] [Google Scholar]
  143. Parks D. H., Rinke C., Chuvochina M., Chaumeil P.-A., Woodcroft B. J., Evans P. N., Hugenholtz P., Tyson G. W.. Recovery of Nearly 8,000 Metagenome-Assembled Genomes Substantially Expands the Tree of Life. Nat. Microbiol. 2017;2(11):1533–1542. doi: 10.1038/s41564-017-0012-7. [DOI] [PubMed] [Google Scholar]
  144. Parks D. H., Imelfort M., Skennerton C. T., Hugenholtz P., Tyson G. W.. CheckM: Assessing the Quality of Microbial Genomes Recovered from Isolates, Single Cells, and Metagenomes. Genome Res. 2015;25(7):1043–1055. doi: 10.1101/gr.186072.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Chklovski A., Parks D. H., Woodcroft B. J., Tyson G. W.. CheckM2: A Rapid, Scalable and Accurate Tool for Assessing Microbial Genome Quality Using Machine Learning. Nat. Methods. 2023;20(8):1203–1212. doi: 10.1038/s41592-023-01940-w. [DOI] [PubMed] [Google Scholar]
  146. Saary P., Mitchell A. L., Finn R. D.. Estimating the Quality of Eukaryotic Genomes Recovered from Metagenomic Analysis with EukCC. Genome Biol. 2020;21(1):244. doi: 10.1186/s13059-020-02155-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  147. Manni M., Berkeley M. R., Seppey M., Zdobnov E. M.. BUSCO: Assessing Genomic Data Quality and Beyond. Curr. Protoc. 2021;1(12):e323. doi: 10.1002/cpz1.323. [DOI] [PubMed] [Google Scholar]
  148. Hyatt D., Chen G.-L., Locascio P. F., Land M. L., Larimer F. W., Hauser L. J.. Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification. BMC Bioinformatics. 2010;11(1):119. doi: 10.1186/1471-2105-11-119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Seemann T.. Prokka: Rapid Prokaryotic Genome Annotation. Bioinformatics. 2014;30(14):2068–2069. doi: 10.1093/bioinformatics/btu153. [DOI] [PubMed] [Google Scholar]
  150. Sommer M. J., Salzberg S. L.. Balrog: A Universal Protein Model for Prokaryotic Gene Prediction. PLoS Comput. Biol. 2021;17(2):e1008727. doi: 10.1371/journal.pcbi.1008727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  151. Schwengers O., Jelonek L., Dieckmann M. A., Beyvers S., Blom J., Goesmann A.. Bakta: Rapid and Standardized Annotation of Bacterial Genomes via Alignment-Free Sequence Identification. Microb. Genom. 2021;7(11):685. doi: 10.1099/mgen.0.000685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  152. Levy Karin E., Mirdita M., Söding J.. MetaEuk-Sensitive, High-Throughput Gene Discovery, and Annotation for Large-Scale Eukaryotic Metagenomics. Microbiome. 2020;8(1):48. doi: 10.1186/s40168-020-00808-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  153. Stanke M., Steinkamp R., Waack S., Morgenstern B.. AUGUSTUS: A Web Server for Gene Finding in Eukaryotes. Nucleic Acids Res. 2004;32:W309–12. doi: 10.1093/nar/gkh379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  154. Nachtweide S., Stanke M.. Multi-Genome Annotation with AUGUSTUS. Methods Mol. Biol. 2019;1962:139–160. doi: 10.1007/978-1-4939-9173-0_8. [DOI] [PubMed] [Google Scholar]
  155. DeLuca D. S., Levin J. Z., Sivachenko A., Fennell T., Nazaire M.-D., Williams C., Reich M., Winckler W., Getz G.. RNA-SeQC: RNA-Seq Metrics for Quality Control and Process Optimization. Bioinformatics. 2012;28(11):1530–1532. doi: 10.1093/bioinformatics/bts196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  156. Zhou Q., Su X., Jing G., Chen S., Ning K.. RNA-QC-Chain: Comprehensive and Fast Quality Control for RNA-Seq Data. BMC Genomics. 2018;19(1):144. doi: 10.1186/s12864-018-4503-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  157. Pérez-Rubio P., Lottaz C., Engelmann J. C.. FastqPuri: High-Performance Preprocessing of RNA-Seq Data. BMC Bioinformatics. 2019;20(1):226. doi: 10.1186/s12859-019-2799-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Langmead B., Trapnell C., Pop M., Salzberg S. L.. Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome. Genome Biol. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  159. BBMap guide. Archive. https://archive.jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbmap-guide/ (accessed 2025. –05–15).
  160. Dobin A., Davis C. A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T. R.. STAR: Ultrafast Universal RNA-Seq Aligner. Bioinformatics. 2013;29(1):15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  161. Banerjee S., Bhandary P., Woodhouse M., Sen T. Z., Wise R. P., Andorf C. M.. FINDER: An Automated Software Package to Annotate Eukaryotic Genes from RNA-Seq Data and Associated Protein Sequences. BMC Bioinformatics. 2021;22(1):205. doi: 10.1186/s12859-021-04120-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  162. Stanke M., Keller O., Gunduz I., Hayes A., Waack S., Morgenstern B.. AUGUSTUS: Ab Initio Prediction of Alternative Transcripts. Nucleic Acids Res. 2006:W435–9. doi: 10.1093/nar/gkl200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  163. Stanke M., Morgenstern B.. AUGUSTUS: A Web Server for Gene Prediction in Eukaryotes That Allows User-Defined Constraints. Nucleic Acids Res. 2005;33:W465–7. doi: 10.1093/nar/gki458. [DOI] [PMC free article] [PubMed] [Google Scholar]
  164. Dimonaco N. J., Aubrey W., Kenobi K., Clare A., Creevey C. J.. No One Tool to Rule Them All: Prokaryotic Gene Prediction Tool Annotations Are Highly Dependent on the Organism of Study. Bioinformatics. 2022;38(5):1198–1207. doi: 10.1093/bioinformatics/btab827. [DOI] [PMC free article] [PubMed] [Google Scholar]
  165. Gabriel L., Brůna T., Hoff K. J., Ebel M., Lomsadze A., Borodovsky M., Stanke M.. BRAKER3: Fully Automated Genome Annotation Using RNA-Seq and Protein Evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Res. 2024;34(5):769–777. doi: 10.1101/gr.278090.123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  166. Liao Y., Smyth G. K., Shi W.. FeatureCounts: An Efficient General Purpose Program for Assigning Sequence Reads to Genomic Features. Bioinformatics. 2014;30(7):923–930. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
  167. Putri G. H., Anders S., Pyl P. T., Pimanda J. E., Zanini F.. Analysing High-Throughput Sequencing Data in Python with HTSeq 2.0. Bioinformatics. 2022;38(10):2943–2945. doi: 10.1093/bioinformatics/btac166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  168. Love M. I., Huber W., Anders S.. Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  169. Robinson M. D., McCarthy D. J., Smyth G. K.. EdgeR: A Bioconductor Package for Differential Expression Analysis of Digital Gene Expression Data. Bioinformatics. 2010;26(1):139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  170. Soneson C., Delorenzi M.. A Comparison of Methods for Differential Expression Analysis of RNA-Seq Data. BMC Bioinformatics. 2013;14(1):91. doi: 10.1186/1471-2105-14-91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  171. Seyednasrollah F., Laiho A., Elo L. L.. Comparison of Software Packages for Detecting Differential Expression in RNA-Seq Studies. Brief. Bioinform. 2015;16(1):59–70. doi: 10.1093/bib/bbt086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  172. Costa-Silva J., Domingues D., Lopes F. M.. RNA-Seq Differential Expression Analysis: An Extended Review and a Software Tool. PLoS One. 2017;12(12):e0190152. doi: 10.1371/journal.pone.0190152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Spies D., Renz P. F., Beyer T. A., Ciaudo C.. Comparative Analysis of Differential Gene Expression Tools for RNA Sequencing Time Course Data. Brief. Bioinform. 2019;20(1):288–298. doi: 10.1093/bib/bbx115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  174. Mungan M. D., Harbig T. A., Perez N. H., Edenhart S., Stegmann E., Nieselt K., Ziemert N.. Secondary Metabolite Transcriptomic Pipeline (SeMa-Trap), an Expression-Based Exploration Tool for Increased Secondary Metabolite Production in Bacteria. Nucleic Acids Res. 2022;50(W1):W682–W689. doi: 10.1093/nar/gkac371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  175. Lloyd K. G., Steen A. D., Ladau J., Yin J., Crosby L.. Phylogenetically Novel Uncultured Microbial Cells Dominate Earth Microbiomes. mSystems. 2018;3(5):00055-18. doi: 10.1128/msystems.00055-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  176. Steen A. D., Crits-Christoph A., Carini P., DeAngelis K. M., Fierer N., Lloyd K. G., Cameron Thrash J.. High Proportions of Bacteria and Archaea across Most Biomes Remain Uncultured. ISME J. 2019;13(12):3126–3130. doi: 10.1038/s41396-019-0484-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  177. Bodor A., Bounedjoum N., Vincze G. E., Erdeiné Kis Á., Laczi K., Bende G., Szilágyi Á., Kovács T., Perei K., Rákhely G.. Challenges of Unculturable Bacteria: Environmental Perspectives. Rev. Environ. Sci. Biotechnol. 2020;19(1):1–22. doi: 10.1007/s11157-020-09522-4. [DOI] [Google Scholar]
  178. Locey K. J., Lennon J. T.. Scaling Laws Predict Global Microbial Diversity. Proc. Natl. Acad. Sci. U. S. A. 2016;113(21):5970–5975. doi: 10.1073/pnas.1521291113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  179. Rinke C., Schwientek P., Sczyrba A., Ivanova N. N., Anderson I. J., Cheng J.-F., Darling A., Malfatti S., Swan B. K., Gies E. A., Dodsworth J. A., Hedlund B. P., Tsiamis G., Sievert S. M., Liu W.-T., Eisen J. A., Hallam S. J., Kyrpides N. C., Stepanauskas R., Rubin E. M., Hugenholtz P., Woyke T.. Insights into the Phylogeny and Coding Potential of Microbial Dark Matter. Nature. 2013;499(7459):431–437. doi: 10.1038/nature12352. [DOI] [PubMed] [Google Scholar]
  180. Handelsman J., Rondon M. R., Brady S. F., Clardy J., Goodman R. M.. Molecular Biological Access to the Chemistry of Unknown Soil Microbes: A New Frontier for Natural Products. Chem. Biol. 1998;5(10):R245–9. doi: 10.1016/S1074-5521(98)90108-9. [DOI] [PubMed] [Google Scholar]
  181. Nichols D., Cahoon N., Trakhtenberg E. M., Pham L., Mehta A., Belanger A., Kanigan T., Lewis K., Epstein S. S.. Use of Ichip for High-Throughput in Situ Cultivation of “Uncultivable” Microbial Species. Appl. Environ. Microbiol. 2010;76(8):2445–2450. doi: 10.1128/AEM.01754-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  182. Liu H., Xue R., Wang Y., Stirling E., Ye S., Xu J., Ma B.. FACS-IChip: A High-Efficiency IChip System for Microbial “dark Matter” Mining. Mar. Life Sci. Technol. 2021;3(2):162–168. doi: 10.1007/s42995-020-00067-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  183. Du W., Li L., Nichols K. P., Ismagilov R. F.. SlipChip. Lab Chip. 2009;9(16):2286–2292. doi: 10.1039/b908978k. [DOI] [PMC free article] [PubMed] [Google Scholar]
  184. Wu X., Spencer S., Gushgari-Doyle S., Yee M. O., Voriskova J., Li Y., Alm E. J., Chakraborty R.. Culturing of “Unculturable” Subsurface Microbes: Natural Organic Carbon Source Fuels the Growth of Diverse and Distinct Bacteria from Groundwater. Front. Microbiol. 2020;11:610001. doi: 10.3389/fmicb.2020.610001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  185. Kaeberlein T., Lewis K., Epstein S. S.. Isolating “Uncultivable” Microorganisms in Pure Culture in a Simulated Natural Environment. Science. 2002;296(5570):1127–1129. doi: 10.1126/science.1070633. [DOI] [PubMed] [Google Scholar]
  186. Maglangit F., Fang Q., Kyeremeh K., Sternberg J. M., Ebel R., Deng H.. A Co-Culturing Approach Enables Discovery and Biosynthesis of a Bioactive Indole Alkaloid Metabolite. Molecules. 2020;25(2):256. doi: 10.3390/molecules25020256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  187. Goers L., Freemont P., Polizzi K. M.. Co-Culture Systems and Technologies: Taking Synthetic Biology to the next Level. J. R. Soc. Interface. 2014;11(96):20140065. doi: 10.1098/rsif.2014.0065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  188. Nichols D., Lewis K., Orjala J., Mo S., Ortenberg R., O’Connor P., Zhao C., Vouros P., Kaeberlein T., Epstein S. S.. Short Peptide Induces an “Uncultivable” Microorganism to Grow in Vitro. Appl. Environ. Microbiol. 2008;74(15):4889–4897. doi: 10.1128/AEM.00393-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  189. Davis K. E. R., Joseph S. J., Janssen P. H.. Effects of Growth Medium, Inoculum Size, and Incubation Time on Culturability and Isolation of Soil Bacteria. Appl. Environ. Microbiol. 2005;71(2):826–834. doi: 10.1128/AEM.71.2.826-834.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  190. Ben-Dov E., Kramarsky-Winter E., Kushmaro A.. An in Situ Method for Cultivating Microorganisms Using a Double Encapsulation Technique. FEMS Microbiol. Ecol. 2009;68(3):363–371. doi: 10.1111/j.1574-6941.2009.00682.x. [DOI] [PubMed] [Google Scholar]
  191. Song J., Oh H.-M., Cho J.-C.. Improved Culturability of SAR11 Strains in Dilution-to-Extinction Culturing from the East Sea, West Pacific Ocean. FEMS Microbiol. Lett. 2009;295(2):141–147. doi: 10.1111/j.1574-6968.2009.01623.x. [DOI] [PubMed] [Google Scholar]
  192. Demin K., Prazdnova E., Kulikov M., Mazanko M., Gorovtsov A.. Alternative Agar Substitutes for Culturing Unculturable Microorganisms. Arch. Microbiol. 2024;206(10):405. doi: 10.1007/s00203-024-04139-5. [DOI] [PubMed] [Google Scholar]
  193. Imachi H., Aoi K., Tasumi E., Saito Y., Yamanaka Y., Saito Y., Yamaguchi T., Tomaru H., Takeuchi R., Morono Y., Inagaki F., Takai K.. Cultivation of Methanogenic Community from Subseafloor Sediments Using a Continuous-Flow Bioreactor. ISME J. 2011;5(12):1913–1925. doi: 10.1038/ismej.2011.64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  194. Jiang C.-Y., Dong L., Zhao J.-K., Hu X., Shen C., Qiao Y., Zhang X., Wang Y., Ismagilov R. F., Liu S.-J., Du W.. High-Throughput Single-Cell Cultivation on Microfluidic Streak Plates. Appl. Environ. Microbiol. 2016;82(7):2210–2218. doi: 10.1128/AEM.03588-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  195. Zheng M., Wen L., He C., Chen X., Si L., Li H., Liang Y., Zheng W., Guo F.. Sequencing-Guided Re-Estimation and Promotion of Cultivability for Environmental Bacteria. Nat. Commun. 2024;15(1):9051. doi: 10.1038/s41467-024-53446-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  196. Wan X., Yao G., Wang K., Bao S., Han P., Wang F., Song T., Jiang H.. Transcriptomic Analysis of Polyketide Synthesis in Dinoflagellate, Prorocentrum Lima. Harmful Algae. 2023;123:102391. doi: 10.1016/j.hal.2023.102391. [DOI] [PubMed] [Google Scholar]
  197. Shukla R., Peoples A. J., Ludwig K. C., Maity S., Derks M. G. N., De Benedetti S., Krueger A. M., Vermeulen B. J. A., Harbig T., Lavore F., Kumar R., Honorato R. V., Grein F., Nieselt K., Liu Y., Bonvin A. M. J. J., Baldus M., Kubitscheck U., Breukink E., Achorn C., Nitti A., Schwalen C. J., Spoering A. L., Ling L. L., Hughes D., Lelli M., Roos W. H., Lewis K., Schneider T., Weingarth M.. An Antibiotic from an Uncultured Bacterium Binds to an Immutable Target. Cell. 2023;186(19):4059–4073. doi: 10.1016/j.cell.2023.07.038. [DOI] [PubMed] [Google Scholar]
  198. Ling L. L., Schneider T., Peoples A. J., Spoering A. L., Engels I., Conlon B. P., Mueller A., Schäberle T. F., Hughes D. E., Epstein S., Jones M., Lazarides L., Steadman V. A., Cohen D. R., Felix C. R., Fetterman K. A., Millett W. P., Nitti A. G., Zullo A. M., Chen C., Lewis K.. A New Antibiotic Kills Pathogens without Detectable Resistance. Nature. 2015;517(7535):455–459. doi: 10.1038/nature14098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  199. Chin C.-S., Alexander D. H., Marks P., Klammer A. A., Drake J., Heiner C., Clum A., Copeland A., Huddleston J., Eichler E. E., Turner S. W., Korlach J.. Nonhybrid, Finished Microbial Genome Assemblies from Long-Read SMRT Sequencing Data. Nat. Methods. 2013;10(6):563–569. doi: 10.1038/nmeth.2474. [DOI] [PubMed] [Google Scholar]
  200. Shigdel U. K., Lee S.-J., Sowa M. E., Bowman B. R., Robison K., Zhou M., Pua K. H., Stiles D. T., Blodgett J. A. V., Udwary D. W., Rajczewski A. T., Mann A. S., Mostafavi S., Hardy T., Arya S., Weng Z., Stewart M., Kenyon K., Morgenstern J. P., Pan E., Gray D. C., Pollock R. M., Fry A. M., Klausner R. D., Townson S. A., Verdine G. L.. Genomic Discovery of an Evolutionarily Programmed Modality for Small-Molecule Targeting of an Intractable Protein Surface. Proc. Natl. Acad. Sci. U. S. A. 2020;117(29):17195–17203. doi: 10.1073/pnas.2006560117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  201. Edwards S. R., Wandless T. J.. The Rapamycin-Binding Domain of the Protein Kinase Mammalian Target of Rapamycin Is a Destabilizing Domain. J. Biol. Chem. 2007;282(18):13395–13401. doi: 10.1074/jbc.M700498200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  202. Yang H., Rudge D. G., Koos J. D., Vaidialingam B., Yang H. J., Pavletich N. P.. MTOR Kinase Structure, Mechanism and Regulation. Nature. 2013;497(7448):217–223. doi: 10.1038/nature12122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  203. Gregory M. A., Hong H., Lill R. E., Gaisser S., Petkovic H., Low L., Sheehan L. S., Carletti I., Ready S. J., Ward M. J., Kaja A. L., Weston A. J., Challis I. R., Leadlay P. F., Martin C. J., Wilkinson B., Sheridan R. M.. Rapamycin Biosynthesis: Elucidation of Gene Product Function. Org. Biomol. Chem. 2006;4(19):3565–3568. doi: 10.1039/b608813a. [DOI] [PubMed] [Google Scholar]
  204. Motamedi H., Cai S. J., Shafiee A., Elliston K. O.. Structural Organization of a Multifunctional Polyketide Synthase Involved in the Biosynthesis of the Macrolide Immunosuppressant FK506. Eur. J. Biochem. 1997;244(1):74–80. doi: 10.1111/j.1432-1033.1997.00074.x. [DOI] [PubMed] [Google Scholar]
  205. Park S. R., Yoo Y. J., Ban Y.-H., Yoon Y. J.. Biosynthesis of Rapamycin and Its Regulation: Past Achievements and Recent Progress. J. Antibiot. (Tokyo) 2010;63(8):434–441. doi: 10.1038/ja.2010.71. [DOI] [PubMed] [Google Scholar]
  206. Schwecke T., Aparicio J. F., Molnár I., König A., Khaw L. E., Haydock S. F., Oliynyk M., Caffrey P., Cortés J., Lester J. B.. The Biosynthetic Gene Cluster for the Polyketide Immunosuppressant Rapamycin. Proc. Natl. Acad. Sci. U. S. A. 1995;92(17):7839–7843. doi: 10.1073/pnas.92.17.7839. [DOI] [PMC free article] [PubMed] [Google Scholar]
  207. https://ir.revmed.com/news-releases/news-release-details/revolution-medicines-reports-new-application-tri-complex (accessed 2025. –05–12).
  208. Kim W., Liu R., Woo S., Kang K. B., Park H., Yu Y. H., Ha H.-H., Oh S.-Y., Yang J. H., Kim H., Yun S.-H., Hur J.-S.. Linking a Gene Cluster to Atranorin, a Major Cortical Substance of Lichens, through Genetic Dereplication and Heterologous Expression. MBio. 2021;12(3):e0111121. doi: 10.1128/mBio.01111-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  209. Hohmann M., Iliasov D., Larralde M., Johannes W., Janßen K.-P., Zeller G., Mascher T., Gulder T. A. M.. Heterologous Expression of a Cryptic BGC from Bilophila Sp. Provides Access to a Novel Family of Antibacterial Thiazoles. ACS Synth. Biol. 2025;14(3):967–978. doi: 10.1021/acssynbio.5c00042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  210. Studzinska-Sroka E., Galanty A., Bylka W.. Atranorin - an Interesting Lichen Secondary Metabolite. Mini Rev. Med. Chem. 2017;17(17):1633–1645. doi: 10.2174/1389557517666170425105727. [DOI] [PubMed] [Google Scholar]
  211. Zhou R., Yang Y., Park S.-Y., Nguyen T. T., Seo Y.-W., Lee K. H., Lee J. H., Kim K. K., Hur J.-S., Kim H.. The Lichen Secondary Metabolite Atranorin Suppresses Lung Cancer Cell Motility and Tumorigenesis. Sci. Rep. 2017;7(1):8136. doi: 10.1038/s41598-017-08225-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  212. Urbanska N., Simko P., Leskanicova A., Karasova M., Jendzelovska Z., Jendzelovsky R., Rucova D., Kolesarova M., Goga M., Backor M., Kiskova T.. Atranorin, a Secondary Metabolite of Lichens, Exhibited Anxiolytic/Antidepressant Activity in Wistar Rats. Life (Basel) 2022;12(11):1850. doi: 10.3390/life12111850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  213. Semenov K. N., Prokopiev I. A., Petukhova N. V., Kremenetskaya U. A., Senichkina D. A., Epifanovskaya O. S., Rumiantsev A. M., Andoskin P. A., Rizaev J. A., Kholmurodova D. K., Ageev S. V., Anufrikov Y. A., Zakharov E. E., Moiseev I. S., Sharoyko V. V.. Atranorin Is a Novel Potential Candidate Drug for Treating Myelodysplastic Syndrome. J. Mol. Liq. 2024;413:125743. doi: 10.1016/j.molliq.2024.125743. [DOI] [Google Scholar]
  214. Murugesan B., Rathinavel T., Kumar R. S., Gomathi M., Maheswari S. U., Ameer K., Al-Ansari M. M., Seralathan K.-K., Selvankumar T.. Atranorin from Lichens Act as Potential Inhibitor for Cervical Tumor Proteins TGFbeta and Kinase Enzyme CDK4/CyclinD3: An Anti-Cancer Intervention Therapy. Int. J. Biol. Macromol. 2025;320(Pt 3):145896. doi: 10.1016/j.ijbiomac.2025.145896. [DOI] [PubMed] [Google Scholar]
  215. Jeon Y.-J., Kim S., Kim J. H., Youn U. J., Suh S.-S.. The Comprehensive Roles of ATRANORIN, A Secondary Metabolite from the Antarctic Lichen Stereocaulon Caespitosum, in HCC Tumorigenesis. Molecules. 2019;24(7):1414. doi: 10.3390/molecules24071414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  216. Kealey J. T., Craig J. P., Barr P. J.. Identification of a Lichen Depside Polyketide Synthase Gene by Heterologous Expression in Saccharomyces Cerevisiae. Metab Eng. Commun. 2021;13:e00172. doi: 10.1016/j.mec.2021.e00172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  217. Schmidt E. W., Nelson J. T., Rasko D. A., Sudek S., Eisen J. A., Haygood M. G., Ravel J.. Patellamide A and C Biosynthesis by a Microcin-like Pathway in Prochloron Didemni, the Cyanobacterial Symbiont of Lissoclinum Patella. Proc. Natl. Acad. Sci. U. S. A. 2005;102(20):7315–7320. doi: 10.1073/pnas.0501424102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  218. Gertz E. M., Yu Y.-K., Agarwala R., Schäffer A. A., Altschul S. F.. Composition-Based Statistics and Translated Nucleotide Searches: Improving the TBLASTN Module of BLAST. BMC Biol. 2006;4(1):41. doi: 10.1186/1741-7007-4-41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  219. Corley D. G., Moore R. E., Paul V. J.. Patellazole B: A Novel Cytotoxic Thiazole-Containing Macrolide from the Marine Tunicate Lissoclinum Patella. J. Am. Chem. Soc. 1988;110(23):7920–7922. doi: 10.1021/ja00231a078. [DOI] [Google Scholar]
  220. Zabriskie T. M., Mayne C. L., Ireland C. M.. Patellazole C: A Novel Cytotoxic Macrolide from Lissoclinum Patella. J. Am. Chem. Soc. 1988;110(23):7919–7920. doi: 10.1021/ja00231a077. [DOI] [Google Scholar]
  221. Donia M. S., Fricke W. F., Partensky F., Cox J., Elshahawi S. I., White J. R., Phillippy A. M., Schatz M. C., Piel J., Haygood M. G., Ravel J., Schmidt E. W.. Complex Microbiome Underlying Secondary and Primary Metabolism in the Tunicate-Prochloron Symbiosis. Proc. Natl. Acad. Sci. U. S. A. 2011;108:E1423–E1432. doi: 10.1073/pnas.1111712108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  222. Kwan J. C., Donia M. S., Han A. W., Hirose E., Haygood M. G., Schmidt E. W.. Genome Streamlining and Chemical Defense in a Coral Reef Symbiosis. Proc. Natl. Acad. Sci. U. S. A. 2012;109(50):20655–20660. doi: 10.1073/pnas.1213820109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  223. Kwan J. C., Schmidt E. W.. Bacterial Endosymbiosis in a Chordate Host: Long-Term Co-Evolution and Conservation of Secondary Metabolism. PLoS One. 2013;8(12):e80822. doi: 10.1371/journal.pone.0080822. [DOI] [PMC free article] [PubMed] [Google Scholar]
  224. Tammam M. A., El-Demerdash A.. Pederins, Mycalamides, Onnamides and Theopederins: Distinctive Polyketide Families with Intriguing Therapeutic Potentialities. Curr. Res. Biotechnol. 2023;6(100145):100145. doi: 10.1016/j.crbiot.2023.100145. [DOI] [Google Scholar]
  225. Kocieński, P. ; Jarowicki, K. ; Marczak, S. ; Willson, T. M. . Three and One-half Approaches to the Synthesis of Pederin. In Strategies and Tactics in Organic Synthesis; Elsevier, 1991; pp 199–236. 10.1016/b978-0-08-092430-4.50012-3. [DOI] [Google Scholar]
  226. Piel J.. A Polyketide Synthase-Peptide Synthetase Gene Cluster from an Uncultured Bacterial Symbiont of Paederus Beetles. Proc. Natl. Acad. Sci. U. S. A. 2002;99(22):14002–14007. doi: 10.1073/pnas.222481399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  227. Piel J., Wen G., Platzer M., Hui D.. Unprecedented Diversity of Catalytic Domains in the First Four Modules of the Putative Pederin Polyketide Synthase. Chembiochem. 2004;5(1):93–98. doi: 10.1002/cbic.200300782. [DOI] [PubMed] [Google Scholar]
  228. Piel J., Hui D., Wen G., Butzke D., Platzer M., Fusetani N., Matsunaga S.. Antitumor Polyketide Biosynthesis by an Uncultivated Bacterial Symbiont of the Marine Sponge Theonella Swinhoei. Proc. Natl. Acad. Sci. U. S. A. 2004;101(46):16222–16227. doi: 10.1073/pnas.0405976101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  229. Wilson M. C., Mori T., Rückert C., Uria A. R., Helf M. J., Takada K., Gernert C., Steffens U. A. E., Heycke N., Schmitt S., Rinke C., Helfrich E. J. N., Brachmann A. O., Gurgui C., Wakimoto T., Kracht M., Crüsemann M., Hentschel U., Abe I., Matsunaga S., Kalinowski J., Takeyama H., Piel J.. An Environmental Bacterial Taxon with a Large and Distinct Metabolic Repertoire. Nature. 2014;506(7486):58–62. doi: 10.1038/nature12959. [DOI] [PubMed] [Google Scholar]
  230. Fisch K. M., Gurgui C., Heycke N., van der Sar S. A., Anderson S. A., Webb V. L., Taudien S., Platzer M., Rubio B. K., Robinson S. J., Crews P., Piel J.. Polyketide Assembly Lines of Uncultivated Sponge Symbionts from Structure-Based Gene Targeting. Nat. Chem. Biol. 2009;5(7):494–501. doi: 10.1038/nchembio.176. [DOI] [PubMed] [Google Scholar]
  231. Peters E. E., Cahn J. K. B., Lotti A., Gavriilidou A., Steffens U. A. E., Loureiro C., Schorn M. A., Cárdenas P., Vickneswaran N., Crews P., Sipkema D., Piel J.. Distribution and Diversity of “Tectomicrobia”, a Deep-Branching Uncultivated Bacterial Lineage Harboring Rich Producers of Bioactive Metabolites. ISME Commun. 2023;3(1):50. doi: 10.1038/s43705-023-00259-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  232. Kampa A., Gagunashvili A. N., Gulder T. A. M., Morinaka B. I., Daolio C., Godejohann M., Miao V. P. W., Piel J., Andrésson Ó. S.. Metagenomic Natural Product Discovery in Lichen Provides Evidence for a Family of Biosynthetic Pathways in Diverse Symbioses. Proc. Natl. Acad. Sci. U. S. A. 2013;110(33):E3129–E3137. doi: 10.1073/pnas.1305867110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  233. Chevreux B., Pfisterer T., Drescher B., Driesel A. J., Müller W. E. G., Wetter T., Suhai S.. Using the MiraEST Assembler for Reliable and Automated MRNA Transcript Assembly and SNP Detection in Sequenced ESTs. Genome Res. 2004;14(6):1147–1159. doi: 10.1101/gr.1917404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  234. Nakabachi A., Ueoka R., Oshima K., Teta R., Mangoni A., Gurgui M., Oldham N. J., van Echten-Deckert G., Okamura K., Yamamoto K., Inoue H., Ohkuma M., Hongoh Y., Miyagishima S.-Y., Hattori M., Piel J., Fukatsu T.. Defensive Bacteriome Symbiont with a Drastically Reduced Genome. Curr. Biol. 2013;23(15):1478–1484. doi: 10.1016/j.cub.2013.06.027. [DOI] [PubMed] [Google Scholar]
  235. Machado M., Magalhães W. C., Sene A., Araújo B., Faria-Campos A. C., Chanock S. J., Scott L., Oliveira G., Tarazona-Santos E., Rodrigues M. R.. Phred-Phrap Package to Analyses Tools: A Pipeline to Facilitate Population Genetics Re-Sequencing Studies. Investig. Genet. 2011;2(1):3. doi: 10.1186/2041-2223-2-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  236. Kaštovský J., Hauer T., Mareš J., Krautová M., Bešta T., Komárek J., Desortová B., Heteša J., Hindáková A., Houk V., Janeček E., Kopp R., Marvan P., Pumann P., Skácelová O., Zapomělová E.. A Review of the Alien and Expansive Species of Freshwater Cyanobacteria and Algae in the Czech Republic. Biol. Invasions. 2010;12(10):3599–3625. doi: 10.1007/s10530-010-9754-3. [DOI] [Google Scholar]
  237. Kust A., Mareš J., Jokela J., Urajová P., Hájek J., Saurav K., Voráčová K., Fewer D. P., Haapaniemi E., Permi P., Řeháková K., Sivonen K., Hrouzek P.. Discovery of a Pederin Family Compound in a Nonsymbiotic Bloom-Forming Cyanobacterium. ACS Chem. Biol. 2018;13(5):1123–1129. doi: 10.1021/acschembio.7b01048. [DOI] [PubMed] [Google Scholar]
  238. Rust M., Helfrich E. J. N., Freeman M. F., Nanudorn P., Field C. M., Rückert C., Kündig T., Page M. J., Webb V. L., Kalinowski J., Sunagawa S., Piel J.. A Multiproducer Microbiome Generates Chemical Diversity in the Marine Sponge Mycale Hentscheli. Proc. Natl. Acad. Sci. U. S. A. 2020;117(17):9508–9518. doi: 10.1073/pnas.1919245117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  239. Perry N. B., Blunt J. W., Munro M. H. G., Pannell L. K.. Mycalamide A, an Antiviral Compound from a New Zealand Sponge of the Genus Mycale. J. Am. Chem. Soc. 1988;110(14):4850–4851. doi: 10.1021/ja00222a067. [DOI] [Google Scholar]
  240. Nurk S., Meleshko D., Korobeynikov A., Pevzner P. A.. MetaSPAdes: A New Versatile Metagenomic Assembler. Genome Res. 2017;27(5):824–834. doi: 10.1101/gr.213959.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  241. Kačar D., Schleissner C., Cañedo L. M., Rodríguez P., de la Calle F., Galán B., García J. L.. Genome of Labrenzia Sp. PHM005 Reveals a Complete and Active Trans-AT PKS Gene Cluster for the Biosynthesis of Labrenzin. Front. Microbiol. 2019;10:2561. doi: 10.3389/fmicb.2019.02561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  242. Kačar D., Schleissner C., Cañedo L. M., Rodríguez P., de la Calle F., Cuevas C., Galán B., García J. L.. In Vivo Production of Pederin by Labrenzin Pathway Expansion. Metab. Eng. Commun. 2022:e00198. doi: 10.1016/j.mec.2022.e00198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  243. Cary J. W., Han Z., Yin Y., Lohmar J. M., Shantappa S., Harris-Coward P. Y., Mack B., Ehrlich K. C., Wei Q., Arroyo-Manzanares N., Uka V., Vanhaecke L., Bhatnagar D., Yu J., Nierman W. C., Johns M. A., Sorensen D., Shen H., De Saeger S., Diana Di Mavungu J., Calvo A. M.. Transcriptome Analysis of Aspergillus Flavus Reveals VeA-Dependent Regulation of Secondary Metabolite Gene Clusters, Including the Novel Aflavarin Cluster. Eukaryot. Cell. 2015;14(10):983–997. doi: 10.1128/EC.00092-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  244. Calvo A. M.. The VeA Regulatory System and Its Role in Morphological and Chemical Development in Fungi. Fungal Genet. Biol. 2008;45(7):1053–1061. doi: 10.1016/j.fgb.2008.03.014. [DOI] [PubMed] [Google Scholar]
  245. Strohdiek A., Köhler A. M., Harting R., Stupperich H., Gerke J., Bastakis E., Neumann P., Ahmed Y. L., Ficner R., Braus G. H.. The Aspergillus Nidulans Velvet Domain Containing Transcription Factor VeA Is Shuttled from Cytoplasm into Nucleus during Vegetative Growth and Stays There for Sexual Development, but Has to Return into Cytoplasm for Asexual Development. PLoS Genet. 2025;21(6):e1011687. doi: 10.1371/journal.pgen.1011687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  246. Moon H., Lee M.-K., Bok I., Bok J. W., Keller N. P., Yu J.-H.. Unraveling the Gene Regulatory Networks of the Global Regulators VeA and LaeA in Aspergillus Nidulans. Microbiol. Spectr. 2023;11:e0016623. doi: 10.1128/spectrum.00166-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  247. Amos G. C. A., Awakawa T., Tuttle R. N., Letzel A.-C., Kim M. C., Kudo Y., Fenical W., Moore B., Jensen P. R.. Comparative Transcriptomics as a Guide to Natural Product Discovery and Biosynthetic Gene Cluster Functionality. Proc. Natl. Acad. Sci. U. S. A. 2017;114(52):E11121–E11130. doi: 10.1073/pnas.1714381115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  248. Boldt J., Lukoševičiu̅tė L., Fu C., Steglich M., Bunk B., Junker V., Gollasch A., Trunkwalter B., Mohr K. I., Beckstette M., Wink J., Overmann J., Müller R., Nübel U.. Bursts in Biosynthetic Gene Cluster Transcription Are Accompanied by Surges of Natural Compound Production in the Myxobacterium Sorangium Sp. Microb. Biotechnol. 2023;16(5):1054–1068. doi: 10.1111/1751-7915.14246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  249. Wang Y., Zhang J., Wang P., Li Y., Wang Y., Yan Y., Chi J., Chen J., Lian J., Piao X., Lei X., Xiao Y., Murray J., Deyholos M. K., Wang Y., Di P., Zhang J.. Integrated Transcriptomic and Metabolomic Analysis Reveals Tissue-Specific Flavonoid Biosynthesis and MYB-Mediated Regulation of UGT71A1 in Panax Quinquefolius. Int. J. Mol. Sci. 2025;26(6):2669. doi: 10.3390/ijms26062669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  250. Luo X., Ye Z., Shi X., Hu Z., Shen J., You L., Huang P., Wang G., Zheng L., Li C., Zhang Y.. Comparative Transcriptomic Analysis Provides Insights into the Regulation of Root-Specific Saponin Production in Panax Japonicus. Sci. Rep. 2024;14(1):27572. doi: 10.1038/s41598-024-78720-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  251. Eruçar F. M., Kuran F. K., Altıparmak Ülbegi G., Özbey S., Karavuş Ş. N., Arcan G. G., Yazıcı Tütüniş S., Tan N., Aksoy Sağırlı P., Miski M.. Sesquiterpene Coumarin Ethers with Selective Cytotoxic Activities from the Roots of Ferula Huber-Morathii Peşmen (Apiaceae) and Unequivocal Determination of the Absolute Stereochemistry of Samarcandin. Pharmaceuticals (Basel) 2023;16(6):792. doi: 10.3390/ph16060792. [DOI] [PMC free article] [PubMed] [Google Scholar]
  252. Kuran F. K., Senadeera S. P. D., Wang D., Hwang J.-Y., Goncharova E., Wilson J., Wamiru A., Wilson B. A. P., Pruett N., Du L., Hoang C. D., Beutler J. A., Miski M.. Sesquiterpene Coumarin Ethers and Phenylpropanoids from the Roots of Ferula Drudeana, the Putative Anatolian Ecotype of the Silphion Plant. Molecules. 2025;30(9):1916. doi: 10.3390/molecules30091916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  253. Amini H., Naghavi M. R., Shen T., Wang Y., Nasiri J., Khan I. A., Fiehn O., Zerbe P., Maloof J. N.. Tissue-Specific Transcriptome Analysis Reveals Candidate Genes for Terpenoid and Phenylpropanoid Metabolism in the Medicinal Plant Ferula Assafoetida. G3 (Bethesda) 2019;9(3):807–816. doi: 10.1534/g3.118.200852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  254. Zhao J., Tian Z., Wang M., Yu X., Li Q., Chen B., Zhang L., Guo X.. Transcriptome and Metabolome Analysis of the Regulatory Network of Bioactive Compounds Synthesis in Different Growth Stages of Two Medicinal Edible Ferula. Sci. Hortic. (Amsterdam) 2024;338:113770. doi: 10.1016/j.scienta.2024.113770. [DOI] [Google Scholar]
  255. Rutz A., Dounoue-Kubo M., Ollivier S., Bisson J., Bagheri M., Saesong T., Ebrahimi S. N., Ingkaninan K., Wolfender J.-L., Allard P.-M.. Taxonomically Informed Scoring Enhances Confidence in Natural Products Annotation. Front. Plant Sci. 2019;10:1329. doi: 10.3389/fpls.2019.01329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  256. Hao D.-C., Xiao P.-G.. Pharmaceutical Resource Discovery from Traditional Medicinal Plants: Pharmacophylogeny and Pharmacophylogenomics. Chin. Herb. Med. 2020;12(2):104–117. doi: 10.1016/j.chmed.2020.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  257. Pye C. R., Bertin M. J., Lokey R. S., Gerwick W. H., Linington R. G.. Retrospective Analysis of Natural Products Provides Insights for Future Discovery Trends. Proc. Natl. Acad. Sci. U. S. A. 2017;114(22):5601–5606. doi: 10.1073/pnas.1614680114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  258. Forrister D. L., Endara M.-J., Soule A. J., Younkin G. C., Mills A. G., Lokvam J., Dexter K. G., Pennington R. T., Kidner C. A., Nicholls J. A., Loiseau O., Kursar T. A., Coley P. D.. Diversity and Divergence: Evolution of Secondary Metabolism in the Tropical Tree Genus Inga. New Phytol. 2023;237(2):631–642. doi: 10.1111/nph.18554. [DOI] [PubMed] [Google Scholar]
  259. Leonhardt S. D., Rasmussen C., Schmitt T.. Genes versus Environment: Geography and Phylogenetic Relationships Shape the Chemical Profiles of Stingless Bees on a Global Scale. Proc. Biol. Sci. 2013;280(1762):20130680. doi: 10.1098/rspb.2013.0680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  260. Steffen K., Indraningrat A. A. G., Erngren I., Haglöf J., Becking L. E., Smidt H., Yashayaev I., Kenchington E., Pettersson C., Cárdenas P., Sipkema D.. Oceanographic Setting Influences the Prokaryotic Community and Metabolome in Deep-Sea Sponges. Sci. Rep. 2022;12(1):3356. doi: 10.1038/s41598-022-07292-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  261. Xu Q., Tan A. K. X., Guo L., Lim Y. H., Tay D. W. P., Ang S. J.. Composite Machine Learning Strategy for Natural Products Taxonomical Classification and Structural Insights. Digit. Discovery. 2024;3(11):2192–2200. doi: 10.1039/D4DD00155A. [DOI] [Google Scholar]
  262. Hoffmann T., Krug D., Bozkurt N., Duddela S., Jansen R., Garcia R., Gerth K., Steinmetz H., Müller R.. Correlating Chemical Diversity with Taxonomic Distance for Discovery of Natural Products in Myxobacteria. Nat. Commun. 2018;9(1):803. doi: 10.1038/s41467-018-03184-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  263. Wilson N. G., Maschek J. A., Baker B. J.. A Species Flock Driven by Predation? Secondary Metabolites Support Diversification of Slugs in Antarctica. PLoS One. 2013;8(11):e80277. doi: 10.1371/journal.pone.0080277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  264. Quach Q. N., Clay K., Lee S. T., Gardner D. R., Cook D.. Phylogenetic Patterns of Bioactive Secondary Metabolites Produced by Fungal Endosymbionts in Morning Glories (Ipomoeeae, Convolvulaceae) New Phytol. 2023;238(4):1351–1361. doi: 10.1111/nph.18785. [DOI] [PubMed] [Google Scholar]
  265. Gallagher K. A., Rauscher K., Pavan Ioca L., Jensen P. R.. Phylogenetic and Chemical Diversity of a Hybrid-Isoprenoid-Producing Streptomycete Lineage. Appl. Environ. Microbiol. 2013;79(22):6894–6902. doi: 10.1128/AEM.01814-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  266. Adamek M., Alanjary M., Sales-Ortells H., Goodfellow M., Bull A. T., Winkler A., Wibberg D., Kalinowski J., Ziemert N.. Comparative Genomics Reveals Phylogenetic Distribution Patterns of Secondary Metabolites in Amycolatopsis Species. BMC Genomics. 2018;19(1):426. doi: 10.1186/s12864-018-4809-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  267. Kalinski J.-C. J., Krause R. W. M., Parker-Nance S., Waterworth S. C., Dorrington R. A.. Unlocking the Diversity of Pyrroloiminoquinones Produced by Latrunculid Sponge Species. Mar. Drugs. 2021;19(2):68. doi: 10.3390/md19020068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  268. Galitz A., Nakao Y., Schupp P. J., Wörheide G., Erpenbeck D.. A Soft Spot for Chemistry-Current Taxonomic and Evolutionary Implications of Sponge Secondary Metabolite Distribution. Mar. Drugs. 2021;19(8):448. doi: 10.3390/md19080448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  269. Lever J., Brkljača R., Rix C., Urban S.. Application of Networking Approaches to Assess the Chemical Diversity, Biogeography, and Pharmaceutical Potential of Verongiida Natural Products. Mar. Drugs. 2021;19(10):582. doi: 10.3390/md19100582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  270. Dohrmann M., Reiswig H. M., Kelly M., Mills S., Schätzle S., Reverter M., Niesse N., Rohde S., Schupp P., Wörheide G.. Expanded Sampling of New Zealand Glass Sponges (Porifera: Hexactinellida) Provides New Insights into Biodiversity, Chemodiversity, and Phylogeny of the Class. PeerJ. 2023;11:e15017. doi: 10.7717/peerj.15017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  271. Jaramillo K. B., Guillén P. O., Abad R., Rodríguez León J. A., McCormack G.. Contribution of Metabolomics to the Taxonomy and Systematics of Octocorals from the Tropical Eastern Pacific. PeerJ. 2025;13:e19009. doi: 10.7717/peerj.19009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  272. Defossez E., Pitteloud C., Descombes P., Glauser G., Allard P.-M., Walker T. W. N., Fernandez-Conradi P., Wolfender J.-L., Pellissier L., Rasmann S.. Spatial and Evolutionary Predictability of Phytochemical Diversity. Proc. Natl. Acad. Sci. U. S. A. 2021;118(3):e2013344118. doi: 10.1073/pnas.2013344118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  273. Allevato D. M., Groppo M., Kiyota E., Mazzafera P., Nixon K. C.. Evolution of Phytochemical Diversity in Pilocarpus (Rutaceae) Phytochemistry. 2019;163:132–146. doi: 10.1016/j.phytochem.2019.03.027. [DOI] [PubMed] [Google Scholar]
  274. Maldonado C., Barnes C. J., Cornett C., Holmfred E., Hansen S. H., Persson C., Antonelli A., Rønsted N.. Phylogeny Predicts the Quantity of Antimalarial Alkaloids within the Iconic Yellow Cinchona Bark (Rubiaceae: Cinchona Calisaya) Front. Plant Sci. 2017;8:391. doi: 10.3389/fpls.2017.00391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  275. Wink M., Botschen F., Gosmann C., Schäfer H., Waterman P. G.. Chemotaxonomy Seen from a Phylogenetic Perspective and Evolution of Secondary Metabolism. Annual Plant Reviews. 2018:364–433. doi: 10.1002/9781119312994.apr0429. [DOI] [Google Scholar]
  276. Adamek M., Alanjary M., Ziemert N.. Applied Evolution: Phylogeny-Based Approaches in Natural Products Research. Nat. Prod. Rep. 2019;36(9):1295–1312. doi: 10.1039/C9NP00027E. [DOI] [PubMed] [Google Scholar]
  277. Mullins A. J., Webster G., Kim H. J., Zhao J., Petrova Y. D., Ramming C. E., Jenner M., Murray J. A. H., Connor T. R., Hertweck C., Challis G. L., Mahenthiralingam E.. Discovery of the Pseudomonas Polyyne Protegencin by a Phylogeny-Guided Study of Polyyne Biosynthetic Gene Cluster Diversity. MBio. 2021;12(4):e0071521. doi: 10.1128/mBio.00715-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  278. Murata K., Suenaga M., Kai K.. Genome Mining Discovery of Protegenins A-D, Bacterial Polyynes Involved in the Antioomycete and Biocontrol Activities of Pseudomonas Protegens. ACS Chem. Biol. 2022;17(12):3313–3320. doi: 10.1021/acschembio.1c00276. [DOI] [PubMed] [Google Scholar]
  279. Hotter V., Zopf D., Kim H. J., Silge A., Schmitt M., Aiyar P., Fleck J., Matthäus C., Hniopek J., Yan Q., Loper J., Sasso S., Hertweck C., Popp J., Mittag M.. A Polyyne Toxin Produced by an Antagonistic Bacterium Blinds and Lyses a Chlamydomonad Alga. Proc. Natl. Acad. Sci. U. S. A. 2021;118(33):2107695118. doi: 10.1073/pnas.2107695118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  280. Holzmeyer L., Hartig A.-K., Franke K., Brandt W., Muellner-Riehl A. N., Wessjohann L. A., Schnitzler J.. Evaluation of Plant Sources for Antiinfective Lead Compound Discovery by Correlating Phylogenetic, Spatial, and Bioactivity Data. Proc. Natl. Acad. Sci. U. S. A. 2020;117(22):12444–12451. doi: 10.1073/pnas.1915277117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  281. Saslis-Lagoudakis C. H., Savolainen V., Williamson E. M., Forest F., Wagstaff S. J., Baral S. R., Watson M. F., Pendry C. A., Hawkins J. A.. Phylogenies Reveal Predictive Power of Traditional Medicine in Bioprospecting. Proc. Natl. Acad. Sci. U. S. A. 2012;109(39):15835–15840. doi: 10.1073/pnas.1202242109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  282. Waglechner N., McArthur A. G., Wright G. D.. Phylogenetic Reconciliation Reveals the Natural History of Glycopeptide Antibiotic Biosynthesis and Resistance. Nat. Microbiol. 2019;4(11):1862–1871. doi: 10.1038/s41564-019-0531-5. [DOI] [PubMed] [Google Scholar]
  283. Deng Q., Li Y., He W., Chen T., Liu N., Ma L., Qiu Z., Shang Z., Wang Z.. A Polyene Macrolide Targeting Phospholipids in the Fungal Cell Membrane. Nature. 2025;640(8059):743–751. doi: 10.1038/s41586-025-08678-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  284. Kwon T., Hovde B. T.. Global Characterization of Biosynthetic Gene Clusters in Non-Model Eukaryotes Using Domain Architectures. Sci. Rep. 2024;14(1):1534. doi: 10.1038/s41598-023-50095-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  285. Shou Q., Feng L., Long Y., Han J., Nunnery J. K., Powell D. H., Butcher R. A.. A Hybrid Polyketide-Nonribosomal Peptide in Nematodes That Promotes Larval Survival. Nat. Chem. Biol. 2016;12(10):770–772. doi: 10.1038/nchembio.2144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  286. Calestani C., Rast J. P., Davidson E. H.. Isolation of Pigment Cell Specific Genes in the Sea Urchin Embryo by Differential Macroarray Screening. Development. 2003;130(19):4587–4596. doi: 10.1242/dev.00647. [DOI] [PubMed] [Google Scholar]
  287. Grayson N. E., Scesa P. D., Moore M. L., Ledoux J.-B., Gomez-Garrido J., Alioto T., Michael T. P., Burkhardt I., Schmidt E. W., Moore B. S.. A Widespread Metabolic Gene Cluster Family in Metazoans. Nat. Chem. Biol. 2025 doi: 10.1038/s41589-025-01927-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  288. Cooke T. F., Fischer C. R., Wu P., Jiang T.-X., Xie K. T., Kuo J., Doctorov E., Zehnder A., Khosla C., Chuong C.-M., Bustamante C. D.. Genetic Mapping and Biochemical Basis of Yellow Feather Pigmentation in Budgerigars. Cell. 2017;171(2):427–439. doi: 10.1016/j.cell.2017.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  289. Saarenpää S., Shalev O., Ashkenazy H., Carlos V., Lundberg D. S., Weigel D., Giacomello S.. Spatial Metatranscriptomics Resolves Host-Bacteria-Fungi Interactomes. Nat. Biotechnol. 2024;42(9):1384–1393. doi: 10.1038/s41587-023-01979-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  290. Yuan Y., Huang C., Singh N., Xun G., Zhao H.. Self-Resistance-Gene-Guided, High-Throughput Automated Genome Mining of Bioactive Natural Products from Streptomyces. Cell Syst. 2025;16(3):101237. doi: 10.1016/j.cels.2025.101237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  291. Mungan M. D., Blin K., Ziemert N.. ARTS-DB: A Database for Antibiotic Resistant Targets. Nucleic Acids Res. 2022;50(D1):D736–D740. doi: 10.1093/nar/gkab940. [DOI] [PMC free article] [PubMed] [Google Scholar]
  292. Chao R., Yuan Y., Zhao H.. Building Biological Foundries for Next-Generation Synthetic Biology. Sci. China Life Sci. 2015;58(7):658–665. doi: 10.1007/s11427-015-4866-8. [DOI] [PubMed] [Google Scholar]
  293. Enghiad B., Huang C., Guo F., Jiang G., Wang B., Tabatabaei S. K., Martin T. A., Zhao H.. Cas12a-Assisted Precise Targeted Cloning Using in Vivo Cre-Lox Recombination. Nat. Commun. 2021;12(1):1171. doi: 10.1038/s41467-021-21275-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  294. Beato M. S., Veneroso V.. The Nagoya Protocol on Access and Benefit Sharing: The Neglected Issue of Animal Health. Front. Microbiol. 2023;14:1124120. doi: 10.3389/fmicb.2023.1124120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  295. Soares J.. The Nagoya Protocol and Natural Product-Based Research. ACS Chem. Biol. 2011;6(4):289. doi: 10.1021/cb200089w. [DOI] [PubMed] [Google Scholar]
  296. Bond M. R., Scott D.. Digital Biopiracy and the (Dis)­Assembling of the Nagoya Protocol. Geoforum. 2020;117:24–32. doi: 10.1016/j.geoforum.2020.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  297. Halewood M.. New Rules for Sharing Benefits from the Use of Digital Sequence Information. Nucleus (Calcutta) 2024;67(1):5–9. doi: 10.1007/s13237-024-00487-1. [DOI] [Google Scholar]
  298. Newman D. J., Cragg G. M.. Natural Products as Sources of New Drugs over the Nearly Four Decades from 01/1981 to 09/2019. J. Nat. Prod. 2020;83(3):770–803. doi: 10.1021/acs.jnatprod.9b01285. [DOI] [PubMed] [Google Scholar]
  299. Domingo-Fernández D., Gadiya Y., Preto A. J., Krettler C. A., Mubeen S., Allen A., Healey D., Colluru V.. Natural Products Have Increased Rates of Clinical Trial Success throughout the Drug Development Process. J. Nat. Prod. 2024;87(7):1844–1851. doi: 10.1021/acs.jnatprod.4c00581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  300. Batista P. H. D.. Proposal for a Global Regulation of Digital Sequence Information. Social Science Research Network. 2024 doi: 10.2139/ssrn.4861106. [DOI] [Google Scholar]
  301. Aubry S., Walckiers P., Frison C.. Keep Talking While Everything Gets Sequenced: Is Global Governance of Genetic Resources Keeping Pace with Digitization? SSRN. 2025 doi: 10.2139/ssrn.5062711. [DOI] [Google Scholar]
  302. Muñoz-García M.. DSI Scientific Network; Scholz A. H.. Navigating COP16’s Digital Sequence Information Outcomes: What Researchers Need to Do in Practice. Patterns (N. Y.) 2025;6(3):101208. doi: 10.1016/j.patter.2025.101208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  303. Freitas H., Gouveia A. C.. Biodiversity Futures: Digital Approaches to Knowledge and Conservation of Biological Diversity. Web Ecol. 2025;25(1):29–37. doi: 10.5194/we-25-29-2025. [DOI] [Google Scholar]
  304. Khalil A. T., Shinwari Z. K., Islam A.. Fostering Openness in Open Science: An Ethical Discussion of Risks and Benefits. Front. Polit. Sci. 2022;4:930574. doi: 10.3389/fpos.2022.930574. [DOI] [Google Scholar]
  305. Adler Miserendino R. A., Meyer R. S., Zimkus B. M., Bates J., Silvestri L., Taylor C., Blumenfield T., Srigyan M., Pandey J. L.. The Case for Community Self-Governance on Access and Benefit Sharing of Digital Sequence Information. Bioscience. 2022;72(5):405–408. doi: 10.1093/biosci/biac019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  306. Nehring R.. Digitising Biopiracy? The Global Governance of Plant Genetic Resources in the Age of Digital Sequencing Information. Third World Q. 2022;43(8):1970–1987. doi: 10.1080/01436597.2022.2079489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  307. Lawson C., Humphries F., Rourke M.. Challenging the Existing Order of Knowledge Sharing Governance with Digital Sequence Information on Genetic Resources. J. Intellect. Prop. Law Pract. 2024;19:337. doi: 10.1093/jiplp/jpad129. [DOI] [Google Scholar]
  308. Yinuo . Press release. United Nations Sustainable Development. https://www.un.org/sustainabledevelopment/blog/2025/02/press-release-governments-agree-on-the-way-forward-to-mobilise-the-resources-needed-to-protect-biodiversity-for-people-and-planet/ (accessed 2025. –05–27).
  309. Rabone, M. ; Horton, T. ; Humphries, F. ; Lyal, C. H. C. ; Muraki Gottlieb, H. ; Scholz, A. H. ; Vanagt, T. ; Jaspars, M. . BBNJ Agreement: Considerations for Scientists and Commercial End Users of MGR at Research, Development and Commercialization Stages. In Sustainable Development Goals Series; Springer Nature Switzerland: Cham, 2025; pp 283–315. 10.1007/978-3-031-72100-7_14. [DOI] [Google Scholar]

Articles from Journal of Natural Products are provided here courtesy of American Chemical Society

RESOURCES