Abstract
Specialized metabolites possess diverse interesting biological activities and some cardenolides- and monoterpene indole alkaloids- (MIAs) derived pharmaceuticals are currently used to treat human diseases such as cancers or hypertension. While these two families of biocompounds are produced by specific subfamilies of Apocynaceae, one member of this medicinal plant family, the succulent tree Pachypodium lamerei Drake (also known as Madagascar palm), does not produce such specialized metabolites. To explore the evolutionary paths that have led to the emergence and loss of cardenolide and MIA biosynthesis in Apocynaceae, we sequenced and assembled the P. lamerei genome by combining Oxford Nanopore Technologies long-reads and Illumina short-reads. Phylogenomics revealed that, among the Apocynaceae whose genomes have been sequenced, the Madagascar palm is so far the species closest to the common ancestor between MIA producers/non-MIA producers. Transposable elements, constituting 72.48% of the genome, emerge as potential key players in shaping genomic architecture and influencing specialized metabolic pathways. The absence of crucial MIA biosynthetic genes such as strictosidine synthase in P. lamerei and non-Rauvolfioideae species hints at a transposon-mediated mechanism behind gene loss. Phylogenetic analysis not only showcases the evolutionary divergence of specialized metabolite biosynthesis within Apocynaceae but also underscores the role of transposable elements in this intricate process. Moreover, we shed light on the low conservation of enzymes involved in the final stages of MIA biosynthesis in the distinct MIA-producing plant families, inferring independent gains of these specialized enzymes along the evolution of these medicinal plant clades. Overall, this study marks a leap forward in understanding the genomic dynamics underpinning the evolution of specialized metabolites biosynthesis in the Apocynaceae family, with transposons emerging as potential architects of genomics restructuring and gene loss.
Keywords: Apocynaceae, Evolution, Biosynthetic gene clusters, Specialized metabolites, Alkaloids
Abbreviations
- ADH
alcohol dehydrogenase
- BUSCO
Benchmarking Universal Single-Copy Orthologs
- GO
geissoschizine oxidase
- Ks
synonymous substitutions per synonymous sites
- LAMT
loganic acid O-methyltransferase
- LTR
long terminal repeat
- MEP
methylerythritol phosphate
- MIA
monoterpene indole alkaloid
- P450
cytochrome P450
- PAS
precondylocarpine acetate synthase
- SAT
stemmadenine-O-acetyltransferase
- SGD
strictosidine β-D-glucosidase
- SLS
secologanin synthase
- STR
strictosidine synthase
- TDC
tryptophan decarboxylase
- TE
transposable element
- WGD
whole genome duplication
1. Introduction
Comprising around 350 genera and 5000 species, the Apocynaceae family, commonly known as the dogbane family, exhibits a global distribution, thriving in varied ecosystems from tropical rainforests to arid deserts [1]. Plants from this family showcase a plethora of morphological structures, reproductive strategies, physiological adaptations and phytochemical diversity that have evolved over millennia [[2], [3], [4], [5]]. In addition to its ecological significance, Apocynaceae has been a source of invaluable contributions to human health and well-being through its rich specialized metabolites repertoire including monoterpene indole alkaloids (MIAs) and cardenolides [6,7].
Cardenolides are a group of biocompounds produced by plant species mainly belonging to the Apocynaceae, Plantaginaceae, and Brassicaceae families [7,8]. They exhibit a broad range of biological activities including cardiotonic and anti-tumor activities [9,10]. They are characterized by a steroid core structure and a lactone ring [11]. The biosynthesis of cardenolides starts with the condensation of two farnesyl-diphosphates by squalene synthase [12]. From squalene, several enzymatic routes are proposed to generate the three putative sterol precursors of cardenolides: cholesterol, β-sitosterol and campesterol [[13], [14], [15]]. These three precursors can be used as substrates by DpCYP87A106 and CpCYP87A103 to produce pregnenolone [16]. In Digitalis purpurea, pregnenolone is further modified by 3β-hydroxysteroid dehydrogenase and progesterone 5β-reductase to produce pregnanolone [[17], [18], [19], [20]]. Even though extensive research has been performed on sterol-derived cardenolides biosynthesis, enzymatic steps from pregnanolone towards the different types of cardenolides remains unidentified in any plant species.
MIAs are a group of specialized metabolites produced by plant species mainly belonging to the Apocynaceae, Loganiaceae and Rubiaceae families (Gentianales order) and a few Nyssaceae (Cornales order) [6]. In the Apocynaceae family, MIAs are specifically produced by the Rauvolfioideae subfamily [2]. Non-MIA producing Apocynaceae species are thus reported as non-Rauvolfioideae species. A broad range of MIAs exhibit a wide range of biological activities, including anti-tumor, anti-inflammatory, and anti-microbial properties [6,21]. MIA backbones are characterized by an indole ring and a monoterpene unit, which are biosynthetically derived from the shikimate and monoterpene iridoid pathways, respectively [22]. The shikimate pathway starts with the condensation of phosphoenolpyruvate and D-erythrose-4-phosphate by the 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase followed by nine steps resulting in the formation of tryptophan (Fig. 1) [23]. This essential amino acid is further decarboxylated by tryptophan decarboxylase (TDC) forming tryptamine, the indole precursor of MIA [24]. The monoterpene iridoid moiety is synthesized from geranyl diphosphate produced by geranyl-diphosphate synthase from isopentyl-diphosphate and from dimethylallyl-diphosphate synthesized from the MEP pathway (Fig. 1) [25]. A series of seven enzymes leads to the synthesis of loganic acid from geranyl-pyrophosphate [26]. Loganic acid is further O-methylated by loganic acid O-methyltransferase (LAMT) and then converted to secologanin, the monoterpene iridoid precursor of MIA, by secologanin synthase (SLS) (Irmler et al., 2000; Murata et al., 2008; Dugé de Bernonville et al., 2015) (Fig. 1) [[27], [28], [29]]. Over the last three decades, the biosynthesis of MIAs has been extensively studied in the iconic plant Catharanthus roseus also known as the Madagascar periwinkle [30]. The pathway starts with the condensation of tryptamine and secologanin by strictosidine synthase (STR), the first identified Pictet Spengler enzyme [31,32]. The resulting intermediate and first MIA of the whole pathway, strictosidine, is then deglycosylated by strictosidine β-D-glucosidase (SGD) [33,34] and further modified by a series of enzymatic reactions to produce dihydroprecondylocarpine acetate, the last molecule that can be considered as part of a central MIA pathway (Fig. 1) [35]. This central MIA pathway contains several branching points resulting to the wide range of MIAs (Fig. 1). Strictosidine aglycones are the precursors of heteroyohimbane and yohimbane alkaloids such as the antihypertensive ajmalicine [[36], [37], [38], [39], [40]]. Geissoschizine is the precursor of sarpagane alkaloids such as the antiarrhythmic ajmaline [41,42]. Dihydroprecondylocarpine is the precursor of iboga alkaloids such as the antiaddictive ibogaine [43,44] and aspidosperma alkaloids such as the anticancer precursor vindoline [45,46]. The identification of the MIA biosynthesis molecular actors in several plant species has progressively paved the way for the development of alternative biotechnological processes for MIA-derived pharmaceutical supply, notably by reconstructing the plant biosynthetic pathways in heterologous organisms [[47], [48], [49]]. It has also shed light on evolutionary trends of the biosynthesis of MIA within plant clades, which has probably co-opted genes from the flavonoid metabolism, as illustrated with O-methyltransferases in C. roseus and Camptotheca acuminata [50,51]. In addition, an interesting characteristic of the MIA pathway is that some biosynthetic genes are physically linked in the genomes, forming biosynthetic gene clusters as described in C. roseus [[52], [53], [54], [55], [56]], Vinca minor [57] and Gelsemium sempervirens [53]. Interestingly, STR and TDC, two key enzymes in MIA biosynthesis, are described to form a biosynthetic gene cluster in many species including C. roseus [52,53,56,57], Rhazya stricta [58] and G. sempervirens [53]. However, no study has so far addressed the evolution of the capacity of synthesizing MIAs among plant families and subfamilies.
In this context, Pachypodium lamerei Drake, commonly known as the Madagascar palm, stands out as an interesting species. P. lamerei is indeed a succulent native to Madagascar, well adapted to arid regions and featuring a massive spiky trunk with thick leaves at the top (Fig. 2A). This stunning tree belongs to the Apocynoideae subfamily of Apocynaceae. Conversely to plants from the Rauvolfioideae subfamily of Apocynaceae (e.g. C. roseus, V. minor, Voacanga thouarsii or Rauvolfia species) that are well-documented for producing a large palette of MIAs and plants from the Asclepiadoideae subfamily (e.g. Calotropis procera) that are well-documented for producing cardenolides, P. lamerei does not biosynthesize such specialized metabolites (Fig. 2D) thus questioning the evolution processes associated to this incapacity in non-Rauvolfioideae and non-Asclepiadoideae species [2,5,53,[59], [60], [61]]. In this study, we report the first Pachypodium genome assembly. We propose several evolutionary scenarios, notably including transposable elements (TE), for the emergence and loss of specialized metabolites biosynthesis pathways.
2. Results and discussion
2.1. Genome sequencing, assembly and annotation
Flye (v.2.8.3) was used to assemble the P. lamerei genome using Oxford Nanopore Technologies (ONT) long-reads, resulting in a 968.8 Mb assembly spread across 3029 scaffolds. A double polishing using Illumina short-reads and pilon (v.1.23) next resulted in a 968.6 Mb assembly distributed across 3029 scaffolds (Table 1). Interestingly, this genome is the second largest Apocynaceae genome available after V. thouarsii (1351.2 Mb, [62]). A clear sign of the excellent quality of the assembled genome is the base-level QV of 31.1, which corresponds to more than 99.999% correctness, and the k-mer completeness of 92.5% (Table 1). Overall, the assembled genome is 96.6% complete (Fig. 2B) with a low duplication rate (2.0%), according to the identification of core Eudicotyledons Benchmarking Universal Single-Copy Orthologs (BUSCO). This BUSCO score is similar to the other Apocynaceae genomes (Supplementary Figure S1A) with the exception of Apocynum venetum and V. minor, both presenting a slightly higher duplication (14.4% and 36.6%, respectively), and Calotropis procera, being moderately less complete (85.3%).
Table 1.
Length of genome assembly (bp) | 968,622,212 |
Number of scaffolds | 3029 |
N50 of scaffolds (Mb) | 1.67 |
L50 of scaffolds | 145 |
GC content (%) | 35.34 |
Longest scaffold (Mb) | 10.422 |
scaffolds >50 kb (%) | 96.75 |
QV | 31.1 |
Number of protein-coding genes | 16,176 |
Average gene length (bp) | 5434 |
Average transcripts number per gene | 1.6 |
On the basis of available resources, genomes of MIA-producing Apocynaceae (i.e. from the Rauvolfioideae subfamily) generally contain around 30,000 genes (C. roseus: 37,298 genes [55]; V. thouarsii: 33,300 genes [62]; V. minor: 29,624 genes [57]). Despite being the second largest Apocynaceae genome characterized to date, P. lamerei displays the lowest number of genes (16,176 protein encoding genes) (Fig. 3A). Interestingly, Gelsemium sempervirens, a MIA-producing Gentianales close to Apocynaceae (Fig. 3A), features relatively similar gene contents than non-Rauvolfioideae species genomes (i.e. C. gigantea, C. procera, Asclepias syriaca, A. venetum and P. lamerei) (Fig. 3A). This suggests that the higher gene content in Rauvolfioideae species likely results from gene duplication occurring after Rauvolfioideae/non-Rauvolfioideae divergence. We also noted that all these predicted gene sets achieved good completeness scores according to Eudicotyledons BUSCO, ranging from 81% (C. procera) to 92.6% (A. venetum) (Supplementary Figure S1B). In addition, the combination of BLASTp and BLASTx against UniProt database and hmmscan against the Pfam database led to the functional annotation of 85.2% of the predicted genes (13,777 of the 16,176 genes, Supplementary Table S1).
Upon the identification of paralogous gene pairs in each species and estimation of the synonymous substitutions per synonymous site (Ks) for each gene pair, whole genome duplication (WGD) events were investigated. Paralogs were thus arranged according to age (Ks). In this case, genes being contemporarily duplicated showed a high-density peak at low Ks. Duplicated genes progressively disappear ending in an L-shape pattern. Important increases in paralogs at a particular Ks are caused by large-scale duplication such as WGD events. These events usually appear as secondary peaks in the Ks plot. Arabidopsis thaliana, C. roseus, P. lamerei and Solanum lycopersicum all showed a secondary peak at around Ks = 2.5 (Fig. 2C), which corresponds to the well-described and conserved whole-genome triplication event common to Eudicotyledons [63]. No additional peak could be identified in P. lamerei Ks plot, thus indicating the absence of any other recent WGD event.
Since TE are known to play important roles in genetic instability and genome evolution [64], the TE composition of the P. lamerei genome was also inferred. Based on this analysis, it appears that 72.48% of P. lamerei genome is composed of TE, most of them being long terminal repeat (LTR, Supplementary Figure S2). The majority of P. lamerei LTR are Copia-type (37.41%), Gypsy-type LTR accounting for 17.58% (Supplementary Figure S2). Such Copia-Gypsy ratio is mainly found in Rauvolfioideae species while non-Rauvolfioideae tend to have a Copia-Gypsy ratio of 1 or less (Supplementary Figure S2). In accordance with their genome size, P. lamerei displays the second highest proportion of TE after V. thouarsii [62]. The similar genome size and TE proportion in these two species suggest that P. lamerei and V. thouarsii are likely among the closest species to a common ancestor that led to the differentiation of Rauvolfioideae and non-Rauvolfioideae Apocynaceae species.
2.2. Conservation of specialized metabolism biosynthetic genes
To ensure completeness of the predicted proteome, we first looked for orthologs of genes involved in the biosynthesis of flavonoids, a prominent group of specialized metabolites broadly conserved in flowering plants. We first performed a targeted metabolomics analysis on key intermediates of the flavonoid pathway to ensure their occurrence in the studied species P. lamerei and the MIA-biosynthesis model plant C. roseus. As previously reported in C. roseus [65,66], we mainly identified final products of the flavonoids pathway such as glycosylated flavonols (i.e. kaempferol-3-glucoside and quercetin-3-glucoside) and both anthocyanidins and anthocyanins (i.e cyanidin and pelargonidin) in both species (Table 2). We also identified the flavonoid precursor phenylalanine. Interestingly, no flavan-3-ols (i.e. catechin and epicatechin) could be detected in the analyzed samples. As flavonoids could be detected in P. lamerei, we thus performed a BLASTp analysis of functionally characterized proteins from A. thaliana, Vitis vinifera, C. roseus and Desmodium uncinatum against the P. lamerei and C. roseus predicted proteomes (Supplementary Tables S2-S3). We were able to identify high confidence orthologs with similar identity and coverage between the two species for the eight genes involved in the biosynthesis of kaempferol, quercetin, pelargonidin, and cyanidin from the condensation of 4-coumaroyl-CoA and 3-malonyl-CoA (Supplementary Table S4). This is in accordance with the detection of the glycone form of all these end products in P. lamerei and C. roseus. Conversely, no catechin nor epicatechin was detected in P. lamerei and C. roseus samples corroborating with our incapacity to identify high confidence orthologs for leucoanthocyanidin reductase and anthocyanidin reductase in both species. These results confirmed the gene annotation quality and that no orthologs would potentially escape our identification procedure.
Table 2.
Metabolite |
P. lamerei |
C. roseus |
|||||
---|---|---|---|---|---|---|---|
Molar mass | RTa | Young leaves | Old leaves | Flowers | Young leaves | Old leaves | |
phenylalanine | 165.19 | 2.7 | ++ | ++ | ++ | ++ | ++ |
naringenin | 272.25 | 12.5 | – | – | – | – | – |
kaempferol | 286.23 | 13.0 | – | – | – | – | – |
kaempferol-3-glucoside | 448.38 | 8.9 | + | – | ++ | + | – |
quercetin | 302.25 | 11.3 | – | – | – | – | – |
quercetin-3-glucoside | 464.38 | 7.8 | ++ | – | ++ | – | – |
pelargonidin | 271.24 | 7.9 | – | + | + | ++ | ++ |
cyanidin | 287.24 | 6.9 | + | + | + | – | – |
cyanidine-3-glucoside | 484.83 | 5.9 | + | + | + | + | + |
catechin | 290.26 | 4.8 | – | – | – | – | – |
epicatechin | 290.26 | 5.9 | – | – | – | – | – |
Retention time.
On this basis, we next performed a similar analysis with MIA and cardenolides biosynthetic genes. We created a database containing functionally characterized proteins involved in shikimate, MEP, iridoid, MIA and cardenolides pathways [67]. We then performed a BLASTp analysis of this database against predicted proteomes of all studied species and conserved hits of at least 90% coverage and 50% identity (Supplementary Tables S5-S20). We deliberately used low identity and coverage thresholds to maximize ortholog identification probability. As expected, this resulted in a higher number of putative orthologs for multigenic families such as alcohol dehydrogenases (ADHs) and cytochromes P450 (P450s). For instance, these low thresholds led to the identification of 10 SLS-like genes in C. roseus while only four bona fide SLS exist in this species (Supplementary Table S21). Based on this approach, we were able to retrieve orthologs from genes of shikimate, MEP, steroids and cardenolide pathways for all species with only few exceptions probably due to phylogenetic distance (Supplementary Table S21-S22). We also identified orthologs for most of the genes of iridoid pathway for all Apocynaceae with the exception of 8-hydroxygeraniol oxidoreductase and loganic acid O-methyltransferase for all non-Rauvolfioideae species (50–90% identity, 94–100% coverage).
Interestingly, concerning the key MIA biosynthetic enzyme STR, only MIA-producing Gentianales present at least one CrSTR orthologous gene while no ortholog was identified in other species. As previously reported, C. acuminata (Cornales) uses an alternative seco-iridoid pathway producing strictosidinic acids via strictosidinic acid synthase that may explain why no CrSTR-like enzyme could be found in this species [68]. It also suggests that MIA biosynthesis in Gentianales and Cornales may have emerged independently (i.e. by convergent evolution). Similarly, no CrSTR-like enzyme has been identified in Apocynoideae and Asclepiadoideae genomes. Thus, the absence of this enzyme, together with the lack of detection of secologanin nor strictosidine in P. lamerei leaf extracts (Fig. 2D) suggests that an evolutionary loss of MIA biosynthesis may have occurred in non-Rauvolfioideae species among Apocynaceae, similarly to what happened in Rubiaceae family [58,69]. This statement is reinforced by the lack of many other MIA biosynthetic genes such as SGD, geissoschizine oxidase (GO), stemmadenine-O-acetyltransferase (SAT) or precondylocarpine acetate synthase (PAS) which probably did not evolved in the absence of the precursor strictosidine (Supplementary Table S21). It is worth noting that no SAT-like genes could be found in Gelsemiaceae nor Rubiaceae species as well as no GO-like genes could be found in Rubiaceae species. Such orthologous pattern suggests that enzymes modifying the strictosidine aglycone derivatives may have appeared independently in these families leading to family-specific MIA skeletons (e.g. quinoline-type in Rubiaceae or corynanthe-type in Apocynaceae).
2.3. Evolutionary divergence of P. lamerei
To gain a more comprehensive understanding of the evolution of specialized metabolites biosynthesis in Apocynaceae, a maximum-likelihood phylogenetic tree was established (Fig. 3A). Therefore, protein coding genes from 8 Apocynaceae species (3 Rauvolfioideae, 2 Apocynoideae and 3 Asclepiadoideae), 4 non-Apocynaceae MIA-producing species (2 Rubiaceae, 1 Gelsemiaceae and 1 Nyssaceae) and 4 non-Apocynaceae non-MIA producing species (A. thaliana, C. canephora, S. lycopersicum and V. vinifera) were compared. 24,413 orthogroups were defined from these 16 protein datasets covering 94.1% of all proteins. 387 of them were single-copy orthogroups and used to set up the phylogenetic tree. The resulting phylogenetic tree is in agreement with Apocynaceae classification based on other approaches including morphological characterization and plastome analysis (Fig. 3A) [3,70]. In addition to their common genomic features, V. thouarsii and P. lamerei are the first species to outgroup in Rauvolfioideae and non-Rauvolfioideae subfamilies.
The evolution of orthogroups was then analyzed along the phylogenetic tree (Fig. 3B–Supplementary Table S23). Orthogroups associated with shikimate and MEP pathways are present in all studied genomes suggesting that these pathways share a common evolutionary origin. Conversely, iridoid pathway-associated orthogroups are only identified in Nyssaceae, Rubiaceae, Gelsemiaceae and Rauvolfioideae species. Such distribution suggests that iridoid biosynthesis has emerged in the Magnoliopsida common ancestor and subsequently experienced independent losses in Solanaceae and non-Rauvolfioideae for instance. Obviously, orthogroups associated with MIA biosynthesis are mainly identified in Rauvolfioideae species. In addition, most of these MIA biosynthesis-associated orthogroups could also be found in G. sempervirens, O. pumila and M. speciosa. Interestingly, as the phylogenetic distance from C. roseus increases, fewer orthogroups associated with MIA biosynthesis are identified. This detection pattern therefore suggests an independent acquisition of post-strictosidine aglycone modifying enzymes, probably linked to the increase in gene contents in Rauvolfioideae species. This hypothesis could be strengthened by the future identification of genes coding for enzymes involved in MIAs biosynthesis in Rubiaceae and/or Gelsemiaceae, provided that the new orthogroups associated with these genes are not detected in Rauvolfioideae species.
2.4. Evolutionary scenario of biosynthetic gene clusters in Apocynaceae
To study the conservation of gene organization between chromosomes and species, a synteny analysis between P. lamerei, its close relative V. thouarsii [62] and C. roseus chromosome-scale genome has been performed [55] (Supplementary Figure S3). The dot plot comparing those species indicated an apparent collinearity between scaffolds. There was a grater collinearity between P. lamerei and V. thouarsii genomes than between P. lamereii and C. roseus genomes. While P. lamerei genome is only partially homologous with C. roseus genome, all P. lamerei scaffolds are homologous to V. thouarsii scaffolds despite some major insertion/deletion events. Hence, given their greater collinearity, their phylogenomic linkage, their similar genome size, their proportion of TE and LTR family ratio, P. lamerei and V. thouarsii are so far the most closely related species and therefore the closest species to the last common ancestor of Rauvolfioideae and non-Rauvolfioideae.
Given the low conservation of iridoid pathway genes and the absence of secologanin, we firstly examined the collinearity between P. lamerei and the genomic regions surrounding iridoid pathway genes in C. roseus. Although we were able to identify 3 highly conserved regions (Supplementary Figure S4), no syntenic region could be observed for 8HGO, IO, 7DLS, 7DLH, LAMT and SLS. It is worth noting that several tandem duplicates could be observed in both species for IRS and especially 7DLGT. These duplications could give rise to new functions via subfunctionalization throughout evolution [[71], [72], [73]]. It is noteworthy that despite a great conservation of G8H between C. roseus and P. lamerei (77.89%id, 100%cov), P. lamerei ortholog is greatly less expressed in all analyzed samples compared to Rauvolfioideae species (Supplementary Figure S5). Thus, the lack of collinearity in genomic regions surrounding most of iridoid pathway genes together with the low expression of G8H suggests a progressive degeneration of this pathway, the evolutionary processes of which remain to be discovered.
In fungi, bacteria and plants, genes involved in specific metabolic pathways frequently group together in the same genomic region [74]. These groups of genes, referred to as biosynthetic gene clusters, have been described in several MIA-producing plant species including the iconic C. roseus [52,53,55,56], V. minor [57], R. stricta [58] and G. sempervirens [53]. Microsyntenic study of these gene clusters enabled the identification of key enzymes in MIA biosynthesis as exemplified with the recent discovery of vincadifformine 16-hydroxylase in V. minor [57]. It is worth noting that genes encoding STR and TDC often cluster together as in C. roseus, G. sempervirens, R. stricta and O. pumila [52,53,55,56,58,69]. A recent single-cell approach in C. roseus unraveled a MATE transporter in this region that was found to be involved in secologanin transport from the cytosol to the vacuole [56]. By microsyntenic analysis between C. roseus and P. lamerei, we identified a similar [STR-TDC-MATE] region in P. lamerei contig 1749 (Fig. 4A). This chromosomal portion of P. lamerei is flanked by two highly conserved regions and, although it lacks any STR-like sequences, exhibits high TDC (reverse strand) conservation. Interestingly, only one transporter-like gene can be found in P. lamerei whereas two MATE genes have been annotated in this region in C. roseus, similarly to G. sempervirens region [53]. Overall, these preliminary observations thus strengthen the hypothesis that the absence of MIA production in P. lamerei and non-Rauvolfioideae species results from STR deletion through evolution in Apocynaceae.
In such a scenario, transposable elements might have played a major role in the proposed evolutionary mechanism. Indeed, while we observe a significantly higher proportion of genes in the C. roseus cluster, the syntenic region in P. lamerei exhibits a specific TE enrichment in the corresponding chromosome portion (Fig. 4B). Similar observations can be made throughout the genome which can explain the huge differences in genome size between these two species. It has to be noted that no significant differences in gene proportion in the cluster, its associated contig or chromosome and the whole genome can be observed for each species. Conversely, P. lamerei genome is significantly richer in TE than the cluster and contig 1749 and C. roseus cluster is significantly enriched in TE compared to chromosome 5 and its whole genome (Fig. 4B). This observation thus reinforces the putative implication of TE in [STR-TDC-MATE] biosynthetic cluster evolution. In addition, STR in C. roseus is flanked by two S0:0000544 helitrons (Supplementary Figure S6). Interestingly, the proportion of this autonomous rolling-circle DNA transposon is significantly higher in the P. lamerei region than the C. roseus region (P. lamerei: 0.38/kb, C. roseus: 0.08/kb, P-value: 3.005e-06) while being smaller. Helitrons exhibit ‘copy-and-paste’ and ‘cut-and-paste’ modes of transposition [75] as well as an excisive mode [76]. Hence, S0:0000544 helitron might have been a key player in the evolutionary mechanism leading to STR deletion in non-Rauvolfioideae species.
3. Conclusion
The comprehensive genomic exploration of P. lamerei genome in comparison with other Apocynaceae species offers intriguing insights into the evolutionary dynamics within this plant family. The assembly of the P. lamerei genome reveals a remarkable genome size of 968.6 Mb, making it the second largest Apocynaceae genome known to date. Interestingly, P. lamerei and the other non-Rauvolfioideae genomes challenge the norm for MIA-producing Apocynaceae genomes, presenting a lower-than-expected gene count such as the 16,176 protein-encoding genes for P. lamerei. This deviation from the typical gene count observed in MIA-producing Apocynaceae genomes raises questions about the genomic landscape of P. lamerei and its implications for specialized metabolism. The phylogenetic analysis places P. lamerei as an important species regarding the study of Apocynaceae evolution. Comparative genomics further reveals a similarity between P. lamerei and V. thouarsii, indicating a close relatedness among Apocynaceae species. The notable presence of TEs, especially LTR elements, in the P. lamerei (72.48%) and V. thouarsii (75.16% [62]) genomes raises questions about their role in shaping genomic architecture. It is likely that the common ancestor of these two species may have undergone horizontal transfer of TE, known to contribute to genome diversification [77]. This phenomenon could then have led to a sharp increase in genome size, which over time contracted through the removal of DNA in other Apocynaceae species such as C. roseus. As the insertion and removal of TE is often imprecise, it can indirectly affect the surrounding sequences which can lead to high duplication and reshuffling. These genomic rearrangements could explain the absence of certain MIA biosynthetic genes, including STR, in non-Rauvolfioideae species. Furthermore, TEs have been described to represent an important mechanism for gene evolution in rice and Poaceae [78,79]. Thus, they could also be a determinant factor in the emergence post-strictosidine aglycone modifying enzymes, as exemplified in rice [79]. This reinforces the potential interest of this genomic data set for the study of genetics and epigenetics mechanisms associated with the evolution, diversification and emergence of specialized metabolites biosynthetic pathways. Moreover, beyond widening our knowledge on MIA biosynthesis evolution, this new genome could improve our understanding of cardenolides emergence in Asclepiadoideae when this biosynthesis pathway is more elucidated.
4. Materials and methods
4.1. Sample collection, DNA extraction and sequencing
Pachypodium lamerei plants were purchased from A l'ombre des figuiers (achat-vente-palmiers.com). Nuclei were extracted from young leaves using a nuclei isolation protocol [80]. The nuclear pellet was resuspended in Qiagen buffer G2 with RNaseA and proteinaseK, and DNA extraction was further continued according to the instructions of Qiagen's Genomic Tip/100G protocol (Qiagen, Venlo, The Netherlands). DNAseq library was performed by Future Genomics Technologies (Leiden, The Netherlands) using Nextera Flex kit (Illumina, San Diego, USA) for Illumina sequencing and ONT 1D ligation sequencing kit (Oxford Nanopore Technologies Ltd, Oxford, United-Kingdom) for Nanopore sequencing. Illumina libraries were sequenced in pared-end mode (2x150 bp) using Illumina NovaSeq 6000 technology. ONT libraries were sequenced on Nanopore PromethION flowcell (Oxford Nanopore Technologies Ltd, Oxford, United-Kingdom) with the guppy version 3.0.3 high-accuracy basecaller.
4.2. De novo genome assembly
The P. lamerei genome assembly was performed by Future Genomics Technologies (Leiden, The Netherlands). After adapter removal using porechop [81], ONT reads were assembled using Flye assembler (v.2.8.3, [82]). Contig were twice polished with Illumina reads using pilon (v.1.23, [83]).
4.3. RNA extraction and sequencing
Pachyopdium lamerei plants were obtained from ikhebeencactus (ikhebeencactus.nl). RNA was extracted from liquid nitrogen flash-frozen leaves, spikes and roots using PureLink RNA mini kit (Thermo Fisher Scientific, Illkirch-Graffenstaden, France). RNA library construction and sequencing was performed using Illumina NovaSeq 6000 technology. Raw RNA-seq data have been deposited under the SRA accession numbers SRR25397418 (leaves), SRR25397417 (spikes), SRR25397419 (roots).
4.4. Gene model prediction and functional annotation
RNAseq reads were aligned to the P. lamerei reference genome using HiSAT2 (2.2.1 [84]). Subsequently, transcriptome was assembled for each alignment using StringTie (v2.1.7 [85]) and merged using stringtie merge option. A combination of BLASTX on predicted transcripts and BLASTp on TransDecoder (v5.5.0 [86]) predicted ORFs against the Uniprot database as well as hmmscan (v3.1b2 [87]) against the PFAM database (https://pfam.xfam.org/) was used to assess putative function of each gene model.
A similar strategy was used for gene modeling and gene functional annotation of A. venetum, A. syriaca and C. procera. Genomic sequence for each species was retrieved from GenBank accession numbers GCA_019593545.1, GCA_027405835.1 and GCA_004801955.1, respectively. RNAseq reads were retrieve for each species from SRA accession numbers: SRR17163778 (leaves), SRR17163779 (stems) and SRR17163780 (roots) for A. venetum; SRR5117431 (buds) for A. syriaca; and SRR8281638, SRR8281639, SRR8281640, SRR8281641, SRR8281642 and SRR8281643 (all corresponding to leaves under different salinity conditions) for C. procera.
4.5. Assembly completeness assessment
Merqury (v. 1.3 [88]) and the stat program from BBMap tool (v.38.94 [89]) were used to assess P. lamerei genome assembly quality. To evaluate assembly and gene model completeness, Benchmarking Universal Single-Copy Orthologs (BUSCO v.5.2.2 [90]) with default parameters was used using a plant-specific database of 2326 single copy orthologs (eudicots_odb10). Gene model statistics were retrieved using agat_sp_statistics from the AGAT package (v.0.8.0 [91]).
4.6. Transposable elements prediction and annotation
Transposable elements (TE) were identified and annotated using the sensitive mode of Extensive de novo TE annotator (EDTA v.1.9.5 [92]). This pipeline combines long-terminal repeat (LTR) annotation using LTR_finder (v. 1.07 [93]) and LTRharvest included in GenomeTools (v.1.5.10 [94]); terminal inverted repeat annotation (TIR) using Generic repeat finder (v.1.0 [95]) and TIR-learner (v.2.5 [96]); and Helitrons annotation using HelitronScanner (v.1.1 [97]). Apart from that, TE size thresholds are employed to avoid erroneous findings. Tandem repeats and short sequences are thereby defined as TIR shorter than 80 bp, LTR shorter than 100 bp and helitrons shorter than 100 bp. LTR are further filtered using LTR_retriever (v.2.9.0 [98]) to prevent erroneous LTR findings. If TIR candidates do not exceed 600 bp, they are categorized as MITEs. EDTA advanced filters are used for additional filtering TIR and Helitrons (for further information, see Ref. [92]). The TE library that was obtained is then used to mask the genome. RepeatModeler (v.2.0.1, default settings [99]) is then used on the unmasked portion of the genome to find non-LTR retrotransposons and unclassified TE that have eluded structure-based TE identification methods. Finally, gene-related sequences are removed by EDTA using the supplied CDS sequences.
4.7. Whole-genome duplication analysis
Transcript sequences of P. lamerei, A. venetum, A. syriaca, C. procera, V. thouarsii [62], V. minor [57], C. roseus [55], Arabidopsis thaliana [100], Mytragyna speciosa [101], Solanum lycopersicum [102], C. acuminata [103], C. gigantea [104], G. sempervirens [53], Vitis vinifera [105] and Ophiorrhiza pumila [69] were used to assume whole genome duplication (WGD) events using the DupPipe pipeline [106]. Each dataset was subject to discontiguous MegaBLAST [107,108] to find duplicated gene pairs (40% sequence similarity over 300bp). Using BLASTx (v.2.6.0–1 [109]), the open reading frame of each gene pair was determined from the NCBI's plant RefSeq protein database (as of May 21, 2021), preserving solely the best hit sequence (sequence similarity threshold: 30% across 150 aa). Following that, DNA sequence alignment was done using GeneWise [110] against the best hit homologous protein and its translation. For each gene pair, MUSCLE (v.3.6 [111]) aligned the amino acid sequences, which then served as a guide for RevTrans (v.1.4 [112]) to align the nucleic acids. In order to assess the divergence times between gene pairs, substitutions per synonymous site (Ks) using the Codeml's F3x4 model from the PAML software (v.4.9 [113]) were calculated.
4.8. Orthology analysis and phylogenetic tree reconstruction
Protein sequences of at least 30 amino acids from P. lamerei, A. venetum, A. syriaca, C. procera, V. thouarsii [62], V. minor [57], C. roseus [55], A. thaliana [100], M. speciosa [101], S. lycopersicum [102], C. acuminata [103], C. gigantea [104], G. sempervirens [53], V. vinifera [105] and O. pumila [69] were compared in order to build gene families. CD-HIT (v.4.7 [114]) was used for each species. In each CD-HIT cluster, the longest representative protein was selected. The subsequent protein datasets were used as input for OrthoFinder (v.2.5.4 [115]) using the following parameters: S diamond -M msa -A muscle -T raxml-ng. From the 387 single-copy orthogroups, a maximum-likelihood phylogenetic tree was built. In order to identify orthogroup loss and growth across the phylogenetic tree, Cafe5 (v.4.2.1 [116]) was employed.
4.9. Synteny analysis
P. lamerei genome was aligned with V. thouarsii and C. roseus genome using minimap2 (v.2.24 [117]) and the following options: cx asm20 --cs. D-Genies (https://dgenies.toulouse.inra.fr/ [118]) was used to visualize the generated paf file, selecting hits with at least 80% identity and ranking contigs by size. Syntenic regions between STR/TDC/MATE1 gene cluster for C. roseus and TDC/MATE1 gene cluster for P. lamerei were compared using BLASTN (v.2.6.0–1 [109]). The resulting hits between the clusters were filtered to only include alignments with an e-value of 1e-6 and alignment length of 1500 bp and alignments were visualized using the R genoPlotR library (v.0.8.11 [119]). Gene and TE in microsyntenic regions was performed by comparing their proportion in the region of interest to their proportion in the corresponding contig by an exact Poisson test with the poisson.test function implemented in stat package (v. 4.1.1) in R (v. 4.1.1 [120]).
4.10. Transcript Abundance estimation
Salmon (v.0.6.0 [121]) with -biasCorrect and -vbo options was used to count RNAseq reads after they were pseudo-aligned onto the predicted transcripts. Abundance estimates were established as transcripts per million (TPM) and are presented in Supplementary Table S24.
4.11. Targeted metabolomics analysis
To attest the presence of MIA and cardenolides, a targeted metabolomic analysis was carried out. Metabolites were extracted by vortexing 50 mg of lyophilized leaf powder with 1 mL of 0.1% formic acid methanol solution. Extracts were injected into the ultra-performance liquid chromatography system ACQUITY (Waters, Milford, MA, USA) coupled to the single quadrupole mass spectrometer SQD2 (Waters, Milford, MA, USA) using conditions described in Ref. [122] over a 18 min gradient. The presence of tryptophan, secologanin and strictosidine, three key metabolites in MIA biosynthesis, as well as progesterone for cardenolides biosynthesis, was then checked by injecting the corresponding standards and a selected ion monitoring mode was used to collect data for tryptophan (m/z 205), secologanin (m/z 389), strictosidine (m/z 531.2) and progesterone (m/z 315).
To attest the presence of flavonoids, similar analysis was carried out on C. roseus and P. lamerei. Young and old leaves of P. lamerei and C. roseus and C. roseus flowers were collected and immediately flash-frozen in liquid nitrogen. Samples were freeze-dried for 48 h and subsequently ground using MM400 ball grinder (Retsch GmbH, Haan, Germany). 20 mg of powder for each sample were then added to 1 mL of 80% methanol and sonicated for 30 min. After 10 min centrifugation at 18,000×g and 4 C°, 600 μL of supernatant were retrieved and five-fold diluted with the mobile phase (milliQ water – acetonitrile 5% - formic acid 0.1%). Extracts were then injected into the ultra-performance liquid chromatography system ACQUITY (Waters, Milford, MA, USA) coupled to the single quadrupole mass spectrometer XEVO (Waters, Milford, MA, USA) using conditions described in Ref. [123]. A selected ion monitoring mode was used to collect data for phenylalanine (m/z 166), naringenin (m/z 273), kaempferol (m/z 285), kaempferol-3-glucoside (m/z 449), quercetin (m/z 301), quercetin-3-glucoside (m/z 465), pelargonidin (m/z 272), cyanidin (m/z 288), cyanidin-3-glucoside (m/z 485), catechin (m/z 291) and epicatechin (m/z 291). To confirm metabolite identification, the corresponding standards were injected for all targets.
Grant information
This work was supported by EU Horizon 2020 research and innovation program [MIAMi project-Grant agreement N°814645]; ARD CVL Biopharmaceutical program of the Région Centre-Val de Loire [ETOPOCentre project]; ARP-IR from the Région Centre-Val de Loire [ScaleBio project] and ANR [project MIACYC – ANR-20-CE43-0010].
Data availability statement
Raw DNA-seq data, raw RNA-seq data and the genome assembly have been deposited in the NCBI database under the BioProject accession number: PRJNA997810 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA997810). The genome annotation, its functional annotation, transcripts sequences, predicted CDS and protein sequences as well as database containing functionally characterized proteins involved in shikimate, methylerythritol phosphate (MEP), iridoid, MIA, steroids and cardenolide pathways are available on the figshare (Cuello et al., 2023, https://doi.org/10.6084/m9.figshare.23126816).
CRediT authorship contribution statement
Clément Cuello: Writing – review & editing, Writing – original draft, Visualization, Methodology, Investigation, Formal analysis, Data curation. Hans J. Jansen: Methodology, Investigation, Formal analysis, Data curation. Cécile Abdallah: Writing – review & editing, Methodology, Investigation. Duchesse-Lacours Zamar Mbadinga: Writing – review & editing, Investigation. Caroline Birer-Williams: Writing – review & editing, Investigation. Mickael Durand: Writing – review & editing, Investigation. Audrey Oudin: Investigation. Nicolas Papon: Writing – review & editing, Investigation. Nathalie Giglioli-Guivarc'H: Writing – review & editing, Investigation. Ron P. Dirks: Validation, Formal analysis, Data curation. Michael Krogh Jensen: Project administration, Funding acquisition, Conceptualization. Sarah Ellen O'Connor: Project administration, Funding acquisition, Conceptualization. Sébastien Besseau: Writing – review & editing, Validation, Supervision, Investigation, Conceptualization. Vincent Courdavault: Writing – review & editing, Writing – original draft, Validation, Supervision, Project administration, Funding acquisition, Conceptualization.
Declaration of competing interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:Vincent Courdavault reports financial support was provided by Horizon Europe. Michael Krogh Jensen reports a relationship with BioMIA that includes: board membership and employment. Ron Dirks reports a relationship with Future Genomics Technologies that includes: board membership and employment. Hans Jensen reports a relationship with Future Genomics Technologies that includes: board membership and employment. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
The authors benefited from the use of the cluster at the Centre de Calcul Scientifique en région Centre-Val de Loire.
Footnotes
Supplementary data to this article can be found online at https://doi.org/10.1016/j.heliyon.2024.e28078.
Contributor Information
Sébastien Besseau, Email: sebastien.besseau@univ-tours.fr.
Vincent Courdavault, Email: vincent.courdavault@univ-tours.fr.
Appendix A. Supplementary data
The following are the Supplementary data to this article:
References
- 1.Patil R.H., Patil M.P., Maheshwari V.L. Apocynaceae Plants: Ethnobotany, Phytochemistry, Bioactivity and Biotechnological Advances. Springer Nature Singapore; Singapore: 2023. Morphology, ecology, taxonomy, diversity, habitat and geographical distribution of the Apocynaceae family; pp. 1–11. [DOI] [Google Scholar]
- 2.Dey A., Mukherjee A., Chaudhury M. Alkaloids from Apocynaceae: origin, pharmacotherapeutic properties, and structure-activity studies. Stud. Nat. Prod. Chem. 2017;52:373–488. doi: 10.1016/B978-0-444-63931-8.00010-2. [DOI] [Google Scholar]
- 3.Fishbein M., Livshultz T., Straub S.C., Simões A.O., Boutte J., McDonnell A., Foote A. Evolution on the backbone: Apocynaceae phylogenomics and new perspectives on growth forms, flowers, and fruits. Am. J. Bot. 2018;105(3):495–513. doi: 10.1002/ajb2.1067. [DOI] [PubMed] [Google Scholar]
- 4.Ollerton J., Liede-Schumann S., Endress M.E., Meve U., Rech A.R., Shuttleworth A., Keller H.A., Fishbein M., Alvarado-Cardenas L.O., Amorim F.W., Bernhardt P., Celep F., Chirango Y., Chiriboga-Arroyo F., Civeyrel L., Cocucci A., Cranmer L., da Silva-Batista I.C., de Jager Linde, Depra M.S., Domingos-Melo A., Dvorsky C., Agostini K., Freitas L., Gaglianone M.C., Galetto L., Gilbert M., Gonzalez-Ramirez I., Gorostiague P., Goyder D., Hachuy-Filho L., Heiduk A., howrd A., Ionta G., Isls-Hernandez S.V., Johnson S.D., Joubert L., Kaiser-Bunbury C.N., Kephart S., Kidyoo A., Koptur S., Koschnitzke C., Lamborn E., Livshuktz T., Machado I.C., Marino S., Mema L., Mochizuki K., Cerdeira Morellato L.P., Mrisha C.K., Muiruri E.W., Nakahama N., Nascimento V.T., Nuttman C., Oliveira P.O., Peter C.I., Punekar S., Rafferty N., Rapini A., Ren Z.X., Rodrigez-Flores C.I., Rosero L., Sakai S., Sazima M., Steenhuisen S.L., Tan C.W., Torres C., Trojelsgaard K., Ushimaru A., Vieira M.F., Wiemer A.P., Yamashiro T., Nadia T., Queiroz J., Quirino Z. The diversity and evolution of pollination systems in large plant clades: Apocynaceae as a case study. Ann. Bot. 2019;123(2):311–325. doi: 10.1093/aob/mcy127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Mohammed A.E., Abdul-Hameed Z.H., Alotaibi M.O., Bawakid N.O., Sobahi T.R., Abdel-Lateff A., Alarif W.M. Chemical diversity and bioactivities of monoterpene indole alkaloids (MIAs) from six Apocynaceae genera. Molecules. 2021;26(2):488. doi: 10.3390/molecules26020488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.O'Connor S.E., Maresh J.J. Chemistry and biology of monoterpene indole alkaloid biosynthesis. Nat. Prod. Rep. 2006;23:532–547. doi: 10.1039/B512615K. [DOI] [PubMed] [Google Scholar]
- 7.Wen S., Chen Y., Lu Y., Wang Y., Ding L., Jiang M. Cardenolides from the Apocynaceae family and their anticancer activity. Fitoterapia. 2016;112:74–84. doi: 10.1016/j.fitote.2016.04.023. [DOI] [PubMed] [Google Scholar]
- 8.Züst T., Strickler S.R., Powell A.F., Mabry M.E., An H., Mirzaei M., York T., Holland C.K., Kumar P., Erb M., Petschenka G., Gomez J.M., Perfectti F., Müller C., Pires J.C., Mueller L.A., Jander G. Independent evolution of ancestral and novel defenses in a genus of toxic plants (Erysimum, Brassicaceae) Elife. 2020;9 doi: 10.7554/eLife.51712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Calderón-Montaño J.M., Burgos-Morón E., Orta M.L., Maldonado-Navas D., García-Domínguez I., López-Lázaro M. Evaluating the cancer therapeutic potential of cardiac glycosides. BioMed Res. Int. 2014;2014 doi: 10.1155/2014/794930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hafner S., Schmiech M., Lang S.J. The cardenolide glycoside acovenoside a interferes with epidermal growth factor receptor trafficking in non-small cell lung cancer cells. Front. Pharmacol. 2021;12 doi: 10.3389/fphar.2021.611657. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Agrawal A.A., Petschenka G., Bingham R.A., Weber M.G., Rasmann S. Toxic cardenolides: chemical ecology and coevolution of specialized plant–herbivore interactions. New Phytol. 2012;194(1):28–45. doi: 10.1111/j.1469-8137.2011.04049.x. [DOI] [PubMed] [Google Scholar]
- 12.Pandey A., Swarnkar V., Pandey T., Srivastava P., Kanojiya S., Mishra D.K., Tripathi V. Transcriptome and metabolite analysis reveal candidate genes of the cardiac glycoside biosynthetic pathway from Calotropis procera. Sci. Rep. 2016;6(1) doi: 10.1038/srep34464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Caspi E., Lewis D.O., Piatak D.M., Thimann K.V., Winter A. Biosynthesis of plant sterols. Conversion of cholesterol to pregnenolone in Digitalis purpurea. Experientia. 1966;22(8):506–507. doi: 10.1007/BF01898654. [DOI] [Google Scholar]
- 14.Caspi E., Lewis D.O. Progesterone: its possible role in the biosynthesis of cardenolides in Digitalis lanata. Science. 1967;156(3774):519–520. doi: 10.1126/science.156.3774.519. [DOI] [PubMed] [Google Scholar]
- 15.Milek F., Reinhard E., Kreis W.J.P.P. Influence of precursors and inhibitors of the sterol pathway on sterol and cardenolide metabolism in Digitalis lanata Ehrh. Plant Physiol. Biochem. 1997;35(2):111–121. [Google Scholar]
- 16.Kunert M., Langley C., Lucier R., Ploss K., Rodríguez López C.E., Serna Guerrero D.A., Rothe E., O'Connor S.E., Sonawane P.D. Promiscuous CYP87A enzyme activity initiates cardenolide biosynthesis in plants. Nat. Plants. 2023:1–11. doi: 10.1038/s41477-023-01515-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Munkert J., Ernst M., Müller-Uri F., Kreis W. Identification and stress-induced expression of three 3β-hydroxysteroid dehydrogenases from Erysimum crepidifolium Rchb. and their putative role in cardenolide biosynthesis. Phytochemistry. 2014;100:26–33. doi: 10.1016/j.phytochem.2014.01.006. [DOI] [PubMed] [Google Scholar]
- 18.Munkert J., Costa C., Budeanu O., Petersen J., Bertolucci S., Fischer G., Müller-Uri F., Kreis W. Progesterone 5β‐reductase genes of the Brassicaceae family as function‐associated molecular markers. Plant Biol. 2015;17(6):1113–1122. doi: 10.1111/plb.12361. [DOI] [PubMed] [Google Scholar]
- 19.Herl V., Frankenstein J., Meitinger N., Müller-Uri F., Kreis W. Δ5-3β-Hydroxysteroid dehydrogenase (3βHSD) from Digitalis lanata. Heterologous expression and characterisation of the recombinant enzyme. Planta Med. 2007;73(7):704–710. doi: 10.1055/s-2007-981537. [DOI] [PubMed] [Google Scholar]
- 20.Herl V., Albach D.C., Müller-Uri F., Bräuchler C., Heubl G., Kreis W. Using progesterone 5 β-reductase, a gene encoding a key enzyme in the cardenolide biosynthesis, to infer the phylogeny of the genus Digitalis. Plant Systemat. Evol. 2008;271:65–78. doi: 10.1007/s00606-007-0616-0. [DOI] [Google Scholar]
- 21.Macabeo A.P., Alejandro G.J., Hallare A.V., Vidar W.S., Villaflores O.B. Phytochemical survey and pharmacological activities of the indole alkaloids in the genus Voacanga Thouars (Apocynaceae) - an update. Phcog. Rev. 2009;3:132–142. [Google Scholar]
- 22.St-Pierre B., Besseau S., Clastre M., Courdavault V., Courtois M., Creche J., Ducos E., Dugé de Bernonville T., Dutilleul C., Glévarec G., Imbault N., Lanoue A., Oudin A., Papon N., Pichon O., Giglioli-Guivarc’h N. Deciphering the evolution, cell biology and regulation of monoterpene indole alkaloids. Adv. Bot. Res. 2013;68:73–109. doi: 10.1016/B978-0-12-408061-4.00003-1. Academic Press. [DOI] [Google Scholar]
- 23.Maeda H., Dudareva N. The shikimate pathway and aromatic amino acid biosynthesis in plants. Annu. Rev. Plant Biol. 2012;63:73–105. doi: 10.1146/annurev-arplant-042811-105439. [DOI] [PubMed] [Google Scholar]
- 24.De Luca V., Marineau C., Brisson N. Molecular cloning and analysis of cDNA encoding a plant tryptophan decarboxylase: comparison with animal dopa decarboxylases. Proc. Natl. Acad. Sci. USA. 1989;86(8):2582–2586. doi: 10.1073/pnas.86.8.2582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Contin A., van der Heijden R., Lefeber A.W., Verpoorte R. The iridoid glucoside secologanin is derived from the novel triose phosphate/pyruvate pathway in a Catharanthus roseus cell culture. FEBS Lett. 1998;434(3):413–416. doi: 10.1016/s0014-5793(98)01022-9. [DOI] [PubMed] [Google Scholar]
- 26.Miettinen K., Dong L., Navrot N., Schneider T., Burlat V., Pollier J., Woittiez L., van der Krol S., Lugan R., Ilc T., Verpoorte R., Oksman-Caldentey K.M., Martinoia E., Bouwmeester H., Goossens A., Memelink J., Werck-Reichhart D. The seco-iridoid pathway from Catharanthus roseus. Nat. Commun. 2014;5(1):3606. doi: 10.1038/ncomms4606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Irmler S., Schröder G., St-Pierre B., Crouch N.P., Hotze M., Schmidt J., Strack D., Matern U., Schröder J. Indole alkaloid biosynthesis in Catharanthus roseus: new enzyme activities and identification of cytochrome P450 CYP72A1 as secologanin synthase. Plant J. 2000;24(6):797–804. doi: 10.1046/j.1365-313x.2000.00922.x. [DOI] [PubMed] [Google Scholar]
- 28.Murata J., Roepke J., Gordon H., De Luca V. The leaf epidermome of Catharanthus roseus reveals its biochemical specialization. Plant Cell. 2008;20(3):524–542. doi: 10.1105/tpc.107.056630. Epub 2008 Mar 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Dugé de Bernonville T., Foureau E., Parage C., Lanoue A., Clastre M., Londono M.A., Oudin A., Houillé B., Papon N., Besseau S., Glévarec G., Atehortùa L., Giglioli-Guivarc'h N., St-Pierre B., De Luca V., O'Connor S.E., Courdavault V. Characterization of a second secologanin synthase isoform producing both secologanin and secoxyloganin allows enhanced de novo assembly of a Catharanthus roseus transcriptome. BMC Genom. 2015;16(1):619. doi: 10.1186/s12864-015-1678-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kulagina N., Méteignier L.V., Papon N., O'Connor S.E., Courdavault V. More than a Catharanthus plant: a multicellular and pluri-organelle alkaloid-producing factory. Curr. Opin. Plant Biol. 2022;67 doi: 10.1016/j.pbi.2022.102200. [DOI] [PubMed] [Google Scholar]
- 31.Treimer J.F., Zenk M.H. Purification and properties of strictosidine synthase, the key enzyme in indole alkaloid formation. Eur. J. Biochem. 1979;101:225–233. doi: 10.1111/j.1432-1033.1979.tb04235.x. [DOI] [PubMed] [Google Scholar]
- 32.McKnight T.D., Roessner C.A., Devagupta R., Scott A.I., Nessler C.L. Nucleotide sequence of a cDNA encoding the vacuolar protein strictosidine synthase from Catharanthus roseus. Nucleic Acids Res. 1990;18(16):4939. doi: 10.1093/nar/18.16.4939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Geerlings A., Ibanez M.M.L., Memelink J., van der Heijden R., Verpoorte R. Molecular cloning and analysis of strictosidine β-D-glucosidase, an enzyme in terpenoid indole alkaloid biosynthesis in Catharanthus roseus. J. Biol. Chem. 2000;275(5):3051–3056. doi: 10.1074/jbc.275.5.3051. [DOI] [PubMed] [Google Scholar]
- 34.Carqueijeiro I., Koudounas K., Dugé de Bernonville T., Sepúlveda L.J., Mosquera A., Bomzan D.P., Oudin A., Lanoue A., Besseau S., Lemoz Cruz P., Kulagina N., Stander E.A., Eymieux S., Burlaud-Gaillard J., Blanchard E., Clastre M., Atehortùa L., St-Pierre B., Giglioli-Guivarc'h N., Papon N., Nagegowda D.A., O'Connor S.E., Courdavault V. Alternative splicing creates a pseudo-strictosidine β-D-glucosidase modulating alkaloid synthesis in Catharanthus roseus. Plant Physiol. 2021;185(3):836–856. doi: 10.1093/plphys/kiaa075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Caputi L., Franke J., Farrow S.C., Chung K., Payne R.M.E., Nguyen T.-D., Dang T.-T.T., Soares Teto Carqueijeiro I., Koudounas K., Dugé de Bernonville T., Ameyaw B., Jones D.M., Vierira I.J.C., Courdavault V., O'Connor S.E. Missing enzymes in the biosynthesis of the anticancer drug vinblastine in Madagascar periwinkle. Science. 2018;360(6394):1235–1239. doi: 10.1126/science.aat4100. [DOI] [PubMed] [Google Scholar]
- 36.Stavrinides A., Tatsis E.C., Caputi L., Foureau E., Stevenson C.E., Lawson D.M., Courdavault V., O'connor S.E. Structural investigation of heteroyohimbine alkaloid synthesis reveals active site elements that control stereoselectivity. Nat. Commun. 2016;7(1) doi: 10.1038/ncomms12116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Stavrinides A., Tatsis E.C., Foureau E., Caputi L., Kellner F., Courdavault V., O'Connor S.E. Unlocking the diversity of alkaloids in Catharanthus roseus: nuclear localization suggests metabolic channeling in secondary metabolism. Chem. Biol. 2015;22(3):336–341. doi: 10.1016/j.chembiol.2015.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Olawale F., Adetunji T.L., Adetunji A.E., Iwaloye O., Folorunso I.M. The therapeutic value of alstonine: an updated review. South Afr. J. Bot. 2023;152:288–295. doi: 10.1016/j.sajb.2022.11.047. [DOI] [Google Scholar]
- 39.Stander E.A., Lehka B., Carqueijeiro I., Cuello C., Hansson F.G., Jensen H.J., Dugé de Bernonville T., Birer Williams C., Verges V., Lezin E., Lorensen M.D.B.B., Dang T.T., Oudin A., Lanoue A., Durand M., Giglioli Guivarc’h N., Janfelt C., Papon N., Dirks R.P., O'Connor S.E., Jensen M.K., Besseau S., Courdavault V. The Rauvolfia tetraphylla genome suggests multiple distinct biosynthetic routes for yohimbane monoterpene indole alkaloids. Commun. Biol. 2023;6:1197. doi: 10.1038/s42003-023-05574-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Lezin E., Carqueijeiro I., Cuello C., Durand M., Jansen H.J., Verges V., Birer Williams C., Oudin A., Dugé de Bernonville T., Petrignet J., Celton N., St-Pierre B., Papon N., Sun C., Dirks R.P., O'Connor S.E., Jensen M.K., Besseau S., Courdavault V. A chromosome-scale genome assembly of Rauvolfia tetraphylla facilitates identification of the complete ajmaline biosynthetic pathway. Plant Comm. 2023 doi: 10.1016/j.xplc.2023.100784. [DOI] [PubMed] [Google Scholar]
- 41.Ruppert M., Ma X., Stockigt J. Alkaloid biosynthesis in Rauvolfia-cDNA cloning of major enzymes of the ajmaline pathway. Curr. Org. Chem. 2005;9(15):1431–1444. doi: 10.2174/138527205774370540. [DOI] [Google Scholar]
- 42.Stöckigt J., Panjikar S., Ruppert M., Barleben L., Ma X., Loris E., Hill M. The molecular architecture of major enzymes from ajmaline biosynthetic pathway. Phytochemistry Rev. 2007;6:15–34. doi: 10.1007/s11101-006-9016-2. [DOI] [Google Scholar]
- 43.Dos Santos R.G., Bouso J.C., Hallak J.E. The antiaddictive effects of ibogaine: a systematic literature review of human studies. Journal of Psychedelic Studies. 2017;1(1):20–28. doi: 10.1556/2054.01.2016.001. [DOI] [Google Scholar]
- 44.Farrow S.C., Kamileen M.O., Caputi L., Bussey K., Mundy J.E., McAtee R.C., Stephenson C.R.J., O'Connor S.E. Biosynthesis of an anti-addiction agent from the iboga plant. J. Am. Chem. Soc. 2019;141(33):12979–12983. doi: 10.1021/jacs.9b05999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Ahmad B., Banerjee A., Tiwari H., Jana S., Bose S., Chakrabarti S. Structural and functional characterization of the Vindoline biosynthesis pathway enzymes of Catharanthus roseus. J. Mol. Model. 2018;24:1–14. doi: 10.1007/s00894-018-3590-2. [DOI] [PubMed] [Google Scholar]
- 46.Goboza M., Aboua Y.G., Chegou N., Oguntibeju O.O. Vindoline effectively ameliorated diabetes-induced hepatotoxicity by docking oxidative stress, inflammation and hypertriglyceridemia in type 2 diabetes-induced male Wistar rats. Biomed. Pharmacother. 2019;112 doi: 10.1016/j.biopha.2019.108638. [DOI] [PubMed] [Google Scholar]
- 47.Courdavault V., O'Connor S.E., Oudin A., Besseau S., Papon N. Towards the microbial production of plant-derived anticancer drugs. Trends in Cancer. 2020;6(6):444–448. doi: 10.1016/j.trecan.2020.02.004. [DOI] [PubMed] [Google Scholar]
- 48.Kulagina N., Guirimand G., Melin C., Lemos‐Cruz P., Carqueijeiro I., De Craene J.O., Oudin A., Heredia V., Koudounas K., Unlubayir M., Lanoue A., Imbault N., St-Pierre B., Papon N., Clastre M., Giglioli-Guivarc'h N., Marc J., Besseau S., Courdavault V. Enhanced bioproduction of anticancer precursor vindoline by yeast cell factories. Microb. Biotechnol. 2021;14(6):2693–2699. doi: 10.1111/1751-7915.13898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Zhang J., Hansen L.G., Gudich O., Viehrig K., Lassen L.M., Schrübbers L., Adhikari K.B., Rubaszka P., Carrasquer-Alvarez E., Chen L., D'Ambrosio V., Lehka B., Haidar A.K., Nallapareddy S., Giannakou K., Laloux M., Arsovska D., Jorgensen M.A.K., Chan L.J.G., Kristensen M., Christensen H.B., Sudarsan S., Stander E.A., Baidoo E., Petzold C.J., Wulff T., O'Connor S.E., Courdavault V., Jensen M.K., Keasling J.D. A microbial supply chain for production of the anti-cancer drug vinblastine. Nature. 2022;609(7926):341–347. doi: 10.1038/s41586-022-05157-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Salim V., Jones A.D., DellaPenna D. Camptotheca acuminata 10‐hydroxycamptothecin O‐methyltransferase: an alkaloid biosynthetic enzyme co‐opted from flavonoid metabolism. Plant J. 2018;95(1):112–125. doi: 10.1111/tpj.13936. [DOI] [PubMed] [Google Scholar]
- 51.Lemos Cruz P., Carqueijeiro I., Koudounas K., Bomzan D.P., Stander E.A., Abdallah C., Kulagina N., Oudin A., Lanoue A., Giglioli-Guivarc'h N., Nagegowda D.A., Papon N., Besseau S., Clastre M., Courdavault V. Identification of a second 16-hydroxytabersonine-O-methyltransferase suggests an evolutionary relationship between alkaloid and flavonoid metabolisms in Catharanthus roseus. Protoplasma. 2023;260(2):607–624. doi: 10.1007/s00709-022-01801-x. [DOI] [PubMed] [Google Scholar]
- 52.Kellner F., Geu-Flores F., Sherden N.H., Brown S., Foureau E., Courdavault V., O'Connor S.E. Discovery of a P450-catalyzed step in vindoline biosynthesis: a link between the aspidosperma and eburnamine alkaloids. Chem. Commun. 2015;51:7626–7628. doi: 10.1039/C5CC01309G. [DOI] [PubMed] [Google Scholar]
- 53.Franke J., Kim J., Hamilton J.P., Zhao D., Pham G.M., Wiegert-Rininger K., Crisovan E., Newton L., Vaillancourt B., Tatsis E., Buell C.R., O'Connor S.E. Gene discovery in Gelsemium highlights conserved gene clusters in monoterpene indole alkaloid biosynthesis. Chembiochem. 2019;20:83–87. doi: 10.1002/CBIC.201800592. [DOI] [PubMed] [Google Scholar]
- 54.Cuello C., Stander E.A., Jansen H.J., Dugé De Bernonville T.D., Oudin A., Birer-Williams C., Lanoue A., Giglioli-Guivarc'h N., Papon N., Krogh Jensen M., O'Connor S.E., Besseau S., Courdavault V. An updated version of the Madagascar periwinkle genome. F1000Research. 2022;11 doi: 10.12688/f1000research.129212.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Sun S., Shen X., Li Y., Li Y., Wang S., Li R., Zhang H., Shen G., Guo B., Wei J., Xu J., St-Pierre B., Chen S., Chao S. Single-cell RNA sequencing provides a high-resolution roadmap for understanding the multicellular compartmentation of specialized metabolism. Nat. Plants. 2023;9:179–190. doi: 10.1038/s41477-022-01291-y. [DOI] [PubMed] [Google Scholar]
- 56.Li C., Wood J.C., Vu A.H., Hamilton J.P., Rodriguez Lopez C.E., Payne R.M.E., Guerrero D.A.S., Gase K., Yomamoto K., Vaillancourt B., Caputi L., O'Connor S.E., Buell C.R. Single-cell multi-omics in the medicinal plant Catharanthus roseus. Nat. Chem. Biol. 2023 doi: 10.1038/s41589-023-01327-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Stander E.A., Cuello C., Birer-Williams C., Kulagina N., Jansen H.J., Carqueijeiro I., Méteignier L.V., Vergès V., Oudin A., Papon N., Dirks R., Jensen M.K., O'Connor S.E., Dugé de Bernonville T., Besseau S., Courdavault V. The Vinca minor genome highlights conserved evolutionary traits in monoterpene indole alkaloid synthesis. G3 Genes|Genomes|Genetics. 2022:jkac268. doi: 10.1093/g3journal/jkac268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Sabir J.S., Jansen R.K., Arasappan D., Calderon V., Noutahi E., Zheng C., Park S., Sabir M.J., Baeshen M.N., Hajrah N.H., Khiyami M.A., Baeshen N.A., Obaid A.Y., Al-Malki A.L., Sankoff D., El-Mabrouk N., Ruhlman T.A. The nuclear genome of Rhazya stricta and the evolution of alkaloid diversity in a medically relevant clade of Apocynaceae. Sci. Rep. 2016;6(1) doi: 10.1038/srep33782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Adabavazeh F., Pourseyedi S., Nadernejad N., Razavizadeh R. Hairy root induction in Calotropis procera and optimization of its phytochemical characteristics by elicitors. Plant Cell Tissue Organ Cult. 2023 doi: 10.1007/s11240-023-02481-y. [DOI] [Google Scholar]
- 60.Naidoo C.M., Naidoo Y., Dewir Y.H., Murthy H.N., El-Hendawy S., Al-Suhaibani N. Major bioactive alkaloids and biological activities of Tabernaemontana species (Apocynaceae) Plants. 2021;10(2):313. doi: 10.3390/plants10020313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Weng J.K., Lynch J.H., Matos J.O., Dudareva N. Adaptive mechanisms of plant specialized metabolism connecting chemistry to function. Nat. Chem. Biol. 2021;17:1037–1045. doi: 10.1038/s41589-021-00822-6. [DOI] [PubMed] [Google Scholar]
- 62.Cuello C., Stander E.A., Jansen H.J., Dugé de Bernonville T., Lanoue A., Giglioli-Guivarc'h N., Papon N., Dirks R.P., Jensens M.K., O'Connor S.E., Besseau S., Courdavault V. Genome assembly of the medicinal plant Voacanga thouarsii. Genome Biology and Evolution. 2022;14(11) doi: 10.1093/gbe/evac158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Jiao Y., Leebens-Mack J., Ayyampalayam S., Bowers J.E., McKain M.R., McNeal J., Rolf M., Ruzicka D.R., Wafula E., Wickett N.J., Wu X., Zhang Y., Wang J., Zhang Y., Carpenter E.J., Ceyholos M.K., Kutchan T.M., Chanderbali A.S., Soltis P.S., Stevenson D.W., McCombie R., Pires J.C., Wong G.K.S., Soltis D.E., Depamphilis C.W. A genome triplication associated with early diversification of the core eudicots. Genome Biol. 2012;13(1):1–14. doi: 10.1186/gb-2012-13-1-r3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Sahebi M., Hanafi M.M., van Wijnen A.J., Rice D., Rafii M.Y., Azizi P., Osma M., Taheri S., Abu Bakar M.F., Mat Isa M.N., Noor Y.M. Contribution of transposable elements in the plant's genome. Gene. 2018;665:155–166. doi: 10.1016/j.gene.2018.04.050. [DOI] [PubMed] [Google Scholar]
- 65.Piovan A., Filippini R., Favretto D. Characterization of the anthocyanins of Catharanthus roseus (L.) G. Don in vivo and in vitro by electrospray ionization ion trap mass spectrometry. Rapid Commun. Mass Spectrom. 1998;12(7):361–367. doi: 10.1002/(SICI)1097-0231(19980415)12:7<361::AID-RCM162>3.0.CO;2-U. [DOI] [Google Scholar]
- 66.Mustafa N.R., Verpoorte R. Phenolic compounds in Catharanthus roseus. Phytochemistry Rev. 2007;6:243–258. doi: 10.1007/s11101-006-9039-8. [DOI] [Google Scholar]
- 67.Cuello C., Jansen H.J., Abdallah C., Zamar D.L., Birer-Williams C., Durand M., Oudin A., Papon N., Giglioli-Guivarc'h N., Dirks R.P., Jensen M.K., O'Connor S.E., Besseau S., Courdavault V. Datasets for Pachypodium lamerei, Apocynum venetum, Asclepias syriaca and Calotropis procera genomes. FigShare. 2023 doi: 10.6084/m9.figshare.23126816. [DOI] [Google Scholar]
- 68.Sadre R., Magallanes-Lundback M., Pradhan S., Salim V., Mesberg A., Jones A.D., DellaPenna D. Metabolite diversity in alkaloid biosynthesis: a multilane (Diastereomer) highway for camptothecin synthesis in Camptotheca acuminata. Plant Cell. 2016;28(8):1926–1944. doi: 10.1105/tpc.16.00193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Rai A., Hirakawa H., Nakabayashi R., Kikuchi S., Hayashi K., Rai M., Tsugawa H., Nakaya T., Mori T., Nagasaki H., Fukushi R., Kusuya Y., Takahashi H., Uchiyama H., Toyoda A., Hikosaka S., Goto E., Saito K., Yamazaki M. Chromosome-level genome assembly of Ophiorrhiza pumila reveals the evolution of camptothecin biosynthesis. Nat. Commun. 2021;12(1):1–19. doi: 10.1038/s41467-020-20508-2. 12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Simões A.O., Livshultz T., Conti E., Endress M.E. Phylogeny and systematics of the Rauvolfioideae (Apocynaceae) based on molecular and morphological Evidence1. Ann. Mo. Bot. Gard. 2007;94(2):268–297. doi: 10.3417/0026-6493(2007)94[268:PASOTR]2.0.CO;2. [DOI] [Google Scholar]
- 71.Cannon S.B., Mitra A., Baumgarten A., Young N.D., May G. The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana. BMC Plant Biol. 2004;4(1):1–21. doi: 10.1186/1471-2229-4-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Huang Y.L., Zhang L.K., Zhang K., Chen S.M., Hu J.B., Cheng F. The impact of tandem duplication on gene evolution in Solanaceae species. J. Integr. Agric. 2022;21(4):1004–1014. doi: 10.1016/S2095-3119(21)63698-5. [DOI] [Google Scholar]
- 73.Rodgers-Melnick E., Mane S.P., Dharmawardhana P., Slavov G.T., Crasta O.R., Strauss S.H., Brunner A.M., DiFazio S.P. Contrasting patterns of evolution following whole genome versus tandem duplication events in Populus. Genome Res. 2012;22(1):95–105. doi: 10.1101/gr.125146.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Méteignier L.V., Nützmann H.W., Papon N., Osbourn A., Courdavault V. Emerging mechanistic insights into the regulation of specialized metabolism in plants. Nat. Plants. 2023;9(1):22–30. doi: 10.1038/s41477-022-01288-7. [DOI] [PubMed] [Google Scholar]
- 75.Hu K., Xu K., Wen J., Yi B., Shen J., Ma C., Fu T., Ouyang Y., Tu J. Helitron distribution in Brassicaceae and whole Genome Helitron density as a character for distinguishing plant species. BMC Bioinf. 2019;20(1):1–20. doi: 10.1186/s12859-019-2945-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Li Y., Dooner H.K. Excision of Helitron transposons in maize. Genetics. 2009;182(1):399–402. doi: 10.1534/genetics.109.101527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Park M., Sarkhosh A., Tsolova V., El-Sharkawy I. Horizontal transfer of LTR retrotransposons contributes to the genome diversity of Vitis. Int. J. Mol. Sci. 2021;22(19) doi: 10.3390/ijms221910446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Chen H., Zhang Y., Feng S. Whole-genome and dispersed duplication, including transposed duplication, jointly advance the evolution of TLP genes in seven representative Poaceae lineages. BMC Genom. 2023;24:290. doi: 10.1186/s12864-023-09389-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Jiang N., Bao Z., Zhang X.Eddy SR., Wessler S.R. Pack-MULE transposable elements mediate gene evolution in plants. Nature. 2004;431:569–573. doi: 10.1038/nature02953. [DOI] [PubMed] [Google Scholar]
- 80.Carrier G., Santoni S., Rodier‐Goud M., Canaguier A., De Kochko A., Dubreuil‐Tranchant C., This P., Boursiquot J.M., Le Cunff L. An efficient and rapid protocol for plant nuclear DNA preparation suitable for next generation sequencing methods. Am. J. Bot. 2011;98(1):e13–e15. doi: 10.3732/ajb.1000371. [DOI] [PubMed] [Google Scholar]
- 81.Wick R.R., Judd L.M., Gorrie C.L., Holt K.E. Completing bacterial genome assemblies with multiplex MinION sequencing. Microb. Genom. 2017;14(10) doi: 10.1099/mgen.0.000132. 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Kolmogorov M., Yuan J., Lin Y., Pevzner P.A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 2019;37(5):540–546. doi: 10.1038/s41587-019-0072-8. 2019 37. [DOI] [PubMed] [Google Scholar]
- 83.Walker B.J., Abeel T., Shea T., Priest M., Abouelliel A., Sakthikumar S., Cuomo C.A., Zeng Q., Wortman J., Young S.K., Earl A.M. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One. 2014;9 doi: 10.1371/JOURNAL.PONE.0112963. e112963–944 e112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Kim D., Paggi J.M., Park C., Bennett C., Salzberg S.L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 2019;37(8):907–915. doi: 10.1038/s41587-019-0201-4. 2019 37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Pertea M., Pertea G.M., Antonescu C.M., Chang T.C., Mendell J.T., Salzberg S.L. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 2015;33(3):290–295. doi: 10.1038/nbt.3122. 2015 33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Haas B.J., Papanicolaou A., Yassour M., Grabherr M., Blood P.D., Bowden J., Couger M.B., Eccles D., Li B., Lieber M., MacManes M.D., Ott M., Orvis J., Pochet N., Strozzi F., Weeks N., Westerman R., William T., Dewey C.N., Henschel R., LeDuc R.D., Friedman N., Regev A. De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity. Nat. Protoc. 2013;8 doi: 10.1038/nprot.2013.084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Finn R.D., Clements J., Eddy S.R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 2011;39:W29–W37. doi: 10.1093/NAR/GKR367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Rhie A., Walenz B.P., Koren S., Phillippy A.M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 2020;21:245. doi: 10.1186/s13059-020-02134-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Bushnell B. Lawrence Berkeley National Lab. (LBNL); Berkeley, CA (United States): 2014. BBMap: A Fast, Accurate, Splice-Aware Aligner (No. LBNL-7065E) [Google Scholar]
- 90.Simão F.A., Waterhouse R.M., Ioannidis P., Kriventseva E.V., Zdobnov E.M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–3212. doi: 10.1093/bioinformatics/btv351. [DOI] [PubMed] [Google Scholar]
- 91.Dainat J., Hereñú D., pascal-git LucileSol. Zenodo; 2022. NBISweden/AGAT: AGAT-v0.8.1. [DOI] [Google Scholar]
- 92.Ou S., Su W., Liao Y., Chougule K., Agda J.R.A., Hellinga A.J., Lugo C.S.B., Elliott T.A., Ware D., Peterson T., Jiang N., Hirsch C.N., Hufford M.B. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 2019;20:275. doi: 10.1186/s13059-019-1905-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Xu Z., Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35:W265–W268. doi: 10.1093/nar/gkm286. Web Server issue): [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Ellinghaus D., Kurtz S., Willhoeft U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinf. 2008;9(1):18. doi: 10.1186/1471-2105-9-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Shi X., Cao S., Wang X., Huang S., Wang Y., Liu Z.…Zhou Y. The complete reference genome for grapevine (Vitis vinifera L.) genetics and breeding. Horticulture Research. 2023;10(5):uhad061. doi: 10.1093/hr/uhad061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Su W., Gu X., Peterson T. TIR-learner, a new ensemble method for TIR transposable element annotation, provides evidence for abundant new transposable elements in the maize genome. Mol. Plant. 2019;12(3):447–460. doi: 10.1016/j.molp.2019.02.008. [DOI] [PubMed] [Google Scholar]
- 97.Xiong W., He L., Lai J., Dooner H.K., Du C. HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc. Natl. Acad. Sci. U.S.A. 2014;111(28):10263–10268. doi: 10.1073/pnas.1410068111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Ou S., Jiang N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 2018;76(2):1410–1422. doi: 10.1104/pp.17.01310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Flynn J.M., Hubley R., Goubert C., Rosen J., Clark A.G., Feschotte C., Smit A.F. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl. Acad. Sci. USA. 2020;117(17):9451–9457. doi: 10.1073/pnas.1921046117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Lamesch P., Berardini T.Z., Li D., Swarbreck D., Wilks C., Sasidharan R., Muller R., Dreher K., Alexander D.L., Garcia-Hernandez M., Karthikeyan A.S., Lee C.H., Nelson W.D., Ploetz L., Singh S., Wensel A., Huala E. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 2012;40 doi: 10.1093/NAR/GKR1090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Brose J., Lau K.H., Dang T.T.T., Hamilton J.P., Martins L.D.V., Hamberger Britta, Hamberger Bjoern, Jiang J., O'Connor S.E., Buell C.R. The Mitragyna speciosa (Kratom) Genome: a resource for data-mining potent pharmaceuticals that impact human health. G3 GenesGenomesGenetics. 2021;11 doi: 10.1093/G3JOURNAL/JKAB058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Hosmani P.S., Flores-Gonzalez M., van de Geest H., Maumus F., Bakker L.V., Schijlen E., van Haarst J., Cordewener J., Sanchez-Perez G., Peters S., Fei Z., Giovannoni J.J., Mueller L.A., Saha S. 2019. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. [DOI] [Google Scholar]
- 103.Kang M., Fu R., Zhang P., Lou S., Yang X., Chen Y., Ma T., Zhang Y., Xi Z., Liu J. A chromosome-level Camptotheca acuminata genome assembly provides insights into the evolutionary origin of camptothecin biosynthesis. Nat. Commun. 2021;12(1):1–12. doi: 10.1038/s41467-021-23872-9. 2021 12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Hoopes G.M., Hamilton J.P., Kim J., Zhao D., Wiegert-Rininger K., Crisovan E., Buell C.R. Genome assembly and annotation of the medicinal plant Calotropis gigantea, a producer of anticancer and antimalarial cardenolides. G3 GenesGenomesGenetics. 2018;8:385–391. doi: 10.1534/g3.117.300331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Shi J., Liang C. Generic repeat finder: a high-sensitivity tool for genome-wide de novo repeat detection. Plant Physiol. 2019;180(4):1803–1815. doi: 10.1104/pp.19.00386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Barker M.S., Dlugosch K.M., Dinh L., Challa R.S., Kane N.C., King M.G., Rieseberg L.H. EvoPipes.net: bioinformatic tools for ecological and evolutionary genomics. Evol. Bioinforma. Online. 2010;6:143–149. doi: 10.4137/EBO.S5861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Ma B., Tromp J., Li M. PatternHunter: faster and more sensitive homology search. Bioinformatics. 2002;18:440–445. doi: 10.1093/BIOINFORMATICS/18.3.440. [DOI] [PubMed] [Google Scholar]
- 108.Zhang Z., Schwartz S., Wagner L., Miller W. 2004. A Greedy Algorithm for Aligning DNA Sequences; pp. 203–214.https://home.liebertpub.com/cmb7 [DOI] [PubMed] [Google Scholar]
- 109.Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L. BLAST+: architecture and applications. BMC Bioinf. 2009;10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Birney E., Clamp M., Durbin R. GeneWise and genomewise. Genome Res. 2004;14:988–995. doi: 10.1101/GR.1865504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Edgar R.C. MUSCLE: a multiple sequence alignment method with reduced time andspace complexity. BMC Bioinf. 2004;5:113. doi: 10.1186/1471-2105-5-113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Wernersson R., Pedersen A.G. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Res. 2003;31:3537–3539. doi: 10.1093/NAR/GKG609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Bioinformatics. 1997;13:555–556. doi: 10.1093/bioinformatics/13.5.555. [DOI] [PubMed] [Google Scholar]
- 114.Fu L., Niu B., Zhu Z., Wu S., Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–3152. doi: 10.1093/BIOINFORMATICS/BTS565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Emms D.M., Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20 doi: 10.1186/S13059-019-1832-Y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Mendes F.K., Vanderpool D., Fulton B., Hahn M.W. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics. 2020;36:5516–5518. doi: 10.1093/BIOINFORMATICS/BTAA1022. [DOI] [PubMed] [Google Scholar]
- 117.Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Cabanettes F., Klopp C. D-GENIES: dot plot large genomes in an interactive, efficient and simple way. PeerJ. 2018;6:e4958. doi: 10.7717/peerj.4958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Guy L., Roat Kultima J., Andersson S.G.E. genoPlotR: comparative gene and genome visualization in R. Bioinformatics. 2010;26:2334–2335. doi: 10.1093/bioinformatics/btq413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.R Core Team. R . 2020. A Language and Environment for Statistical Computing. Vienna: Austria. [Google Scholar]
- 121.Patro R., Duggal G., Love M.I., Irizarry R.A., Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods. 2017;14(4):417–419. doi: 10.1038/nmeth.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Dugé de Bernonville T., Carqueijeiro I., Lanoue A., Lafontaine F., Sánchez Bel P., Liesecke F., Musset K., Oudin A., Glévarec G., Pichon O., Besseau S., Clastre M., St-Pierre B., Flors V., Maury S., Huguet E., O'Connor S.E., Courdavault V. Folivory elicits a strong defense reaction in Catharanthus roseus: metabolomic and transcriptomic analyses reveal distinct local and systemic responses. Sci. Rep. 2017;7 doi: 10.1038/srep40453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Elejalde-Palmett C., Billet K., Lanoue A., De Craene J.O., Glévarec G., Pichon O., Clastre M., Courdavault V., St Pierre B., Giglioli-Guivarc’h N., Dugé de Bernonville T., Besseau S. Genome-wide identification and biochemical characterization of the UGT88F subfamily in Malus x domestica Borkh. Phytochemistry. 2019;157:135. doi: 10.1016/j.phytochem.2018.10.019. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw DNA-seq data, raw RNA-seq data and the genome assembly have been deposited in the NCBI database under the BioProject accession number: PRJNA997810 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA997810). The genome annotation, its functional annotation, transcripts sequences, predicted CDS and protein sequences as well as database containing functionally characterized proteins involved in shikimate, methylerythritol phosphate (MEP), iridoid, MIA, steroids and cardenolide pathways are available on the figshare (Cuello et al., 2023, https://doi.org/10.6084/m9.figshare.23126816).