Skip to main content
Applications in Plant Sciences logoLink to Applications in Plant Sciences
letter
. 2020 Nov 28;8(11):e11400. doi: 10.1002/aps3.11400

Enabling evolutionary studies at multiple scales in Apocynaceae through Hyb‐Seq

Shannon C K Straub 1,, Julien Boutte 1, Mark Fishbein 2, Tatyana Livshultz 3
PMCID: PMC7705337  PMID: 33304663

Abstract

Premise

Apocynaceae is the 10th largest flowering plant family and a focus for study of plant–insect interactions, especially as mediated by secondary metabolites. However, it has few genomic resources relative to its size. Target capture sequencing is a powerful approach for genome reduction that facilitates studies requiring data from the nuclear genome in non‐model taxa, such as Apocynaceae.

Methods

Transcriptomes were used to design probes for targeted sequencing of putatively single‐copy nuclear genes across Apocynaceae. The sequences obtained were used to assess the success of the probe design, the intrageneric and intraspecific variation in the targeted genes, and the utility of the genes for inferring phylogeny.

Results

From 853 candidate nuclear genes, 835 were consistently recovered in single copy and were variable enough for phylogenomics. The inferred gene trees were useful for coalescent‐based species tree analysis, which showed all subfamilies of Apocynaceae as monophyletic, while also resolving relationships among species within the genus Apocynum. Intraspecific comparison of Elytropus chilensis individuals revealed numerous single‐nucleotide polymorphisms with potential for use in population‐level studies.

Discussion

Community use of this Hyb‐Seq probe set will facilitate and promote progress in the study of Apocynaceae across scales from population genomics to phylogenomics.

Keywords: Apocynaceae, dogbane, genome reduction, Hyb‐Seq, low‐copy nuclear genes, milkweed, phylogenomics, targeted sequencing


Target capture sequencing of nuclear genes is one approach that facilitates large‐scale phylogenomic studies of non‐model organisms (Lemmon and Lemmon, 2013; Dodsworth et al., 2019). In plants, this approach has clarified evolutionary relationships in multiple lineages (e.g., Mandel et al., 2015; Fisher et al., 2016; Heyduk et al., 2016; Léveillé‐Bourret et al., 2018; Boutte et al., 2019; Couvreur et al., 2019; Herrando‐Moraira et al., 2019; Bagley et al., 2020). Obtaining data from many nuclear genes facilitates the use of species tree methods based on the multispecies coalescent model (Mirarab and Warnow, 2015; Edwards et al., 2016) and provides data sets of the size required to address problems that have been unsolvable using traditional molecular systematics approaches (e.g., Léveillé‐Bourret et al., 2018; Herrando‐Moraira et al., 2019), or even whole plastome data (e.g., Straub et al., 2014). This method allows sufficient flexibility to design probes in conserved regions that flank more variable intron and intergenic regions, providing sequence data that are useful for comparative analysis of distantly related species, among closely related species within a genus, or even within a single species (Weitemier et al., 2014, 2019; Crowl et al., 2017; Peng et al., 2017; Villaverde et al., 2018; de la Harpe et al., 2019; Jones et al., 2019).

As sequencing of DNA and RNA has become easier and cheaper, several large‐scale projects have sequenced, assembled, and made available to the research community numerous plant transcriptomes (e.g., Phytometasyn, https://bioinformatics.tugraz.at/phytometasyn/; Medicinal Plant Genomics Resource, http://medicinalplantgenomics.msu.edu; 1000 Plants, www.onekp.com). Now that transcriptomes from multiple species within the same plant family are publicly available, tools (e.g., MarkerMiner; Chamala et al., 2015) that can take advantage of this wealth of information about nuclear genes in the design of targeted sequencing probes can be easily utilized. The MarkerMiner pipeline conducts a reciprocal BLAST search between each transcriptome and a reference proteome to identify putative orthologs, which are then filtered based on known single‐copy and low‐copy genes in angiosperms (DeSmet et al., 2013). Putative orthologs from multiple input transcriptomes are then clustered by reference protein identifiers and aligned. The alignments are useful for targeted enrichment probe design and other downstream applications (Chamala et al., 2015). This approach has been successful in aiding development of targeted sequencing probes in multiple plant lineages (Villaverde et al., 2018; Morais et al., 2019; Jantzen et al., 2020).

The milkweed and dogbane family, Apocynaceae, has publicly available transcriptomes from several species. With ca. 5300 species, Apocynaceae is the 10th largest family of flowering plants and has a worldwide distribution (Endress et al., 2018). Its members are widely known for their production of secondary metabolites that function in defense against herbivores, some of which have evolved mechanisms to detoxify and sequester these compounds (Malcolm and Brower, 1989). Some metabolites, such as vinblastine and vincristine, are used as human medicines (Cragg and Newman, 2005). Pollination biology and floral evolution have been another focal area of study due to the presence of complex derived floral structures, such as pollinia, gynostegia, and coronas (Endress, 1994; Endress and Bruyns, 2000; Fishbein, 2001; Fishbein et al., 2018). Complete resolution of the backbone of the phylogeny of Apocynaceae has proven to be difficult using traditional molecular systematics methods, a supermatrix approach, and even plastome data sets (Livshultz, 2010; Fishbein et al., 2018). However, understanding the evolution of secondary metabolite biosynthetic pathways and other traits, such as floral structure and pollen aggregation, in these plants requires a well‐resolved phylogeny, which could be inferred based on information from hundreds of nuclear genes using a target capture sequencing approach.

Previously, Weitemier et al. (2014) used the Hyb‐Seq pipeline to identify loci for probe design for targeted sequencing in one genus of Apocynaceae, Asclepias L. They identified 768 single‐copy genes and demonstrated their utility in two subtribes of Asclepiadeae (Asclepiadoideae), and this set has subsequently been utilized to successfully resolve relationships among Asclepias species (Boutte et al., 2019). However, Weitemier et al. (2014) also showed that there would be greatly decreased enrichment success due to DNA sequence divergence from Asclepias in members of the rauvolfioid grade (formerly Rauvolfioideae; Simões et al., 2016) of Apocynaceae. In this study, we produced a set of hybridization probes for targeted sequencing of hundreds of nuclear genes that work across the whole of Apocynaceae, and can be used for future studies at multiple scales ranging from higher‐level phylogenomics to intraspecific studies.

METHODS

Probe design

Five transcriptomes from across Apocynaceae were utilized for probe design for targeted enrichment of single‐ or low‐copy genes. The sampled species represent two of the five subfamilies and four different tribes of Apocynaceae: Asclepias syriaca L. (Weitemier et al., 2019) and Calotropis procera (Aiton) W. T. Aiton (Kwon et al., 2015) of Asclepiadeae (Asclepiadoideae), and Rauvolfia serpentina (L.) Benth. ex Kurz (Góngora‐Castillo et al., 2012), Rhazya stricta Decne. (Yates et al., 2014), and Tabernaemontana elegans Stapf (Xiao et al., 2013) of the rauvolfioid‐grade tribes Vinceae, Amsonieae, and Tabernaemontaneae, respectively. These exemplars should capture the diversity of the family given that their most recent common ancestor is close to the crown node of the family (Fishbein et al., 2018).

MarkerMiner version 1.2 (Chamala et al., 2015) was employed to identify putatively single‐copy genes using the Vitis vinifera L. proteome reference and a minimum similarity of 85%. The default minimum transcript length of 900 bp was used because it approximates the setting used by Weitemier et al. (2014) for target capture sequencing probe design in Asclepias using the Hyb‐Seq pipeline. Intron/exon boundaries were determined by comparison with the V. vinifera genome. This work was completed utilizing the High Performance Computing Center facilities of Oklahoma State University at Stillwater.

MarkerMiner output alignments were submitted to MYcroarray (now Daicel Arbor Biosciences, Ann Arbor, Michigan, USA) for the design and synthesis of myBaits biotinylated RNA probes of 120 nucleotides with 2× tiling. Probes with multiple BLAST hits to the A. syriaca nuclear genome assembly (Weitemier et al., 2019) or those containing repetitive sequence were excluded from the final probe set of 48,974 probes. An additional 2707 probes for targeting two nuclear genes, paralogs dhs and hss, were present in the probe set used for enrichment that generated data for a separate project investigating pyrrolizidine alkaloid biosynthesis (Livshultz et al., 2018). These two genes are not considered further in the reported results of this paper.

The genes targeted across Apocynaceae were compared with the set of genes targeted in Asclepias by Weitemier et al. (2014) and the universal angiosperm probe set (Johnson et al., 2019) using reciprocal BLASTN (E‐value threshold of 10−6, minimum percentage of identity of 80%; Altschul et al., 1990) and Python custom scripts (Python version 2.7.12; Python Software Foundation, 2016).

Taxon sampling, library preparation, and targeted sequencing

Probes were tested for target enrichment success in 15 species of Apocynaceae spanning the diversity of the family and including representatives of the rauvolfioid grade, the apocynoid grade, Secamonoideae, and Asclepiadoideae (Table 1). Four species of Apocynum L. and two individuals of Elytropus chilensis (A. DC.) Müll. Arg. were sampled to assess the utility of genes at the intrageneric and intraspecific levels, respectively. One outgroup species, Gelsemium sempervirens (L.) J. St.‐Hil. (Gelsemiaceae), from a closely related family in Gentianales was also included.

Table 1.

Sampling, sequencing, and assembly success for 853 nuclear genes in Apocynaceae. a

Subfamily Tribe Sample name Voucher specimen (Herbarium) b Trimmed reads Reads on target Genes with reads mapped Exons assembled Exon sequence assembled (bp) Splash zone sequence assembled (bp) Paralog warnings Final gene occupancy Plastome sequencing depth
Rauvolfioideae Vinceae Ochrosia coccinea (Teijsm. & Binn.) Miq. Takeuchi 14422 (A) 2,717,939 73% 846 6559 1,386,984 2,675,319 3 834 93×
Tabernaemontaneae Tabernaemontana bufalina Lour. Middleton et al. 1749 (A) 1,350,506 71% 843 7715 1,375,668 2,825,097 3 832 40×
Melodineae Craspidospermum verticillatum Bojer ex A.DC. Schonenberger et al. D36 (UPS) 383,099 79% 825 4916 1,070,778 1,168,586 2 816 18×
Hunterieae Hunteria zeylanica Gardner ex Thwaites Middleton et al. 3816 (E) 1,295,655 77% 845 6186 1,314,036 2,212,866 2 834 36×
Amsonieae Amsonia orientalis Decne. Endress s.n. (Z) 3,663,743 84% 848 6839 1,394,925 2,419,790 1 835 30×
Apocynoideae Echiteae Prestonia portobellensis (Beurl.) Woodson Ventura 21262 (NY) 3,581,141 78% 844 4855 1,191,183 2,053,075 1 830 156×
Odontadenieae Elytropus chilensis Müll. Arg. ‐ 1 Sobel & Strudwick 2740 (NY) 909,288 82% 844 5511 1,166,106 1,447,041 1 n/a 49×
Elytropus chilensis ‐ 2 M. Mihoc et al. CONC#156934 (CONC) 2,150,861 76% 847 6110 1,280,367 2,338,932 0 835 79×
Apocyneae Apocynum androsaemifolium L. T. Livshultz 03‐32 (BH) 1,931,260 78% 847 5408 1,234,158 1,823,085 1 834 155×
Apocynum cannabinum L. T. Livshultz 03‐28 (BH) 3,875,483 79% 847 5686 1,272,198 2,509,109 1 832 198×
Apocynum pictum Schrenk F. Konta & N. Abjusalik 35598 (NY) 353,289 79% 822 4288 989,706 1,216,606 2 812 19×
Apocynum venetum L. F. Konta & N. Abjusalik 35600 (NY) 621,097 75% 838 4883 1,115,646 1,795,705 1 828 21×
Secamonoideae Toxocarpus villosus (Blume) Decne. D. J. Middleton et al. 1341 (A) 1,751,248 69% 842 5727 1,205,457 2,430,178 1 829 102×
Asclepiadoideae Fockeae Fockea edulis K. Schum. T. Livshultz s.n. 31.III.1998 (BH) 629,938 67% 816 5160 1,050,246 1,753,430 1 800 54×
Marsdenieae Marsdenia glabra Costantin D. J. Middleton et al. 1123 (A) 3,138,002 78% 844 5901 1,180,695 2,311,478 2 824 44×
Asclepiadeae Tassadia berteroana (Spreng.) W. D. Stevens Fuentes 3904 (OKLA) 6,652,102 86% 844 5739 1,149,015 2,043,461 1 820 195×
Outgroup Gelsemium sempervirens (L.) J. St.‐Hil. Fishbein 7665 (OKLA) 1,276,554 61% 836 2132 734,490 1,108,017 1 777 67×
Mean (SD) 2,134,189 (1,653,721) 76% (6%) 840 (10) 5507 (1200) 1,183,039 (165,106) 2,007,757 (533,759) 1.4 (0.8) 823 (16) 80× (61×)
Median 1,751,248 78% 844 5686 1,191,183 2,053,075 1 829.5 54×
a

DNA extractions were performed from herbarium specimens for vouchers indicated in boldface.

b

Herbarium acronyms are according to Index Herbariorum (http://sweetgum.nybg.org/science/ih/).

Total genomic DNA was extracted using previously described methods (Livshultz et al., 2007; Fishbein et al., 2018) from either silica‐dried leaf tissue or herbarium specimens (Table 1). DNA was quantified using a Qubit 2.0 fluorometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA) and visualized with GelRed nucleic acid stain (Phenix Research Products, Candler, North Carolina, USA) after agarose gel electrophoresis. Genomic DNA was fragmented to approximately 400 bp using sonication in a Bioruptor Pico (Diagenode, Denville, New Jersey, USA). Illumina‐compatible libraries were constructed using NEBNext Ultra II kits (New England Biolabs, Ipswich, Massachusetts, USA) in one‐half volume reactions with the NEB adapters replaced by NEXTflex‐HT barcoded adapters (Bioo Scientific Corporation, Austin, Texas, USA). Libraries were pooled in equimolar ratios and sent to Arbor Biosciences for solution hybridization with the biotinylated RNA baits described above. Enriched pools were combined and 150‐bp paired‐end reads were sequenced on a NextSeq 500 (Illumina, San Diego, California, USA) at the Genomics Core Facility of the Drexel University College of Medicine.

Sequence clean‐up, assembly, and alignment

Illumina reads were quality trimmed using Trimmomatic version 0.36 (Bolger et al., 2014). Adapter sequences were deleted and reads with a quality score ≤ Q20 were removed. Then, a sliding window approach was employed to trim the 3′‐end of the read with a quality score ≤ Q20 for a window of 5 bp. Only reads of at least 36 bp were retained for downstream analyses. Cleaned reads were assembled using HybPiper version 1.2 (Johnson et al., 2016) with the Burrows–Wheeler Aligner (BWA) option. Supercontigs were produced and intron positions were detected using the HybPiper intronerate module. Genes for which paralogous sequences were detected using the HybPiper paralog_investigator module and genes with <75% terminal occupancy were removed from downstream analyses. Read depth and HybPiper statistics were obtained using the depth_calculator and hybpiper_stat modules, respectively. Work reported here was run on hardware supported by Drexel’s University Research Computing Facility.

Putatively orthologous sequences were aligned using MAFFT version 1.3.6 (Katoh et al., 2002) using an iterative refinement method (FFT‐NS‐i, two cycles). The BAM output files created by HybPiper were converted to PILEUP files using SAMtools (Li et al., 2009) to detect single‐nucleotide polymorphisms (SNPs). Intraspecific SNPs were identified in E. chilensis and intrageneric SNPs were identified in Apocynum using the SNP detection method of Boutte et al. (2016) with a minimum read depth of 25 and a SNP detection threshold of 30 as well as additional custom Python scripts.

Read pools for each sample were assessed for the potential to assemble plastomes using off‐target reads. Trimmed reads were mapped to a reference plastome from Oncinotis tenuiloba Stapf (NC_025657), a member of the apocynoid grade, in Geneious version 10.2.6 (Biomatters Ltd., Auckland, New Zealand) with the sensitivity set to medium‐low and allowing up to five iterations. Multimap reads were mapped randomly.

Phylogenomic analyses

For phylogenomic analyses, only the E. chilensis individual with the highest number of assembled genes was retained. Gene trees were estimated in a maximum likelihood framework using the GTR+GAMMA model of sequence evolution in IQ‐TREE version 1.5.5 (Nguyen et al., 2015). One thousand ultrafast bootstrap replicates (Hoang et al., 2018) were conducted for each sequence matrix. Species trees were inferred using ASTRAL II version 4.10.12 with default parameters (Mirarab and Warnow, 2015). For species tree analyses, three sampling strategies were used: (1) all gene trees based on exons only, (2) exon‐only gene trees with very poorly resolved gene trees (total of ≥75% of nodes with <75% bootstrap support) removed, and (3) all genes with exons and splash zones included (supercontigs). Phyparts with the bootstrap filter set to 50% (Smith et al., 2015) was used to assess gene tree conflict, which was represented using pie charts using the PhyParts_PieCharts script (https://github.com/mossmatters/MJPythonNotebooks/blob/master/phypartspiecharts.py) for all trees that contained the outgroup, Gelsemium sempervirens.

RESULTS

Targeted sequencing

A total of 853 putatively single‐copy genes with a cumulative exon length of 1,545,593 bp were identified in Apocynaceae. Among these 853 genes, probes were designed for 359 genes considering at least one transcript from an Asclepiadoideae species and a Rauvolfioideae species. Probes were designed for 341 genes without a transcript from Asclepiadoideae present, and for 153 genes without a Rauvolfioideae transcript present. On average 76% (±6%) of reads were on target, a mean of 98% (±1%) of genes were at least partially assembled, very few potential paralogs (0–3) were detected, and samples sourced from herbarium specimens produced results similar to silica‐dried samples (Table 1, Fig. 1). For the Apocynaceae ingroup, assembly of targeted exons ranged from 4288 to 7715, with an average of 6.8 and a median of six exons per gene, and yielded between 989,706 bp and 1,394,925 bp of sequence. Assembly of untargeted splash zones (i.e., intron and intergenic spacers) produced from 1,168,586 bp to 2,825,097 bp of additional sequence per sample (Table 1). The outgroup, G. sempervirens, did not have the fewest partially assembled genes, but did produce the fewest assembled exons and the least amount of total assembled sequence data, and had the lowest percentage of reads on target (Table 1, Fig. 1). Mapping results indicated that from the off‐target reads, the plastome had been sequenced to an average depth of 80× (Table 1).

Figure 1.

Figure 1

Heatmap of target enrichment, sequencing, and assembly success for 853 nuclear genes in Apocynaceae.

Eighteen genes were removed from downstream analyses because four of the targeted genes were not recovered in any species, an additional nine genes had <75% terminal occupancy, and five genes had putative paralogs present in one or more sampled individuals. Of the 853 targeted genes, 81 (9.5%) overlapped with the 353 genes in the universal angiosperm set and 69 (8.1%) overlapped with the 768 genes in the Asclepias‐specific set.

Alignment and phylogeny

Cumulatively, the exon‐only sequence alignments had 1,922,030 sites with an average of 2302 (±1397) sites per gene and a median length of 1900 sites. For the ingroup, there were 30% variable and 12% parsimony informative characters. The aligned exon‐plus‐splash‐zone matrices had 7,748,184 characters with an average of 9387 (±5949) characters per matrix and a median length of 7650 characters. There were 34% variable and 14% parsimony informative characters for the ingroup.

Most gene trees were inferred with strong bootstrap support for the relationships among taxa. Only 18 of the exon‐only trees had less than 75% of the nodes with greater than 75% bootstrap support. The species trees inferred using all exon‐only (n = 835), exon‐only with very poorly resolved trees removed (n = 817; Fig. 2), and the all exon‐plus‐splash‐zone trees (n = 835) had identical topologies, with all subfamilies (including Rauvolfioideae) monophyletic. They differed slightly in the local posterior probabilities within Rauvolfioideae (data not shown). Notably, the local posterior probability for a monophyletic Rauvolfioideae was much lower than all other local posterior probabilities (0.69 vs. ≥0.98). The normalized quartet score for the species tree was 0.84, indicating a moderate level of discordance among gene trees. For most of the relationships within and among Apocynoideae, Secamonoideae, and Asclepiadoideae, the majority of gene trees were informative and concordant with the species tree. However, among Rauvolfioideae, there were many uninformative gene trees, and more gene trees were in conflict with the species tree than were concordant, despite there being relatively high local posterior probabilities for most splits.

Figure 2.

Figure 2

ASTRAL species tree of Apocynaceae inferred from 817 nuclear gene trees. Numbers next to the nodes are local posterior probabilities. The scale bar is in coalescent units. For each node, pie charts show the proportions of the gene tree bipartitions that are concordant with the species tree (blue), the most frequently observed alternative bipartition (green), all other bipartitions (red), and uninformative or missing (gray). The proportions were calculated using all gene trees containing the outgroup Gelsemium sempervirens (n= 761). Numbers of concordant/conflicting gene trees are shown next to or below each pie chart.

Interspecific and intraspecific variation

For the four species of Apocynum, 171,323 intrageneric SNPs were detected, and were much more frequent in splash zones than in exons (n = 834 genes; Table 2, Fig. 3A). The phylogenetic information present in these nucleotide substitutions led to complete resolution of the evolutionary relationships among these species (Fig. 2). In the comparison of the sequences from the two E. chilensis individuals, 38,555 intraspecific SNPs were observed, and they were more common in splash zones than in exons (n = 831 genes; Table 2, Fig. 3B).

Table 2.

Number of single‐nucleotide polymorphisms per 100 bp of exonic and splash zone sequence.

Taxon Sequence type Mean Median 95% Confidence interval
Apocynum Exons 0.450 0.357 [0.426, 0.474]
Splash zones 3.242 2.942 [3.127, 3.357]
Elytropus chilensis Exons 0.110 0.068 [0.101, 0.119]
Splash zones 0.746 0.327 [0.667, 0.825]

Figure 3.

Figure 3

Frequency and distribution of intrageneric and intraspecific single‐nucleotide polymorphisms (SNPs) in targeted nuclear genes. (A) Intrageneric SNPs among four species of Apocynum. Sites where one or more individuals were heterozygous were excluded from the graph. (B) Intraspecific SNPs in two individuals of Elytropus chilensis.

DISCUSSION

The probes designed from transcriptome sequences of Apocynaceae were effective for targeted sequencing of 835 putatively single‐copy nuclear genes from an initial set of 853 genes. The assembled sequences contained information that was useful for resolving the phylogeny of Apocynaceae from the higher‐level subfamily relationships down to species relationships within a single genus, Apocynum. The probe set should therefore be useful for phylogenetic studies across the family at multiple scales, and potentially in other families of Gentianales given the high success rate for assembling nuclear genes in the outgroup, a representative of Gelsemiaceae. Most samples had sufficient plastome sequencing depth from off‐target reads to suggest that reference‐guided or de novo assembly of the plastome would likely be successful (Straub et al., 2012) and add another potential source of information for inferring phylogeny (see Fishbein et al., 2018). Furthermore, given the numerous SNPs detected in both exonic and intronic gene regions for Elytropus Müll. Arg. individuals, the probes will also be useful for generating data appropriate for population genetics and evolutionary genomics studies.

The new bait set for targeted sequencing in Apocynaceae presented here provides an example of a successful lineage‐specific probe design. MarkerMiner yielded loci that were conserved enough to be targeted across the whole family, and very few of the loci turned out not to be single copy, as has been observed in other studies that compared MarkerMiner‐based designs to those generated by other pipelines (Kadlec et al., 2017; Vatanparast et al., 2018). Additionally, several studies that have compared the success of taxon‐specific and universal probe sets have found that taxon‐specific designs yield the largest number of most useful loci (Kadlec et al., 2017; Chau et al., 2018; Jantzen et al., 2020), although this has not been universally observed (Larridon et al., 2020). When feasible, a combination of lineage‐specific and universal probe sets, such as the Angiosperms353 probe set (Johnson et al., 2019), can yield the largest pool of nuclear loci appropriate for phylogenomic studies (Jantzen et al., 2020).

In agreement with previous molecular phylogenetic studies that utilized chloroplast regions, whole plastomes, and/or few nuclear genes for inferring the phylogeny of Apocynaceae (Verhoeven et al., 2003; Livshultz et al., 2007; Livshultz, 2010; Fishbein et al., 2018), Asclepiadoideae was monophyletic and sister to Secamonoideae with local posterior probabilities of 1. Based on previously published analyses, Rauvolfioideae and Apocynoideae have been understood to be paraphyletic grades within the family (Livshultz et al., 2007; Simões et al., 2007, 2016; Livshultz, 2010; Fishbein et al., 2018), but were unexpectedly recovered as monophyletic in this analysis. For Apocynoideae, one might have expected Apocynum to be sister to Secomonoideae plus Asclepiadoideae with Elytropus (Odontadenieae) plus Prestonia (Echiteae) sister to that group, and for sampled Rauvolfioideae to be paraphyletic with the Amsonia‐Hunteria clade sister to the Apocynoideae‐Secamonoideae‐Asclepiadoideae clade (Fishbein et al., 2018). Here, the monophyly of Apocynoideae in the species tree had high local posterior probability and was supported by a majority of gene trees (Fig. 2). In contrast, the monophyly of Rauvolfioideae was only weakly supported in the species tree, and the proportion of gene trees with bipartitions that conflicted with the species tree was much higher than the proportion that supported the species tree topology.

The phylogenomic results observed here are based on data from many nuclear genes and may give better insight into the evolution of Apocynaceae than previous studies due to greatly increased character sampling; however, even in phylogenomic studies taxon sampling has been shown to affect species tree topology (Walker et al., 2017). Sparse taxon sampling has been demonstrated to have the potential to negatively affect phylogenetic reconstruction (Hillis et al., 2003; Nabhan and Sarkar, 2012), even though coalescent approaches may be less affected by this issue than are other analytical methods for species tree inference (Bravo et al., 2019). Given that previous studies of both plastid and nuclear loci have provided strong evidence that Rauvolfioideae and Apocynoideae are paraphyletic grades, their surprising monophyly observed in this study is likely to be shown to be anomalous with more thorough taxon sampling. Future phylogenomic studies using these probes will employ more extensive sampling, and engage in deeper exploration of the data to assess the impacts of various strategies on species tree inference. Potential strategies include: trimming of multiple sequence alignments and detection of errors in homology assessment, partitioning sequences, evaluating models of sequence evolution prior to phylogenetic analysis to infer gene trees, data filtering such as detecting outlier genes, and employing multiple approaches to species tree inference (Bravo et al., 2019).

Going forward, the large number of putatively single‐copy genes identified across the Apocynaceae and the probe set designed in this study will both serve as resources for the community of researchers interested in answering questions about Apocynaceae phylogenetics, phylogeography, and population genetics. This new gene set does not greatly overlap with the universal 353 gene set for angiosperms (Johnson et al., 2019) or the Asclepias‐specific 768 gene set (Weitemier et al., 2014) that works well across one Apocynaceae subfamily (Asclepiadoideae), and therefore increases the range of options for researchers to target subsets of single‐copy genes appropriate to the level of divergence and questions asked in a particular study.

Acknowledgments

The authors thank A. Simões (Universidade Estadual de Campinas) and M. Endress (University of Zurich) for tissue of A. orientalis; J. Teisher and D. Chin (Drexel University), and J. Schafer (Oklahoma State University) for computational support; C. Smith (Drexel University), and A. Foote, M. Cullinan, M. Steinfeldt, K. Kostović, and C. Chung (Hobart and William Smith Colleges) for laboratory help; and A. Devault and J. Enk (Daicel Arbor BioSciences) for probe design assistance. Funding was provided by the National Science Foundation (grant DEB 1655223/1655553 to S.C.K.S. and T.L. and 1457473/1457510 to S.C.K.S. and M.F.), and a Drexel University Clinical and Translational Research Institute (CTRI) seed grant to T.L.

Straub, S. C. K. , Boutte J., Fishbein M., and Livshultz T.. 2020. Enabling evolutionary studies at multiple scales in Apocynaceae through Hyb‐Seq. Applications in Plant Sciences 8(11): e11400.

Data Availability

Raw Illumina sequence data are available from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (BioProject PRJNA660787). Nuclear gene probe sequences (https://doi.org/10.6084/m9.figshare.12830579.v1 and https://doi.org/10.6084/m9.figshare.12830576.v1) and alignments (https://doi.org/10.6084/m9.figshare.12830648.v1) are available from Figshare. Custom Python scripts are available upon request.

LITERATURE CITED

  1. Altschul, S. F. , Gish W., Miller W., Myers E. W., and Lipman D. J.. 1990. Basic local alignment search tool. Journal of Molecular Biology 215: 403–410. [DOI] [PubMed] [Google Scholar]
  2. Bagley, J. C. , Uribe‐Convers S., Carlsen M. M., and Muchhala N.. 2020. Utility of targeted sequence capture for phylogenomics in rapid, recent angiosperm radiations: Neotropical Burmeistera bellflowers as a case study. Molecular Phylogenetics and Evolution 152: 106769. [DOI] [PubMed] [Google Scholar]
  3. Bolger, A. M. , Lohse M., and Usadel B.. 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Boutte, J. , Ferreira de Carvalho J., Rousseau‐Gueutin M., Poulain J., Da Silva C., Wincker P., Ainouche M., and Salmon A.. 2016. Reference transcriptomes and detection of duplicated copies in hexaploid and allododecaploid Spartina species (Poaceae). Genome Biology and Evolution 8: 3030–3044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Boutte, J. , Fishbein M., Liston A., and Straub S. C. K.. 2019. NGS‐Indel Coder: A pipeline to code indel characters in phylogenomic data with an example of its application in milkweeds (Asclepias). Molecular Phylogenetics and Evolution 139: 106534. [DOI] [PubMed] [Google Scholar]
  6. Bravo, G. A. , Antonelli A., Bacon C. D., Bartoszek K., Blom M. P. K., Huynh S., Jones G., et al. 2019. Embracing heterogeneity: Coalescing the Tree of Life and the future of phylogenomics. PeerJ 7: e6399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chamala, S. , García N., Godden G. T., Krishnakumar V., Jordon‐Thaden I. E., DeSmet R., Barbazuk W. B., et al. 2015. MarkerMiner 1.0: A new application for phylogenetic marker development using angiosperm transcriptomes. Applications in Plant Sciences 3: 1400115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Chau, J. H. , Rahfeldt W. A., and Olmstead R. G.. 2018. Comparison of taxon‐specific versus general locus sets for targeted sequence capture in plant phylogenomics. Applications in Plant Sciences 6: e1032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Couvreur, T. L. P. , Helmstetter A. J., Koenen E. J. M., Bethune K., Brandão R. D., Little S. A., Sauquet H., and Erkens R. H. J.. 2019. Phylogenomics of the major tropical plant family Annonaceae using targeted enrichment of nuclear genes. Frontiers in Plant Science 9: 1941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cragg, G. M. , and Newman D. J.. 2005. Plants as a source of anti‐cancer agents. Journal of Ethnopharmacology 100: 72–79. [DOI] [PubMed] [Google Scholar]
  11. Crowl, A. A. , Myers C., and Cellinese N.. 2017. Embracing discordance: Phylogenomic analyses provide evidence for allopolyploidy leading to cryptic diversity in a Mediterranean Campanula (Campanulaceae) clade. Evolution 71: 913–922. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. de la Harpe, M. , Hess J., Loiseau O., Salamin N., Lexer C., and Paris M.. 2019. A dedicated target capture approach reveals variable genetic markers across micro‐ and macro‐evolutionary time scales in palms. Molecular Ecology Resources 19: 221–234. [DOI] [PubMed] [Google Scholar]
  13. DeSmet, R. , Adams K. L., Vandepoele K., Montagu M. C. E. V., Maere S., and de Peer Y. V.. 2013. Convergent gene loss following gene and genome duplications creates single‐copy families in flowering plants. Proceedings of the National Academy of Sciences, USA 110: 2898–2903. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Dodsworth, S. , Pokorny L., Johnson M. G., Kim J. T., Maurin O., Wickett N. J., Forest F., and Baker W. J.. 2019. Hyb‐Seq for flowering plant systematics. Trends in Plant Science 24: 887–891. [DOI] [PubMed] [Google Scholar]
  15. Edwards, S. V. , Xi Z., Janke A., Faircloth B. C., McCormack J. E., Glenn T. C., Zhong B., et al. 2016. Implementing and testing the multispecies coalescent model: A valuable paradigm for phylogenomics. Molecular Phylogenetics and Evolution 94: 447–462. [DOI] [PubMed] [Google Scholar]
  16. Endress, M. E. , and Bruyns P. V.. 2000. A revised classification of the Apocynaceae s.l. Botanical Review 66: 1–56. [Google Scholar]
  17. Endress, M. E. , Meve U., Middleton D. J., and Liede‐Schumann S.. 2018. Apocynaceae In Kubitzki K. [ed.], The families and genera of vascular plants: Flowering plants, Eudicots, Apiales, Gentianales (except Rubiaceae), 207–411. Springer International Publishing, Cham, Switzerland. [Google Scholar]
  18. Endress, P. K. 1994. Diversity and evolutionary biology of tropical flowers. Cambridge University Press, Cambridge, United Kingdom. [Google Scholar]
  19. Fishbein, M. 2001. Evolutionary innovation and diversification in the flowers of Asclepiadaceae. Annals of the Missouri Botanical Garden 88: 603–623. [Google Scholar]
  20. Fishbein, M. , Livshultz T., Straub S. C. K., Simões A. O., Boutte J., McDonnell A., and Foote A.. 2018. Evolution on the backbone: Apocynaceae phylogenomics and new perspectives on growth forms, flowers, and fruits. American Journal of Botany 105: 495–513. [DOI] [PubMed] [Google Scholar]
  21. Fisher, A. E. , Hasenstab K. M., Bell H. L., Blaine E., Ingram A. L., and Columbus J. T.. 2016. Evolutionary history of chloridoid grasses estimated from 122 nuclear loci. Molecular Phylogenetics and Evolution 105: 1–14. [DOI] [PubMed] [Google Scholar]
  22. Góngora‐Castillo, E. , Childs K. L., Fedewa G., Hamilton J. P., Liscombe D. K., Magallanes‐Lundback M., Mandadi K. K., et al. 2012. Development of transcriptomic resources for interrogating the biosynthesis of monoterpene indole alkaloids in medicinal plant species. PLoS ONE 7: e52506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Herrando‐Moraira, S. , Calleja J. A., Galbany‐Casals M., Garcia‐Jacas N., Liu J.‐Q., López‐Alvarado J., López‐Pujol J., et al. 2019. Nuclear and plastid DNA phylogeny of tribe Cardueae (Compositae) with Hyb‐Seq data: A new subtribal classification and a temporal diversification framework. Molecular Phylogenetics and Evolution 137: 313–332. [DOI] [PubMed] [Google Scholar]
  24. Heyduk, K. , Trapnell D. W., Barrett C. F., and Leebens‐Mack J.. 2016. Phylogenomic analyses of species relationships in the genus Sabal (Arecaceae) using targeted sequence capture. Biological Journal of the Linnean Society 117: 106–120. [Google Scholar]
  25. Hillis, D. M. , Pollock D. D., McGuire J. A., and Zwickl D. J.. 2003. Is sparse taxon sampling a problem for phylogenetic inference? Systematic Biology 52: 124–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hoang, D. T. , Chernomor O., von Haeseler A., Minh B. Q., and Vinh L. S.. 2018. UFBoot2: Improving the ultrafast bootstrap approximation. Molecular Biology and Evolution 35: 518–522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Jantzen, J. R. , Amarasinghe P., Folk R. A., Reginato M., Michelangeli F. A., Soltis D. E., Cellinese N., and Soltis P. S.. 2020. A two‐tier bioinformatic pipeline to develop probes for target capture of nuclear loci with applications in Melastomataceae. Applications in Plant Sciences 8: e11345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Johnson, M. G. , Gardner E. M., Liu Y., Medina R., Goffinet B., Shaw A. J., Zerega N. J. C., and Wickett N. J.. 2016. HybPiper: Extracting coding sequence and introns for phylogenetics from high‐throughput sequencing reads using target enrichment. Applications in Plant Sciences 4: 1600016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Johnson, M. G. , Pokorny L., Dodsworth S., Botigué L. R., Cowan R. S., Devault A., Eiserhardt W. L., et al. 2019. A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k‐medoids clustering. Systematic Biology 68: 594–606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Jones, K. E. , Fér T., Schmickl R. E., Dikow R. B., Funk V. A., Herrando‐Moraira S., Johnston P. R., et al. 2019. An empirical assessment of a single family‐wide hybrid capture locus set at multiple evolutionary timescales in Asteraceae. Applications in Plant Sciences 7: e11295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kadlec, M. , Bellstedt D. U., Le Maitre N. C., and Pirie M. D.. 2017. Targeted NGS for species level phylogenomics: ‘Made to measure’ or ‘one size fits all’? PeerJ 5: 3569–3570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Katoh, K. , Misawa K., Kuma K., and Miyata T.. 2002. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30: 3059–3066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kwon, C. W. , Park K.‐M., Kang B.‐C., Kweon D.‐H., Kim M.‐D., Shin S. W., Je Y. H., and Chang P.‐S.. 2015. Cysteine protease profiles of the medicinal plant Calotropis procera R. Br. revealed by de novo transcriptome analysis. PLoS ONE 10: e0119328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Larridon, I. , Villaverde T., Zuntini A. R., Pokorny L., Brewer G. E., Epitawalage N., Fairlie I., et al. 2020. Tackling rapid radiations with targeted sequencing. Frontiers in Plant Science 10: 1655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Lemmon, E. M. , and Lemmon A. R.. 2013. High‐throughput genomic data in systematics and phylogenetics. Annual Review of Ecology, Evolution, and Systematics 44: 99–121. [Google Scholar]
  36. Léveillé‐Bourret, É. , Starr J. R., Ford B. A., Moriarty Lemmon E., and Lemmon A. R.. 2018. Resolving rapid radiations within angiosperm families using anchored phylogenomics. Systematic Biology 67: 94–112. [DOI] [PubMed] [Google Scholar]
  37. Li, H. , Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., et al. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Livshultz, T. 2010. The phylogenetic position of milkweeds (Apocynaceae subfamilies Secamonoideae and Asclepiadoideae): Evidence from the nucleus and chloroplast. Taxon 59: 1016–1030. [Google Scholar]
  39. Livshultz, T. , Middleton D. J., Endress M. E., and Williams J. K.. 2007. Phylogeny of Apocynoideae and the APSA clade (Apocynaceae s.l.). Annals of the Missouri Botanical Garden 94: 324–359. [Google Scholar]
  40. Livshultz, T. , Kaltenegger E., Straub S. C. K., Weitemier K., Hirsch E., Koval K., Mema L., and Liston A.. 2018. Evolution of pyrrolizidine alkaloid biosynthesis in Apocynaceae: Revisiting the defence de‐escalation hypothesis. New Phytologist 218: 762–773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Malcolm, S. B. , and Brower L. P.. 1989. Evolutionary and ecological implications of cardenolide sequestration in the monarch butterfly. Experientia 45: 284–295. [Google Scholar]
  42. Mandel, J. R. , Dikow R. B., and Funk V. A.. 2015. Using phylogenomics to resolve mega‐families: An example from Compositae. Journal of Systematics and Evolution 53: 391–402. [Google Scholar]
  43. Mirarab, S. , and Warnow T.. 2015. ASTRAL‐II: Coalescent‐based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics 31: i44–i52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Morais, E. B. , Schönenberger J., Conti E., Antonelli A., and Szövényi P.. 2019. Orthologous nuclear markers and new transcriptomes that broadly cover the phylogenetic diversity of Acanthaceae. Applications in Plant Sciences 7: e11290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Nabhan, A. R. , and Sarkar I. N.. 2012. The impact of taxon sampling on phylogenetic inference: A review of two decades of controversy. Briefings in Bioinformatics 13: 122–134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Nguyen, L.‐T. , Schmidt H. A., von Haeseler A., and Minh B. Q.. 2015. IQ‐TREE: A fast and effective stochastic algorithm for estimating maximum‐likelihood phylogenies. Molecular Biology and Evolution 32: 268–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Peng, Z. , Fan W., Wang L., Paudel D., Leventini D., Tillman B. L., and Wang J.. 2017. Target enrichment sequencing in cultivated peanut (Arachis hypogaea L.) using probes designed from transcript sequences. Molecular Genetics and Genomics 292: 955–965. [DOI] [PubMed] [Google Scholar]
  48. Python Software Foundation . 2016. Python language reference, version 2.7.12. Website http://www.python.org [accessed 20 October 2020].
  49. Simões, A. O. , Livshultz T., Conti E., and Endress M. E.. 2007. Phylogeny and systematics of the Rauvolfioideae (Apocynaceae) based on molecular and morphological evidence. Annals of the Missouri Botanical Garden 94: 268–297. [Google Scholar]
  50. Simões, A. O. , Kinoshita L. S., Koch I., Silva M. J., and Endress M. E.. 2016. Systematics and character evolution of Vinceae (Apocynaceae). Taxon 65: 99–122. [Google Scholar]
  51. Smith, S. A. , Moore M. J., Brown J. W., and Yang Y.. 2015. Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants. BMC Evolutionary Biology 15: 150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Straub, S. C. K. , Parks M., Weitemier K., Fishbein M., Cronn R. C., and Liston A.. 2012. Navigating the tip of the genomic iceberg: Next‐generation sequencing for plant systematics. American Journal of Botany 99: 349–364. [DOI] [PubMed] [Google Scholar]
  53. Straub, S. C. K. , Moore M. J., Soltis P. S., Soltis D. E., Liston A., and Livshultz T.. 2014. Phylogenetic signal detection from an ancient rapid radiation: Effects of noise reduction, long‐branch attraction, and model selection in crown clade Apocynaceae. Molecular Phylogenetics and Evolution 80: 169–185. [DOI] [PubMed] [Google Scholar]
  54. Vatanparast, M. , Powell A., Doyle J. J., and Egan A. N.. 2018. Targeting legume loci: A comparison of three methods for target enrichment bait design in Leguminosae phylogenomics. Applications in Plant Sciences 6: e1036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Verhoeven, R. L. , Liede S., and Endress M. E.. 2003. The tribal position of Fockea and Cibirhiza (Apocynaceae: Asclepiadoideae): Evidence from pollinium structure and cpDNA sequence data. Grana 42: 70–81. [Google Scholar]
  56. Villaverde, T. , Pokorny L., Olsson S., Rincón‐Barrado M., Johnson M. G., Gardner E. M., Wickett N. J., et al. 2018. Bridging the micro‐ and macroevolutionary levels in phylogenomics: Hyb‐Seq solves relationships from populations to species and above. New Phytologist 220: 636–650. [DOI] [PubMed] [Google Scholar]
  57. Walker, J. F. , Yang Y., Moore M. J., Mikenas J., Timoneda A., Brockington S. F., and Smith S. A.. 2017. Widespread paleopolyploidy, gene tree conflict, and recalcitrant relationships among the carnivorous Caryophyllales. American Journal of Botany 104: 858–867. [DOI] [PubMed] [Google Scholar]
  58. Weitemier, K. , Straub S. C. K., Cronn R. C., Fishbein M., Schmickl R., McDonnell A., and Liston A.. 2014. Hyb‐Seq: Combining target enrichment and genome skimming for plant phylogenomics. Applications in Plant Sciences 2: 1400042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Weitemier, K. , Straub S. C. K., Fishbein M., Bailey C. D., Cronn R. C., and Liston A.. 2019. A draft genome and transcriptome of common milkweed (Asclepias syriaca) as resources for evolutionary, ecological, and molecular studies in milkweeds and Apocynaceae. PeerJ 7: e7649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Xiao, M. , Zhang Y., Chen X., Lee E.‐J., Barber C. J. S., Chakrabarty R., Desgagné‐Penix I., et al. 2013. Transcriptome analysis based on next‐generation sequencing of non‐model plants producing specialized metabolites of biotechnological interest. Journal of Biotechnology 166: 122–134. [DOI] [PubMed] [Google Scholar]
  61. Yates, S. A. , Chernukhin I., Alvarez‐Fernandez R., Bechtold U., Baeshen M., Baeshen N., Mutwakil M. Z., et al. 2014. The temporal foliar transcriptome of the perennial C3 desert plant Rhazya stricta in its natural environment. BMC Plant Biology 14: 2. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Raw Illumina sequence data are available from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (BioProject PRJNA660787). Nuclear gene probe sequences (https://doi.org/10.6084/m9.figshare.12830579.v1 and https://doi.org/10.6084/m9.figshare.12830576.v1) and alignments (https://doi.org/10.6084/m9.figshare.12830648.v1) are available from Figshare. Custom Python scripts are available upon request.


Articles from Applications in Plant Sciences are provided here courtesy of Wiley

RESOURCES