Abstract
Premise
Cornales is an order of flowering plants containing ecologically and horticulturally important families, including Cornaceae (dogwoods) and Hydrangeaceae (hydrangeas), among others. While many relationships in Cornales are strongly supported by previous studies, some uncertainty remains with regards to the placement of Hydrostachyaceae and to relationships among families in Cornales and within Cornaceae. Here we analyzed hundreds of nuclear loci to test published phylogenetic hypotheses and estimated a robust species tree for Cornales.
Methods
Using the Angiosperms353 probe set and existing data sets, we generated phylogenomic data for 158 samples, representing all families in the Cornales, with intensive sampling in the Cornaceae.
Results
We curated an average of 312 genes per sample, constructed maximum likelihood gene trees, and inferred a species tree using the summary approach implemented in ASTRAL‐III, a method statistically consistent with the multispecies coalescent model.
Conclusions
The species tree we constructed generally shows high support values and a high degree of concordance among individual nuclear gene trees. Relationships among families are largely congruent with previous molecular studies, except for the placement of the nyssoids and the Grubbiaceae‐Curtisiaceae clades. Furthermore, we were able to place Hydrostachyaceae within Cornales, and within Cornaceae, the monophyly of known morphogroups was well supported. However, patterns of gene tree discordance suggest potential ancient reticulation, gene flow, and/or ILS in the Hydrostachyaceae lineage and the early diversification of Cornus. Our findings reveal new insights into the diversification process across Cornales and demonstrate the utility of the Angiosperms353 probe set.
Keywords: ancient reticulation, Angiosperms353, asterids, coalescence, Cornales, gene flow, incomplete lineage sorting, phylogenomics, species tree estimation, target capture
Cornales (10 families; 42 genera; ~605 species) is an order of flowering plants with an extensive history of inconsistent taxonomic treatments. Due to these inconsistencies, the late Dr. Richard H. Eyde (1988) described the group as a “dustbin”, the collection area for a plethora of different reassignments from authors with differing opinions. Despite these inconsistencies, a few morphological characters were used to unite Cornalean lineages including relatively inconspicuous flowers with inferior ovaries that develop into a drupe (Edye, 1988). With the emergence of DNA sequencing, molecular phylogenetic studies have drastically improved understanding of this group’s evolutionary history. These studies have helped clean up the “dustbin” defining a Cornalean clade (Fig. 1), consisting of the following major lineages: Cornaceae‐Alangiaceae (Co‐A), Nyssaceae‐Davidiacace‐Mastixacace (nyssoids), Grubbiaceae‐Curtisiaceae (G‐Cu), Hydrangeaceae‐Loasaceae (Hydra‐L), and Hydrostachyacae (Xiang et al., 1993, 1998, 2002, 2011; Fan and Xiang, 2003). Several previous Cornus allies were found to belong to Apiales (e.g., Torricellia, Griselinia), allied with Aquafoliales (e.g., Helwingia), or their own order (e.g., Garrya, Aucuba). The new Cornales concept supported by molecular evidence includes several taxa previously thought to be distantly related to Cornus, such as Hydrangeaceae, Loasaceae, Hydrostachyaceae, and Grubbia. Addition of these taxa to the order has dramatically increased the morphological complexity and heterogeneity, leading to challenges for resolving relationships within Cornales, especially with many contentious placements of the family Hydrostachyaceae both within and outside of the order (Xiang, 1999; Albach et al., 2001; Fan and Xiang, 2003; Burleigh et al., 2009). Through molecular studies, relationships within Cornales had only been assessed using a few organellar loci and 26S rDNA (matK, ndhF, rbcL, atpB, trnL‐F, and trnH‐K), which showed conflicting or poorly supported results (Xiang et al., 1998, 2002, 2011; Fan and Xiang, 2003), until a recent study using the plastid genome (Fu et al., 2019). That recent phylogenomic study using whole plastomes resolved relationships among families with strong support (Fu et al., 2019). However, the study revealed a new placement of the nyssoids placed sister to the Hydra‐L and Hydrostachyaceae, which was unexpected with respect to morphology. The nyssoids have long been considered a core member of the Cornales due to similarities with Cornus species in the anatomical features of the ovaries and pollen morphology (Eyde, 1988), which is also supported by wood anatomy (Noshiro and Baas, 1998) and a recent phylogenetic analysis of morphological data of Cornales (Atkinson, 2018).
FIGURE 1.
Floral diversity across Cornales. (A) Cornaceae, Cornus florida; (B) Alangiaceae, Alangium salviifolium; (C) Nyssaceae, Nyssa sylvatica observed in Washington, D.C., USA by Carrie Seltzer (licensed under http://creativecommons.org/licenses/by‐nc/4.0/); (D) Davidiaceae, Davida involucrata Digital Image © Board of Trustees, RBG Kew (licensed under http://creativecommons.org/licenses/by/3.0/); (E) Mastixiaceae, Mastixia arborea Digital Image © Board of Trustees, RBG Kew (licensed under http://creativecommons.org/licenses/by/3.0/); (F) Grubbiaceae, Grubbia rosmarinifolia subsp. rosmarinifolia observed in South Africa by Nicola van Berkel (licensed under http://creativecommons.org/licenses/by‐sa/4.0/), (G) Curtisia dentata (Burm.f.) C.A.Sm. observed in South Africa by Craig Peter (licensed under http://creativecommons.org/licenses/by‐nc/4.0/), (H) Hydrangeaceae, Hydrangea arborescens observed in Pennsylvania, USA by Daniel Gillies (licensed under https://creativecommons.org/licenses/by/4.0/); (I) Loasaceae, Mentzelia involucrata; (J) Hydrostachyaceae, Hydrostachys verruculosa A.Juss. observed in Madagascar by Romer Rabarijaona (licensed under http://creativecommons.org/licenses/by‐nc/4.0/).
Within the Cornales, the family Cornaceae, as defined here, is a monogeneric family consisting of only Cornus L. s.l. This family is composed of four morphological groups (Xiang et al., 2006), the blue‐ or white‐fruited (BW) dogwoods, the cornelian cherries (CC), and the big‐bracted (BB) and dwarf (DW) dogwoods. The monophyly of Cornaceae is well supported by both morphological and molecular studies (Murrell, 1993; Xiang et al., 1993, 1996, 1998, 2006), but there has been a long history of controversy in determining relationships within the family. Xiang et al. (2006) examined these relationships using matK, ITS, rbcL and 26S rDNA sequences combined with 37 morphological characters and produced a well‐supported tree. However, when sequences and morphology were looked at separately, disparate topologies emerged. Furthermore, the support for the relationships among the four groups, particularly involving the placement of the CC group, was often not strong in previous studies using one or a few gene regions (Fan et al., 2004; Xiang et al. 2008). These issues in resolving relationships in Cornales and Cornaceae present a need to reassess these relationships through species‐tree analyses of multiple nuclear loci. At present, just one nuclear locus, 26S rDNA, has been used to infer relationships among all Cornales lineages (Fan and Xiang, 2003; Xiang et al., 2011), and the estimated 26S phylogeny recovers conflicting relationships relative to the plastid gene/genome phylogeny (e.g., placement of Alangium); although the conflict is poorly supported. Furthermore, evidence for a rapid radiation in the order, following its origin in the Cretaceous (Xiang et al., 2011; Atkinson, 2016, 2018), also presents challenges to estimation of a robust multi‐nuclear‐gene species phylogeny.
Advances in the ability to generate a large number of nuclear loci are enabling more robust phylogenetic inferences that can account for incomplete lineage sorting (ILS) and reticulate evolution (Degnan and Rosenberg, 2006; McCormack et al., 2009; Smith et al., 2015). Compared to other genome sampling techniques, targeted enrichment provides a cost‐effective method to obtain large phylogenomic data sets that can provide hundreds to thousands of low‐copy number (LCN) nuclear loci for many samples (Heyduk et al., 2016; McKain et al., 2018; Hale et al., 2020). Furthermore, targeted bait capture allows researchers to obtain data from degraded DNA derived from herbarium samples (Hart et al., 2016; Brewer et al., 2019). Probe set design for target enrichment requires existing genomic resources such as genomes and transcriptomes and extensive computation and sequence curation (Heyduk et al., 2016; Hale et al., 2020). Thus, these factors impose a substantial barrier to entry for phylogenomics in nonmodel systems with limited sequence data. The recent development of the Angiosperms353 probe set (Johnson et al., 2018) provides an attractive option for plant systematists working across angiosperms, enabling the generation of hundreds of nuclear loci for any angiosperm lineage.
We used the Angiosperms353 probe set to generate sequences for hundreds of nuclear loci across 148 samples representing Cornales lineages and outgroups, with an additional 10 samples pulled from existing data sets (Appendix S1) to reconstruct a robust species phylogeny for Cornales and test published hypotheses for relationships across the order. Our primary goals were (1) to resolve relationships amongst the major lineages of Cornales, (2) to further assess the placements of Hydrostachyaceae and nyssoids within Cornales and (3) to look closer at relationships within the family Cornaceae. Furthermore, this study allows us to assess and demonstrate the efficacy of the Angiosperms353 probe set at different taxonomic levels.
METHODS
Library preparation and sequencing
We sampled 148 accessions across Cornales and outgroups, with more intensive sampling in the Cornaceae, and outgroup core eudicot samples representing core asterids, Caryophyllales, and rosids. Representation of Cornalean families included Alangiaceae (from among 1 genus, 20 species: 16 samples representing 10 species); Cornaceae (1 genus, 55–65 species: 91 samples representing 45 species); Curtisiacae (1 genus, 1 species: 1 sample representing 1 species); Davidiaceae (1 genus, 1 species: 1 sample representing 1 species); Grubbiaceae (1 genus, 3 species: 1 sample representing 1 species); Hydrangeaceae (17 genera, ~190 species: 17 samples representing 8 genera and 16 species); Hydrostachyaceae (1 genus, 22 species: 2 samples representing 2 species); Loasaceae (20 genera, ~325 species: 12 samples representing 7 genera and 12 species); Mastixiaceae (2 genera, 16 species: 1 sample representing 1 genus and 1 species); Nyssaceae (2 genera, 9 species: 5 samples representing 2 genera and 5 species). Tissue samples were collected from the field, botanical gardens, arboreta, and herbarium specimens (Appendix S1). Fresh tissue was either dried on silica desiccant or stored at −20°C if collected from local arboreta or botanical gardens. Then tissue was ground in a mortar and pestle with liquid nitrogen, and total DNA was isolated using the sorbitol CTAB method (Doyle and Doyle, 1987; Štorchová et al., 2000) and qualitatively detected by agarose electrophoresis. DNA concentration was checked for impurities with a Nanodrop spectrophotometer and quantified using a Qubit fluorometer (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA). Genomic DNAs were sheared to an average fragment size of 550 bp using S220 Focused‐Ultrasonicator (Covaris, Woburn, MA, USA) at the Genomic Sciences Laboratory of North Carolina State University. Genomic libraries were generated using the KAPA LTP Library Preparation kit for Illumina (Roche, Basel, Switzerland) at the Plant Molecular Systematics Laboratory at North Carolina State University, using dual index primers from the EHS DNA Laboratory at the University of Georgia (baddna.uga.edu). Following the KAPA LTP Library Preparation workflow, these genomic libraries were amplified using 12 PCR cycles, then they were quantified using the Qubit fluorometer HS assay. Average fragment size of the genomic DNA libraries was assessed using the 2100 Agilent Bioanalyzer (Santa Clara, California, USA). Prior to hybridization, 6–12 dual‐indexed genomic DNA libraries were pooled in equimolar concentrations to include (220)500–1000 ng of each library for a final (0.5)4–6 nM total pool. The pooled libraries were then hybridized with the Angiosperms353 probe set (Daicel Arbor Biosciences, myBaits Target Capture Kit, Ann Arbor, MI, USA). Hybridization reactions were incubated at 65°C for 24–32 h, either in in a Hybex Microsample Incubator with red Chill‐out Liquid Wax (Bio‐Rad, Hercules, CA, USA) at the Sackler Phylogenomic Laboratory (Jodrell Laboratory at Royal Botanic Gardens, Kew, UK), or in an Eppendorf Mastercycler at the University of Georgia. Targets attached to biotinylated probes were rescued with streptavidin‐coated magnetic beads (Dynabeads, ThermoFisher). Library pools with rescued captures were enriched with 10 PCR cycles, and products were cleaned using Sera‐mag SpeedBeads (GE Healthcare Life Sciences, Pittsburgh, PA, USA). Final enriched libraries were quality checked on the 2100 Agilent Bioanalyzer, and concentrations were measured with qPCR and the Qubit fluorometer. Multiple enriched library pools were sequenced in a NextSeq (Illumina, San Diego, CA, USA), 300‐cycle PE150 High Output flow cell, at the Georgia Genomics and Bioinformatics Core at the University of Georgia.
Gene assembly and curation
Raw reads were cleaned with Trimmomatic v0.36 (Bolger et al., 2014), by removing leading and trailing N or low quality (below 3) bases, scanning reads with a 4‐base wide sliding window and cutting where average per base quality was below 15, and removing reads less than 36 bp long (LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36). For downstream analyses, we only kept cleaned reads in which both read pairs survived. To assemble target genes, we used the HybPiper pipeline (Johnson et al., 2016). In this pipeline, reads were mapped to the Angiosperms353 targets (Johnson et al., 2018) using the Burrows–Wheeler alignment (BWA) tool (Li and Durbin, 2009). Mapped reads for each target were assembled de novo into contigs using the best k‐mer detected using SPAdes version 3.13.0 (Nurk et al., 2013). The exon contigs were then aligned to the reference to extract coding sequence and then scaffolded using exonerate (Slater and Birney, 2005). To preserve the flanking intron regions and generate “supercontigs”, including both exons and flanking intron regions, we used the intronerate.py script available with HybPiper. Finally, we used the paralog_investigator.py script available with HybPiper, to investigate the distribution of paralogs within our samples (IDs=1‐192, DAV, GRU; Appendix S1).
To bolster our outgroup sampling, we used existing data from the Angiosperms353 probe design paper (Johnson et al., 2018). We included the supercontig sequences from the following species: Aralia cordata (Araliaceae, Apiales, asterids), Cyphostemma mappia (Vitaceae, Vitales, rosids), Cyrilla racemiflora (Cyrillaceae, Ericales, asterids), Crossosoma californicum (Crossosomataceae, Crossosomatales, rosids), Dodonaea viscosa (Sapindaceae, Sapindales, rosids), and Gerrardina foliosa (Gerrardinaceae, Huerteales, rosids).
Finally, to increase sampling for Ericales, the putative sister clade to Cornales (OTPT Initiative, 2019; Zhang et al., 2020), we extracted genes from existing diploid reference genomes: Actinidia eriantha (Tang et al., 2019), Rhododendron williamsianum (Soza et al., 2019), and Camellia sinensis (Wei et al., 2018). We used BLASTn (Altschul et al., 1990) to query the gene CDS of these species against the Angiosperms353 target gene set to identify orthologous genes. Using an e‐value threshold of 1E‐20, we removed instances where multiple query genes had hits to the same target gene or gene family, as these may be potential paralogs. Hits where only one query gene mapped to a target were then kept. Using this list, we pulled the full‐length genes to align against the supercontig sequences.
Phylogeny reconstruction
Since concatenation methods are inconsistent under ILS (Kubatko and Degnan, 2007; Davidson et al., 2015), we opted for a multi‐locus species tree estimation approach. For our analyses, we used supercontig sequences as they provide variable sites in both the coding and noncoding regions, which is of use when reconstructing phylogenetic relationships at both deep and shallow scales (Johnson et al., 2018; Villaverde et al., 2018). We used the program PASTA (Mirarab et al., 2014; Balaban et al., 2019) to align supercontig sequences for each gene target. This program uses three iterations of building a guide tree, each time the alignment data matrix is subset (using the guide tree for the iteration), and MAFFT‐L‐INS‐i (Katoh et al., 2002, 2005) is used to align the resulting subsets, which are later merged back to the full‐length alignment using OPAL (Wheeler and Kececioglu, 2007) and transitivity (Mirarab et al., 2015). We trimmed alignments using Phyutility 2.2.6 (Smith and Dunn, 2008) to remove sites with >90% missing data (‐clean 0.1) and calculated summary statistics for both the trimmed and raw alignments using AMAS (Borowiec, 2016). From each alignment, we constructed gene trees using IQ‐TREE 1.6.5 (Nguyen et al., 2015). We used the ModelFinder option (Kalyaanamoorthy et al., 2017) to select the best substitution model for each alignment and then inferred gene trees under maximum likelihood (ML) (Nguyen et al., 2015). Additionally, we used the ultrafast bootstrap feature (Hoang et al., 2018) to generate 1000 bootstrap replicates for each gene tree. To account for ILS, we estimated an unrooted species tree with ASTRAL‐III (Zhang et al., 2018) using 250 multi‐locus bootstrap (MLB) replicates for clade support. Finally, we scored branches on the species tree using local posterior probabilities (‐t 4 option in ASTRAL) and quartet frequencies (normalized quartet scores, ‐t 8 option in ASTRAL) for the three possible bipartitions around each node in the species tree. Trees were plotted in FigTree v1.4.4 (Fig. 2) (http://tree.bio.ed.ac.uk/software/figtree/) or Geneious 11.1.5 (Fig. 3, Appendix S2) (https://www.geneious.com).
FIGURE 2.
Cladogram illustrating Cornales familial relationships. ASTRAL‐III species tree generated using 353 nuclear loci from the Angiosperms353 probe set. Quartet frequencies are depicted for all branches on the tree. Numerical values denote multi‐locus bootstrap support (MLBS) and local posterior probabilities (LPP). MLBS is listed first and LPP second (MLBS/LPP). Branches without support values indicate full support (MLBS = 100, LPP = 1) for both MLBS and LPP. Quartet frequencies at branches are represented by pie charts. Green, yellow, and blue indicate the frequencies for the main, 1st alternative, and 2nd alternative topologies, respectively.
FIGURE 3.
Phylogenetic relationships amongst the family Cornaceae. ASTRAL‐III species tree generated using 353 nuclear loci from the Angiosperms353 probe set. Internal branch lengths measured in coalescent units and terminal branches set to one. Colored clades represent the four Cornaceae morphological groups: green = dwarf dogwoods (DW); pink = blue‐ and white‐fruited dogwoods (BW); blue = cornelian cherries (CC); orange = big‐bracted dogwoods (BB). Quartet frequencies are depicted as pie charts for branches leading to each morphological group and for branches connecting them. Green, yellow, and blue indicate the frequencies for the main, 1st alternative, and 2nd alternative topologies, respectively. Numerical values denote multi‐locus bootstrap support (MLBS). Branches without support values indicate full support (MLBS = 100).
RESULTS
In the 158 samples used to reconstruct the phylogeny, target gene recovery ranged from 68 to 349 genes (Appendices S1, S3). All of the Angersperms353 genes were included in our analyses. Within Cornales, the average percentage of genes recovered per family was above 87%, except for the family Hydrostachyaceae with an average of 42% (Appendix S1). For samples outside of Cornales, gene target recovery ranged from 64% to 97% (Fig. 1, Appendix S1). The distribution of paralogs in our data ranged from 0 to 83 paralogs per sample with the vast majority of samples showing low paralog representation (<10 paralogs, mean = 6.975, median = 7) (Appendix S4A). Additionally, no patterns emerged when looking at the distribution of paralogs across clades (Appendix S4B). Summary statistics showed that our raw alignments contained loci with missing data ranging from ~63–97% (Appendices S5A, S6), whereas the trimmed alignments ranged from ~35–66%. (Appendices S5B, S7). Species trees recovered from both the raw and trimmed data sets had the same topology with very minor differences in support values (Appendix S2). Therefore, we decided to use the species tree from the raw data set.
As has been seen in previously phylogenomic investigations (OTPT Initiative, 2019; Zhang et al., 2020), Ericales was recovered as sister to Cornales (Fig. 2) and both (“Ericornid” clade of Zhang et al., 2020) were inferred to be sister to the euasterids (“Core Asterids'” clade of Zhang et al., 2020). However, here we resolve the euasterids, represented by single species both for Lamiales (euasterid I) and Apiales (euasterid II), as a grade rather than a clade. We assume this topology is just an attribute of poor sampling of euasterids. Yet quartet frequencies on the branch leading to the (Apiales (Cornales, Ericales)) (Q1 = 0.45; Q2 = 0.16; Q3 = 0.39) clade show relatively high support for an alternative topology (Q3), which would recover the euasterids in a clade sister to Cornales‐Ericales, in agreement with the 1KP capstone paper (OTPT Initiative, 2019) and Zhang et al. (2020). Our Caryophylalles sample falls sister to the asterid clade, with rosids as the outgroup clade (Fig. 2). Multi‐locus bootstrap support (MLBS) and local posterior probabilities (LPP) throughout the backbone of the tree was often 100% and 1.0, respectively, with a few exceptions (Fig. 2).
The species tree we generated preserves the monophyly of Cornales and the families within it. We see that the previously described five major clades within the order (Xiang et al., 2011) remain intact: Cornaceae‐Alangiaceae (Co‐A), Nyssaceae‐Davidiacace‐Mastixacace (nyssoids), Grubbiaceae‐Curtisiaceae (G‐Cu), Hydrangeaceae‐Loasaceae (Hydra‐L), and Hydrostachyacae. However, the topology of these clades is slightly different. In previous molecular studies, we see the Co‐A and G‐Cu clades placed as sister clades, as well as the nyssoids as sister to Hydra‐L + Hydrostachyaceae (Xiang et al., 2011). From our analyses we see G‐Cu sister to the nyssoids and this clade is placed sister to the Co‐A clade. Quartet frequencies for the branch leading to the Hydra‐L + Hydrostachyaceae clade (Q1 = 0.53; Q2 = 0.1; Q3 = 0.37) show good support for the main topology (Q1) as seen in Fig. 2. The second alternative topology (Q3) has a relatively high frequency, and places Hydrostachyaceae sister to the rest of the Cornales, a topology also seen in some previous studies, such as Fan and Xiang (2003).
Within the family Cornaceae, comprising four morphological groups (Xiang et al., 2006)—blue‐ or white‐fruited (BW) dogwoods, cornelian cherries (CC), big‐bracted (BB), and dwarf (DW) dogwoods—were each found to be monophyletic with low levels of gene tree discordance. Generally, we only saw drops in MLBS at lower taxonomic scales for relationships that were previously also poorly supported (Fig. 3). In our species tree (Fig. 3), CC and BB form a clade, which is sister to BW, and DW is sister to the CC‐BB‐BW clade (DW (BW (CC, BB))). Clades across the Cornus tree generally had low gene tree discordance, except for the branches leading to the (BB, CC) clade and the (BW (BB, CC)) clade (Fig. 3). On the branch leading to (BB (CC, BW)) quartet frequencies support a main topology of (DW (BW (CC, BB))) (Q1 = 0.49), but the first alternate topology of (BW (DW (CC, BB))) clade also received substantial quartet support (Q2 = 0.33), while quartet support for the second alternative topology ((CC, BB), (DW, BW)) was lower (Q3 = 0.18). At the branch leading to the (BB, CC) the main topology has a higher quartet frequency (Q1 = 0.42), and the two alternatives have equal frequencies (Q2 = 0.29; Q3 = 0.29).
DISCUSSION
In this study, we generated and analyzed a large phylogenomic data set with 158 samples to reconstruct the evolutionary history of the order Cornales, with extensive sampling of Cornaceae. Target capture efficiency was generally good for Cornales samples with the exception of the two Hydrostachys species (Fig. 1; Appendix S1). The distribution of paralogs across clades does not coincide with this result of lower data recovery in Hydrostachyaceae. Without an available genome sequence for Hydrostachyaceae, we are not sure about the reasons for the low coverage. For the Ericales, the Cyrilla racemiflora sample from a previous study (Johnson et al., 2018) exhibited excellent capture efficiency (331 genes, Appendix S1). However, genes retrieved from genome annotations for the three other Ericales samples were below average (just 36.5% for Actinidia eriantha, 56.7% for Camellia sinensis, and 73.9% for Rhododendron williamsianum), perhaps due to incomplete annotation of these genomes. Average taxon occupancy among the 353 multiple sequence alignments and gene trees was 144 (Fig. 1). Summary methods like ASTRAL‐III (Zhang et al., 2018) are generally robust to moderate levels of missing data (Xi et al., 2016; Molloy and Warnow, 2018), and bootstrap analyses suggest that this is the case for our analysis of Cornales using the Angiosperm353 probe set.
Relationships among major lineages in Cornales
The estimated relationships within the species tree for Cornales were generally well supported (Fig. 2, Fig. 3, Appendix S3) and largely congruent with the published plastid gene‐based phylogeny (Xiang et al., 2011; Fu et al., 2019) with a few notable differences. The well‐supported relationship between the nyssoids + Grubbiaceae‐Curtisiaceae clade and the Cornaceae‐Alangiaceae clade (Fig. 2) exhibits low among‐gene discordance and high support values. Close affinities with Alangiaceae, Cornaceae, Curtisiaceae and the nyssoids have been found in morphological studies using wood characters (Noshiro and Baas, 1998) and pollen shape (Ferguson, 1977; Eyde, 1988). Moreover, the relationships we find are somewhat congruent with more recent phylogenetic analyses of Cornales based on fruit stone morphology including fossils and extant taxa that recovered Cornaceae‐Alangiacae and nyssoids as a clade with the exception of the Grubbiaceae and Curtisiaceae being placed sister to the Cornaceae‐Alangiacae and nyssoid clade in a grade (Atkinson, 2018; Hayes et al., 2018). Interestingly, Hayes et al. (2018) performed a constrained analysis using a prior molecular topology (Xiang et al., 2011) and recovered Grubbiaceae with Curtisiaceae as a clade sister to fossil taxa in the stem nyssoid lineage (designated as stem NMD in Atkinson, 2018). Although that result was poorly supported, it provides a morphological context for our placement of Grubbiaceae‐Curtisiacace sister to the nyssoids. Moreover, since previous plastid‐gene analyses (Xiang et al., 2011; Fu et al., 2019) show a well‐supported placement of nyssoids as sister to the Hydra‐L‐Hydrostachyaceae clade in contrast with the placement of nyssoids + Grubbiaceae‐Curtisiaceae clade seen in our analyses, there is potential for ancient chloroplast capture in the nyssoids.
In agreement with the plastome‐based trees, the aquatic Hydrostachyacae is placed within Cornales sister to the Hydrangeaceae‐Loasaceae clade (Xiang et al., 2011; Fu et al., 2019). This placement is consistent with fruit morphology where families in Cornales can be separated into two general groups: indehiscent/drupaceaous (Alangiaceae, Cornaceae, Curtisiaceae, Davidiaceae, Grubbiaceae and Nyssaceae) and dehiscent/capsular (Hydrangeaceae, Hydrostachyaceae and Loasacaee) (Atkinson, 2018). However, our data shows that the dehiscent/capsular group form a clade rather than a grade as seen in the phylogenetic analysis from Atkinson (2018). This placement of Hydrostachyaceae is only supported by slightly over half of the gene trees (Q1), and quartet frequency for the placement of Hydrostachyaceae as sister to the remainder of Cornales families (Q3) is greater than the other alternative, which would place Hydrostachyaceae sister to the clade including Cornaceae, Nyssoids, Grubbiaceae, and Curtisiaceae (Q2) (Fig. 2). If ILS were the only source of gene tree discordance one would expect equal quartet frequencies for the two alternative placements of Hydrostachyaceae. Moreover, ancestral gene flow or biased gene retention (or capture) could skew alternative quartet frequencies.
Relationships within Cornaceae
Although there has been a long history of controversy in determining relationships in Cornus, a few systematic hypotheses have seen substantial support in more recent studies. One hypothesis discussed by Eyde (1988) synthesizes non‐molecular characters and supports BB and DW in a clade with CC as the sister and BW sister to the rest of the family (BW (CC (DW, BB))). This hypothesis emphasizes the grouping of BB and DW on the basis of showy bracts being a synapomorphy for the group. Another hypothesis by Murrell (1993) suggested that showy bracts have evolved at least twice in Cornus and was based on a cladistic analysis of 28 morphological characteristics in which the most parsimonious tree placed BB and CC in a clade with DW as the sister and BW sister to the rest of the family (BW (DW (CC, BB))). Eyde’s hypothesis was supported by most of the previous molecular phylogenetic studies that recognized well‐supported sister relationships between BB and DW although the placement of CC was sometimes not strongly supported (Xiang et al., 1996, 1993, 1998a, 2011; Fan and Xiang, 2003; Fan et al., 2004; Fu et al., 2019). However, data from cDNA sequences of the genes PISTILLATA (PI) and LEAFY (LFY) appeared to agree with Murrell’s hypothesis, but did not provide strong support (Zhang et al. 2008; Liu et al. 2013). The nuclear phylogeny generated from our study revealed novel relationships within Cornus (Fig. 3). This relationship is different from all previous plastid‐based phylogenetic studies of Cornus and Cornales where the DW group has usually been strongly supported as the sister of the BB group (Xiang et al., 1993, 1996, 1998, 2006, 2011; Fan and Xiang, 2003; Fu et al., 2019). On the other hand, our analyses show support for BB and CC in a clade sister to BW and DW sister to the remainder of the genus (DW (BW (CC, BB))) (Fig. 3), different from all previous hypotheses. However, at the branch leading to (BW (CC, BB)), the first alternative topology [(BW (DW (CC, BB))) Q2 = 0.33] has a substantial amount of support and is consistent with the hypothesis by Murrell (1993). This hypothesis is also consistent with a more recent morphological analysis by Xiang et al. (2006), which used the same morphological characters as Murrell (1993) plus an additional nine morphological characters and expanded sampling. Uneven quartet frequencies in alternative topologies for the branch leading to (BW (CC, BB)) suggest an ancient reticulation or perhaps the possibility of a hybrid origin for the DW group and/or ILS during the early diversification of Cornus. At the branch leading to (CC, BB) both alternative topologies have equal frequencies, a pattern consistent with ILS. From the recent discoveries of geographically widespread Cornus fossils (Manchester et al., 2010; Atkinson, 2016), Atkinson (2016) suggests that Cornus was much more widespread and diverse in the past and that extant taxa represent relict species from these ancient lineages. The possibility of more diverse and widespread taxa in Cornus in the past supports our hypotheses as these large, widespread populations may have led to the patterns of gene tree discordance we see, suggesting possible gene flow and/or ILS in Cornus.
The patterns of gene tree discordance for some of the deep relationships within Cornales (e.g., involving major clades nyssoids and Hydrostachyaceae) and within Cornus suggest rapid radiation of these lineages in their early evolutionary history, as also evidenced in analyses of the plastid genome data (Fu et al., 2019) and a phylogenetic analysis including fossils (Atkinson, 2018). The variation in quartet frequencies we see across the tree are largely consistent with ILS, but there are a few nodes where quartet frequencies for alternative resolutions are uneven implicating the possibility of ancient gene flow or hybrid speciation in those lineages (Figs. 2, 3). These results substantiate the need to increase efforts to generate and analyze nuclear sequence, morphological, and paleontological data to disentangle these complex evolutionary relationships. Future work should test whether ancient gene flow or hybrid speciation may have contributed to gene tree‐species tree conflict.
CONCLUSIONS
This study demonstrates the utility of the Angiosperm 353 probe kit for phylogenomic studies of Cornales. Our findings help elucidate the diversification processes and evolutionary histories of Cornales and Cornaceae. Despite evidence of a rapid radiation, early in the diversification of Cornales, this data set was able to resolve a robust species tree for among‐family relationships (including Hydrostachyaceae) within the Cornales. Uneven quartet frequencies for the alternative resolutions around some nodes, together with the observed patterns of gene tree discordance at some deep nodes within Cornales and Cornus (Figs. 2, 3) may imply ancient reticulate evolution.
AUTHOR CONTRIBUTIONS
Shawn Kodiattu Thomas: Formal analysis (equal); Investigation (equal); Writing – original draft (equal); Writing – review & editing (equal). Xiang Liu: Data curation (equal); Writing – review & editing (equal). Zhi‐Yuan Du: Data curation (equal); Formal analysis (equal); Writing – review & editing (equal). Yibo Dong: Data curation (equal); Writing – review & editing (equal). Amanda Cummings: Data curation (equal); Writing – review & editing (equal). Lisa Pokorny: Data curation (equal); Writing – review & editing (equal). Qui‐Yun (Jenny) Xiang: Conceptualization (equal); Data curation (equal); Funding acquisition (equal); Investigation (equal); Writing – review & editing (equal). James H. Leebens‐Mack: Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Writing – original draft (equal); Writing – review & editing (equal).
J.L.M. and Q.Y.(J.)X. conceived the project and designed the sampling strategy. Y.D., X.L., and L.P. prepared DNA samples. X.L. and A.C. prepared genomic DNA libraries and performed library amplifications. A.C. and L.P. performed hybridizations, qPCR, and sample pooling. S.K.T. and Z.Y.D. analyzed the data. S.K.T. and J.L.M. wrote the first draft of the manuscript. All authors edited and commented on the final version of the manuscript.
Supporting information
APPENDIX S1. Sample, read pairs, and target gene recovery information.
APPENDIX S2. Species trees produced from raw and trimmed alignments.
APPENDIX S3. Heat map depicting presence/absence of target genes for the 158 samples used.
APPENDIX S4. Distributions of paralogous loci in our samples.
APPENDIX S5. Comparison of missing data percentage per loci in raw and trimmed alignments.
APPENDIX S6. AMAS summary statistics for raw alignments.
APPENDIX S7. AMAS summary statistics for trimmed alignments.
Acknowledgments
The authors are grateful for the comments and suggestions from Associate Editor Dr. Norm Wickett and the two anonymous reviewers that have helped improve this manuscript. We thank Dr. Adam Bewick and Dr. Karolina Heyduk for their valuable help and feedback during data analysis. We thank Sackler Phylogenomic Laboratory, within the Jodrell Laboratory at Royal Botanic Gardens, Kew (Richmond, Surrey, UK), for facilitating molecular lab resources. We thank the Georgia Genomics and Bioinformatics Core facility for generating sequence data for this project and the Georgia Advanced Computing Resource Center for assistance in installing software and in maintaining a high‐performance computing cluster for data analysis. We thank W.B. Zhou for providing DNA samples of some taxa, H. Sun, Q.F. Wang, J. Chen, Y. Luo, X.F. Gao, Y.D. Gao, X.L. Zhao, W.B. Liao, R.S. Lu, L. Li, Y. Liu, and W.B. Zhou for help in obtaining some plant materials and J.C. Raulston Arboretum, Sarah Duke Garden, Missouri Botanical Garden, Arnold Arboretum, NBG (SANBI, South Africa), for providing materials from their living, herbarium, and silica‐dried tissue collections for molecular sampling of DNA. Funding was provided by the Dogwood Genome Project (NSF IOS‐1444567) and benefited from NSF DEB‐1442161.
Thomas, S. K., Liu X., Du Z.‐Y., Dong Y., Cummings A., Pokorny L., Xiang Q.‐Y. (J.), and Leebens‐Mack J. H.. 2021. Comprehending Cornales: phylogenetic reconstruction of the order using the Angiosperms353 probe set. American Journal of Botany 108(7): 1112–1121.
Contributor Information
Shawn K. Thomas, Email: shawnkt4@gmail.com.
Qui‐Yun (Jenny) Xiang, Email: jenny_xiang@ncsu.edu.
James H. Leebens‐Mack, Email: jleebensmack@uga.edu.
Data Availability Statement
Raw sequence reads are available in the NCBI SRA BioProject PRJNA729098 (http://www.ncbi.nlm.nih.gov/bioproject/729098). Scripts, intermediate data, assembled sequences and trees are available at https://github.com/shawnkt/Cornales353.
LITERATURE CITED
- Albach, D. C., Soltis D. E., Chase M. W., and Soltis P. S.. 2001. Phylogenetic placement of the enigmatic angiosperm Hydrostachys . Taxon 50: 781–805. [Google Scholar]
- Altschul, S. F., Gish W., Miller W., Myers E. W., and Lipman D. J.. 1990. Basic local alignment search tool. Journal of Molecular Biology 215: 403–410. [DOI] [PubMed] [Google Scholar]
- Atkinson, B. A.2016. Cretaceous origin of dogwoods: an anatomically preserved Cornus (Cornaceae) fruit from the Campanian of Vancouver Island. PeerJ 4: e2808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Atkinson, B. A.2018. The critical role of fossils in inferring deep‐node phylogenetic relationships and macroevolutionary patterns in Cornales. American Journal of Botany 105: 1401–1411. [DOI] [PubMed] [Google Scholar]
- Balaban, M., Moshiri N., Mai U., Jia X., and Mirarab S.. 2019. TreeCluster: Clustering biological sequences using phylogenetic trees. PLoS One 14: e0221068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolger, A. M., Lohse M., and Usadel B.. 2014. Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics 15: 2114–2120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borowiec, M. L.2016. AMAS: a fast tool for alignment manipulation and computing of summary statistics. PeerJ 4: e1660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brewer, G. E., Clarkson J. J., Maurin O., Zuntini A. R., Barber V., Bellot S., Biggs N., et al. 2019. Factors affecting targeted sequencing of 353 nuclear genes from herbarium specimens spanning the diversity of angiosperms. Frontiers in Plant Science 10: 1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burleigh, J. G., Hilu K. W., and Soltis D. E.. 2009. Inferring phylogenies with incomplete data sets: a 5‐gene, 567‐taxon analysis of angiosperms. BMC Evolutionary Biology 9: 61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davidson, R., Vachaspati P., Mirarab S., and Warnow T.. 2015. Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer. BMC Genomics 16(Supplement 10): S1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Degnan, J. H., and Rosenberg N. A.. 2006. Discordance of species trees with their most likely gene trees. PLoS Genetics 2: e68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doyle, J. J., and Doyle J. L.. 1987. CTAB DNA extraction in plants. Phytochemical Bulletin 19: 11–15. [Google Scholar]
- Eyde, R. H.1988. Comprehending Cornus: puzzles and progress in the systematics of the dogwoods. Botanical Review 54: 233–351. [Google Scholar]
- Fan, C., Purugganan M. D., Thomas D. T., Wiegmann B. M., and Xiang (J.) Q. Y.. 2004. Heterogeneous evolution of the Myc‐like Anthocyanin regulatory gene and its phylogenetic utility in Cornus L. (Cornaceae). Molecular Phylogenetics and Evolution 33: 580–594. [DOI] [PubMed] [Google Scholar]
- Fan, C., and Xiang Q.‐Y.. 2003. Phylogenetic analyses of Cornales based on 26S rRNA and combined 26S rDNA‐MATK‐RBCL sequence data. American Journal of Botany 90: 1357–1372. [DOI] [PubMed] [Google Scholar]
- Ferguson, I. K.1977. Cornaceae Dum. World pollen and spore flora, vol. 6, 1–34. Almqvist & Wiksell, Stockholm, Sweden. [Google Scholar]
- Fu, C.‐N., Mo Z.‐Q., Yang J.‐B., Ge X.‐J., Li D.‐Z., Xiang Q.‐Y. J., and Gao L.‐M.. 2019. Plastid phylogenomics and biogeographic analysis support a trans‐Tethyan origin and rapid early radiation of Cornales in the Mid‐Cretaceous. Molecular Phylogenetics and Evolution 140: 106601. [DOI] [PubMed] [Google Scholar]
- Hale, H., Gardner E. M., Viruel J., Pokorny L., and Johnson M. G.. 2020. Strategies for reducing per‐sample costs in target capture sequencing for phylogenomics and population genomics in plants. Applications in Plant Sciences 8: e11337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hart, M. L., Forrest L. L., Nicholls J. A., and Kidner C. A.. 2016. Retrieval of hundreds of nuclear loci from herbarium specimens. Taxon 65: 1081–1092. [Google Scholar]
- Hayes, R. F., Smith S. Y., Montaellano‐Ballesteros M., Alvarez‐Reyes G., Hernandez‐Rivera R., and Fastovsky D. E.. 2018. Cornalean affinities, phylogenetic significance, and biogeographic implications of Operculifructus infructescences from the Late Cretaceous (Campanian) of Mexico. American Journal of Botany 105: 1911–1928. [DOI] [PubMed] [Google Scholar]
- Heyduk, K., Stephens J. D., Faircloth B. C., and Glenn T. C.. 2016. Targeted DNA region re‐sequencing. InAransay A. M. and J. L.Lavín Trueba [eds.], Field guidelines for genetic experimental designs in high‐throughput sequencing, 43–68. Springer International Publishing, Cham, Switzerland. [Google Scholar]
- Hoang, D. T., Chernomor O., A. von Haeseler , Minh B. Q., and Vinh L. S.. 2018. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Molecular Biology and Evolution 35: 518–522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson, M. G., Gardner E. M., Liu Y., Medina R., Goffinet B., Shaw A. J., Zerega N. J. C., and Wickett N. J.. 2016. HybPiper: Extracting coding sequence and introns for phylogenetics from high‐throughput sequencing reads using target enrichment. Applications in Plant Sciences 4: 1600016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson, M. G., Pokorny L., Dodsworth S., Botigué L. R., Cowan R. S., Devault A., Eiserhardt W. L., et al. 2018. A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k‐medoids clustering. Systematic Biology 68: 594–606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalyaanamoorthy, S., Minh B. Q., Wong T. K. F., A. von Haeseler , and Jermiin L. S.. 2017. ModelFinder: fast model selection for accurate phylogenetic estimates. Nature Methods 14: 587–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katoh, K., Kuma K.‐I., Toh H., and Miyata T.. 2005. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Research 33: 511–518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katoh, K., Misawa K., Kuma K.‐I., and Miyata T.. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30: 3059–3066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kubatko, L. S., and Degnan J. H.. 2007. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Systematic Biology 56: 17–24. [DOI] [PubMed] [Google Scholar]
- Li, H., and Durbin R.. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics 25: 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu, J., Franks R. G., Feng C.‐M., Liu X., Fu C.‐X., and Jenny Xiang Q.‐Y.. 2013. Characterization of the sequence and expression pattern of LFY homologues from dogwood species (Cornus) with divergent inflorescence architectures. Annals of Botany 112: 1629–1641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manchester, S. R., Xiang X., and (J.) Xiang Q.. 2010. Fruits of Cornelian cherries (Cornaceae: Cornus subg. Cornus) in the Paleocene and Eocene of the Northern Hemisphere. International Journal of Plant Sciences 171: 882–891. [Google Scholar]
- McCormack, J. E., Huang H., and Knowles L. L.. 2009. Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design. Systematic Biology 58: 501–508. [DOI] [PubMed] [Google Scholar]
- McKain, M. R., Johnson M. G., Uribe‐Convers S., Eaton D., and Yang Y.. 2018. Practical considerations for plant phylogenomics. Applications in Plant Sciences 6: e1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mirarab, S., Nguyen N., Guo S., Wang L.‐S., Kim J., and Warnow T.. 2015. PASTA: ultra‐large multiple sequence alignment for nucleotide and amino‐acid sequences. Journal of Computational Biology 22: 377–386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mirarab, S., Nguyen N., and Warnow T.. 2014. PASTA: Ultra‐large multiple sequence alignment. InSharan R. [ed.], Research in computational molecular biology, 177–191. RECOMB 2014. Lecture notes in computer science, vol. 8394. Springer International Publishing, Cham, Switzerland. [Google Scholar]
- Molloy, E. K., and Warnow T.. 2018. To include or not to include: the impact of gene filtering on species tree estimation methods. Systematic Biology 67: 285–303. [DOI] [PubMed] [Google Scholar]
- Murrell, Z. E.1993. Phylogenetic relationships in Cornus (Cornaceae). Systematic Botany 18: 469–495. [Google Scholar]
- Nguyen, L.‐T., Schmidt H. A., A. von Haeseler , and Minh B. Q.. 2015. IQ‐TREE: a fast and effective stochastic algorithm for estimating maximum‐likelihood phylogenies. Molecular Biology and Evolution 32: 268–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noshiro, S., and Baas P.. 1998. Systematic wood anatomy of Cornaceae and allies. IAWA Journal 19: 43–97. [Google Scholar]
- Nurk, S., Bankevich A., Antipov D., Gurevich A., Korobeynikov A., Lapidus A., Prjibelsky A., et al. 2013. Assembling genomes and mini‐metagenomes from highly chimeric reads. InDeng M., Jiang R., Sun F., and X.Zhang [eds.], Research in computational molecular biology, 158–170. Springer, Berlin, Germany. [Google Scholar]
- One Thousand Plant Transcriptomes Initiative , Leebens‐Mack, J.H., Barker, M.S. et al. 2019. One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574: 679–685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slater, G. S. C., and Birney E.. 2005. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6: 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith, S. A., and Dunn C. W.. 2008. Phyutility: a phyloinformatics tool for trees, alignments, and molecular data. Bioinformatics 24: 715–716. [DOI] [PubMed] [Google Scholar]
- Smith, S. A., Moore M. J., Brown J. W., and Yang Y.. 2015. Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants. BMC Evolutionary Biology 15: 150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Soza, V. L., Lindsley D., Waalkes A., Ramage E., Patwardhan R. P., Burton J. N., Adey A., et al. 2019. The Rhododendron genome and chromosomal organization provide insight into shared whole‐genome duplications across the heath family (Ericaceae). Genome Biology and Evolution 11: 3353–3371. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Štorchová, H., Hrdličková R., Chrtek J. Jr., Tetera M., Fitze D., and Fehrer J.. 2000. An improved method of DNA isolation from plants collected in the field and conserved in saturated NaCl/CTAB solution. Taxon 49: 79–84. [Google Scholar]
- Tang, W., Sun X., Yue J., Tang X., Jiao C., Yang Y., Niu X., et al. 2019. Chromosome‐scale genome assembly of kiwifruit Actinidia eriantha with single‐molecule sequencing and chromatin interaction mapping. GigaScience 8: giz027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Villaverde, T., Pokorny L., Olsson S., Rincón‐Barrado M., Johnson M. G., Gardner E. M., Wickett N. J., et al. 2018. Bridging the micro‐and macroevolutionary levels in phylogenomics: Hyb‐Seq solves relationships from populations to species and above. New Phytologist 220: 636–650. [DOI] [PubMed] [Google Scholar]
- Wei, C., Yang H., Wang S., Zhao J., Liu C., Gao L., Xia E., et al. 2018. Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality. Proceedings of the National Academy of Sciences, USA 115: E4151–E4158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wheeler, T. J., and Kececioglu J. D.. 2007. Multiple alignment by aligning alignments. Bioinformatics 23: i559–i568. [DOI] [PubMed] [Google Scholar]
- Xiang, Q.‐Y.1999. Systematic affinities of Grubbiaceae and Hydrostachyaceae within the Cornales—Insights from rbcL sequences. Harvard Papers in Botany 4: 527–541. [Google Scholar]
- Xiang, Q.‐Y., Moody M. L., Soltis D. E., Fan C. Z., and Soltis P. S.. 2002. Relationships within Cornales and circumscription of Cornaceae—matK and rbcL sequence data and effects of outgroups and long branches. Molecular Phylogenetics and Evolution 24: 35–57. [DOI] [PubMed] [Google Scholar]
- Xiang, J. Q., Brunsfeld S. J., Soltis D. E., and Soltis P. S.. 1996. Chloroplast DNA phylogeny of Cornus L. (Cornaceae) and its implications for biogeography and character evolution. Systematic Botany 21: 515–534. [Google Scholar]
- Xiang, Q.‐Y., Soltis D. E., Morgan D. R., and Soltis P. S.. 1993. Phylogenetic relationships of Cornus L. sensu lato and putative relatives inferred from rbcL sequence data. Annals of the Missouri Botanical Garden 80: 723–734. [Google Scholar]
- Xiang, Q., Soltis D., and Soltis P.. 1998. Phylogenetic relationships of Cornaceae and close relatives inferred from matK and rbcL sequences. American Journal of Botany 85: 285. [PubMed] [Google Scholar]
- Xiang, Q.‐Y., Thomas D. T., and Xiang Q. P.. 2011. Resolving and dating the phylogeny of Cornales—Effects of taxon sampling, data partitions, and fossil calibrations. Molecular Phylogenetics and Evolution 59: 123–138. [DOI] [PubMed] [Google Scholar]
- Xiang, Q.‐Y., Thomas D. T., Zhang W., Manchester S. R., and Murrell Z.. 2006. Species level phylogeny of the genus Cornus (Cornaceae) based on molecular and morphological evidence—implications for taxonomy and Tertiary intercontinental migration. Taxon 55: 9–30. [Google Scholar]
- Xiang, Q. Y. (J.), Thorne J. L., Seo T. K., Zhang W., Thomas D. T., and Ricklefs R. E.. 2008. Rates of nucleotide substitution in Cornaceae (Cornales)—Pattern of variation and underlying causal factors. Molecular Phylogenetics and Evolution 49: 327–342. [DOI] [PubMed] [Google Scholar]
- Xi, Z., Liu L., and Davis C. C.. 2016. The impact of missing data on species tree estimation. Molecular Biology and Evolution 33: 838–860. [DOI] [PubMed] [Google Scholar]
- Zhang, C., Rabiee M., Sayyari E., and Mirarab S.. 2018. ASTRAL‐III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics 19: 153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang, C., Zhang T., Luebert F., Xiang Y., Huang C.‐H., Hu Y., Rees M., et al. 2020. Asterid phylogenomics/phylotranscriptomics uncover morphological evolutionary histories and support phylogenetic placement for numerous whole genome duplications. Molecular Biology and Evolution 37: 3188–3210. [DOI] [PubMed] [Google Scholar]
- Zhang, W., Xiang Q.‐Y., Thomas D. T., Wiegmann B. M., Frohlich M. W., and Soltis D. E.. 2008. Molecular evolution of PISTILLATA‐like genes in the dogwood genus Cornus (Cornaceae). Molecular Phylogenetics and Evolution 47: 175–195. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
APPENDIX S1. Sample, read pairs, and target gene recovery information.
APPENDIX S2. Species trees produced from raw and trimmed alignments.
APPENDIX S3. Heat map depicting presence/absence of target genes for the 158 samples used.
APPENDIX S4. Distributions of paralogous loci in our samples.
APPENDIX S5. Comparison of missing data percentage per loci in raw and trimmed alignments.
APPENDIX S6. AMAS summary statistics for raw alignments.
APPENDIX S7. AMAS summary statistics for trimmed alignments.
Data Availability Statement
Raw sequence reads are available in the NCBI SRA BioProject PRJNA729098 (http://www.ncbi.nlm.nih.gov/bioproject/729098). Scripts, intermediate data, assembled sequences and trees are available at https://github.com/shawnkt/Cornales353.