Abstract
The clusioid clade of Malpighiales is comprised of five families: Bonnetiaceae, Calophyllaceae, Clusiaceae, Hypericaceae and Podostemaceae. Recent studies have found the plastome structure of Garcinia mangostana L. from Clusiaceae was conserved, while plastomes of five riverweed species from Podostemaceae showed significant structural variations. The diversification pattern of plastome structure of the clusioid clade worth a thorough investigation. Here we determined five complete plastomes representing four families of the clusioid clade. Our results found that the plastomes of the early diverged three families (Clusiaceae, Bonnetiaceae and Calophyllaceae) in the clusioid clade are relatively conserved, while the plastomes of the other two families show significant variations. The Inverted Repeat (IR) regions of Tristicha trifaria and Marathrum foeniculaceum (Podostemaceae) are greatly reduced following the loss of the ycf1 and ycf2 genes. An inversion over 50 kb spanning from trnK-UUU to rbcL in the LSC region is shared by Cratoxylum cochinchinense (Hypericaceae), T. trifaria and Ma. foeniculaceum (Podostemaceae). The large inversed colinear block in Hypericaceae and Podostemaceae contains all the genes in the 50-kb inversed colinear block in a clade of Papilionoideae, with two extra genes (trnK-UUU and matK) at one end. Another endpoint of both inversions in the two clusioids families and Papilionoideae is located between rbcL and accD. This study greatly helped to clarify the plastome evolution in the clusioid clade.
Subject terms: Molecular evolution, Plant evolution
Introduction
Plastomes of heterotrophic plants are generally highly rearranged1, while plastomes of autotrophic angiosperms seem to be relatively conserved2. Most autotrophic angiosperm plastomes are characterized by a copy of Inverted Repeat (IR) regions, one Large Single Copy (LSC) region and one Small Single Copy (SSC) region, with the average size of 153 kb, generally include 101–118 unique genes that primarily participating in photosynthesis, transcription, and translation3,4. The advent of high-throughput sequencing has facilitated rapid progress in the field of comparative plastid genomics5,6.
Several distinct autotrophic angiosperms clades have substantial variations in plastome size and gene order. Large variation of plastome size is often associated with IR expansion or contraction7, but could also be influenced to some extent by gene and intron losses. The loss of two hypothetical open reading frames ycf1 and ycf2, two largest plastid genes, could significantly reduce the plastome size6. Multiple independent losses of some plastid genes and introns have been reported8,9, some of these genes have transferred to the nucleus9. Successful gene transfers from the plastid to the nuclear genome during angiosperm evolution have been documented for rpl22, rpl32 and infA9,10.
Inversions play an important role in plastid genome structural variations and have been fully characterized in a number of plastomes. Large inversions have been found in plastomes of many plant lineages, such as Onagraceae11, Asteraceae12, and Fabaceae13. In Fabaceae, multiple large inversions have been reported, including a 50-kb inversion shared by most Papilionoids except a few early-diverging clades, a 78-kb inversion in Phaseolinae of Phaseoleae, inversions of 23-kb, 24-kb, or 36-kb in the Genistoid clade, a 39-kb inversion in Robinia of Papilionoideae, and a 38-kb inversion in Tylosema of Cercidoideae13–15. Recent studies have found short Inverted Repeat (sIR) meditated flip-flop recombination event could induce large inversions13,15,16.
The clusioid clade (Malpighiales) contains five families (Bonnetiaceae, Calophyllaceae, Clusiaceae, Hypericaceae, and Podostemaceae) represented by 94 genera and ~1900 species17. Their distribution is nearly cosmopolitan, with the greatest species diversity in the tropics18. Species in this clade include large tropical rainforest trees, temperate and high-altitude tropical herbs and shrubs, and even aquatic plants (Podostemaceae) growing in swift-flowing rivers and streams18. Many species are economically important, such as tropical fruits including the mangosteen (Garcinia mangostana L.) and the mammey apple (Mammea americana L.), timber (Calophyllum brasiliense Cambess., Mesua ferrea L.), and medicine (Hypericum perforatum L.).
Previous studies found that the plastome of Garcinia mangostana L. from Clusiaceae was relatively conserved19, while plastomes of five riverweed species from Podostemaceae had highly variable structure20. Why closely related families have so diverged plastome structure? What is the plastome structural divergence pattern of this economically and ecologically important clade? The diversification pattern of plastome structure of the clusioid clade worth a further investigation. Here we determined five complete plastomes in the clusioid clade: Bonnetia paniculata Spruce ex Benth. (Bonnetiaceae), Me. ferrea (Calophyllaceae), Cratoxylum cochinchinense (Lour.) Blume. (Hypericaceae), Tristicha trifaria (Bory ex Willd.) Spreng. and Marathrum foeniculaceum Bonpl. (Podostemaceae). Comparison of the plastomes in this clade unveils significantly reduced IR regions in the plastomes of T. trifaria and Ma. foeniculaceum following the loss of ycf1 and ycf2. A large inversion over 50 kb spanning from trnK-UUU to rbcL in the LSC region is shared by C. cochinchinense, T. trifaria and Ma. foeniculaceum.
Results and Discussion
Plastome sequencing and general characteristics
Raw reads were all obtained through whole-genome sequencing. Due to the differences in plant materials and experimental procedures, the average coverage depth of plastomes varies from 89 to 1823 (Tables 1, Table S1). All five new plastomes of the clusioid clade exhibit a typical quadripartite structure. The plastome size among the sampled clusioids species ranges from 130,967 bp in T. trifaria to 161,473 bp in Me. ferrea. The length of IR ranges from 19,916 bp in Ma. foeniculaceum to 27,614 bp in Me. ferrea. The GC content of Ma. foeniculaceum is slightly lower. The plastome size and IR length of the two species from the Podostemaceae are significantly smaller than those of the other three clusioids families.
Table 1.
Bonnetia paniculata | Mesua ferrea | Cratoxylum cochinchinense | Tristicha trifaria | Marathrum foeniculaceum | |
---|---|---|---|---|---|
Family | Bonnetiaceae | Calophyllaceae | Hypericaceae | Podostemaceae | Podostemaceae |
Size (bp) | 156,782 | 161,473 | 156,953 | 130,967 | 131,600 |
Status | complete | complete | complete | complete | complete |
Average base-coverage | 89× | 254× | 382× | 1823× | 476× |
Reads-used | 30,000,000 | 30,000,000 | 30,000,000 | 20,570,154 | 19,714,250 |
IR size (bp) | 27,309 | 27,614 | 26,086 | 19,599 | 19,916 |
Average read length (bp) | 149 | 99 | 99 | 149 | 149 |
GC content | 36.2% | 36.4% | 36.3% | 36.3% | 35.1% |
Accession number | MK995182 | MK995181 | MK995180 | MK995179 | MK995178 |
Phylogenetic relationships
A maximum likelihood tree was constructed using an 82-gene matrix. The clusioid clade was strongly supported with a bootstrap value (BS) of 100%. Previous studies such as Ruhfel et al. (2011) using three plastid and one mitochondrial loci and Xi et al. (2012) using broad-range sampling plastome data also strongly supported the clusioid clade17,21. Our results are congruent with previous studies, which resolved a well-supported (Bonnetiaceae, Clusiaceae) clade as the early diverged lineage, and strongly supported Calophyllaceae being the sister to the strongly supported (Hypericaceae, Podostemaceae) clade17,21,22 (Fig. 1). There are also many morphological characteristics of species in this clade supporting these phylogenetic relationships. Though the position of the wholly aquatic Podostemaceae has been very difficult to be determined owing to their highly atypical morphology, the terrestrial members of this clade (i.e., Bonnetiaceae, Calophyllaceae, Clusiaceae, and Hypericaceae) have long been considered closely related17,18,21. Bonnetiaceae and Clusiaceae share staminal fascicles opposite the petals. Hypericaceae and Podostemaceae share tenuinucellate ovules17. Additionally, some members of Hypericaceae and Podostemaceae have papillate stigmas. Besides, Hypericaceae, Calophyllaceae, and some Podostemaceae share resin-containing glands or canals that are especially visible in the leaves17. The phylogeny of the clusioid clade provides a framework for understanding the evolutionary history of this morphologically and ecologically diverse clade.
Plastome evolution
Plastomes structure of the early diverged three families (Clusiaceae, Bonnetiaceae and Calophyllaceae) are relatively conserved with only a few gene losses or pseudogenes. The infA gene and the second intron of ycf3 are lost in the plastid genome of G. mangostana, B. paniculata and Me. ferrea. The ndhK gene is pseudogene due to the presence of an internal stop codon in B. paniculata. Other gene losses include the rps16 gene in B. paniculata and the rpl32 gene in G. mangostana. However, the other two families (Hypericaceae and Podostemaceae) show more considerable plastome structural variations. The plastomes of C. cochinchinense, T. trifaria and Ma. foeniculaceum have lost the infA and rps16 genes, the second intron of the clpP gene, the second intron of the ycf3 gene, and the intron of the rps12 gene. The rpl32 gene in T. trifaria and Ma. foeniculaceum and the rps7 gene in Ma. foeniculaceum are pseudogenes due to the presence of premature stop codons. Additional gene losses in T. trifaria and Ma. foeniculaceum include plastid hypothetical ORFs (ycf genes), the ycf1 and ycf2 genes. As a result of the two ycf genes losses, the plastomes of the two Podostemaceae species are significantly smaller than the other four sequenced plastomes of the clusioid clade. IRs have expanded approximately 800 bp at the IR/SSC boundary in Ma. foeniculaceum, resulted in the relocation of the rps15 gene from SSC to IR.
The two identical copies of IR provide a template for error correction when a mutation occurs in one of the copies, and hence likely suppress the substitution rate in the IR3. Previous studies have reported the increased substitution rate of genes relocated from IR into SC23, and the decreased substitution rate of genes relocated from SC into IR24. However, relocation of ycf2 from IR into SC did not followed by an accelerated substitution rate, which has been explained by a recently occurred event in ginkgo evolution25. Studies in Pelargonium plastomes also found that expansion of IR does not result in decreased substitution rates of the relocated genes, suggesting the lineage- and locus-specific rate heterogeneity may have a larger effect that the IR on the substitution rate variation in plastid genes3,24. In our study, the relocated rps15 gene didn’t show decreased substitution rate (LRT p-value: 0.21, df = 1, details in Table S2). Since the relocation of rps15 did not accumulate significant mutations, we hypothesize that this relocation occurred recently or the rps15 gene is simply too short for the substitution rate to be detected. Our study supplies another case that the gene relocated into IR does not show decreased substitution rates. Patterns of molecular evolution in the IR and SC regions differ, most notably by a reduced rate of nucleotide substitution in the IR compared to the SC region, but the evolutionary consequences may be more complex than previous suggested3,24.
The loss of ycf1 and ycf2 genes have been documented in the plastomes of Poaceae26, Geraniaceae27, and Ericaceae28. The functions of both ycf genes are still controversial. Studies in tobacco (Nicotiana tabacum) and green algae (Chlamydomonas reinhardtii) suggested the ycf1 and ycf2 genes should not be related to photosynthesis, but encode products that are essential for cell survival29. These two genes have been inferred to be involved in cell division, DNAs/mRNA binding, protein assembly and transport, etc29. One essential function of the ycf1 and ycf2 genes might be linked to expression, assembly, or function of the accD gene product28,30. Some Poaceae species have lost both ycf genes in addition to the accD gene31. Plants that have lost the accD gene have divergent ycf1 and ycf2 sequences30. The plastome of T. trifaria and Ma. foeniculaceum, which have lost ycf1 and ycf2, have highly divergent accD sequences with only 51.8% and 51.1% identical sites, respectively, comparing with that of the early diverged three families (Clusiaceae, Bonnetiaceae and Calophyllaceae). Interestingly, the plastome of C. cochinchinense, which contains the two ycf genes, also has highly variable accD sequences with only 51.8% identical sites comparing with the firstly diverged three families. Further investigation is required to clarify the coevolution of accD and two ycf genes. Why plastomes of these taxa lost two ycf genes remains unclear, and they are also worth further explorations.
Inversions have been fully characterized in a number of plastomes and represent an essential mechanism for plastome rearrangements2. A Large inversion spanning from trnK-UUU to rbcL in the LSC region is shared by three plastomes of Hypericaceae and Podostemaceae (56 kb and 52 kb respectively; Fig. 2). The inversions are about 4 kb shorter in T. trifaria and Ma. foeniculaceum than that in C. cochinchinense, mainly due to the loss of some intergenic sequences in the Podostemaceae plastomes. Parallel inversions utilizing the same endpoints in distantly related taxa are extremely rare6. Within Fabaceae, a 50-kb inversion occurs in most Papilionoideae except a few basal lineages14,32. Interestingly, the large inversion of Hypericaceae and Podostemaceae contains all the genes in the 50-kb inversion of Papilionoideae, with two extra genes (trnK-UUU and matK) at one breakpoint. Another breakpoint of this inversion is located between rbcL and accD, being identical to that of the 50-kb inversion of Papilionoideae. Earlier studies have demonstrated a strong correlation between repetitive sequences and the incidence of inversions. In several cases, dispersed repeats have been inferred to promote inversions through intramolecular recombination2,15. The distribution of repeats was found to be strongly associated with breakpoints in the rearranged plastomes of Geraniaceae27. Studies confirmed that a specific plastomic inversion of a 34-kb fragment in Calocedrus macrolepis was likely to be mediated by an 11-bp IR. A 36-kb inversion in Lupinus and a 39-kb inversion in Robinia are probably mediated by a pair of 29-bp sIRs situated in the 3′-ends of two trnS genes13. No repeats have been found in boundary regions of the inversions in T. trifaria and Ma. foeniculaceum. While a pair of 76-bp sIRs are found at the breakpoints of the 56-kb inversion in C. cochinchinense (Fig. S1), which probably mediated this inversion.
Methods
Taxon sampling and DNA sequencing
The previously published plastome of G. mangostana19 (NC_036341) in this clade was included in comparative analyses. Illumina sequencing data of two species were obtained from the NCBI Sequence Read Archive (accession no. SRR7121482 and SRR7121944) representing two families in clusioids: Me. ferrea (Calophyllaceae) and C. cochinchinense (Hypericaceae). Three species were sampled to represent the other two families of this clade: B. paniculata (Bonnetiaceae), T. trifaria and Ma. foeniculaceum (Podostemaceae). Two representative species, Licania heteromorpha (NC_024062) from Malpighiales and Averrhoa carambola (NC_033350) from Oxalidales were included as outgroups.
Total genomic DNA of B. paniculata, T. trifaria, and Ma. foeniculaceum was isolated from specimens using the DNeasy Plant Mini Kit, then fragmented to construct short-insert (350 bp) library following manufacturer’s manual (Illumina). Paired-end sequencing was performed on Illumina HiSeq X TEN at Plant Germplasm and Genomics Center (Kunming Institute of Botany, Chinese Academy of Sciences). Details of sample collection are listed in Table S1.
Genome assembly, annotation and analyses
The paired-end reads were filtered and assembled into complete plastome using GetOrganelle v1.6.1a33–36 under default settings, with kmers set dependent on the sequenced read length: -k 21,35,45,55,65,75,85,95,105,115,121 were used for 150-bp reads, while -k 21,45,65,85,89,91,95,99 were used for 100-bp reads. Final assembly graphs were checked in Bandage37. Two configurations of each plastome caused by the flip-flop recombination mediated by the IR were obtained, and one of the them was arbitrarily selected for downstream analysis since the plastome exists in two equimolar states38. All plastomes were initially annotated using PGA39 and GeSeq40, with annotated plastome from Amborella trichopoda (NC_005086) selected as the reference. For confirmation, all annotations were compared with the previously published plastome of G. mangostana and exon boundaries were manually adjusted in Geneious Prime41. All newly sequenced plastomes were deposited in GenBank under the accession nos. MK995178-MK995182. [Note to Reviewers: deposited sequences will be released immediately upon acceptance]
The 82 shared protein-coding and rRNA genes were extracted from the plastomes of eight species using “get_annotated_regions_from_gb.py” (https://github.com/Kinggerm/PersonalUtilities/)42, then aligned with prank v.14060343. Phylogenetic analysis was performed using maximum likelihood methods with 1000 bootstrap replicates on RAxML version 8.2.1144. We used codeml implemented in PAML45 to estimate nucleotide substitution rates of the rps15 gene in Ma. foeniculaceum under the null model (1 dN/dS ratios for all branches) and alternative model (2 or more dN/dS ratios for branches). The codon frequencies were determined using F3 × 4 model. The 2-rate model was tested against the 1-rate model by LRT using chi2 in PAML. One copy of IR was removed from each plastome and the remaining genome sequences were aligned using the progressiveMauve algorithm in Mauve v2.3.146. In order to identify and discard small or insignificant genome rearrangements, the minimum LCB weight was set as 1588. Repeats were identified using the Find Repeats implanted in Geneious Prime. The criteria used were set as follows: minimum repeat length: 30 bp, maximum mismatches: 3%, exclude repeats up to 10 bp longer than contained repeat and exclude contained repeats when longer repeats has frequency at least 3.
Supplementary information
Acknowledgements
We thank the Missouri Botanical Garden for providing specimen materials, BGI researchers for confirmation of the SRA data and specimens, Cheng Liu from the laboratory of the Germplasm Bank of Wild Species, Kunming Institute of Botany for identification of the specimen, and Molecular Biology Experiment Center, Germplasm Bank of Wild Species in Southwest China for skillful laboratory assistance. This work was supported by grants from the Large-scale Scientific Facilities of the Chinese Academy of Sciences (No. 2017-LSF-GBOWS-02); the Science and Technology Basic Resources Investigation Program of China (2019FY100900); the Strategic Priority Research Program of Chinese Academy of Sciences (XDB31010000); and the National Natural Science Foundation of China [key international (regional) cooperative research project No. 31720103903]; the open research project for “Cross-Cooperative Team” of the Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences to Jian-Jun Jin.
Author contributions
D.M.J., J.J.J. and T.S.Y. conceived and designed the study; D.M.J. and J.J.J. analyzed data. D.M.J. wrote the manuscript, with contributions from all of the authors; All authors critically reviewed the paper.
Data availability
The complete plastome sequences of Marathrum foeniculaceum, Tristicha trifaria, Cratoxylum cochinchinense, Mesua ferrea and Bonnetia paniculata sequenced in this study has been submitted to GenBank database under accession numbers MK995178-MK995182.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
is available for this paper at 10.1038/s41598-020-66024-7.
References
- 1.Wicke S, et al. Mechanisms of functional and physical genome reduction in photosynthetic and nonphotosynthetic parasitic plants of the broomrape family. Plant Cell. 2013;25:3711–3725. doi: 10.1105/tpc.113.113373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mower, J. P. & Vickrey, T. L. Structural diversity among plastid genomes of land plants in Advances in botanical research Vol. 85 (ed. Jansen, R.K. Chaw, S.M.) 263-292 (Academic Press, 2018).
- 3.Weng ML, Ruhlman TA, Jansen RK. Expansion of inverted repeat does not decrease substitution rates in Pelargonium plastid genomes. New Phytol. 2016;214:842–851. doi: 10.1111/nph.14375. [DOI] [PubMed] [Google Scholar]
- 4.Daniell H, Lin CS, Yu M, Chang WJ. Chloroplast genomes: diversity, evolution, and applications in genetic engineering. Genome Biol. 2016;17:134. doi: 10.1186/s13059-016-1004-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sabir J, et al. Evolutionary and biotechnology implications of plastid genome variation in the inverted-repeat-lacking clade of legumes. Plant Biotechnol J. 2014;12:743–754. doi: 10.1111/pbi.12179. [DOI] [PubMed] [Google Scholar]
- 6.Rabah SO, et al. Passiflora plastome sequencing reveals widespread genomic rearrangements. J Syst Evol. 2019;57:1–14. [Google Scholar]
- 7.Chumley Timothy W., Palmer Jeffrey D., Mower Jeffrey P., Fourcade H. Matthew, Calie Patrick J., Boore Jeffrey L., Jansen Robert K. The Complete Chloroplast Genome Sequence of Pelargonium × hortorum: Organization and Evolution of the Largest and Most Highly Rearranged Chloroplast Genome of Land Plants. Molecular Biology and Evolution. 2006;23(11):2175–2190. doi: 10.1093/molbev/msl089. [DOI] [PubMed] [Google Scholar]
- 8.Jansen RK, et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. P Natl Acad Sci USA. 2007;104:19369–19374. doi: 10.1073/pnas.0709121104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Magee AM, et al. Localized hypermutation and associated gene losses in legume chloroplast genomes. Genome Res. 2010;20:1700–1710. doi: 10.1101/gr.111955.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Park S, Jansen RK, Park S. Complete plastome sequence of Thalictrum coreanum (Ranunculaceae) and transfer of the rpl32 gene to the nucleus in the ancestor of the subfamily Thalictroideae. BMC Plant Biol. 2015;15:40. doi: 10.1186/s12870-015-0432-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hupfer H, et al. Complete nucleotide sequence of the Oenothera elata plastid chromosome, representing plastome I of the five distinguishable Euoenothera plastomes. Mol Gen Genet. 2000;263:581–585. doi: 10.1007/pl00008686. [DOI] [PubMed] [Google Scholar]
- 12.Walker JF, Zanis MJ, Emery NC. Comparative analysis of complete chloroplast genome sequence and inversion variation in Lasthenia burkei (Madieae, Asteraceae) Am J Bot. 2014;101:722–729. doi: 10.3732/ajb.1400049. [DOI] [PubMed] [Google Scholar]
- 13.Wang YH, et al. Plastid genome evolution in the early-diverging legume subfamily Cercidoideae (Fabaceae) Front Plant Sci. 2018;9:138. doi: 10.3389/fpls.2018.00138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Doyle JJ, Doyle JL, Ballenger JA, Palmer JD. The distribution and phylogenetic significance of a 50-kb chloroplast DNA inversion in the flowering plant family Leguminosae. Mol Phylogenet Evol. 1996;5:429–438. doi: 10.1006/mpev.1996.0038. [DOI] [PubMed] [Google Scholar]
- 15.Keller J, et al. The evolutionary fate of the chloroplast and nuclear rps16 genes as revealed through the sequencing and comparative analyses of four novel legume chloroplast genomes from Lupinus. DNA Res. 2017;24:343–358. doi: 10.1093/dnares/dsx006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Qu XJ, Wu CS, Chaw SM, Yi TS. Insights into the existence of isomeric plastomes in Cupressoideae (Cupressaceae) Genome Biol Evol. 2017;9:1110–1119. doi: 10.1093/gbe/evx071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ruhfel BR, et al. Phylogeny of the clusioid clade (Malpighiales): evidence from the plastid and mitochondrial genomes. Am J Bot. 2011;98:306–325. doi: 10.3732/ajb.1000354. [DOI] [PubMed] [Google Scholar]
- 18.Ruhfel BR, Bove CP, Philbrick CT, Davis CC. Dispersal largely explains the Gondwanan distribution of the ancient tropical clusioid plant clade. Am J Bot. 2016;103:1117–1128. doi: 10.3732/ajb.1500537. [DOI] [PubMed] [Google Scholar]
- 19.Jo S, et al. The complete plastome of tropical fruit Garcinia mangostana (Clusiaceae) Mitochondrial DNA B. 2017;2:722–724. doi: 10.1080/23802359.2017.1390406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Bedoya, A. M. et al. Plastid genomes of five species of riverweeds (Podostemaceae): structural organization and comparative analysis in Malpighiales. Front Plant Sci10 (2019). [DOI] [PMC free article] [PubMed]
- 21.Xi ZX, et al. Phylogenomics and a posteriori data partitioning resolve the Cretaceous angiosperm radiation Malpighiales. P Natl Acad Sci USA. 2012;109:17519–17524. doi: 10.1073/pnas.1205818109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Li HT, et al. Origin of angiosperms and the puzzle of the Jurassic gap. Nat Plants. 2019;5:461–470. doi: 10.1038/s41477-019-0421-0. [DOI] [PubMed] [Google Scholar]
- 23.Perry AS, Wolfe KH. Nucleotide substitution rates in legume chloroplast DNA depend on the presence of the inverted repeat. J Mol Evol. 2002;55:501–508. doi: 10.1007/s00239-002-2333-y. [DOI] [PubMed] [Google Scholar]
- 24.Li FW, Kuo LY, Pryer KM, Rothfels CJ. Genes translocated into the plastid inverted repeat show decelerated substitution rates and elevated GC content. Genome Biol Evol. 2016;8:2452–2458. doi: 10.1093/gbe/evw167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lin CP, Wu CS, Huang YY, Chaw SM. The complete chloroplast genome of Ginkgo biloba reveals the mechanism of inverted repeat contraction. Genome Biol Evol. 2012;4:374–381. doi: 10.1093/gbe/evs021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Guisinger MM, Chumley TW, Kuehl JV, Boore JL, Jansen RK. Implications of the plastid genome sequence of Typha (Typhaceae, Poales) for understanding genome evolution in poaceae. J Mol Evol. 2010;70:149–166. doi: 10.1007/s00239-009-9317-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Weng ML, Blazier JC, Govindu M, Jansen RK. Reconstruction of the ancestral plastid genome in Geraniaceae reveals a correlation between genome rearrangements, repeats, and nucleotide substitution rates. Mol Biol Evol. 2014;31:645–659. doi: 10.1093/molbev/mst257. [DOI] [PubMed] [Google Scholar]
- 28.Braukmann TWA, Broe MB, Stefanovic S, Freudenstein JV. On the brink: the highly reduced plastomes of nonphotosynthetic Ericaceae. New Phytol. 2017;216:254–266. doi: 10.1111/nph.14681. [DOI] [PubMed] [Google Scholar]
- 29.Drescher A, Ruf S, Calsa T, Jr., Carrer H, Bock R. The two largest chloroplast genome-encoded open reading frames of higher plants are essential genes. Plant J. 2000;22:97–104. doi: 10.1046/j.1365-313x.2000.00722.x. [DOI] [PubMed] [Google Scholar]
- 30.Delannoy E, Fujii S. Colas des Francs-Small, C., Brundrett, M. & Small, I. Rampant gene loss in the underground orchid Rhizanthella gardneri highlights evolutionary constraints on plastid genomes. Mol Biol Evol. 2011;28:2077–2086. doi: 10.1093/molbev/msr028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Doyle JJ, Davis JI, Soreng RJ, Garvin D, Anderson MJ. Chloroplast DNA inversions and the origin of the grass family (Poaceae) Proc Natl Acad Sci USA. 1992;89:7722–7726. doi: 10.1073/pnas.89.16.7722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Cardoso D, et al. Revisiting the phylogeny of papilionoid legumes: new insights from comprehensively sampled early-branching lineages. Am J Bot. 2012;99:1991–2013. doi: 10.3732/ajb.1200380. [DOI] [PubMed] [Google Scholar]
- 33.Camacho C, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bankevich A, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Jin, J.J. et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. bioRxiv 256479, 10.1101/256479 (2019). [DOI] [PMC free article] [PubMed]
- 37.Wick RR, Schultz MB, Zobel J, Holt KE. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics. 2015;31:3350–3352. doi: 10.1093/bioinformatics/btv383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Walker JF, Jansen RK, Zanis MJ, Emery NC. Sources of inversion variation in the small single copy (SSC) region of chloroplast genomes. Am J Bot. 2015;102:1751–1752. doi: 10.3732/ajb.1500299. [DOI] [PubMed] [Google Scholar]
- 39.Qu XJ, Moore MJ, Li DZ, Yi TS. PGA: a software package for rapid, accurate, and flexible batch annotation of plastomes. Plant Methods. 2019;15:50. doi: 10.1186/s13007-019-0435-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Tillich M, et al. GeSeq - versatile and accurate annotation of organelle genomes. Nucleic Acids Res. 2017;45:W6–W11. doi: 10.1093/nar/gkx391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kearse M, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–1649. doi: 10.1093/bioinformatics/bts199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Zhang, R. et al. Exploration of plastid phylogenomic conflict yields new insights into the deep relationships of Leguminosae. Syst Biol, 10.1093/sysbio/syaa013 (2020). [DOI] [PMC free article] [PubMed]
- 43.Loytynoja A, Goldman N. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science. 2008;320:1632–1635. doi: 10.1126/science.1158395. [DOI] [PubMed] [Google Scholar]
- 44.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
- 46.Darling AE, Mau B, Perna N. T. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5:e11147. doi: 10.1371/journal.pone.0011147. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The complete plastome sequences of Marathrum foeniculaceum, Tristicha trifaria, Cratoxylum cochinchinense, Mesua ferrea and Bonnetia paniculata sequenced in this study has been submitted to GenBank database under accession numbers MK995178-MK995182.