Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2020 Jun 10;10:9394. doi: 10.1038/s41598-020-66219-y

Patterns and Rates of Plastid rps12 Gene Evolution Inferred in a Phylogenetic Context using Plastomic Data of Ferns

Shanshan Liu 1, Zhen Wang 1, Hui Wang 2, Yingjuan Su 1,3,, Ting Wang 4,
PMCID: PMC7287138  PMID: 32523061

Abstract

The trans-splicing rps12 gene of fern plastomes (plastid genomes) exhibits a unique structure owing to its variations in intragenic exon location and intron content, and thus, it provides an excellent model system for examining the effect of plastid gene structure on rates and patterns of molecular evolution. In this study, 16 complete fern plastome sequences were newly generated via the Illumina HiSeq sequencing platform. We reconstructed the phylogeny of ferns and inferred the patterns and rates of plastid rps12 gene evolution in a phylogenetic context by combining these plastome data with those of previously published fern species. We uncovered the diversity of fern plastome evolution by characterizing the structures of these genomes and obtained a highly supported phylogenetic framework for ferns. Furthermore, our results revealed molecular evolutionary patterns that were completely different from the patterns revealed in previous studies. There were significant differences in the patterns and rates of nucleotide substitutions in both intron-containing and intron-less rps12 alleles. Rate heterogeneity between single-copy (SC) and inverted repeat (IR) exons was evident. Unexpectedly, however, IR exons exhibited significantly higher synonymous substitution rates (dS) than SC exons, a pattern that contrasts the regional effect responsible for decreased rates of nucleotide substitutions in IRs. Our results reveal that structural changes in plastid genes have important effects on evolutionary rates, and we propose possible mechanisms to explain the variations in the nucleotide substitution rates of this unusual gene.

Subject terms: Molecular evolution, Plant evolution, Plant molecular biology, Evolutionary biology, Sequencing, Genome evolution

Introduction

Plastid ribosomes are ubiquitous organelles in plant cells and play a vital role in the biosynthesis of proteins. In higher plants, plastid ribosomes contain approximately 60 ribosomal proteins that are encoded in both the plastid and the nuclear genetic compartments1. Among the plastid-encoded ribosomal protein gene structures, rps12 is the most notable. The plastid ribosomal protein S12 encoded by the rps12 gene is a highly conserved protein located in the functional center of the 30S subunit of the ribosome2. In fern plastomes (plastid genomes), where rps12 is a trans-splicing gene, this gene is split into three exons by two introns in most, but not all, ferns, and one intron (intron I) is discontinuous. The first exon of the rps12 gene is generally located in the large single-copy (LSC) region, whereas the second and third exons reside in the inverted repeats (IRs); the two IR copies have identical sequences but opposite transcriptional directions. More importantly, the second intron is lacking from the rps12 gene of all species belonging to three basal fern lineages: Psilotales, Ophioglossales, and Equisetales3,4. Thus, two distinct rps12 gene types were identified in ferns based on the presence or absence of intron II, and type I and II genes corresponded to intron-containing and intron-less genes, respectively. Plastid genomic structural alterations, such as inversions, duplications, and gene or intron loss, are often accompanied by an increase in the rate of plastome sequence evolution5,6. Therefore, different copy numbers of exons and the presence or absence of introns have garnered substantial interest as an avenue to explore the evolutionary patterns of this unique gene.

Introns are highly stable components of land plant plastomes, and it is widely believed that a basic set of introns was established prior to the divergence between vascular and nonvascular plants, because this intron set is shared by different taxonomic groups (such as charophytes, bryophytes, and spermatophytes)7. The intron contents of fern plastomes are also highly conserved, with no gains and few losses during fern evolution4,8. The interspecies variation in the intron content of fern plastomes mainly reflects the complete loss of several intron-containing genes in specific lineages (e.g., intron-containing rps16 genes are absent in Psilotales, Ophioglossales, and Equisetales4), or the presence of intron-less alleles in some lineages (e.g., intron-less rps12 genes are present in Psilotales, Ophioglossales, and Equisetales; Lygodium japonicum lacks rpoC1 introns that are commonly found in other ferns8). Intron losses have been associated with elevated substitution rates in plastid genes, as previously described5,9,10, but there is an extreme bias in taxon sampling of these studies. The vast majority of these studies are based on angiosperms. Consequently, we cannot determine whether this pattern is a universal phenomenon in the plastome or an independent evolutionary event in certain lineages or genes. The plastid rps12 gene in ferns provides an excellent model system to study the effect of intron losses on evolutionary rate and test the generality of this evolutionary pattern.

Another distinctive feature of the rps12 gene in ferns is the variation in copy number among its exons. As mentioned previously, in most ferns, the first exon of rps12 is located in the LSC, whereas the second and third exons reside in the IRs. It is well known that two completely identical IR regions are prominent structural features of the plastomes of nearly all land plants. The regional effect of the evolutionary pattern of plastid genes, in which the evolutionary rate of genes located in the IRs is lower than that of genes in the single-copy (SC) regions, has been documented with extensive research1115. The sequence identity of IRs can be maintained by copy-dependent DNA repair because when mutations are introduced into one IR copy, the other copy provides a template for error correction13,16, thereby suppressing the substitution rate in the IRs. Biased gene conversion, as an efficient mutation-correcting mechanism, can result in a genome with different copy regions that have different mutation rates16,17. There are up to 900 genomic copies in a single plastid18, whereas the duplicative property of the IRs provides an even greater number of copies; therefore, the frequency of gene conversion in the IRs should be higher than that in the SC regions, and this phenomenon might be responsible for a significantly lower evolutionary rate in the IRs than in the SC regions14,16.

However, with a growing quantity of plastome data available for comparative genomics, this hypothesis of the regional effect of the IRs has been refuted by several studies1921. In addition to the impact of IRs on the evolutionary rate, the non-IR localization, locus- and lineage-specific also have significant effects on evolutionary rate heterogeneity in plastomes19,2123. Therefore, we suggest that each of these factors alone cannot sufficiently explain the rates and patterns of molecular evolution in plastid genes. For this reason, we turned our research focus to the split rps12 gene because of its location in both the SC and IR regions. The variation in the exon location and intron content of the rps12 gene in fern plastomes provides a unique opportunity to explore the effect of gene structure on sequence evolution. To thoroughly understand the pattern of rps12 gene evolution in ferns, in the current study, we employed greatly expanded taxon sampling, including 91 fern species, and undertook a broad survey to investigate the impact of structural variation on plastid gene evolution.

Results and Discussion

Organization and dynamic structural evolution of plastid genomes in ferns

Whole-genome sequencing using an Illumina HiSeq platform generated 6,794,240–23,309,670 raw reads for 16 samples. We obtained 6,026,844–21,215,174 clean reads by removing adaptors and low-quality read pairs (Table 1). Following de novo and reference-guided assembly strategies, 16 newly sequenced fern plastomes were each assembled into a single circular molecule. Both genome size and GC content were relatively conserved among all species (Table 2). The size of the 16 plastome sequences ranged from 148,928 bp in Dryopteris sieboldii to 164,857 bp in Selliguea yakushimensis, and all plastomes displayed a typical quadripartite structure consisting of a large-single copy region (LSC, 79,002–92,033 bp), a small single-copy region (SSC, 19,484–27,733 bp), and a pair of inverted repeats (IRs, 22,528–32,017 bp) (Fig. 1 and Table 2). Across all sequenced ferns, there were 84–86 protein coding genes, 27–29 tRNA genes and 4 rRNA genes. The polypods and tree ferns had similar coding gene contents, with a few notable distinctions. Compared with the 84 protein coding genes inferred to be present in the ancestral plastomes of polypods, tree ferns had an additional psaM gene that was shared with most other non-polypods. The majority of the sequenced samples displayed typical fern intron contents. The rps12 gene in all the sequenced species was classified as type I, and the first exon was located in the LSC, far away from the second and third exons of rps12, which were present in two copies in the IRs.

Table 1.

Summary of the sequencing data for 16 fern species.

Species Raw data (G) Clean data (G) Raw Reads Clean reads Coverage (×) Accession number
L. microphyllum 2.43 2.08 8,083,733 6,927,768 198.45 MN623356
P. bifurcatum 6.99 6.36 23,309,670 21,215,174 552.65 MN623367
L. hederaceum 2.28 2.21 7,585,298 7,066,889 260.67 MN623364
S. yakushimensis 2.94 2.71 9,815,752 9,033,527 284.70 MN623352
T. decurrens 2.17 2.05 7,228,697 6,826,579 24.29 MN623363
N. cordifolia 2.04 1.81 6,794,240 6,026,844 85.16 MN623365
D. sieboldii 3.26 3.00 10,851,235 10,014,316 221.45 MN623354
B. subcordata 2.48 2.28 8,276,022 7,604,656 127.23 MN623358
P. triphyllum 5.79 5.37 19,289,987 17,902,643 356.00 MN623361
P. decursive-pinnata 3.22 2.86 10,736,184 9,531,427 79.56 MN623353
G. erubescens 2.89 2.63 9,643,034 8,770,301 1196.58 MN623355
O. gibba 3.25 3.06 11,775,566 11,370,848 125.57 MN623360
B. insignis 3.86 3.58 12,880,011 11,926,722 178.90 MN623366
D. maximum 3.13 2.87 10,430,597 9,565,536 128.62 MN623359
S. lepifera 3.10 2.87 10,346,974 9,561,719 253.00 MN623357
P. subadnata 2.80 2.56 9,342,117 8,540,938 393.80 MN623362

Table 2.

Plastome features of the sequenced species. Plus and minus signs denote genes that are present and absent, respectively, in the corresponding species. ψ represents pseudogenes.

Species Genome size (bp) LSC (bp) IR (bp) SSC (bp) GC % Gene trnR-UCG trnV-UAC trnT-UGU trnN-GUU
L. microphyllum 158,029 81,244 27,494 21,797 41.83 132 + + + +
P. bifurcatum 156,985 79,002 28,249 21,485 39.91 130 + +
L. hederaceum 152,337 81,395 24,593 21,756 42.65 132 + + + +
S. yakushimensis 164,857 80,975 32,017 19,848 40.80 135 + + + +
T. decurrens 151,258 82,963 23,256 21,783 41.98 129 + +
N. cordifolia 149,152 82,020 22,850 21,432 39.36 131 + + +
D. sieboldii 148,928 82,251 22,528 21,621 43.08 131 + + +
B. subcordata 153,428 83,087 24,476 21,389 42.39 132 + + + +
P. triphyllum 151,908 82,777 23,617 21,897 42.82 132 + + + +
P. decursive-pinnata 150,995 82,344 23,530 21,591 42.37 132 + + + +
G. erubescens 156,961 82,715 26,229 21,788 43.15 132 + + + +
O. gibba 159,641 92,033 23,018 21,572 43.53 132 + + + +
B. insignis 149,734 81,453 23,387 21,508 41.40 131 + + +
D. maximum 150,984 82,293 23,462 21,767 43.90 132 + + + +
S. lepifera 162,216 86,349 24,067 27,733 40.80 132 + + ψ +
P. subadnata 159,998 89,960 24,307 21,424 42.94 132 + + ψ +

Figure 1.

Figure 1

Sizes of each part of 16 fern complete plastome sequences.

Although the plastome structure remained relatively stable, some species showed exceptional variability in tRNA gene content (Table 2). Several tRNA gene losses from multiple independent lineages were detected, including the loss of trnR-UCG in Platycerium bifurcatum, Tectaria decurrens, and Nephrolepis cordifolia; the loss of trnV-UAC in P. bifurcatum, D. sieboldii, and Brainea insignis; the loss of trnN-GUU in T. decurrens; and the pseudogenization of trnT-UGU in Sphaeropteris lepifera and Plagiogyria subadnata (Table 2). All of these genes have also been parallelly lost in other polypods plastomes with the exception of the trnN. The trnN is one of the core set genes contained in IRs, generally adjacent to either ndhF or chlL at the IR/SSC borders. In ferns, IRs are generally considered to be the most stable part of the plastome because genomic rearrangement rarely occurs in these regions. However, in contrast, a recent study indicated that the IR sequences and gene contents were highly variable in polypods24. Our results are in accordance with the latter findings and provide additional evidence for the dynamic evolution of IR regions among closely related polypod plastomes.

Polypods are the lineage of most derived ferns that diversified in the Cretaceous period, displaying an ecologically opportunistic response to the diversification of angiosperms25. The plastomes of polypods have undergone multiple complex genomic reconfigurations during fern evolution, and thus, their plastomes differ substantially from the plastomes of basal ferns (Psilotales, Ophioglossales, Marattiales, and Equisetales). Plastome evolution among polypods is considered relatively static compared with that in lineages other than polypods26. Surprisingly, distinct genome organizations were identified in S. yakushimensis based on inversions and IR boundary variation. A major variation in the S. yakushimensis plastome relative to the genomes of other core leptosporangiates is the altered location from the ndhF-ccsA (Fig. 2). In addition, IR expansion into the SSC resulted in a duplication of the ycf1, chlL,  and chlN genes in the S. yakushimensis plastome. The extent of IR expansion in the S. yakushimensis plastomes is unprecedented among ferns. The IRs in S. yakushimensis are up to 32 kb in length (Fig. 1 and Table 2), whereas the longest IR found previously was 29 kb in Cibotium barometz27. These rare genome structure variations in polypods were detected for the first time.

Figure 2.

Figure 2

Unique structural changes in the plastome of Selliguea yakushimensis compared to those of other core leptosporangiates. Genes are represented by colored boxes above or below of the black chromosome bar according to the direction of transcription. The novel inversion that occurs in the S. yakushimensis is shown in a purple box. The red dashed arrow indicates the range of the IR expansion in the S. yakushimensis plastome.

Phylogenetic analyses

Both the 93- and 84-taxon datasets based on 50 protein-coding genes showed consistent phylogenetic framework, differing only in the support values for some nodes (Fig. 3 and Supplementary Fig. S1). Along the backbone, Equisetales was uncovered as a sister clade to the remaining ferns with strong support, followed by a highly supported joint Ophioglossales + Psilotales clade, itself sister to a clade with Marattiales (Fig. 3). The phylogenetic relationships among these four basal fern orders are the most debated topics in fern phylogeny. Most previous studies using nuclear genes2830, a combination of mitochondrial and plastid sequences31, several plastid genes32,33, and whole plastome sequences also obtained this topology34,35. In contrast, other phyloplastomic studies tend to support grouping Equisetales and Ophioglossales + Psilotales together as a monophyletic group and sister to the remaining ferns36,37. In addition, the recent phylotranscriptomic analyses have also revealed a distinct topology of relationships among the basal fern orders, showing that Equisetales is the sister group to all other ferns, whereas Marattiales and Psilotales + Ophioglossales form a monophyletic group38. Given that plastid genes generally evolve more slowly than the nuclear genes, these topological differences may be due to different numbers of phylogenetically informative sites contained within the diverse molecular data39.

Figure 3.

Figure 3

Phylogram showing intron losses and the distribution of the rps12 gene in fern plastomes. The topology was based on an ML tree generated from 50 concatenated protein-coding genes from 91 fern and two outgroup plastomes. Only nodes with bootstrap support values less than 100% are shown. The absence of the rps12 intron is indicated with a red line; the dashed line denotes that all the exons of rps12 were present in only one copy.

The diversification of leptosporangiates occurred after that of eusporangiate ferns, within which Osmundales was the earliest-diverging lineage, and then, Gleicheniales and Hymenophyllales formed a monophyletic group sister to the remaining non-Osmundales ferns (Fig. 3). The phylogenetic position of Hymenophyllales remains debated. Most phylogenetic analyses have depicted two alternative relationships: Hymenophyllales as a sister clade to Gleicheniales + the rest of non-Osmundales leptosporangiates25,32,33,4043, or Hymenophyllales and Gleicheniales together form a clade that is sister to the remaining non-Osmundales leptosporangiates29,38,44. Interestingly, both topologies have been found in more recent plastid phylogeny reconstructions using different data types and partition schemes37. Kuo et al.37 showed that the use of plastome organization features also fails to provide additional support for either of these two topologies. This phylogenetic uncertainty is probably due to sparse taxon sampling, data types, model selection, and tree inference methods. Although the relative positions of the order Hymenophyllales remains inconclusive, our results should not be ignored based on the feasibility and effectiveness of plastomes for inferring phylogenies.

As in previous studies, Schizaeales was clearly identified as the sister clade to the core leptosporangiates29,33,36,38,44. In the core leptosporangiates, Salviniales and Cyatheales are successive sisters to Polypodiales, and each of these orders is clearly monophyletic, with moderate to high support (Fig. 3 and Supplementary Fig. S1). Our results showed that the earliest-diverging clade of Polypodiales is Lindsaeaceae, followed by Pteridaceae. These clades are well supported as monophyletic. Although Dennstaedtiaceae was weakly supported as the sister lineage to the eupolypods in our study, the relationship found here is congruent with the findings of most recent studies28,36,38,45 (Fig. 3 and Supplementary Fig. S1). Eupolypods account for well over half of the extant fern diversity, and determining their sister group has been difficult. Earlier studies reported that Pteridaceae, instead of Dennstaedtiaceae, was sister to the eupolypods; however, the relationships among these three lineages were not well-resolved because of low support41,42. These relationships need to be further studied to ascertain which family is the sister lineage to the eupolypods.

Impact of intron loss on the rps12 evolutionary rate

The occurrence of rps12 intron loss in three basal fern lineages is considered to be an important evolutionary event in ferns. If the phylogenetic relationships among ferns are indeed consistent with our analysis, which recovered Equisetales as a sister clade to the remaining ferns, then the intron of the rps12 gene would have been independently lost at least twice during fern evolution4. Research suggests that the rate of plastome sequence evolution is generally affected by structural changes5,6; thus, we sought to investigate the impact of intron loss on the evolutionary rate of the rps12 genes.

Branch length in a phylogenetic tree represents an estimate of the amount of sequence divergence in the corresponding lineage, which is equal to the product of the absolute substitution rate and time46. For this reason, we cannot ignore the timescale over which molecular rates change in comparisons of substitution rates among taxa. Therefore, we tested for an intron effect on substitution rate changes by comparing the absolute substitution rates of the type I and type II genes. Our results showed that the trans-splicing rps12 genes in all examined species were highly conserved with a size of 372 bp, encoding a total of 123 amino acids. The differences in the rates between the two types of genes were mainly reflected in the rates of synonymous substitution (RS). Wilcoxon rank sum tests showed that the values of RS for type II genes were significantly higher than those for type I genes (P < 0.01), whereas the values of the rates of nonsynonymous substitution (RN) were not significantly different (P = 0.8019) (Fig. 4; Supplementary Fig. S2 and Table S1). Moreover, in order to detect whether the selection pressures acting on the two types of genes were significantly different, we compared dN/dS ratios in a phylogenic context via a model-based approach. A likelihood ratio test (LRT) was used to compare the fits of two models: the null model, where values of dN/dS were not significantly different between type I and type II, and the alternative model, where type I had different dN/dS ratios relative to type II. Overall, the alternative model was significantly different from the null model (P < 0.05).

Figure 4.

Figure 4

Comparison of RN, RS, TI, and TV rates between intron-containing (type I) and intron-less (type II) rps12 genes in ferns. Asterisks indicate (**)P < 0.01 and (***)P < 0.001.

Currently, the most widely accepted mechanism of intron loss is a reverse transcriptase (RT)-mediated model (namely, retroprocessing), which is a process of integrating intron-less cDNA generated by reverse transcription of the corresponding mRNA into the genome by homologous recombination4749. Another two possible mechanisms of intron loss include genomic deletion50 and exonization of intronic sequences51. In these cases, the intron is often removed imprecisely, changing the intron/exon borders. Consequently, the rps12 intron loss that occurred in the plastomes of ferns is indicative of the first mechanism because the exon boundaries of the intron-less genes have been shown to be perfectly matched to that of the intron-containing genes. Previous studies have reported that the accelerated evolution of clpP1 in many seed plant lineages is associated with intron loss9,10,52. A possible explanation for this acceleration in clpP1 evolution is that reverse transcriptases and/or RNA polymerases have higher error rates during retroprocessing53. Although the rapid evolution of the clpP1 gene could also relate to a hybrid effect (e.g., duplications, indels, and pseudogenization), we found a significantly higher RS for type II genes than for type I genes in the fairly conserved gene rps12, indicating that accelerated gene evolutionary rates are correlated with loss of introns. Therefore, this phenomenon may be a prevalent genome-wide pattern.

Furthermore, we analyzed the two well-known types of genetic mutations, transitions (TIs) and transversions (TVs), to test whether there were differences in mutation pattern between type I and type II genes. Our results showed that TIs occurred more frequently than TVs in both type I and type II genes (Fig. 4; Supplementary Fig. S2 and Table S2). From the perspective of natural selection, this “transition bias” phenomenon is considered to indicate that selection disfavors transversions, as transversions are more likely to alter the amino acid sequence of proteins than transitions54. Comparison between the TI and TV values showed that both values were significantly higher in type II than in type I genes (P < 0.001) (Fig. 4; Supplementary Fig. S2 and Table S2). An LRT was also implemented to identify rate shifts between the type I and type II branches, and the results indicated that TI/TV was significantly different in the type I branches than in the type II branches (P < 0.001). That is, transition and transversion events that occurred in type II genes resulted in more synonymous changes than those in type I genes. This finding may explain the significant change in RS between the type I and type II genes, whereas RN did not exhibit significant changes.

Complicated rate variation in rps12 exons

Due to a series of genomic inversions across the fern phylogeny, 84 of 93 sampled species showed the same pattern of exon distribution, which was divided into IR and SC exons. The consensus sequences of the SC and IR exons of the fern rps12 genes were 114 bp (encoding the 1st to 38th amino acids) and 232 bp in length (encoding the 39th to 123rd amino acids), respectively. Since nonsynonymous mutations are more strongly affected by selective constraints, while synonymous mutations are largely invisible to natural selection, a comparison of the synonymous substitution rates provides a better understanding of DNA sequence evolution55. In order to compare with previous research more intuitively, we used the same pairwise comparison method as the previous studies when investigating the difference between the rates of IR and SC exons12,21. Our results revealed a more complicated rate variation in the rps12 exons than expected. The dS values for the IR exon were significantly higher than those for the SC exon (P < 0.001), a pattern that contrasts the regional effect responsible for decreased rates of nucleotide substitutions in the IRs (Fig. 5; Supplementary Table S3). The results of an LRT between the one-rate and two-rate models showed significant rate changes relative to the position of an exon in either the IR or SC region (P < 0.01).

Figure 5.

Figure 5

Comparison of dS, TI, and TV values between IR and SC exons of rps12 genes in ferns. Asterisks indicate (***)P < 0.001.

A previous study indicated that the total substitution rates decreased after gene translocation into the IRs by using a model-based approach across the fern phylogeny11. Moreover, Zhu et al.12 compared the dS values of IR and SC genes across land plants using the pairwise estimation method, including some representative fern species. Based on very limited taxon sampling, they observed that former IR genes that moved into the SC region due to genome rearrangements underwent corresponding dS accelerations. However, we discovered that the IR exons of a given gene in fern species had, in fact, significantly higher dS rates on average than the corresponding SC exons. The most substantial distinction between our study and previous studies is that we compared the IR exons of rps12 with the SC exons of the same gene rather than other SC genes. Furthermore, we employed vastly expanded taxon sampling that included not only the major fern lineages but also additional species of derived lineages. Comparing the rate changes in exons located in different regions of a given gene can effectively avoid rate heterogeneity caused by lineage effects8. The intragenic location changes in the rps12 exons present a unique opportunity to test the effect of IRs on evolutionary rate variation. Thus, the pattern of substitution rate inhibition in the IRs may not be applicable to the fern rps12 gene. Indeed, several studies have shown that a pattern of decreased IR substitution rates is not universally suitable among vascular plants, as increased IR substitution rates were previously observed in some genes of Pelargonium19,21, Plantago, and Silene12. Several possible mechanisms have previously been proposed to explain substitution rate increases in plastid genes. These abnormally evolving genes could result from local hypermutation potentially induced by a high level of error-prone double-strand break repair12. A subsequent study by Weng et al.21 expanded taxon sampling to characterize rates and patterns of evolution in Pelargonium, and the results showed that the anomalous rate acceleration observed in Pelargonium plastomes could be explained by a mixture of locus-specific, lineage-specific, and IR-dependent effects.

Thus, the mechanisms responsible for generating substitution rate variation may be different in each case. Considering that homologous recombination could enhance copy-correction activity in the IRs via gene conversion, the most direct explanation is the reduction in recombinant activity in IRs. Furthermore, a model of recombination repair16,56 could serve as another possible explanation for their observations of increased dS. Recombination repair is also known as post-replication repair56, which is a process of repairing impaired molecules by using undamaged molecules as donors, and if this repair is error-prone, single base pair substitutions will be generated because the repair process involves gap-filling DNA synthesis16. Additional copies of IR exons in a cell or organelle can act as donors to ensure the efficiency of recombination repair. Consequently, we speculate that recombinant activity and recombination repair may have larger effects than the IR on the substitution rate variation in the rps12 gene.

Furthermore, values of TI and TV for SC and IR exons were also estimated. The results showed that both the TI and TV values in the SC exons were significantly higher than those in the IR exons (Fig. 5; Supplementary Table S3), and an LRT rejected the one-rate model in favor of the two-rate model (P < 0.001). Likewise, comparisons of the DNA sequences of the rps12 gene showed that the base composition in the two exons was not uniform and an excess of transitional over transversional substitutions was present. One reason for the transition bias observed in these exons could be a mutational bias due to the intrinsic properties of DNA, as purines and pyrimidines have different conformational sizes.

Materials and Methods

Taxon sampling and DNA extraction

Ferns are a species-rich lineage of vascular plants, occupying a high diversity of ecological niches, and are a major component of the earth’s land flora57. However, only 132 complete plastomes are available from GenBank. All 132 complete fern plastome sequences in the NCBI RefSeq collection as of 10 Feb 2019 were downloaded from GenBank, and the annotated coding sequences and the number of exons in rps12 were extracted from these sequences. Because the rps12 gene is highly conserved in fern plastomes and no substantial variation in the rps12 sequence was observed among congeners, only one sample was chosen from each genus (except for two samples that were selected from Equisetum) among the previously published fern plastomes to reduce redundancy in the dataset. Then, additional plastome sequences from 16 fern species were sequenced to ensure the coverage of additional fern clades. The increased sampling was undertaken to better understand the evolutionary patterns of rps12 gene in ferns. Sequenced species mainly included tree ferns (Cyatheales) and polypod ferns (Polypodiales), which contain most of the extant fern diversity. This sampling strategy resulted in 93 samples representing all 11 extant fern orders and 32 families and two plastomes of lycophytes (outgroup) (Supplementary Table S4).

In this study, fresh leaves of 16 newly sequenced fern species were sampled from Wuhan Botanical Garden, Chinese Academy of Sciences (CSA), and Fairy Lake Botanical Garden (CSA), respectively (Table 3). Samples were taken from young leaves of each plant and were either flash-frozen in liquid nitrogen or placed in paper envelopes and dried with silica gel. Genomic DNA was extracted from the silica-dried or lyophilized leaf tissue of each sample using the Plant Genomic DNA Kit (Tiangen Biotech., Beijing, China). The DNA concentration and purity assessments were performed using agarose gel electrophoresis and a NanoDrop spectrophotometer (Thermo Scientific, Carlsbad, CA, USA). Isolations with concentrations  ≥ 150 ng/μl were chosen for Illumina sequencing.

Table 3.

Taxa sampled in this study. FLBG, Fairy Lake Botanical Garden, Chinese Academy of Sciences (CSA); WBG, Wuhan Botanical Garden, CAS.

Family Genus Species Sampling site
Polypodiaceae Lemmaphyllum L. microphyllum FLBG
Platycerium P. bifurcatum FLBG
Lepidomicrosorum L. hederaceum WBG
Selliguea S. yakushimensis WBG
Thelypteridaceae Pronephrium P. triphyllum FLBG
Phegopteris P. decursive-pinnata FLBG
Glaphyropteridopsis G. erubescens WBG
Dryopteridaceae Bolbitis B. subcordata FLBG
Dryopteris D. sieboldii FLBG
Blechnaceae Oceaniopteris O. gibba FLBG
Brainea B. insignis FLBG
Tectariaceae Tectaria T. decurrens FLBG
Nephrolepidaceae Nephrolepis N. cordifolia FLBG
Athyriaceae Diplazium D. maximum FLBG
Cyatheaceae Sphaeropteris S. lepifera FLBG
Plagiogyriaceae Plagiogyria P. subadnata WBG

Genome sequencing, assembly, and annotation

Genomic DNA was sheared using a Covaris M220 focused-ultrasonication device (Covaris Inc., MS, USA) to a mean fragment size of 300 bp. Paired-end libraries were prepared using the NEBNext Ultra DNA Library Prep Kit (New England Biolabs, Ipswich, MA) for sequencing on an Illumina HiSeq 4000 platform (Illumina Inc., San Diego, CA, USA). Subsequently, 150 bp paired-end reads were produced with an insert size of ~300 bp. Following enrichment, we obtained the raw data for 16 species, ranging from 2.04–6.99 Gb (Table 1). Raw data were then filtered, and adaptor sequences were removed with Trimmomatic v0.3358. Trimming was performed from both ends of each read, removing all bases with a quality lower than Q20, and keeping only reads of 50 nt or longer. The quality-filtered reads were subjected to de novo assembly using Velvet v1.2.0859. Each assembled contig was BLAST searched against previously reported fern plastomes to identify plastid contigs. Gaps that remained in the assembled draft sequence were filled by polymerase chain reaction (PCR) using specific primers that were designed from the regions flanking the gaps.

We annotated plastomes in DOGMA60. Plastid genes were corrected in each draft genome by performing homology searches with BLASTX or BLASTN against the previously published fern plastomes. Genes were annotated as pseudogenes if they showed disruptions in reading frames or frameshifts caused by nontriplet insertions or deletions. Transfer RNA (tRNA) genes were verified using two online programs, ARAGORN61 and tRNAscan-SE62. Plastome maps for all species were drawn using OGDRAW63. All fully annotated plastome sequences were deposited in NCBI GenBank under accession numbers MN623352–MN623367 (Table 1).

Phylogenetic analyses

Phylogenetic analyses were performed on two separate protein-coding gene matrices. The first included 50 shared genes from all the sampled taxa. The second dataset included 50 gene sequences from 84 taxa. This matrix included the same genes as the first dataset except that those taxa with a single copy of rps12 were removed (Supplementary Table S4). All sequence alignments for each gene were conducted using MAFFT v7.31064, and trimmed to exclude poorly aligned positions using Gblocks v0.91b65 with default settings. The trimmed alignments were concatenated in SequenceMatrix v1.066. Phylogenetic trees from the concatenated dataset were estimated using the maximum likelihood (ML) method in RAxML v8.2.467 with the standard GTRGAMMA model by 1,000 rapid bootstrap replicates.

Estimation of evolutionary rates

Nucleotide sequences for each part exon of the rps12 gene from 91 fern species and two lycophyte outgroups, Huperzia lucidula and Isoetes flaccida, were extracted from plastomes and spliced into a complete coding region. For this step, the exon/intron boundaries of each sample must be carefully examined, and incorrectly annotated genes were manually adjusted if necessary because misalignment would result in the erroneous inference of substitutions rates. The coding sequences of the rps12 gene were aligned at the protein level using the Align Codon option of ClustalW as implemented in MEGA768.

Absolute substitution rates for rps12 were calculated according to the methods that have been described previously69,70. In brief, divergence times for all nodes within ferns were estimated using the penalized likelihood approach in the program r8s71, with a time constraint of 354 million years for the crown group age of ferns25. The ML tree constructed from the dataset of all sampled taxa was used as the starting tree. We performed 100 bootstrap replicates to calculate standard errors for the divergence time of each node. The branch lengths of dS (number of substitutions per synonymous site) and dN (number of substitutions per nonsynonymous site) trees were computed using the codeml module in PAML v4.872. The standard errors of dN and dS branches were calculated from the standard errors of their corresponding total branch lengths as reported by PAML. Absolute rates of synonymous (RS) and nonsynonymous (RN) substitution for each branch were calculated by dividing the synonymous/nonsynonymous branch length by the length of time corresponding to that branch.

To compare the nucleotide mutation patterns of the two gene types, the TI and TV rates for rps12 were also estimated in the phylogenetic context of ferns using the HKY85 model (the local parameters option) as implemented in HyPhy73. The significance of the differences in RS, RN, TI, and TV between the type I and type II branches were evaluated using Wilcoxon rank sum tests. We used a model-based method to test whether the dN/dS changes differed between the type I and type II branches in a phylogenetic context. The phylogenetic tree of all 93 species generated using ML was used as a constraint tree. The null model, where one dN/dS ratio was fixed across ferns, was compared with an alternative model, where the type I and type II clades were allowed to have different dN/dS ratios. LRTs were performed in PAML v4.872 to compare the goodness of fit of the two models. We also tested for possible rate heterogeneity in the TI/TV ratios across different branches in a phylogenetic context. The LRT scheme for detecting a significant difference in TI/TV between type I and type II genes was conducted using the HKY85 model in HyPhy73, in which the null and alternative models specified were the same as those used when detecting dN/dS.

Furthermore, to investigate whether the evolutionary rate variations in the rps12 exon corresponded to its location, that is, the IR or SC regions, we performed a parallel analysis of 84 species to determine the effects of IR location on the substitution rate of a given gene. Nine samples were excluded from this analysis because their rps12 exon sequences were all single copies (Supplementary Table S4). Based on their locations, the rps12 exon sequences were extracted from the plastomes and aligned at the protein level, and the alignments were concatenated separately. The pairwise dS rates for the IR and SC exons between the two lycophytes and each fern species were calculated using the codeml module in PAML v4.872 (seqtype = 1, runmode = −2, and CodonFreq = 2 in the codeml.ctl files). The ML tree inferred from 50 plastid genes of 82 taxa was used as a constraint tree. Likewise, we calculated the pairwise TI and TV of the two exons by using the HKY85 model implemented in HyPhy to compare the nucleotide mutation patterns. Wilcoxon rank sum tests were used to test the rate differences between the SC and IR exons. To investigate whether the dN/dS changes in exons corresponded to their location, on the phylogeny, a model-based rate analysis was used. LRTs were performed using the MG94xHKY85 codon model in HyPhy73, in which the one-rate model specified a shared dN/dS across the whole gene, whereas the two-rate model allowed the IR and SC exons to have different dN/dS values. The same setting was applied to the LRTs for TI/TV, except that a HKY85 substitution model was applied.

Supplementary information

Supplementary Figures. (6.2MB, docx)
Supplementary Tables. (50.4KB, xlsx)

Acknowledgements

This work was supported by the National Natural Science Foundation of China [31370364, 31570652, 31670200, 31770587, and 31872670], the Natural Science Foundation of Guangdong Province, China [2016A030313320 and 2017A030313122], Science and Technology Planning Project of Guangdong Province, China [2017A030303007], Project of Department of Science and Technology of Shenzhen City, Guangdong, China [JCYJ20160425165447211, JCYJ20170413155402977, JCYJ20170818155249053, and JCYJ20190813172001780], and Science and Technology Planning Project of Guangzhou City, China [201804010389].

Author contributions

Ting Wang conceived and designed the experiments; Hui Wang, Shanshan Liu, and Yingjuan Su collected and contributed sample material; Zhen Wang performed the experiments; Shanshan Liu analyzed the data and wrote the manuscript; Ting Wang and Yingjuan Su reviewed the manuscript critically. All authors have read and approved the final manuscript.

Data availability

The complete plastome sequences generated in current study are available in GenBank, https://www.ncbi.nlm.nih.gov/genbank/ (accession numbers are described in the text).

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Yingjuan Su, Email: suyj@mail.sysu.edu.cn.

Ting Wang, Email: tingwang@scau.edu.cn.

Supplementary information

is available for this paper at 10.1038/s41598-020-66219-y.

References

  • 1.Eneas-Filho J, Hartley MR, Mache R. Pea chloroplast ribosomal proteins: characterization and site of synthesis. Mol. Gen. Genet. 1981;184:484–488. doi: 10.1007/BF00352527. [DOI] [Google Scholar]
  • 2.Yamaguchi K, Subramanian AR. Proteomic identification of all plastid‐specific ribosomal proteins in higher plant chloroplast 30S ribosomal subunit. Eur. J. Biochem. 2003;270:190–205. doi: 10.1046/j.1432-1033.2003.03359.x. [DOI] [PubMed] [Google Scholar]
  • 3.Karol KG, et al. Complete plastome sequences of Equisetum arvense and Isoetes flaccida: implications for phylogeny and plastid genome evolution of early land plant lineages. BMC Evol. Biol. 2010;10:321. doi: 10.1186/1471-2148-10-321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Grewe F, Guo W, Gubbels EA, Hansen AK, Mower JP. Complete plastid genomes from Ophioglossum californicum, Psilotum nudum, and Equisetum hyemale reveal an ancestral land plant genome structure and resolve the position of Equisetales among monilophytes. BMC Evol. Biol. 2013;13:8. doi: 10.1186/1471-2148-13-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Jansen RK, et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc. Natl. Acad. Sci. USA. 2007;104:19369–19374. doi: 10.1073/pnas.0709121104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Weng ML, Blazier JC, Govindu M, Jansen RK. Reconstruction of the ancestral plastid genome in Geraniaceae reveals a correlation between genome rearrangements, repeats, and nucleotide substitution rates. Mol. Biol. Evol. 2013;31:645–659. doi: 10.1093/molbev/mst257. [DOI] [PubMed] [Google Scholar]
  • 7.Barkan, A. Intron splicing in plant organelles in Molecular Biology and Biotechnology of Plant Organelles (eds. Daniell, H. & Chase, C.) 295–322, 10.1007/978-1-4020-3166-3_11 (Springer, Dordrecht, 2004).
  • 8.Gao L, et al. Plastome sequences of Lygodium japonicum and Marsilea crenata reveal the genome organization transformation from basal ferns to core leptosporangiates. Genome Biol. Evol. 2013;5:1403–1407. doi: 10.1093/gbe/evt099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Erixon P, Oxelman B. Whole-gene positive selection, elevated synonymous substitution rates, duplication, and indel evolution of the chloroplast clpP1 gene. PLoS One. 2008;3:e1386. doi: 10.1371/journal.pone.0001386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Park S, et al. Contrasting patterns of nucleotide substitution rates provide insight into dynamic evolution of plastid and mitochondrial genomes of Geranium. Genome Biol. Evol. 2017;9:1766–1780. doi: 10.1093/gbe/evx124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Li, F. W., Kuo, L. Y., Pryer, K. M. & Rothfels, C. J. Genes translocated into the plastid inverted repeat show decelerated substitution rates and elevated GC content. Genome Biol. Evol. 8, 2452–2458, 10 1093 /gbe/evw167 (2016). [DOI] [PMC free article] [PubMed]
  • 12.Zhu A, Guo W, Gupta S, Fan W, Mower JP. Evolutionary dynamics of the plastid inverted repeat: the effects of expansion, contraction, and loss on substitution rates. New Phytol. 2016;209:1747–1756. doi: 10.1111/nph.13743. [DOI] [PubMed] [Google Scholar]
  • 13.Wolfe KH, Li WH, Sharp PM. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc. Natl. Acad. Sci. USA. 1987;84:9054–9058. doi: 10.1073/pnas.84.24.9054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Perry AS, Wolfe KH. Nucleotide substitution rates in legume chloroplast DNA depend on the presence of the inverted repeat. J. Mol. Evol. 2002;55:501–508. doi: 10.1007/PL00020998. [DOI] [PubMed] [Google Scholar]
  • 15.Yamane K, Yano K, Kawahara T. Pattern and rate of indel evolution inferred from whole chloroplast intergenic regions in sugarcane, maize and rice. DNA Res. 2006;13:197–204. doi: 10.1093/dnares/dsl012. [DOI] [PubMed] [Google Scholar]
  • 16.Birky C, Walsh J. Biased gene conversion, copy number, and apparent mutation rate differences within chloroplast and bacterial genomes. Genetics. 1992;130:677–683. doi: 10.1093/genetics/130.3.677. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Khakhlova O, Bock R. Elimination of deleterious mutations in plastid genomes by gene conversion. Plant J. 2006;46:85–94. doi: 10.1111/j.1365-313X.2006.02673.x. [DOI] [PubMed] [Google Scholar]
  • 18.Bendich AJ. Why do chloroplasts and mitochondria contain so many copies of their genome? BioEssays. 1987;6:279–282. doi: 10.1002/bies.950060608. [DOI] [PubMed] [Google Scholar]
  • 19.Guisinger MM, Kuehl JV, Boore JL, Jansen RK. Genome-wide analyses of Geraniaceae plastid DNA reveal unprecedented patterns of increased nucleotide substitutions. Proc. Natl. Acad. Sci. USA. 2008;105:18424–18429. doi: 10.1073/pnas.0806759105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Blazier JC, et al. Variable presence of the inverted repeat and plastome stability in Erodium. Ann. Bot. 2016;117:1209–1220. doi: 10.1093/aob/mcw065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Weng ML, Ruhlman TA, Jansen RK. Expansion of inverted repeat does not decrease substitution rates in Pelargonium plastid genomes. New Phytol. 2017;214:842–851. doi: 10.1111/nph.14375. [DOI] [PubMed] [Google Scholar]
  • 22.Sloan DB, Alverson AJ, Wu M, Palmer JD, Taylor DR. Recent acceleration of plastid sequence and structural evolution coincides with extreme mitochondrial divergence in the angiosperm genus Silene. Genome Biol. Evol. 2012;4:294–306. doi: 10.1093/gbe/evs006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sloan DB, Triant DA, Wu M, Taylor DR. Cytonuclear interactions and relaxed selection accelerate sequence evolution in organelle ribosomes. Mol. Biol. Evol. 2013;31:673–682. doi: 10.1093/molbev/mst259. [DOI] [PubMed] [Google Scholar]
  • 24.Logacheva MD, et al. Comparative analysis of inverted repeats of polypod fern (Polypodiales) plastomes reveals two hypervariable regions. BMC Plant Biol. 2017;17:255. doi: 10.1186/s12870-017-1195-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Schneider H, et al. Ferns diversified in the shadow of angiosperms. Nature. 2004;428:553–557. doi: 10.1038/nature02361. [DOI] [PubMed] [Google Scholar]
  • 26.Wolf PG, Roper JM, Duffy AM. The evolution of chloroplast genome structure in ferns. Genome. 2010;53:731–738. doi: 10.1139/G10-061. [DOI] [PubMed] [Google Scholar]
  • 27.Liu SS, Wang Z, Wang T, Su YJ. The complete chloroplast genome of Cibotium barometz (Cibotiaceae), an endangered CITES medicinal fern. Mitochondrial DNA Part B. 2018;3:464–465. doi: 10.1080/23802359.2018.1462128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Rothfels CJ, et al. The evolutionary history of ferns inferred from 25 low‐copy nuclear genes. Am. J. Bot. 2015;102:1089–1107. doi: 10.3732/ajb.1500089. [DOI] [PubMed] [Google Scholar]
  • 29.Qi X, et al. A well-resolved fern nuclear phylogeny reveals the evolution history of numerous transcription factor families. Mol. Phylogenet. Evol. 2018;127:961–977. doi: 10.1016/j.ympev.2018.06.043. [DOI] [PubMed] [Google Scholar]
  • 30.Wolf PG, et al. Target sequence capture of nuclear‐encoded genes for phylogenetic analysis in ferns. Appl. Plant Sci. 2018;6:e01148. doi: 10.1002/aps3.1148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Knie N, Fischer S, Grewe F, Polsakiewicz M, Knoop V. Horsetails are the sister group to all other monilophytes and Marattiales are sister to leptosporangiate ferns. Mol. Phylogenet. Evol. 2015;90:140–149. doi: 10.1016/j.ympev.2015.05.008. [DOI] [PubMed] [Google Scholar]
  • 32.Rai HS, Graham SW. Utility of a large, multigene plastid data set in inferring higher‐order relationships in ferns and relatives (monilophytes) Am. J. Bot. 2010;97:1444–1456. doi: 10.3732/ajb.0900305. [DOI] [PubMed] [Google Scholar]
  • 33.Testo W, Sundue M. A 4000-species dataset provides new insight into the evolution of ferns. Mol. Phylogenet. Evol. 2016;105:200–211. doi: 10.1016/j.ympev.2016.09.003. [DOI] [PubMed] [Google Scholar]
  • 34.Kim HT, Chung MG, Kim KJ. Chloroplast genome evolution in early diverged leptosporangiate ferns. Mol. Cells. 2014;37:372–382. doi: 10.14348/molcells.2014.2296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Labiak PH, Karol KG. Plastome sequences of an ancient fern lineage reveal remarkable changes in gene content and architecture. Am. J. Bot. 2017;104:1008–1018. doi: 10.3732/ajb.1700135. [DOI] [PubMed] [Google Scholar]
  • 36.Lu JM, Zhang N, Du XY, Wen J, Li DZ. Chloroplast phylogenomics resolves key relationships in ferns. J. Syst. Evol. 2015;53:448–457. doi: 10.1111/jse.12180. [DOI] [Google Scholar]
  • 37.Kuo LY, Qi X, Ma H, Li FW. Order‐level fern plastome phylogenomics: new insights from Hymenophyllales. Am. J. Bot. 2018;105:1545–1555. doi: 10.1002/ajb2.1152. [DOI] [PubMed] [Google Scholar]
  • 38.Shen H, et al. Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns. GigaScience. 2018;7:1–11. doi: 10.1093/gigascience/gix116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Zhang N, Zeng L, Shan H, Ma H. Highly conserved low‐copy nuclear genes as effective markers for phylogenetic analyses in angiosperms. New Phytol. 2012;195:923–937. doi: 10.1093/gigascience/gix116. [DOI] [PubMed] [Google Scholar]
  • 40.Pryer KM, et al. Horsetails and ferns are a monophyletic group and the closest living relatives to seed plants. Nature. 2001;409:618–622. doi: 10.1038/35054555. [DOI] [PubMed] [Google Scholar]
  • 41.Schuettpelz E, Korall P, Pryer KM. Plastid atpA data provide improved support for deep relationships among ferns. Taxon. 2006;55:897–906. doi: 10.2307/25065684. [DOI] [Google Scholar]
  • 42.Schuettpelz E, Pryer KM. Fern phylogeny inferred from 400 leptosporangiate species and three plastid genes. Taxon. 2007;56:1037–1037. doi: 10.2307/25065903. [DOI] [Google Scholar]
  • 43.Qiu YL, et al. The deepest divergences in land plants inferred from phylogenomic evidence. Proc. Natl. Acad. Sci. USA. 2006;103:15511–15516. doi: 10.1073/pnas.0603335103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Pryer KM, et al. Phylogeny and evolution of ferns (monilophytes) with a focus on the early leptosporangiate divergences. Am. J. Bot. 2004;91:1582–1598. doi: 10.3732/ajb.91.10.1582. [DOI] [PubMed] [Google Scholar]
  • 45.PPG I A community‐derived classification for extant lycophytes and ferns. J. Syst. Evol. 2016;54:563–603. doi: 10.1111/jse.12229. [DOI] [Google Scholar]
  • 46.Mower JP, Touzet P, Gummow JS, Delph LF, Palmer JD. Extensive variation in synonymous substitution rates in mitochondrial genes of seed plants. BMC Evol. Biol. 2007;7:135. doi: 10.1186/1471-2148-7-135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Derr LK, Strathern JN. A role for reverse transcripts in gene conversion. Nature. 1993;361:170–173. doi: 10.1038/361170a0. [DOI] [PubMed] [Google Scholar]
  • 48.Mourier T, Jeffares DC. Eukaryotic intron loss. Science. 2003;300:1393–1393. doi: 10.1126/science.1080559. [DOI] [PubMed] [Google Scholar]
  • 49.Cohen NE, Shen R, Carmel L. The role of reverse transcriptase in intron gain and loss mechanisms. Mol. Biol. Evol. 2011;29:179–186. doi: 10.1093/molbev/msr192. [DOI] [PubMed] [Google Scholar]
  • 50.Llopart A, Comeron JM, Brunet FG, Lachaise D, Long M. Intron presence–absence polymorphism in Drosophila driven by positive Darwinian selection. Proc. Natl. Acad. Sci. USA. 2002;99:8121–8126. doi: 10.1073/pnas.122570299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Parma J, Christophe D, Pohl V, Vassart G. Structural organization of the 5′ region of the thyroglobulin gene: Evidence for intron loss and “exonization” during evolution. J. Mol. Biol. 1987;196:769–779. doi: 10.1016/0022-2836(87)90403-7. [DOI] [PubMed] [Google Scholar]
  • 52.Williams AM, Friso G, van Wijk KJ, Sloan DB. Extreme variation in rates of evolution in the plastid Clp protease complex. Plant J. 2019;98:243–259. doi: 10.1111/tpj.14208. [DOI] [PubMed] [Google Scholar]
  • 53.Preston BD. Error-prone retrotransposition: rime of the ancient mutators. Proc. Natl. Acad. Sci. USA. 1996;93:7427–7431. doi: 10.1073/pnas.93.15.7427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Vogel F, Kopun M. Higher frequencies of transitions among point mutations. J. Mol. Evol. 1977;9:159–180. doi: 10.1007/BF01732746. [DOI] [PubMed] [Google Scholar]
  • 55.Akashi, H. Inferring weak selection from patterns of polymorphism and divergence at “silent” sites in Drosophila DNA. Genetics139, 1067–1076, https://www.genetics.org/content/139/2/1067.short (1995). [DOI] [PMC free article] [PubMed]
  • 56.Sharp PM, Shields DC, Wolfe KH, Li WH. Chromosomal location and evolutionary rate variation in enterobacterial genes. Science. 1989;246:808–810. doi: 10.1126/science.2683084. [DOI] [PubMed] [Google Scholar]
  • 57.Smith AR, et al. A classification for extant ferns. Taxon. 2006;55:705–731. doi: 10.2307/25065646. [DOI] [Google Scholar]
  • 58.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Zerbino D, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–829. doi: 10.1101/gr.074492.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Wyman SK, Jansen RK, Boore JL. Automatic annotation of organellar genomes with DOGMA. Bioinformatics. 2004;20:3252–3255. doi: 10.1093/bioinformatics/bth352. [DOI] [PubMed] [Google Scholar]
  • 61.Laslett D, Canback B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 2004;32:11–16. doi: 10.1093/nar/gkh152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Lowe TM, Chan PP. tRNAscan-SE On-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res. 2016;44:W54–W57. doi: 10.1093/nar/gkw413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Lohse M, Drechsel O, Kahlau S, Bock R. OrganellarGenomeDRAW—a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 2013;41:W575–W581. doi: 10.1093/nar/gkt289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 2013;30:772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 2000;17:540–552. doi: 10.1093/oxfordjournals.molbev.a026334. [DOI] [PubMed] [Google Scholar]
  • 66.Vaidya G, Lohman DJ, Meier R. SequenceMatrix: concatenation software for the fast assembly of multi‐gene datasets with character set and codon information. Cladistics. 2011;27:171–180. doi: 10.1111/j.1096-0031.2010.00329.x. [DOI] [PubMed] [Google Scholar]
  • 67.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 2016;33:1870–1874. doi: 10.1093/molbev/msw054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Cho Y, Mower JP, Qiu YL, Palmer JD. Mitochondrial substitution rates are extraordinarily elevated and variable in a genus of flowering plants. Proc. Natl. Acad. Sci. USA. 2004;101:17741–17746. doi: 10.1073/pnas.0408302101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Parkinson CL, et al. Multiple major increases and decreases in mitochondrial substitution rates in the plant family Geraniaceae. BMC Evol. Biol. 2005;5:73. doi: 10.1186/1471-2148-5-73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Sanderson MJ. r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics. 2003;19:301–302. doi: 10.1093/bioinformatics/19.2.301. [DOI] [PubMed] [Google Scholar]
  • 72.Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  • 73.Pond SLK, Muse SV. HyPhy: hypothesis testing using phylogenies. Bioinformatics. 2005;21:676–679. doi: 10.1093/bioinformatics/bti079. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Figures. (6.2MB, docx)
Supplementary Tables. (50.4KB, xlsx)

Data Availability Statement

The complete plastome sequences generated in current study are available in GenBank, https://www.ncbi.nlm.nih.gov/genbank/ (accession numbers are described in the text).


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES