Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2024 Mar 1;121(10):e2317240121. doi: 10.1073/pnas.2317240121

Genome copy number predicts extreme evolutionary rate variation in plant mitochondrial DNA

Kendra D Zwonitzer a,1, Lydia G Tressel a, Zhiqiang Wu b, Shenglong Kan b,c, Amanda K Broz d, Jeffrey P Mower e, Tracey A Ruhlman a, Robert K Jansen a, Daniel B Sloan d, Justin C Havird a
PMCID: PMC10927533  PMID: 38427600

Significance

Rates of molecular evolution span many orders of magnitude across different taxa in nuclear and organellar genomes. Angiosperm mitogenomes display some of the most extreme rate variation. We demonstrate that plant mitogenome substitution rates are negatively associated with mitogenome copy number (the number of mitogenomes per cell). We propose that homologous recombination, the primary repair mechanism in plant organelles, is less effective in low copy number environments, leading to elevated substitution rates. Fast mitogenome evolution is also associated with larger mitogenomes. These trends were only observed in mitogenomes and not in plastid genomes. Overall, these findings suggest an explanation for extreme rate variation across angiosperm mitogenomes and potentially for other species which use homologous recombination in their organellar genomes.

Keywords: plant mitogenome, mtDNA, substitution rate, copy number, plastome

Abstract

Nuclear and organellar genomes can evolve at vastly different rates despite occupying the same cell. In most bilaterian animals, mitochondrial DNA (mtDNA) evolves faster than nuclear DNA, whereas this trend is generally reversed in plants. However, in some exceptional angiosperm clades, mtDNA substitution rates have increased up to 5,000-fold compared with closely related lineages. The mechanisms responsible for this acceleration are generally unknown. Because plants rely on homologous recombination to repair mtDNA damage, we hypothesized that mtDNA copy numbers may predict evolutionary rates, as lower copy numbers may provide fewer templates for such repair mechanisms. In support of this hypothesis, we found that copy number explains 47% of the variation in synonymous substitution rates of mtDNA across 60 diverse seed plant species representing ~300 million years of evolution. Copy number was also negatively correlated with mitogenome size, which may be a cause or consequence of mutation rate variation. Both relationships were unique to mtDNA and not observed in plastid DNA. These results suggest that homologous recombinational repair plays a role in driving mtDNA substitution rates in plants and may explain variation in mtDNA evolution more broadly across eukaryotes. Our findings also contribute to broader questions about the relationships between mutation rates, genome size, selection efficiency, and the drift-barrier hypothesis.


Mitochondria are relics from the dawn of eukaryotes, serving as cellular energy producers and bearing their own DNA, the remains of their original bacterial genome. The genes encoded in mtDNA are highly influential on organismal function, and essential processes such as oxidative phosphorylation rely on many chimeric protein complexes composed of mitochondrial- and nuclear-encoded subunits. Evolutionary patterns can vary substantially between nuclear DNA and mitochondrial DNA (mtDNA) within the same cell (1). This variation may be explained by distinct chemical environments between the nucleus and mitochondria, matrilineal inheritance of mtDNA, differences in effective population sizes (Ne) of the genomes, and differences in replication and repair machinery. The ratio of substitution rates between nuclear DNA and mtDNA varies considerably across eukaryotes (24). Among bilaterian animals, mtDNA generally evolves faster than nuclear DNA (1), but other groups such as yeast, some mollusks, some corals, and most plants show the opposite trend (48). In addition to variation in relative evolutionary rates between the nuclear genome and mtDNA, there is also appreciable variation in mtDNA rates among lineages. This has been observed in animals, algae, and plants (3, 912). For example, in the mitochondrial-encoded gene cytochrome b, there is a 30-fold difference in the synonymous substitution rate across birds and a 90-fold difference across mammals (9, 11).

Plant lineages provide a model to investigate the causes of rate variation in mtDNA. Although plants generally have mtDNA substitution rates that are much slower than in nuclear or plastid DNA (6), rates of evolution can vary widely across lineages (1316). In angiosperms, multiple, rapid accelerations in mtDNA substitution rates have been noted within the genera Ajuga, Acorus, Viscum, Eleocharis, Sarcophyrnium, Silene, Plantago, and Pelargonium (14, 15, 1722). These changes tend to occur rapidly, on the order of 10 million years (17). For example, in the genus Silene, “fast” lineages have 100-fold higher rates than “slow” species, despite diverging from a common ancestor only ~6 mya (23). Across all seed plants, absolute synonymous substitution rates for mtDNA were estimated to vary 6,000-fold, suggesting that plant mtDNA may have the most variable rates of evolution among nonviral genomes (15). Studies in Silene suggest evolutionary rates may co-vary with mitogenome size, with the fast-evolving Silene species having among the largest mitogenomes recorded while slow-evolving Silene species have smaller mitogenomes, more typical of other angiosperms (17). However, mitogenome size and other genomic features of mtDNA are underexplored in clades where accelerations have occurred.

It is generally unknown why mtDNA substitution rates vary across eukaryotes or why dramatic accelerations occur in some lineages. Recent observations in Silene and Plantago suggest that mtDNA copy number may play a role in explaining these differences (24, 25). Silene conica, a species with fast-evolving mtDNA, was found to have an unusually low mtDNA copy number, about one mt genome per cell, while the “slow” species Silene latifolia was relatively typical at about 50 mt genomes per cell (24). This observation was limited to the mitochondrial genome, as chloroplast (plastid) genome copy numbers were more similar among fast and slow Silene.

Copy number is an enticing predictor for variation in mtDNA substitution rates due to the mechanism of organellar genome repair in plants (26, 27). The inability to repair mutations can lead to a proportionate increase in mtDNA neutral substitution rate (28). Despite the potential vulnerability of mtDNA to mutations stemming from reactive oxygen species generated during metabolism (29), most metazoans have limited mtDNA repair mechanisms. Base excision repair and replication-linked repair via polymerase gamma (PolG) are responsible for the majority of error correction in animal mtDNA, and damage to mtDNA frequently results in the degradation of the genome entirely (30, 31). Plants, by contrast, have more complex and variable repair processes, including extensive use of homologous recombination (26, 3235). Homologous recombination in plant mtDNA is similar to the recA-mediated pathway in bacteria such as Escherichia coli, where recA binds to single-stranded DNA at the position of a double-stranded break and locates a homologous region to use as a template for repair. Because homologous recombination relies on unmutated copies of DNA to perform corrections (32, 33), low organelle genome copy number may result in less efficient double-stranded break repair due to a lack of templates, resulting in an overall increase in mutation rate (24, 25).

We hypothesized that mtDNA copy number may predict mt substitution rates across plants, as lower copy number may limit the efficiency of homologous recombinational repair mechanisms. Using 60 plant leaf samples, including three lineages that have experienced independent accelerations in mtDNA substitution rates, we tested for relationships between mtDNA copy number, substitution rate, and genome size. We also assessed whether these trends were apparent in plastid genes. Overall, we reasoned that a negative relationship between substitution rate and copy number in plant mtDNA would provide support for the hypothesis that inefficient homologous repair mechanisms are a driver of variation in substitution rates across plant mtDNA.

Results

mtDNA Substitution Rates and Copy Numbers Vary by Three Orders of Magnitude across Plants.

mtDNA substitution rates and copy numbers were estimated using shotgun genome sequencing data from leaf tissue of 60 species across angiosperms and gymnosperms. In total, we generated 229 sequences across 16 mtDNA genes (SI Appendix, Fig. S1 and Dataset S1) and completed or corrected mtDNA sequences for several other genes (SI Appendix, Fig. S1 and Dataset S1). We also identified a possible nuclear transfer (or complete loss) of the mtDNA gene ccmB in one clade of Plantago (SI Appendix, Fig. S1), which is typically conserved across angiosperms and gymnosperms, although further work is needed to validate this result (36).

Absolute mtDNA substitution rates were calculated as synonymous substitution rates (dS) per million years of evolution for each terminal branch of our phylogeny, and copy number was estimated by dividing the sequencing coverage of mt genes by the nuclear coverage. Evolutionary rates varied extensively across our dataset (Fig. 1), with absolute synonymous substitution rates varying by 5,000-fold, the highest being 0.13 dS/mya in Pelargonium exstipulatum and the lowest being 2.53e−05 dS/mya in Ceratozamia hildae (Fig. 2). Similarly, there was substantial variation in estimated mtDNA copy number, which varied by about 500-fold across our dataset, from 432 copies of mtDNA per haploid nuclear genome in Ginkgo biloba to less than one in Silene conica. Using a similar method to estimate plastid DNA (ptDNA) substitution rates and copy numbers, we found less variability, with 27-fold variation for both rates and copy numbers across the 60 species examined (Figs. 1 and 3).

Fig. 1.

Fig. 1.

Mitochondrial DNA synonymous substitution rates vary by orders of magnitude across plants. Synonymous substitution rates (ds per branch) across the 60 species used in this study. Broad groups of interest with variation in mtDNA substitution rates are highlighted: Sileneae, Plantagineae, and Geraniales (gymnosperms are also highlighted). Mitochondrial rates (Left) are based on 16 intron-free mitochondrial genes. Pelargonium, a subset of Plantago, and a subset of Silene species have experienced drastic acceleration in rates compared to most other plants. Plastid rates are based on 12 photosystem genes. Pt genes experience much less variation than mt genes. Topology is constrained to known species relationships.

Fig. 2.

Fig. 2.

Mitogenome evolutionary rate and mitogenome size are negatively associated with genome copy number. (A) Raw and (B) log-transformed correlation of mtDNA copy number and absolute synonymous substitution rate in mitochondrial genes. (C) Raw and (D) log-transformed correlation of mtDNA copy number and mitogenome size. (E) Raw and (F) log-transformed correlation of mitogenome size and absolute synonymous substitution rate in mitochondrial genes. Dashed lines of best fit represent the linear model, and solid lines represent the PGLS model (dashed vertical lines in A and C represent a break in the x axis)

Fig. 3.

Fig. 3.

Plastome evolutionary rate and plastome size are not associated with genome copy number. (A) Raw and (B) log-transformed correlation of ptDNA copy number and absolute synonymous substitution rate in plastid genes. (C) Raw and (D) log-transformed correlation of ptDNA copy number and plastome size. (E) Raw and (F) log-transformed correlation of plastome size and synonymous substitution rate in plastid genes. Dashed lines of best fit represent the linear model, and solid lines represent the PGLS model.

mtDNA Copy Number Explains mtDNA Substitution Rate and Mitogenome Size across Plants.

We found a strong, negative relationship between mtDNA synonymous substitution rate and copy number, roughly following a power function (Fig. 2A). Essentially, we found that at low copy numbers, there is a diversity of substitution rates, including species showing rapid rate accelerations, but at higher copy numbers, rates of mtDNA substitution are low and characteristic of most plants (≤~1e-3 dS/mya). After log-transforming the data, mtDNA copy number explained 47% of the variation in substitution rate in a standard linear model (dotted line; Fig. 2B; t = −6.979, P < 0.001). When phylogeny was included in the analysis using phylogenetic generalized least squares (PGLS), this relationship was still strongly statistically significant and had a slope of −0.66 (solid black line; Fig. 2B; t = −2.966, P = 0.005). Additionally, there are negative trends between mtDNA copy number and substitution rates within each of the three angiosperm clades with rapid accelerations in mtDNA substitution rates: Sileneae (N = 5, t = −2.871, P = 0.06), Plantagineae (N = 13, t = −2.975, P = 0.01), and Geraniaceae (N = 25, t = −2.481, P = 0.02), but gymnosperms did not show a significant negative trend (N = 11, t = −0.544, P = 0.60, Fig. 2B).

The same trend was also found between mitochondrial genome size and mtDNA copy number, with larger genome sizes having lower copy numbers (Fig. 2C). Both the standard linear model and the PGLS between genome size and copy number in mtDNA showed significant negative relationships (dotted line; Fig. 2D; t = −4.877, P < 0.001; solid black line; Fig. 2D; t = −4.423, P < 0.001). The PGLS model had a slope of −0.56. Within-group trends using standard linear models were also significant in Sileneae (N = 5, t = −3.023, P = 0.05), Geraniaceae (N = 25, t = −3.617, P = 0.001), and gymnosperms (N = 11, t = −3.671, P = 0.005), but not in Plantagineae (N = 13, t = −1.365, P = 0.2) (Figs. 2D and 4).

Fig. 4.

Fig. 4.

Accelerations of substitution rates are associated with large mitogenomes and low copy numbers across plant mtDNA. Phylogeny of the 60 plant species investigated where branch length represents mtDNA dS, with genome size and copy numbers represented as points at the end of terminal branches. Larger circles correspond to larger genomes, and darker brown circles correspond to higher copy number.

As expected, based on the above relationships (Fig. 2 B and D), log10-transformed substitution rate and mitogenome size were significantly positively correlated (t = 6.920, P < 0.001) and significance was retained in a PGLS model (t = 3.542, P < 0.001) (Fig. 2F). When visualizing mtDNA substitution rates, copy numbers, and genome size on a phylogeny, the correlation between these three variables was apparent (Fig. 4). Analyses using nonsynonymous substitution rates (dn) were also performed on mtDNA, and similar trends were noted (SI Appendix, Fig. S2 and Dataset S2).

ptDNA Synonymous Substitution Rates Are not Explained by Copy Number or Genome Size.

While there were ostensibly negative relationships between substitution rate and both copy number and genome size when considering ptDNA, the trends were likely due to phylogeny (Fig. 3 A and C). Using log10-transformed data, plastid copy numbers were significantly negatively associated with substitution rate (dotted line; Fig. 3B; t = −3.148, P = 0.003) and had a negative, but not statistically significant trend with genome size (dotted line; Fig. 3D; t = −1.829, P = 0.07) in standard linear models that do not account for phylogeny. However, once phylogeny was considered in the model using PGLS, neither of these trends was retained as statistically significant (Fig. 3B; t = −0.7253, P = 0.47, Fig. 3D; t = −0.0604, P = 0.95). The loss of significance when accounting for phylogeny likely reflects the main driver in ptDNA substitution rate variation in our data being the difference between angiosperms and gymnosperms, where angiosperms had rates ~2.2 times higher than gymnosperms (Wilcox test, W = 376, P < 0.001). No significant within-group trends were observed in ptDNA (Fig. 3 C and D and SI Appendix, Fig. S3). The relationship between log10-transfomred substitution rate and plastome size was also not significant in a linear model (Fig. 3F; t = 1.522, P = 0.13) or PGLS (Fig. 3F; t = 0.827, P = 0.41). Analyses using nonsynonymous substitution rates (dn) were also performed on ptDNA and similar, nonsignificant trends were noted (SI Appendix, Fig. S4 and Dataset S2).

Copy Number Estimates Are Robust Across Multiple Calculation Methods.

To validate our mtDNA and ptDNA copy number estimates based on whole-genome shotgun sequencing coverage, we used two approaches: published C-value (i.e., nuclear genome size) estimates based on flow cytometry (37) and droplet digital PCR (ddPCR). In angiosperms, copy number was calculated using the coverage of organellar genes divided by coverage of nuclear genes. This approach was not possible in gymnosperms, as for many species there were not sufficient sequencing reads for 1× nuclear genome coverage (due to their particularly large nuclear genomes). Therefore, we used C-value estimates (37) for gymnosperm copy number estimation by dividing the total amount of sequenced base pairs by the C-value (Materials and Methods). In both mtDNA and ptDNA, there was a significant positive correlation between log10-transformed copy numbers calculated via C-value and based on mapping coverage (mtDNA: t = 6.44, P < 0.001, R2 = 0.52; ptDNA: t = 4.91, P < 0.001, R2 = 0.39; log-transformed data forced through the origin, SI Appendix, Fig. S5). We also repeated our main analyses using C-value-generated copy numbers for all species instead of copy numbers based on sequencing coverage. The findings were similar: Standard linear and PGLS models showed a negative correlation between copy number and mtDNA substitution rates (lm: t = −5.88, P < 0.001; PGLS: t = −2.93, P = 0.005), while only the standard linear model was significant in ptDNA (lm: t = −2.30, P = 0.026; PGLS: t = −1.12, P = 0.27). Correlations were also similar between copy number calculated using the C-value approach and organelle genome size in both mtDNA (lm: t = −4.11, P < 0.001; PGLS: t = −2.14, P = 0.037) and ptDNA (lm: t = −1.75, P = 0.086; PGLS: t = −0.48, P = 0.63). These results suggested that using C-values to estimate genome size in gymnosperms was reasonable.

Additionally, copy numbers estimated with sequencing coverage were confirmed via ddPCR on a subset of 11 species with known variation in mtDNA substitution rates. Both mtDNA and ptDNA copy numbers were highly correlated between sequencing coverage and ddPCR methods (SI Appendix, Fig. S5 mito P < 0.001; R2 = 0.67; SI Appendix, Fig. S5 plastid P = 0.006; R2 = 0.51; log-transformed data).

DNA samples for the preceding analyses were collected from only vegetative leaf samples. Therefore, our copy number estimates are from these tissues, whereas the pertinent mutational processes shaping evolutionary rates must occur in “germline” tissues (38). To assess whether low mtDNA copy number in leaf tissue was indicative of low copy number in other tissue types, we performed more extensive ddPCR analysis on the species with the lowest estimated mtDNA copy number from leaf tissue (Silene conica). We found that this species retained a low copy number across different tissues, including pistil tissue which may more accurately reflect the copy numbers found in a putative germline. Leaf tissue, seedlings, and pistil of S. conica all had a mtDNA copy number of less than 10, while ptDNA copy numbers ranged from around 200 to over 1,000 (SI Appendix, Fig. S6).

Discussion

Copy Number Explains Substitution Rate and Genome Size Variation in Angiosperm mtDNA.

Within eukaryotes, there is substantial variation in mtDNA neutral evolutionary rates (as estimated by synonymous substitutions). This variation is sometimes observed between closely related taxa, with multiple, independent lineages in angiosperms having undergone drastic, rapid increases in substitution rates (15, 1720, 22, 39). Although the specific mechanisms underlying these rate accelerations have not been resolved, evolutionary theory predicts that changes in neutral substitution rates can be caused by changes in mutation rates (28). Due to the reliance of plant mtDNA on homologous recombination for repair, it is possible for mutation rates to increase when mtDNA copy numbers become low, leading to a lack of available templates for homologous repair (24, 25). We therefore examined correlations between mtDNA copy number and synonymous substitution rates to evaluate this hypothesis across 60 plants.

Our results largely support the hypothesis that mtDNA repair is limited in plants with low copy number, although correlations were not linear. At higher copy numbers, homologous recombination may act efficiently, resulting in the relatively slow, “normal” substitution rates characteristic of most plant species’ mtDNA (Fig. 2A). At lower copy numbers, a diversity of evolutionary rates exists, and all species in our dataset that experienced rate accelerations had relatively low copy numbers (Fig. 2A). Therefore, all rate accelerations in the plant lineages examined here could be caused by low copy numbers and inefficient homologous recombination. While originally unexpected, it is possible that differences in homologous recombination may lead to a nonlinear trend as observed here.

However, low copy numbers do not necessarily lead to rate accelerations, as some species with low copy numbers maintain low substitution rates (Fig. 2A). It is possible that other factors involved in homologous recombination may mitigate the effects of low copy number, such as organelle size and mito-fusion/fission dynamics. For example, larger organelles may result in more mtDNA copies per mitochondrion available for recombination (vs. when copies are sequestered in many separate organelles), and mitochondria that fuse frequently may allow for an increase in circulation of mitochondrial genomes around the cell (16, 4042). While visualization of organelles in our dataset is unexplored for most species showing variable mtDNA substitution rates, we note that in Silene “fast” and “slow” species show similar mitochondrial morphology and dynamics (43). This supports that mtDNA copy number and not organelle dynamics is responsible for template availability in at least some lineages.

Beside template availability, homologous recombination repair may vary across taxa due to enzyme efficiencies or other factors (26). Additional mutational repair mechanisms such as base excision repair and improved replication-mediated repair may also play a dominant role in some lineages, allowing overall rates to remain low despite inefficient homologous repair (33). Future work should expand on the species sampled here, especially in other lineages (e.g., Ajuga and Sarcophrynium) showing known rate accelerations (14, 20). The generation of more complete mitogenomes and WGS datasets will allow more robust testing of these correlations.

Although the trend between mtDNA copy number and synonymous substitution rates was strong across angiosperms, it was not significant within gymnosperms (Fig. 2B) as gymnosperm mtDNA substitution rates were uniformly slow in our dataset (Fig. 1). One reason gymnosperms may have shown overall slower rates in our analysis could be due to an overrepresentation of species with longer generation times. Our gymnosperm dataset contained numerous tree species which have much longer generation times than the chosen angiosperm species, making the absolute substitution rates relatively slow for gymnosperms. However, generation time does not explain variable substitution rates in angiosperms, as angiosperms in our dataset had fairly consistent generation times (i.e., biennials or short-lived perennials). Similarly, ploidy levels are fairly consistent within the angiosperm lineages examined here, suggesting ploidy is not responsible for variation in copy number, genome size, or substitution rates.

We also found that mtDNA copy number correlates negatively with mitogenome size (Fig. 2). One explanation for this finding is that large insertions could be less likely to be corrected via homologous repair if copy numbers are low (and if insertions are larger and/or more common than deletions, which is a hypothesis that needs to be tested in plant mtDNA). Therefore, both expanded mitogenome sizes and fast mtDNA mutation/substitution rates could be caused by low mtDNA copy number. However, large genome sizes could also be a cause, rather than a consequence of low mtDNA copy number. In this scenario, lineages with large genome expansions might be limited to low mtDNA copy numbers given the physical and biochemical limitations of the amount of DNA that can be replicated and housed in organelles, which appear to have similar morphology in slow and fast Silene species (43). Low copy numbers would then lead to high mutation and substitution rates via inefficient homologous repair. Our results do not allow determining whether large mitogenomes are a cause or a consequence of low copy numbers. Although an increase in substitution rates is associated with an increase in genome size in our dataset, the most compact plant mitogenome, Viscum scurrloideum, is fast, not slow evolving (21). However, the parasitic lifestyle of this species was speculated to lead to relaxed selection in its mtDNA, which has lost many conserved genes and shows a pattern of relaxed mtDNA selection overall (21) that is not characteristic of other fast-evolving species in our dataset such as Silene (SI Appendix, Fig. S2) (17, 44). We also acknowledge that plant mtDNA size is largely driven by changes in noncoding DNA but that rates and copy numbers were estimated from coding regions in our dataset. It has been proposed that mutation and repair process differ between genic and intergenic regions in plant mtDNA (45, 46), but there is limited evidence for this model (47).

ptDNA Copy Number in Leaves Is not Associated with Substitution Rate and Genome Size Variation in Angiosperms.

Because homologous repair is also a key mutation repair mechanism for ptDNA, we expected similar trends between ptDNA substitution rate and copy number. Previous work shows that plastid genes in the inverted repeat region of the plastome evolve more slowly than those in the single-copy region, likely because of more efficient homologous repair (6, 4850). However, ptDNA substitution rates were not significantly associated with copy number when controlling for phylogeny (Fig. 3). This may be because mtDNA copy numbers from vegetative cells closely reflect copy numbers in meristematic cells (i.e., the germline), but ptDNA vegetative copy numbers do not. Copy numbers in germline tissue would provide the most meaningful correlations with evolutionary rates, but only vegetative WGS data were available for the species examined here. ptDNA copy number expands rapidly during development of leaf tissue from meristematic tissue (51), which may have obliterated any meaningful variation from our dataset, resulting in only modest variation for ptDNA compared to mtDNA substitution rates (Fig. 1). Our ddPCR data support this conclusion, as ptDNA copy numbers in S. conica flowers were low compared to leaves, while mtDNA copy numbers were low across all tissues (SI Appendix, Fig. S6). Plastomes also generate the majority (up to 82%) of total mRNA transcripts in plant leaf cells (52), suggesting selection may be keeping ptDNA copy numbers universally high in photosynthetic tissues. Future work investigating ptDNA copy number across plant tissues would allow more robust tests of our hypothesis.

Selection Efficiency and Drift as Predictors of Substitution Rates in Organellar Genomes.

To examine substitutions with more likely fitness consequences, we analyzed correlations between absolute nonsynonymous substitution rates (dN per million years of evolution) and copy number or genome size for both mtDNA (SI Appendix, Fig. S2) and ptDNA (SI Appendix, Fig. S4). The same trends were found with dN as when synonymous substitution rates were examined: There were strong, phylogenetically robust relationships between rates and copy number or size in mtDNA but not for ptDNA (SI Appendix, Figs. S2 and S4). Because both mutation rate and selection should influence dN, we also evaluated the relationship between dN/dS ratios and copy number (SI Appendix, Figs. S7 and S8). Relaxed selection from drift should result in increased dN/dS ratios, and we predicted a similar negative relationship between dN/dS ratios and copy number if relaxed selection was driving high dN at low copy number. However, there was not a significant correlation between dN/dS and mtDNA or ptDNA copy number in the PGLS models (SI Appendix, Figs. S7 and S8 and Dataset S2), suggesting that variable mutation rates, not selection efficiencies, are driving the patterns we report here.

Across broader eukaryotes, increases in nuclear genome size and substitution rates are typically associated with decreases in Ne (53). Although the mechanisms for how decreases in Ne contribute to increases in synonymous substitution rates and genome sizes have not been fully resolved, selection likely plays an important role in this process, with lower selection efficiency leading to higher substitution rates via genetic drift at low Ne. It is possible that decreases in organellar copy number contribute to similar trends in organellar genomes, as mitogenomes and plastomes operate as a population on a subcellular level. It is possible that copy number behaves in a similar way to Nefor organellar genomes and that many slightly deleterious mutations are more likely to spread in a population with low copy number due to higher genetic drift and lower efficiency of selection, although organellar populations are known to undergo bottlenecks between generations (54). While neutral substitution rates should reflect differences in mutation rates, drift and selection might modify the relationship between copy number and substitution rate shown here (Fig. 2). Assuming selection may play a role in driving mutations to fixation (i.e., becoming a substitution), the power of genetic drift is related to Ne by a power law, which we observed when correlating substitution rates to copy number (Fig. 2A).

It is tempting to interpret our results through the drift-barrier hypothesis, which was based on a negative relationship between synonymous substitution rates and Ne in both nuclear and mitochondrial genomes of metazoa (53). This observed trend followed a power law with a slope of about −0.6, very similar to the correlation between dS substitution rate and copy number that we report here (−0.66, Fig. 2B). Lynch (2010) additionally found a positive association between genome size and substitution rate, similar to our results (Fig. 4). The drift-barrier hypothesis explains these trends by proposing that populations with low Ne are unable to optimize their mutational repair machinery to the same degree as larger populations due to less efficient selection. This leads to higher mutation rates (along with larger, bloated genomes) in species with low Ne (53). It is tempting to consider copy number as having a similar effect on mtDNA as Ne would have on nuclear DNA. However, applying the drift-barrier hypothesis to our data is complicated because organelle repair machinery is encoded in the nuclear genome and would be limited by the nuclear Ne and not influenced by the mitochondrial Ne (or mtDNA copy number). While low mtDNA copy numbers could lead to inflated genomes and increases in mutation rates via a decrease in Ne, popular explanations such as the drift-barrier hypothesis are difficult to apply to the mtDNA patterns we describe here.

Implications Outside of Plants and Future Directions.

Overall, we find that copy number strongly predicts evolutionary rates and genome sizes in plant mtDNA, likely due to homologous repair being a primary mutation repair mechanism in plant mtDNA and possibly related to more general trends in how selection acts on organellar genomes. Though this study was limited to plants and biased to lineages with dramatic rate accelerations, it has implications for evolutionary rates of mtDNA, ptDNA, and other organellar/endosymbiotic genomes across eukaryotes. Homologous recombination is likely used in the mtDNA of many other eukaryotic taxa besides plants and has been demonstrated in yeast (55, 56). Because homologous repair is a major mutation repair mechanism in bacteria (57) it is likely that the original mitochondrial symbiont relied on homologous repair, which may be considered an ancestral repair mechanism for mtDNA. However, homologous repair appears to have been lost in mtDNA of many bilaterian animals (58, 59), which may be one reason mtDNA mutation rates appear to be elevated in bilaterians compared to plants (1, 6). Some evidence does point to mitochondrial recombination occurring in particular animal lineages (55, 60, 61), and recent work identified a nuclear recombination protein that also seems to repair mtDNA double-strand breaks in Drosophila and humans (62). Future work could perform analyses similar to those used here for plants on other eukaryotic lineages to test whether the correlations found here extend to other taxa, both with and without known mtDNA homologous recombination repair mechanisms. If selection and drift play a role in these trends, they may be a general characteristic across organelle genomes in eukaryotes regardless of whether homologous recombination is used to repair mutations.

Nonbilaterian animals may also be important models for establishing links between homologous repair and rates of mtDNA evolution. For example, octocorals have low mtDNA substitution rates compared to bilaterian animals, similar to plants (7). Plant and bilaterian animal mtDNA repair machinery is encoded in the nuclear genome, but octocorals have horizontally acquired a mutS homolog encoded in their mtDNA (63, 64) that likely participates in mismatch repair in the mitogenome (6567). Interestingly, mtDNA rate accelerations have occurred in octocoral species where mutS has been secondarily lost (67). Rates of mtDNA evolution also vary drastically among other nonbilaterian lineages including cnidarians, sponges, ctenophores, and myxozoas, where rate accelerations are common (12). However, the roles of homologous recombination vs. mismatch repair in the mtDNAs of these lineages is underexplored. In general, it is tempting to speculate that the efficiency of homologous repair in mtDNA mediated by homologous repair proteins may explain variation in mtDNA evolutionary rates across other taxa as well as plants.

Future work could also test our hypothesis experimentally. One experiment could directly manipulate mtDNA copy number in a controlled fashion to test for increases in genome size and mutation rates. Given an extreme reduction in copy number, mtDNA mutation rates should be elevated and be able to be detected in vegetative tissue via high-fidelity sequencing approaches (68). After propagation for several generations, substitution rates and mtDNA genome sizes might also increase. However, we note that some species in our dataset with low copy number retained low mtDNA substitution rates (Fig. 2B). Previous work has examined the effect of an Arabidopsis heterozygous knockout of pollB, a gene involved in replication of organellar genomes in plants (69), although there was no effect on mutation rates. Overall, our findings show a correlation between substitution rates and copy number, but empirical data is needed to confirm causality.

Materials and Methods

Data Curation.

To test our hypothesis, we used sequences from nuclear genes, mitochondrial genes, and plastid genes to calculate mtDNA and ptDNA substitution rates, and used raw sequencing reads from whole genome shotgun sequencing projects to calculate relative copy number (i.e., number of mtDNA copies per nuclear genome copy). We used publicly available shotgun genome sequencing data available from NCBI (Dataset S3). Sequencing methods for Sileneae, Plantagineae, and Geraniaceae were described in previous publications by coauthors (13, 24, 7075). Sequencing methods from gymnosperms and other angiosperms are found on their corresponding SRA page. Most WGS samples used in this study are Illumina paired-end sequences. Raw reads for each of the 60 species were first assembled using SPAdes-3.15.2 with settings -k 21,33,55,77,99 -t 64 to generate putative contigs from nuclear and mitochondrial genomes (76). These contigs are available on NCBI (Dataset S4) A set of 954 known single-copy nuclear genes from across flowering plants (77) were used as blast queries against these contigs using discontiguous MEGABLAST in Geneious Prime 2022, returning between 113 and 903 hits depending on the species (77, 78). Nuclear hits that were shorter than 1 kb were extended to flanking regions to reach 1 kb. These sequences comprised the nuclear gene set for each species. New mitochondrial gene sequences were obtained by aligning genes from closely related species to the SPAdes contigs for that species using Geneious Prime 2022. Sixteen total mtDNA genes were used in the analyses: atp1, atp4, atp6, atp8, atp9, ccmB, ccmC, ccmFn, cox1, cox3, cob, mttB, nd3, nd4L, nd6, and nd9. Plastid sequences were extracted from previously generated, annotated genomes (Dataset S5) and included 12 plastid genes: psaA, psaB, psaC, psbA, psbB, psbC, psbD, psbE, psbH, psbK, psbN, and psbZ.

Estimating Copy Number via Sequencing.

Bowtie2 v.2.4.5 was used to align raw Illumina reads for each species to fasta files containing nuclear, mitochondrial, or plastid gene sequences for the same species (79). Samtools-depth v.1.14 was used to calculate the mapping depth for each base pair of each gene (80). This file was imported into R and copy number was calculated using a custom R-script. Briefly, 150 base pairs were trimmed from the ends of each gene before coverage over each gene was averaged. For angiosperms, nuclear coverage was calculated by taking the median coverage of these averaged depths. Since gymnosperms had larger nuclear genomes, in many cases WGS did not provide sufficient coverage to accurately cover nuclear contigs (i.e., coverage was < 1×). For this reason, we used C-values from the Kew database, release 7.1, to estimate the genome size in base pairs (37); then, the total number of base pairs sequenced divided by the C-value size estimate was used as the nuclear coverage. This method was validated by using a subset of angiosperms with published C-values, and we also used C-value estimated copy numbers across all species (angiosperms and gymnosperms) to repeat our main analysis investigating the relationships between substitution rates and copy numbers (SI Appendix, Fig. S5). Mitochondrial coverage was calculated similarly, but using the 16 mitochondrial genes as curated above. Plastid coverage was calculated by taking the median coverage when mapping reads to the plastome, which was available for most species. If not, plastid coverage was calculated in the same way as the mitochondrial and nuclear coverage using the 12 plastid genes as curated above. mtDNA and ptDNA copy numbers were then calculated by dividing the mitochondrial coverage and plastid coverage by the nuclear coverage, respectively.

Calculating Substitution Rates.

The rate of synonymous substitutions (ds) per million years of evolution was calculated on terminal branches of our phylogeny (Fig. 1) using both the mtDNA and ptDNA genes curated above. The topology was constrained based on the most likely relationships between the focal species (15, 17, 73). First, individual mtDNA or ptDNA gene sequences were aligned and concatenated using custom scripts before using codeml in PAML (v4.9) with “model=1” to estimate ds for each branch in the phylogeny (81, 82). Then, absolute time since divergence events was estimated from a chronogram made using a phylogeny based on matK plastid sequences constructed in RAxML-NG (using default settings) and then using r8s to time calibrate the tree (8385). We used two time points to calibrate the chronogram, the origin of rosids at 118 mya, and the Pinales/Gnetum split at 307 mya (86). Terminal branches were collapsed if their divergence time was less than 1 mya. One species (Pelargonium nanum) was also removed due to its zero-length ds terminal branch. We then calculated absolute synonymous substitution rates using dS estimates from each terminal branch divided by time since divergence estimated by the matK chronogram. We used the same method to also calculate absolute nonsynonymous substitution rates in a separate analysis using terminal branch dN estimates.

Validating Copy Number via ddPCR.

A subset of species was used to validate mtDNA and ptDNA copy numbers using ddPCR (87). Primers (Dataset S6) were designed to amplify conserved regions of three mitochondrial genes (cob, atp1, and nad9), three plastid genes (matK, psaA, psbE), and three single-copy nuclear genes (beta-1,2-N-acetylglucosaminyltransferase II; AT2G05320, Tyrosyl-tRNA synthetase, class Ib; AT3G02660, and 3,8-divinyl protochlorophyllide a 8-vinyl reductase; AT5G18660 following the Arabidopsis nomenclature). DNA was extracted from leaf tissues of 11 species with known variation in mtDNA substitution rates: A. githago, S. latifolia, S. conica, P. sericea, A. thaliana, P. coronopus, P. aristata, S. noctiflora, C. macrophylla, P. myrrhifolium, and G. incanum. Qiagen DNeasy kits were used to extract whole genomic DNA from all samples except for Geraniaceae samples, which were processed using a modified CTAB protocol (88). Samples used in ddPCR were from the same DNA extraction used for sequencing when possible. ddPCR was then performed for each gene/species combination using a Bio-Rad ddPCR system (QX200 Droplet Generator and Droplet Reader and QX200 ddPCR Evagreen SuperMix) along with 2.5 ng DNA for nuclear genes, 1.25 ng DNA for mitochondrial genes, and 0.0025 ng DNA for plastid genes. PCR was run with an initial 95 °C denaturing step for 5 min then 40 cycles with 30 s at 95 °C and 60 s at 55 °C. Thresholds for positive and negative droplets were adjusted manually by eye for each sample after using Bio-Rad QuantaSoftTM autothresholding. We then used ddPCR output to calculate the number of each gene copy per ng DNA in each species. Copy number was estimated as median mtDNA or ptDNA gene copy per ng DNA divided by median nuclear gene copy per ng DNA.

Tissue-specific measurements of mtDNA and ptDNA copy number in Silene conica were performed using ddPCR, essentially as in Broz et al. (24), to confirm low mtDNA copy numbers were present across tissues. In brief, seeds were placed on germination paper in petri dishes and grown on light racks under short day conditions (10 h light/14 h dark). A portion of the seedlings were harvested when cotyledons had fully expanded (N = 3 samples of five pooled seedlings each), and others were transferred to potting media. For transferred plants, fully expanded rosette leaves were harvested after ~1.5 mo of growth (N = 3 individuals), and upon maturity, pistils were dissected from recently opened flowers (N = 3 individuals). DNA was extracted from all samples using the Qiagen Plant DNeasy Kit, and ddPCR and copy number determination was performed as described previously (24)

Estimating Mitochondrial Genome Sizes.

Unlike mitogenomes in bilaterian animals, mitochondrial genomes in plants are difficult to assemble and complete mitogenomes were not available for most species in our dataset. Therefore, we estimated mitogenome sizes by identifying likely mitochondrial contigs in our SPAdes assemblies (see above) based on coverage. We generated a plot for each species of average coverage vs. length for each contig. This generated graphs which have peaks for each genome: nuclear, mitochondrial, and plastid (SI Appendix, Figs. S9–S12). We also identified which contigs contained our curated 16 mtDNA genes (see above). Peaks were visually inspected and thresholds for determining mtDNA vs. other contigs were set by eye. The length of all the mtDNA contigs was then summed to generate an estimated mitogenome size for each species. For 16 species with known mitogenome sizes, we found a strong correlation (R2 = 0.93) between mitogenome sizes estimated using this approach and the known mitogenome size (SI Appendix, Fig. S13). We note that mitogenome sizes for Pelargonium tetragonum and Plantago coronopus were estimated to be above 15Mb, which would be the largest mitogenomes that have been assembled for a eukaryote (89) and warrant further investigation. A similar method was also used to estimate plastome size in two species in the dataset without available plastomes, Geranium brycei and Geranium sanguineum.

Statistics and Figures.

Statistical analyses were performed in R version 4.2.2. Standard linear models were run in base R using log10 transformed data and the lm function. PGLS models were run with log10 transformed data using equal length branches with the nlme function (90, 91). Figures were made using ggplot2 (92). Data from the main figures can be found in supplement (Dataset S7).

Supplementary Material

Appendix 01 (PDF)

Dataset S01 (CSV)

pnas.2317240121.sd01.csv (11.7KB, csv)

Dataset S02 (CSV)

Dataset S03 (CSV)

Dataset S04 (CSV)

Dataset S05 (CSV)

Dataset S06 (CSV)

Dataset S07 (CSV)

pnas.2317240121.sd07.csv (17.1KB, csv)

Acknowledgments

We thank the members of the Havird Lab for thoughtful comments on the manuscript and Erik Iverson for assisting with the PGLS analysis. This work was funded by NIH NIGMS MIRA grant 5R35GM142836-03.

Author contributions

D.B.S. and J.C.H. designed research; K.D.Z., L.G.T., Z.W., S.K., A.K.B., J.P.M., T.A.R., R.K.J., D.B.S., and J.C.H. performed research; K.D.Z. analyzed data; and K.D.Z. and J.C.H. wrote the paper.

Competing interests

The authors declare no competing interest.

Footnotes

This article is a PNAS Direct Submission.

Data, Materials, and Software Availability

All raw reads, gene sequences, and contigs used in this project are available on NCBI (Datasets S1 and S3–S5) (93). Code and other datasets are available on Figshare (https://figshare.com/projects/Genome_copy_number_predicts_extreme_evolutionary_rate_variation_in_plant_mitochondrial_DNA/180793) (94) and github (https://github.com/thekzwon/plant_mitogenome_project) (95) repositories.

Supporting Information

References

  • 1.Brown W. M., George M., Wilson A. C., Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. U.S.A. 76, 1967–1971 (1979). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Havird J. C., Sloan D. B., The roles of mutation, selection, and expression in determining relative rates of evolution in mitochondrial versus nuclear genomes. Mol. Biol. Evol. 33, 3042–3053 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Smith D. R., Arrigo K. R., Alderkamp A. C., Allen A. E., Massive difference in synonymous substitution rates among mitochondrial, plastid, and nuclear genes of Phaeocystis algae. Mol. Phylogenet. Evol. 71, 36–40 (2014). [DOI] [PubMed] [Google Scholar]
  • 4.Allio R., Donega S., Galtier N., Nabholz B., Large variation in the ratio of mitochondrial to nuclear mutation rate across animals: Implications for genetic diversity and the use of mitochondrial DNA as a molecular marker. Mol. Biol. Evol. 34, 2762–2772 (2017). [DOI] [PubMed] [Google Scholar]
  • 5.Lynch M., Koskella B., Schaack S., Mutation pressure and the evolution of organelle genomic architecture. Science 311, 1727–1730 (2006). [DOI] [PubMed] [Google Scholar]
  • 6.Wolfe K. H., Li W. H., Sharp P. M., Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc. Natl. Acad. Sci. U.S.A. 84, 9054–9058 (1987). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hellberg M. E., No variation and low synonymous substitution rates in coral mtDNA despite high nuclear variation. BMC Evol. Biol. 6, 24 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.De Chiara M., et al. , Discordant evolution of mitochondrial and nuclear yeast genomes at population level. BMC Biol. 18, 1–15 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Nabholz B., Glémin S., Galtier N., The erratic mitochondrial clock: Variations of mutation rate, not population size, affect mtDNA diversity across birds and mammals. BMC Evol. Biol. 9, 1–13 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Drouin G., Daoud H., Xia J., Relative rates of synonymous substitutions in the mitochondrial, chloroplast and nuclear genomes of seed plants. Mol. Phylogenet. Evol. 49, 827–831 (2008). [DOI] [PubMed] [Google Scholar]
  • 11.Nabholz B., Glémin S., Galtier N., Strong variations of mitochondrial mutation rate across mammals–the longevity hypothesis. Mol. Biol. Evol. 25, 120–130 (2008). [DOI] [PubMed] [Google Scholar]
  • 12.Lavrov D. V., Pett W., Animal mitochondrial DNA as we do not know it: mt-genome organization and evolution in nonbilaterian lineages. Genome Biol. Evol. 8, 2896–2913 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Havird J. C., Trapp P., Miller C. M., Bazos I., Sloan D. B., Causes and consequences of rapidly evolving mtDNA in a plant lineage. Genome Biol. Evol. 9, 323–336 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Sloan D. B., Using plants to elucidate the mechanisms of cytonuclear co-evolution. New Phytol. 205, 1040–1046 (2015). [DOI] [PubMed] [Google Scholar]
  • 15.Mower J. P., Touzet P., Gummow J. S., Delph L. F., Palmer J. D., Extensive variation in synonymous substitution rates in mitochondrial genes of seed plants. BMC Evol. Biol. 7, 135 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Mower J. P., Sloan D. B., Alverson A. J., "Plant mitochondrial genome diversity: The genomics revolution" in Plant Genome Divers. Vol. 1 Plant Genomes (2012), pp. 123–144. [Google Scholar]
  • 17.Sloan D. B., et al. , Rapid evolution of enormous, multichromosomal genomes in flowering plant mitochondria with exceptionally high mutation rates. PLoS Biol. 10, e1001241 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Parkinson C. L., et al. , Multiple major increases and decreases in mitochondrial substitution rates in the plant family Geraniaceae. BMC Evol. Biol. 5, 1–12 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bakker F. T., Breman F., Merckx V., DNA sequence evolution in fast evolving mitochondrial DNA nad1 exons in Geraniaceae and Plantaginaceae. Taxon 55, 887–896 (2006). [Google Scholar]
  • 20.Zhu A., Guo W., Jain K., Mower J. P., Unprecedented heterogeneity in the synonymous substitution rate within a plant genome. Mol. Biol. Evol. 31, 1228–1236 (2014). [DOI] [PubMed] [Google Scholar]
  • 21.Skippington E., Barkmanb T. J., Ricea D. W., Palmera J. D., Miniaturized mitogenome of the parasitic plant viscum scurruloideum is extremely divergent and dynamic and has lost all nad genes. Proc. Natl. Acad. Sci. U.S.A. 112, E3515–E3524 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Lee C., Ruhlman T. A., Jansen R. K., Rate accelerations in plastid and mitochondrial genomes of Cyperaceae occur in the same clades. Mol. Phylogenet. Evol. 182, 107760 (2023). [DOI] [PubMed] [Google Scholar]
  • 23.Rautenberg A., Sloan D. B., Aldén V., Oxelman B., Phylogenetic Relationships of Silene multinervia and Silene Section Conoimorpha (Caryophyllaceae). Syst. Botany 37, 226–237 (2012). 10.1600/036364412X616792. [DOI] [Google Scholar]
  • 24.Broz A. K., Waneka G., Wu Z., Gyorfy M. F., Sloan D. B., Detecting de novo mitochondrial mutations in angiosperms with highly divergent evolutionary rates. Genetics 218, iyab039 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Guo W., Evolution of Organellar Genome Architecture in Seed Plants: The Role of Intracellular Gene Transfer, Recombination and Mutation. ETD collection for University of Nebraska-Lincoln. AAI3642759 (2014).
  • 26.Chevigny N., Schatz-Daas D., Lotfi F., Gualberto J. M., DNA repair and the stability of the plant mitochondrial genome. Int. J. Mol. Sci. 21, 328 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Miller-Messmer M., et al. , RecA-dependent DNA repair results in increased heteroplasmy of the Arabidopsis mitochondrial genome. Plant Physiol. 159, 211 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kimura M., Evolutionary rate at the molecular level. Nature 217, 624–626 (1968). [DOI] [PubMed] [Google Scholar]
  • 29.Harman D., Free radical theory of aging: Consequences of mitochondrial aging. Age (Omaha). 6, 86–94 (1983). [Google Scholar]
  • 30.Shokolenko I. N., Wilson G. L., Alexeyev M. F., Persistent damage induces mitochondrial DNA degradation. DNA Repair (Amst). 12, 488–499 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Shokolenko I. N., Venediktova N., Bochkareva A., Wilson G. I., Alexeyev M. F., Oxidative stress induces degradation of mitochondrial DNA. Nucleic Acids Res. 37, 2539 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Davila J. I., et al. , Double-strand break repair processes drive evolution of the mitochondrial genome in Arabidopsis. BMC Biol. 9, 1–14 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Boesch P., et al. , DNA repair in organelles: Pathways, organization, regulation, relevance in disease and aging. Biochim. Biophys. Acta Mol. Cell Res. 1813, 186–200 (2011). [DOI] [PubMed] [Google Scholar]
  • 34.Palmer J. D., Herbon L. A., Plant mitochondrial DNA evolves rapidly in structure, but slowly in sequence. J. Mol. Evol. 28, 87–97 (1988). [DOI] [PubMed] [Google Scholar]
  • 35.Wu Z., Waneka G., Broz A. K., King C. R., Sloan D. B., MSH1 is required for maintenance of the low mutation rates in plant mitochondrial and plastid genomes. Proc. Natl. Acad. Sci. U.S.A. 117, 16448–16455 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Mower J. P., Variation in protein gene and intron content among land plant mitogenomes. Mitochondrion 53, 203–213 (2020). [DOI] [PubMed] [Google Scholar]
  • 37.Pellicer J., Leitch I. J., The Plant DNA C-values database (release 7.1): An updated online repository of plant genome size data for comparative studies. New Phytol. 226, 301–305 (2020). [DOI] [PubMed] [Google Scholar]
  • 38.Lanfear R., Do plants have a segregated germline? PLoS Biol. 16, e2005439 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Cho Y., Mower J. P., Qiu Y. L., Palmer J. D., Mitochondrial substitution rates are extraordinarily elevated and variable in a genus of flowering plants. Proc. Natl. Acad. Sci. U.S.A. 101, 17741–17746 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Arimura S. I., Fission and fusion of plant mitochondria, and genome maintenance. Plant Physiol. 176, 152–161 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Seguí-Simarro J. M., Coronado M. J., Staehelin L. A., The mitochondrial cycle of arabidopsis shoot apical meristem and leaf primordium meristematic cells is defined by a perinuclear tentaculate/cage-like mitochondrion. Plant Physiol. 148, 1380–1393 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Chustecki J. M., Etherington R. D., Gibbs D. J., Johnston I. G., Altered collective mitochondrial dynamics in the Arabidopsis msh1 mutant compromising organelle DNA maintenance. J. Exp. Bot. 73, 5428–5439 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Havird J. C., et al. , Do angiosperms with highly divergent mitochondrial genomes have altered mitochondrial function? Mitochondrion 49, 1 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Havird J. C., Whitehill N. S., Snow C. D., Sloan D. B., Conservative and compensatory evolution in oxidative phosphorylation complexes of angiosperms with highly divergent rates of mitochondrial genome evolution. Evolution 69, 3069–3081 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Christensen A. C., Plant mitochondrial genome evolution can be explained by DNA repair mechanisms. Genome Biol. Evol. 5, 1079 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Christensen A. C., Genes and junk in plant mitochondria—Repair mechanisms and selection. Genome Biol. Evol. 6, 1448–1453 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Wu Z., Waneka G., Sloan D. B., The tempo and mode of angiosperm mitochondrial genome divergence inferred from intraspecific variation in Arabidopsis thaliana. G3 (Bethesda) 10, 1077–1086 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Perry A. S., Wolfe K. H., Nucleotide substitution rates in legume chloroplast DNA depend on the presence of the inverted repeat. J. Mol. Evol. 55, 501–508 (2002). [DOI] [PubMed] [Google Scholar]
  • 49.Zhu A., Guo W., Gupta S., Fan W., Mower J. P., Evolutionary dynamics of the plastid inverted repeat: The effects of expansion, contraction, and loss on substitution rates. New Phytol. 209, 1747–1756 (2016). [DOI] [PubMed] [Google Scholar]
  • 50.Weng M. L., Ruhlman T. A., Jansen R. K., Expansion of inverted repeat does not decrease substitution rates in Pelargonium plastid genomes. New Phytol. 214, 842–851 (2017). [DOI] [PubMed] [Google Scholar]
  • 51.Greiner S., et al. , Chloroplast nucleoids are highly dynamic in ploidy, number, and structure during angiosperm leaf development. Plant J. 102, 730–746 (2020). [DOI] [PubMed] [Google Scholar]
  • 52.Forsythe E. S., et al. , Organellar transcripts dominate the cellular mRNA pool across plants of varying ploidy levels. Proc. Natl. Acad. Sci. U.S.A. 119, e2204187119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Lynch M., Evolution of the mutation rate. Trends Genet. 26, 345–352 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Zhang H., Burr S. P., Chinnery P. F., The mitochondrial DNA genetic bottleneck: Inheritance and beyond. Essays Biochem. 62, 225–234 (2018). [DOI] [PubMed] [Google Scholar]
  • 55.Ladoukakis E. D., Zouros E., Direct evidence for homologous recombination in mussel (Mytilus galloprovincialis) mitochondrial DNA. Mol. Biol. Evol. 18, 1168–1175 (2001). [DOI] [PubMed] [Google Scholar]
  • 56.Sena E. P., Revet B., Moustacchi E., In vivo homologous recombination intermediates of yeast mitochondrial DNA analyzed by electron microscopy. MGG Mol. Gen. Genet. 202, 421–428 (1986). [DOI] [PubMed] [Google Scholar]
  • 57.Smith G. R., Homologous recombination in procaryotes. Microbiol. Rev. 52, 1 (1988). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Hagström E., Freyer C., Battersby B. J., Stewart J. B., Larsson N. G., No recombination of mtDNA after heteroplasmy for 50 generations in the mouse maternal germline. Nucleic Acids Res. 42, 1111–1116 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Hayashi J. I., Tagashira Y., Yoshida M. C., Absence of extensive recombination between inter- and intraspecies mitochondrial DNA in mammalian cells. Exp. Cell Res. 160, 387–395 (1985). [DOI] [PubMed] [Google Scholar]
  • 60.Ma H., O’Farrell P. H., Selections that isolate recombinant mitochondrial genomes in animals. Elife 4, e07247 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Guo X., Liu S., Liu Y., Evidence for recombination of mitochondrial DNA in triploid crucian carp. Genetics 172, 1745 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Klucnika A., et al. , REC drives recombination to repair double-strand breaks in animal mtDNA. J. Cell Biol. 222, e202201137 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Pont-Kingdon G. A., et al. , A coral mitochondrial mutS gene. Nature 375, 109–111 (1995). [DOI] [PubMed] [Google Scholar]
  • 64.Pont-Kingdon G., et al. , Mitochondrial DNA of the coral Sarcophyton glaucum contains a gene for a homologue of bacterial MutS: A possible case of gene transfer from the nucleus to the mitochondrion. J. Mol. Evol. 46, 419–431 (1998). [DOI] [PubMed] [Google Scholar]
  • 65.Bilewitch J. P., Degnan S. M., A unique horizontal gene transfer event has provided the octocoral mitochondrial genome with an active mismatch repair gene that has potential for an unusual self-contained function. BMC Evol. Biol. 11, 228 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Ogata H., et al. , Two new subfamilies of DNA mismatch repair proteins (MutS) specifically abundant in the marine environment. ISME J. 57, 1143–1151 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Muthye V., Mackereth C. D., Stewart J. B., Lavrov D. V., Large dataset of octocoral mitochondrial genomes provides new insights into mt-mutS evolution and function. DNA Repair (Amst). 110, 103273 (2022). [DOI] [PubMed] [Google Scholar]
  • 68.Sloan D. B., Broz A. K., Sharbrough J., Wu Z., Detecting rare mutations and DNA damage with sequencing-based methods. Trends Biotechnol. 36, 729–740 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Cupp J. D., Nielsen B. L., Arabidopsis thaliana organellar DNA polymerase IB mutants exhibit reduced mtDNA levels with a decrease in mitochondrial area density. Physiol. Plant. 149, 91–103 (2013). [DOI] [PubMed] [Google Scholar]
  • 70.Wu Z., Sloan D. B., Recombination and intraspecific polymorphism for the presence and absence of entire chromosomes in mitochondrial genomes. Heredity 1225, 647–659 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Mower J. P., et al. , Plastomes from tribe Plantagineae (Plantaginaceae) reveal infrageneric structural synapormorphies and localized hypermutation for Plantago and functional loss of ndh genes from Littorella. Mol. Phylogenet. Evol. 162, 107217 (2021). [DOI] [PubMed] [Google Scholar]
  • 72.Warren J. M., et al. , Rewiring of aminoacyl-tRNA synthetase localization and interactions in plants with extensive mitochondrial tRNA gene loss. Mol. Biol. Evol. 40, msad163 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Choi K. S., Weng M. L., Ruhlman T. A., Jansen R. K., Extensive variation in nucleotide substitution rate and gene/intron loss in mitochondrial genomes of Pelargonium. Mol. Phylogenet. Evol. 155, 106986 (2021). [DOI] [PubMed] [Google Scholar]
  • 74.Clark-Matott J., et al. , Metabolomic analysis of exercise effects in the POLG mitochondrial DNA mutator mouse brain. Neurobiol. Aging 36, 2972–2983 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Park S., et al. , Contrasting patterns of nucleotide substitution rates provide insight into dynamic evolution of plastid and mitochondrial genomes of geranium. Genome Biol. Evol. 9, 1766–1780 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Bankevich A., et al. , SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Duarte J. M., et al. , Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryzaand their phylogenetic utility across various taxonomic levels. BMC Evol. Biol. 10, 1–18 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Camacho C., et al. , BLAST+: Architecture and applications. BMC Bioinformatics 10, 1–9 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Langmead B., Salzberg S. L., Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Li H., et al. , The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Yang Z., PAML: A program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556 (1997). [DOI] [PubMed] [Google Scholar]
  • 82.Yang Z., PAML 4: Phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007). [DOI] [PubMed] [Google Scholar]
  • 83.Stamatakis A., RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Kozlov A. M., Darriba D., Flouri T., Morel B., Stamatakis A., RAxML-NG: A fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35, 4453–4455 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Sanderson M. J., r8s: Inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19, 301–302 (2003). [DOI] [PubMed] [Google Scholar]
  • 86.Li H. T., et al. , Origin of angiosperms and the puzzle of the Jurassic gap. Nat. Plants 5, 461–470 (2019). [DOI] [PubMed] [Google Scholar]
  • 87.Bj H., et al. , High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal. Chem. 83, 8604–8610 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Weng M. L., Blazier J. C., Govindu M., Jansen R. K., Reconstruction of the ancestral plastid genome in geraniaceae reveals a correlation between genome rearrangements, repeats, and nucleotide substitution rates. Mol. Biol. Evol. 31, 645–659 (2014). [DOI] [PubMed] [Google Scholar]
  • 89.Putintseva Y. A., et al. , Siberian larch (Larix sibirica Ledeb.) mitochondrial genome assembled using both short and long nucleotide sequence reads is currently the largest known mitogenome. BMC Genomics 21, 654 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Pinheiro J. C., Bates D. M., Mixed-Effects Models in S and S-PLUS (Springer, New York, 2000). [Google Scholar]
  • 91.Pinheiro J., Bates D., R Core Team, nlme: Linear and Nonlinear Mixed Effects Models (R package version 3.1-164, The Comprehensive R Archive Network, 2022).
  • 92.Wickham H., ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, New York, 2016). [Google Scholar]
  • 93.Sayers E. W., et al. , Database resources of the national center for biotechnology information. Nucleic Acids Res. 50, D20–D26 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Zwonitzer K., Genome copy number predicts extreme evolutionary rate variation in plant mitochondrial DNA. Figshare. https://figshare.com/projects/Genome_copy_number_predicts_extreme_evolutionary_rate_variation_in_plant_mitochondrial_DNA/180793. Deposited 10 October 2023. [DOI] [PMC free article] [PubMed]
  • 95.Zwonitzer K., thekzwon/plant_mitogenome_project. Github. https://github.com/thekzwon/plant_mitogenome_project. Deposited 10 October 2023.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix 01 (PDF)

Dataset S01 (CSV)

pnas.2317240121.sd01.csv (11.7KB, csv)

Dataset S02 (CSV)

Dataset S03 (CSV)

Dataset S04 (CSV)

Dataset S05 (CSV)

Dataset S06 (CSV)

Dataset S07 (CSV)

pnas.2317240121.sd07.csv (17.1KB, csv)

Data Availability Statement

All raw reads, gene sequences, and contigs used in this project are available on NCBI (Datasets S1 and S3–S5) (93). Code and other datasets are available on Figshare (https://figshare.com/projects/Genome_copy_number_predicts_extreme_evolutionary_rate_variation_in_plant_mitochondrial_DNA/180793) (94) and github (https://github.com/thekzwon/plant_mitogenome_project) (95) repositories.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES