Skip to main content
Human Molecular Genetics logoLink to Human Molecular Genetics
. 2022 Jul 15;31(23):4075–4086. doi: 10.1093/hmg/ddac149

Dynamics of the most common pathogenic mtDNA variant m.3243A > G demonstrate frequency-dependency in blood and positive selection in the germline

Melissa Franco 1, Sarah J Pickett 2,, Zoe Fleischmann 3, Mark Khrapko 4, Auden Cote-L’Heureux 5, Dylan Aidlen 6, David Stein 7, Natasha Markuzon 8,2, Konstantin Popadin 9,10,11, Maxim Braverman 12, Dori C Woods 13, Jonathan L Tilly 14, Doug M Turnbull 15, Konstantin Khrapko 16,
PMCID: PMC9703810  PMID: 35849052

Abstract

The A-to-G point mutation at position 3243 in the human mitochondrial genome (m.3243A > G) is the most common pathogenic mtDNA variant responsible for disease in humans. It is widely accepted that m.3243A > G levels decrease in blood with age, and an age correction representing ~ 2% annual decline is often applied to account for this change in mutation level. Here we report that recent data indicate that the dynamics of m.3243A > G are more complex and depend on the mutation level in blood in a bi-phasic way. Consequently, the traditional 2% correction, which is adequate ‘on average’, creates opposite predictive biases at high and low mutation levels. Unbiased age correction is needed to circumvent these drawbacks of the standard model. We propose to eliminate both biases by using an approach where age correction depends on mutation level in a biphasic way to account for the dynamics of m.3243A > G in blood. The utility of this approach was further tested in estimating germline selection of m.3243A > G. The biphasic approach permitted us to uncover patterns consistent with the possibility of positive selection for m.3243A > G. Germline selection of m.3243A > G shows an ‘arching’ profile by which selection is positive at intermediate mutant fractions and declines at high and low mutant fractions. We conclude that use of this biphasic approach will greatly improve the accuracy of modelling changes in mtDNA mutation frequencies in the germline and in somatic cells during aging.

Introduction

Pathogenic variants in the mitochondrial genome are responsible for a wide range of diseases that affect mitochondrial function (1). The multi-copy nature of mitochondrial DNA (mtDNA) means that it is possible for more than one species of mtDNA to co-exist within the same cell, termed heteroplasmy. By far the most common heteroplasmic mtDNA pathogenic variant is an A to G transition at position 3243 (m.3243A > G) within MT-TL1, which encodes mitochondrial tRNALeu(UUR) (2). Estimates of m.3243A > G carrier frequency range from 140 to 250 people per 100 000 (3,4), although the point prevalence for adult disease is much lower than this, at 3.5 per 100  000 (5), suggesting that many carriers are either asymptomatic or have mild, undiagnosed symptoms.

Originally identified within a cohort of patients presenting with a severe syndrome characterized by mitochondrial encephalopathy, lactic acidosis and stroke-like episodes (MELAS), m.3243A > G is associated with extremely varied clinical presentations. Patients can experience a variety of phenotypes including ataxia, diabetes, deafness, ptosis, chronic progressive ophthalmoplegia, cardiomyopathy, cognitive dysfunction and severe psychiatric manifestations (6–11). Disease burden can be partly explained by an individual’s m.3243A > G mutation level, but other factors are likely to also play a role (11–13). These complexities make offering prognostic advice to patients very difficult and the mutation’s high population frequency means that m.3243A > G-related disease is one of the biggest challenges faced in mitochondrial disease clinics.

Although levels of m.3243A > G are relatively stable in post-mitotic tissues such as muscle, there is strong evidence of negative selection against the G allele in mitotic tissues, even in relatively asymptomatic patients; this loss of m.3243A > G in mitotic tissues has been studied in detail in blood (7,12,14–18). Initial simulation studies suggested that this decline could be exponential (16), although more recent studies using larger amounts of data point to a more complex process (12,19). To study the dynamics of this mutation, we took advantage of the large quantity of longitudinal data that are available for m.3243A > G levels in blood and have developed a new, empirical model that better describes the dichotomous pattern that we observe.

The m.3243A > G variant is maternally inherited and, similar to other heteroplasmic mtDNA mutations, undergoes a genetic bottleneck during development that often leads to offspring with very different levels of m.3243A > G than their mothers (20,21). Different mtDNA mutations segregate at different rates, demonstrating that the dynamics of this bottleneck are dependent on the mtDNA variant being transmitted (22). We postulated that studying this bottleneck may help us understand why the m.3243A > G variant is so common in the population; using our new model of the dynamics of blood heteroplasmy/mutation level, we explored whether there is a transmission bias for m.3243A > G between mother and child i.e. whether m.3243A > G is under germline selection.

Results

Longitudinal changes of m.3243A > G levels in blood with age reveal a dichotomous pattern (Fig. 1)

Figure 1.

Figure 1

Analysis of longitudinal blood m.3243A > G levels. (A) Each segment represents the change of m.3243A > G levels in blood for a single individual over a longitudinal period. Red and blue lines are cases of age-related increased or decreased/stable m.3243A > G levels, respectively. (B) Logistic regression of the data presented in A. Increasing segments are represented as ‘1 s’ and decreasing as ‘0 s’. The grey shaded curve represents the 95% confidence interval for the binary regression at each point of the curve. The logistic regression curve has been constructed and significance of the negativity of the curve estimated (P < 0.0001). See Materials and Methods for details.

A large amount of data on the changes of the level of m.3243A > G mutation in blood with the age of the individual have been compiled in a recent report (12). We have reproduced these data in Figure 1A. Traditionally, the level of m.3243A > G mutation in the blood is considered to follow an exponential decay with age of Inline graphic2% per year. The data in Figure 1A, however, appear visually dichotomous i.e. at higher m.3243A > G levels we see more cases where the mutation level decreases (red), whereas at lower levels there is little evidence of decline and there are more cases with an increase or that remain constant (blue). The dichotomy can be further demonstrated by binary logistic regression analysis (Fig. 1B). Figure 1B shows that individuals with higher m.3243A > G levels tend to, in accordance with the conventional model, decrease their levels with age. Conversely, although contradictory to the convention, individuals with low mutation levels tend to increase or stabilize their mutation level. This trend is significantly non-random (the slope of regression curve is significantly negative P < 0.0001).

Mother–child m.3243A > G levels also show a dichotomous pattern

The dichotomous pattern of m.3243A > G dynamics in blood is further supported by the analysis of a different but similarly constructed dataset i.e. the mother–child dataset (20), that represents inheritance of m.3243A > G between mothers and their children. We note that this dataset shows a similar dichotomous pattern (Fig. 2A) as the longitudinal dataset, which reflects the dynamics in blood with age discussed above (Fig. 1A). Indeed, in the high child mutation level range (approximately above 5%), the majority of mother/child relationships are strongly descending in mutation frequency. In contrast, in the children’s low mutation level region (below 5%), the ascending child>mother pattern prevails. In accordance with this visual appearance, logistic regression analysis confirms a statistically significant increase of child–mother pairs with increasing mutation levels (blue segments) at lower child mutant fractions (MFs), reflecting the conclusions drawn in Figure 1B. We conclude that in both the longitudinal and the mother/child datasets, the increase of m.3243A > G mutation level prevails at the low mutation levels and decrease prevails at the high mutation levels. Thus, the rate and the direction of change of m.3243A > G level in blood appear to depend on the mutation level.

Figure 2.

Figure 2

Analysis of mother–child m.3243A > G levels in blood. (A) Every segment connects the m.3243A > G level of a child (left end of the segment) to the m.3243A > G level of their mothers (right end). Descending segments are coded red, ascending—blue, decreasing or stable. (B) Solid line: Logistic regression curve of the mother–child data presented in A, constructed as in Figure 1B. Dashed line: Logistic regression of longitudinal data from Figure 1B, overlaid for comparison. The grey shaded curve represents the 95% confidence interval for the binary regression at each point of the curve.

The standard 2% annual decline model is biased at high and low mutational levels

The observed dichotomy of m.3243A > G (Figs 1 and 2) implies that, because the conventional 2% annual decline model inherently applies to all mutation frequency levels in the same way and thus cannot account for two opposite patterns, it must be making biased predictions among individuals with low and/or high levels of the m.3243A > G mutation. To test this supposition, we split the longitudinal dataset into low- and high-level subsets and evaluated the 2% annual decline model for bias in each of the two subsets. We noted, however, that although the analysis presented in Figures 1 and 2 does demonstrate the existence of dichotomy in mutation dynamics in blood, it does not permit precise determination of the threshold separating the two subsets with different mutation dynamics. To address this, we use four different thresholds (10, 15, 20, 30%) which essentially cover the entire span of the data to generate eight subsets of the longitudinal dataset, four each above and below the four thresholds (‘low mutant level subsets’ and ‘high mutant level subsets’, respectively). To test our hypothesis that the standard 2% annual decline (16) model was unable to correctly handle trends in the dichotomous data, we used the 2% model to predict mutant levels at the last measurement given mutant levels at the first measurement and the age difference between measurements. More specifically, we used the equation:

graphic file with name DmEquation1.gif (1)

where R is the rate of change on MF per year, MF1 is the actual mutation level at the first measurement, MF2pred is the predicted level at the last measurement and ΔA is the difference in age between the two measurements. We then calculated the error ratio MF2pred/MF2 (where MF2 is the actual level at the last measurement) for each individual prediction and the geometric average of error ratios for each given subset (see also Fig. 3A caption). Error ratios vary around 1, with 1 indicating ‘no error’. To make the measure of the error more intuitive, we then subtracted 1 from error ratios to obtain the ‘average relative error’ of the prediction, which is a fair measure of the bias of the model (positive or negative) in the given subset. The result of this analysis is shown in Figure 3A. As shown in forward predictions, the standard (2% annual decline) model underestimates mutation frequency (negative average relative prediction error) in the low mutant domain (blue) and overestimates (positive average relative prediction error) in the high fraction domain (red). This implies that the m.3243A > G decline rate of 2% per year is too slow for the high-frequency individuals, in whom mutations apparently decline on average faster than 2%, and too fast for the low fraction domain, where m.3243A > G levels are more stable or increase with age within individuals on average.

Figure 3.

Figure 3

(A) The standard 2% annual decline model is negatively biased at low mutation levels and positively biased at high mutation levels. Eight subsets of the longitudinal data were created by splitting the data into two subsets at four arbitrary mutation level thresholds (10, 15, 20 and 30%; at first measurement; the subset sizes are as follows: < 10%, 18; < 15%, 31; < 20%, 39; < 30%, 61; > 10%, 78; > 15%, 65; > 20%, 57; > 30%, 35; all, 96).). The 2% annual decline model was used to predict the level of m.3243A > G in each individual at the last measurement on the basis of the first measurement and age difference. The average relative errors within each subset were calculated and plotted as blue bars (low MF subsets) or red bars (high MF subsets). The colors were chosen to make an analogy with Figures 1 and 2 and highlight that high and low mutational level subsets preferentially consist of ascending and descending segments of Figures 1A and 2A. (B) Subset-specific ‘unbiased’ mutation rates that neutralize biases depend on the mutational level of the subset. ‘Unbiased’ rates were determined for each of eight subsets described in (A), presented as the rates of annual decline R (Eq. (1)) such that the error of predictions of the resulting ‘unbiased’ model averaged within the subset was zero. Blue bars—low mutation level subsets, red bars—high mutation level subsets. See Materials and Methods and Figures 4 and Supplementary Material, Figure S1. (C) ‘Unbiased’ model reveals positive germline selection. To estimate the unbiased enrichment of m.3243A > G per generation due to germline selection, the mother–child dataset was split into eight subsets in the same way as the longitudinal dataset in panel A. The unbiased rates derived from the eight subsets of the longitudinal dataset shown in panel B were used to predict child’s mutational level at mother’s age and the ratio of adjusted child to mother’s mutation level was considered estimate of germline selection. Bars represent enrichments (i.e. geometric medians of child/mother ratios within each of the 8 subsets minus 1). Blue bars—low mutation level subsets, red bars—high mutation level subsets. Black bar (‘All’) represents enrichment (i.e. lack thereof) estimated by the standard 2% rate model within the entire mother–child dataset, which is consistent with the previous studies. Grey bars represent germline selection predicted if the 2% decline rate was used in each of the eight subsets instead of the specific unbiased decline rates. The labels above the bars represent significance as determined by a two tailed sign test (NS= not significant, *= p-value lower than 0.1, **= p-value lower than 0.05). See Materials and Methods for details of calculations.

Biphasic models alleviate biases of the 2% annual decline model

To alleviate the biases of the standard 2% annual decline model, we propose to use a biphasic model which uses two annual decline rates, reflecting the dichotomous pattern of mutation decline with age. To build a biphasic model, the longitudinal dataset, which is used as ‘training’ data set, is divided into two subsets—below and above a chosen ‘separation threshold’ of the MF. For each of the two resulting subsets, an unbiased mutation decline rate is determined. This is the rate which, when used in Eq. (1), results in zero average error calculated over the subset (see materials and methods for details). The rationale is that if this model makes unbiased predictions on the ‘training’ set (the longitudinal dataset), then it will most likely be unbiased in predicting child mutation levels at their mother’s age on the mother/child dataset.

This approach requires the specification of one free parameter: a threshold level of mutation fraction to separate the dataset into high and low mutation level subsets. The two ‘unbiased’ rates of annual decline of mutation level for each of the two subsets are then determined unambiguously by the condition that the model is unbiased. As far as the separating threshold is concerned, we used the same thresholds and generated the same eight subsets as those used for testing the conventional 2% annual decline model described in the previous section (Fig. 3A and B). For each of the eight subsets, we determined the ‘unbiased’ rate of decline. Reassuringly, the ‘unbiased’ model for each subset was also close to the best fit model, meaning that the sum of squared errors was close to a minimum (see Supplementary Material, Fig. S1 for a full set of graphs).

In addition, we showed that the biphasic model is significantly less biased than the standard 2% model. To prove this, we performed a simulation where the longitudinal dataset was randomly partitioned into equal sized training and testing sets 1000 times. Biphasic models were built for each training set and used on the corresponding test set to predict the second mutation level measurement. Then the performance of the biphasic models on the test sets was compared with performance of the standard 2% model. As expected, biphasic models performed unbiasedly on average while the 2% model was biased, and the two distributions of bias were significantly different (P < 0.00001). The biphasic model was less biased than the 2% model in 70% of iterations. This statistic must be considered in light of the fact that this simulation was highly conservative i.e. it was overwhelmed by random variation resulting from the small sample size. In fact, both the training and test sets were much smaller than in the real experimental data, where the training set is the entire dataset from Grady et al. and the test set is the entire mother/child dataset, which is about 3-fold larger than the Grady et al. dataset.

As expected, the bar graph of the unbiased rates per subset (Fig. 3B) looks like a mirror image of the average relative error bars per subset (Fig.3A). Note that the line of reflection in Figure 3B is not the x-axis, but the red dotted line at −0.02, which is because the 2% annual decline model is the reference in this case. Interestingly, unbiased rates tend to be positive in individuals with a low MF (i.e. MF increases in the blood with age in these individuals). Conversely, unbiased rates tend to be negative in individuals with a high MF. This distinction remains relatively stable as the threshold separating the high and the low MF subsets is varied from 10 to 30%. The fair consistency of the neutral decline rates in subsets that lie predominately within the high (>15%, ‘>20%’,’ > 30%’) or low mutation level domains (‘<10%’, ‘<15%’, ‘<20%’) implies that the unbiased decline rates are innate characteristics of each domain. We therefore conclude that the high and the low mutation level domains should be analysed differently and separately. This is particularly true as the unbiased decline rates in the two domains have opposite signs, which are likely to compensate each other and obviate the details of the dynamics of m.3243A > G if they are treated jointly.

Estimating germline selection using the bi-phasic model

Germline selection of mtDNA pertains to the bias in the transmission of an mtDNA variant to the next generation, so theoretically, germline selection could be simply estimated by the ratio of mutation levels in the child versus mother. In reality, the child mother ratio is affected by the process of random segregation of the genotypes because of the intergenerational mtDNA bottleneck. As a result, selection can be reliably estimated only by averaging the child/mother ratios among a substantial number of mother/child pairs. We use here geometric averaging, which is a natural way of averaging ratios. Furthermore, the child/mother ratio is inevitably on the basis of m.3243A > G levels measured in a somatic tissue, most often, whole blood. Because m.3243A > G levels in blood systematically decrease with age, in the case of blood samples, the child/mother ratio needs to be corrected for the age difference to yield an unbiased estimate of germline selection. We therefore expected that the bias of the 2% model that we described above might have affected previous estimates of germline selection.

To test this expectation and to obtain corrected estimates, we split the mother–child dataset into eight subsets, mirroring our methodology for the longitudinal dataset. We then used, for each of the subsets, the specific unbiased decline rates devised from the longitudinal data (Fig. 3B) to factor out the mother–child age differences within the corresponding subsets of the mother–child dataset. The children’s mutational levels were projected to their mother’s age using the unbiased rate of m.3243A > G change and then was divided by the mothers’ mutation levels. The details of the calculations for these estimates are presented in Materials and Methods. The estimated germline enrichment is positive at intermediate low mutant frequencies (<15%, < 20%, < 30% but not in the lowest subset < 10% (Fig. 3C). Notably, at higher MFs positive selection decreases and then becomes negative. This suggests that m.3243A > G selection in the germline follows an ‘arched’ curve.

We then compared our results to the standard model: we calculated the expected germline selection for each of the subsets, and for the entire mother–child dataset, under assumption of the uniform decline rate of 2% per year. For the entire dataset, this produces a germline selection estimate which is close to zero (Fig. 3C; black bar labelled ‘All’). The subset-based analysis (Fig. 3C; grey bars), unlike our bi-phasic approach which tends to predict weak positive selection estimates in most subsets, clearly predicts strong negative selection in the low mutation level domain and strong positive selection in the high mutation level domain. The explanation for this paradoxical behaviour of the standard approach is as follows: when a flat −2% rate is being used for age adjustment of the low mutation level subsets the mother–child pairs are being adjusted with an incorrect mutation decline rate (2%/year i.e. decrease instead of the unbiased 2%/year increase as in our calculations). This excessively negative rate results in an excessively negative adjustment of the child’s mutation level projection to the mother’s age, which results in negative germline selection estimate. In contrast, in the high MF subsets, the use of the 2%/year decline rate is not sufficient to compensate for the even more rapid, and real, decline in these subsets. By our estimates, decline rates in high mutant subsets are about 3%/year (average of the red bars in Fig. 3C). Thus, insufficient negative adjustment leaves child’s levels too high, which results in an overestimate of positive germline selection.

Although our estimates of selection within the high m.3243A > G level subsets of the mother–child data are negative (the median-based estimate), the distributions are in fact skewed. Although a majority of the data appear to show low levels of negative germline selection, there is also a long right-hand tail of what appears as positive selection (Supplementary Material, Fig. S2). The presence of these asymmetrically positioned outliers suggests that we cannot rule out the possibility of some positive selection taking place in these high-level subsets, potentially taking place in a special subset of individuals, which can be dependent, for example, on their nuclear genetic background.

Discussion

Dynamics of m.3243A > G in blood are dichotomous

Negative selection of the m.3243A > G pathogenic variant in human blood is well established; previous approaches to model this decline with age have pointed towards an exponential or sigmoidal process but the dynamics of m.3243A > G are more complex and these models do not fully explain the data (12,16,19). Using a large quantity of recently compiled longitudinal data (12), we report that the dynamics of m.3243A > G in blood are dichotomous; levels predominantly decline in individuals with high mutation levels and are predominantly stable or even slightly increasing at low mutation levels. Therefore, the standard 2% annual decline correction, which is adequate on average, creates bias both at high and low mutation level and its predictions depend on the proportion of individuals with high and low mutation levels in the dataset. Interestingly, we detected a similar dichotomy in a large dataset of blood m.3243A > G levels in mother–child pairs, providing further support for a model in which the dynamics of m.3243A > G decline are dependent upon mutation level. Unbiased age correction is needed to circumvent the drawbacks of the 2% annual decline model; we used our observations to develop a new, empirical model of the decline of m.3243A > G in blood which better accounts for this dichotomy. We then used this unbiased age correction to explore the transmission of m.3243A > G from mother to child, detecting patterns consistent with positive germline selection of this variant, which may be a contributing factor in the comparatively high frequency of m.3243A > G compared with other pathogenic mtDNA point variations (5).

Dichotomous dynamics cannot be described by a single decline rate. Therefore, any model that does not account for this dichotomy will be biased. To reveal this, we tested a 2% annual decline model as a predictor of subsequent measurements of mutation level on the basis of preceding measurements and the age difference in different subsets of the longitudinal dataset. Average error is positive in the subsets of individuals with high mutation levels and negative in low mutation level individuals, confirming that the 2% annual decline model is biased. We note however, consistent with previous reports (22,23), that it appears unbiased overall since the average error of its predictions across the entire longitudinal dataset is very small and not significantly different from zero. This can be explained by the neutralizing effect of the two partial biases, which occurs when averaging across both the high and low mutation subsets of the data. Interestingly, this means that overall bias of the 2% decline model is not universal—it depends on the relative composition of the dataset i.e. how many individuals are in the high and in the low mutation groups.

Biphasic model—An unbiased approach to dichotomous dynamics of m.3243A > G

Adjustment of the blood levels of m.3243A > G in mother and child for the age difference using the 2% annual decline model has been shown to have a dramatic effect on the estimates of germline selection of the m.3243A > G mutation as compared with unadjusted estimates (16). We reasoned that this bias would significantly affect current estimates of selection as well. We therefore sought to modify the 2% model to eliminate this bias while keeping the elegant framework of the 2% model maximally unchanged. We chose to use a bi-phasic model to reflect the dichotomy of the m.3243A > G dynamics. In the interest of keeping the number of parameters in the model to a minimum, we made the simplest assumption that there are two subsets i.e. the high and the low mutation level subsets, each with their specific exponential dynamics. It should, however, be noted that individual variability in decline rate still exists within these domains. We postulate that this could arise from differences in the threshold for biochemical expression of the variant allele between individuals, thus affecting negative selection against cells with high levels of the variant allele (see further discussion in the positive selection section). This theory is compatible with the high level of variability in disease burden and phenotypic spectrum that is seen in individuals carrying m.3243A > G (7–9,11,12,24,25). In addition, our preliminary studies imply that that the variability of the decline rates between individuals may in part result from complex dynamics of the cell composition of whole blood, where different cell types may carry different mutational loads, according with a recent report (26).

In the initial stages of this study, we attempted to use complex models with extra parameters to capture the variability of the data and were convinced that given a relatively small size and high variance of the dataset, such attempts resulted in overfitting. We have therefore chosen to limit analysis to the most basic model possible for a dichotomous data—the biphasic model.

The development of this model revealed important features of the dynamics of the mutation in blood. Contrary to the conventional view that blood m.3243A > G levels universally decrease with age, in the blood of individuals with low mutation levels, m.3243A > G levels tend to stabilize and even increase in some cases. Conversely, unbiased rates in the high mutation level individuals are more aggressively negative than the conventionally accepted decline rate of 2%. Thus, the perceived overall 2% decline rate of the standard model probably stems from de facto ‘averaging’ of the rates at high and low mutational levels, which has been possible to detect due to the recent availability of larger, longitudinal datasets.

Analogy between the longitudinal and the mother/child datasets

The similarity that we see between the longitudinal and mother–child datasets is expected and thus reassuring. Indeed, m.3243A > G levels in child and mother can be viewed as two sequential samplings of mtDNA mutation levels from the same germline. By ‘sampling’, we mean that the continuous germline gave rise to somatic tissue (blood cells in this case) in the mother and then in the child, each of which were used to infer mutation level of the germline. These two samplings, however, systematically differ in that in the child the mutation spends fewer years in the blood cells than in the mother. Similarly, in the longitudinal dataset, every pair of sequential measurements represent two samplings of the germline.

Unbiased bi-phasic approach reveals positive selection in the germline

Germline selection of m.3243A > G can be estimated by comparing mutation levels in children and mothers. However, because of the systematic changes in the blood levels of m.3243A > G with age (16), the relative levels in blood in mothers and children must be adjusted for the age difference. Previously, this has been achieved by applying the annual decline model of ~ 2% per year (22). We revisit the transmission dynamics of m.3243A > G using the bi-phasic model developed in this study. Accordingly, we divided mother–child pairs into a series of high and low mutation level subsets and used the subset-specific unbiased rates as the best empirical means to correct for the mother–child age difference needed for estimation on intergenerational germline selection. Results of this analysis are shown in Figure 3C. We do note a variability in the estimates; obtaining more precise estimates of the magnitude of this apparent germline selection will require larger and more balanced datasets.

The distribution of positive selection estimates shown in Figure 3C implies that positive selection is strongest and most consistent in subsets which include cases of moderately low MF levels (<20, < 30). Currently, it is difficult to precisely determine the extent of selection at lowest MFs (i.e. when MF approaches zero). This is may be in part related to the higher relative error of mutation measurements at lower mutation levels. We note, however, that our analysis provides no evidence that positive selection persists at lowest MFs. Rather, it most likely decreases, converges to zero or even becomes negative. This conclusion is corroborated by the apparent shift of the main peak of the distribution to the zero position (compare distributions ‘<20’, ‘<15’ and ‘<10’ in Supplementary Material, Fig. S2). More data are needed to substantiate this preliminary conclusion.

We also saw asymmetric outliers within the higher level subsets. Therefore, we cannot discount the possibility that some positive selection is also occurring at higher levels. As all of these outliers are dramatic, it is tempting to speculate that excessive individual variability of decline rates may have genetic cause. Indeed, familial clustering was demonstrated in a previous study which showed a high heritability estimate for m.3243A > G level (h2 = 0.72, standard error = 0.26, P = 0.010) (20). Moreover, another recent study has identified several nuclear genetic loci associated with non-pathogenic heteroplasmy (27). Further study of the effects of nuclear genetics on the transmission of pathogenic mtDNA variants is warranted.

What can be the mechanism behind positive selection? It is tempting to speculate that, for example, that the presence of m.3243A > G may offer a positive advantage—potentially because less efficient translation of mtDNA-encoded proteins may cause local compensatory mtDNA replication or stimulate cell proliferation (28), both of which will result in positive germline selection. At higher mutation levels, however, translation deficiency is expected to result in progressive respiration defect that is likely to eventually trigger oocyte attrition or embryo demise, as has been demonstrated for other detrimental mtDNA mutations e.g. (29,30). This effect would explain the trend towards negative selection at higher MFs seen in Figure 3C.

Bi-phasic versus 2% annual decline approach: The sources of the discrepancy

To put these findings in the context of previous studies, we performed additional analysis to explore the sources of differences between our estimates and those reported previously, which found no evidence of positive selection (22,23). First, we directly compared our approach to the 2% annual decline model by applying this correction to exactly the same mother–child dataset that we used for the bi-phasic estimates, to exclude the possibility that differences are caused by differences in the dataset. Reassuringly, in agreement with previous studies, the predicted overall germline selection when calculated over the entire mother–child dataset using the standard 2% decline rate is very low and not significantly different from zero. Most notably, unlike our biphasic approach, which tends to predict weak positive selection estimates in most subsets, the 2% annual decline approach clearly predicts strong negative selection in the low mutation level domain, and strong positive selection in the high mutation level domain, an effect that we believe is due to insufficient adjustment in the high m.3243A > G level domain and excessive adjustment in the low m.3243A > G level domain.

The above observations reveal an important drawback of the standard model. Estimates on the basis of 2% annual decline model are highly sensitive to the composition of the mother–child dataset. Indeed, from Figure 3C (grey bars) one can deduce that the higher the proportion of low mutation level mother–child pairs included in the dataset, the lower the expected overall estimate of the germline selection using 2% adjustment. In fact, the near zero estimate of selection intensity in the current mother–child dataset is merely the result of the particular proportion of low and high mutation level mother/child pairs in this dataset. The biphasic model is poised to deliver much higher stability of the estimate with respect to the changing proportion of high and low mutation level patients.

Comparison to other studies: Positive and negative germline selection

Convincing positive germline selection of the m.3243A > G has not been reported previously but has been observed for other detrimental mtDNA variants. An apparently milder variant, m.8993 T > G, may be under positive germline selection throughout the MF range (23). Interestingly, data from Freyer et al. (30) show that a specific highly detrimental mutation, the m.3875delC in the mouse mt-Tm gene, encoding tRNAMet, at an intermediate MF (45–60%) systematically increases its presence in offspring compared with the mother, implying positive selection of a detrimental mutation in the mouse germline which is reversed at higher (>60%) levels (see their Figs 2A and 3). Surprisingly, Freyer et al. did not discuss their own data as supporting positive germline selection. Of note, no data are available for mutational levels below 45% for this mutation, so it is possible that selection decreases at lower MF as we observe for m.3243A > G (Fig. 3C). In conclusion, it appears that detrimental mutations, at least in some cases, are under positive selection in the germline. It would be interesting to determine how general such a phenomenon is, using a broader set of detrimental mtDNA mutations.

A few months after original version of this paper was first published in BioRxiv (31), another group has reported positive germline selection of the m.3243A > G (32). They reported positive selection, but their conclusion was on the basis of an incorrect analysis. Specifically, they showed that positive selection (presented as average heteroplasmy shift, HS) was highest at low mother mutation frequency values, gradually decreased with mutation frequency and eventually became negative at high frequencies. Our preliminary analysis indicates that discrepancy between our results (arching selection profile) and those of Zhang et al. (monotonous decrease of selection with mutation level) can be accounted for by the ‘regression to the mean’ bias (33), which adds strong spurious negative correlation between the shift of MF from mother to child and mother’s MF.

Positive selection of detrimental mutations appears to be a counterproductive process, so it is interesting why such a phenomenon would even exist. Although positive selection of mtDNA mutations is generally accepted, there is no consensus on its mechanism [reviewed in Khrapko and Turnbull(34)]. Most likely positive selection is related to an attempt of the cell to compensate for adenosine triphosphate (ATP) deficiency imposed by the mutation. Our observation that selection reaches a maximum at an intermediate MF implies that a minimum fraction is needed to promote a phenotypic response, called ‘phenotypic threshold’.

Interestingly, our results (Fig. 3C) are compatible with a possibility that at low MFs m3243A > G may be under negative selection, although our results do not achieve statistical significance. If such selection indeed existed, it would have been likely to act at the individual mitochondrion level, because at low MF any process depending on cooperation between mitochondria (like the phenotypic threshold mentioned above) is expected to decrease. Indeed, mitochondria-autonomous local mechanisms of negative selection have been recently reported in Drosophila germ cells (35). Of note, a combination of negative selection at low and positive selection at intermediate mutation levels could have efficiently prevented the accumulation of detrimental mutations in the population. Indeed, at low initial levels, nascent mutations would be preferentially removed by negative selection. Those mutations that managed to make it through this initial check point and to reach intermediate levels (e.g. by means of random drift in the bottleneck), would be pushed by the positive selection to even higher levels, presumably making the offspring unfit and therefore removed by development arrest or by Darwinian selection.

Potential applications of the biphasic approach

From a practical point of view, this study will hopefully lead to models that better account for the dynamics of pathogenic mtDNA variation by including the effect of mutation level. The simple biphasic model proposed here may serve as a practical alternative to the 2% annual decline model in cases where, like in studies of germline selection, unbiased prediction and versatility of the model (applicability to datasets that are unbalanced with respect to cases with low and/or high mutational levels) are essential. This can be realized by applying one of the two rates: ~ 3% annual decline for cases with mutation levels above 20% and ~ 1% annual increase for cases with levels below 20%. Of note, in this study we used the mutation level at first measurement, or mutation level of the child (which is analogous to the first measurement). We have chosen this convention because the dynamics of the mutation converges with time (phase lines become denser with time), so in general, the prediction of the succeeding mutation level is more stable than prediction of the preceding mutation level. As of now, this bi-phasic model is not a finished working instrument that can be used to estimate at-birth m.3243A > G levels within a clinical prognostic setting. Rather, this research reveals that dynamics of m.3243A > G in blood is more complex than previously considered and highlights the need for more detailed, perhaps mechanistic, models, especially in the lower range of mutation levels. With more data available for models to build upon and test, our approach may be further optimized or replaced by a more detailed and precise model. Our current model, however, is appropriate to make conclusions about the aggregate behaviour of mutations, such as positive selection, with the data currently at hand.

Materials and Methods

Data sources

Longitudinal data

Longitudinal measurements of m.3243A > G levels in blood were obtained from Grady et al. (12). This dataset is comprised of 96 individuals who were recruited into the Mitochondrial Disease Patient Cohort UK and had two or more measurements of m.3243A > G blood level taken between 2000 and 2017. The median time between first and last measurement was 2.5 years (interquartile range (IQR) = 4.35, range = 0.01–15.20), the median age at first measurement was 36.25 years (IQR = 20.58, range = 15.60–72.50) and at last measurement was 40.25 years (IQR = 21.00, range = 19.80–78.80).

Mother–child data

Measurements of m.3243A > G levels in mother–child pairs were obtained from Pickett et al. (20). This dataset contains 183 mother–child pairs (from 113 different mothers), comprised of 67 pairs from the Mitochondrial Disease Patient Cohort UK (8) and 116 pairs obtained from a literature search and previously published by Wilson and colleagues (22). To minimize ascertainment bias, pairs in which the child was the proband had previously been removed, as had one pair from the literature where the m.3243A > G variant was thought to have arisen de novo in the child (36). We identified an additional 30 pairs where the mother was the proband which were excluded from analysis to reduce bias, leaving a total of 153 pairs (from 97 mothers) that were taken forward into the analysis. The mean age at m.3243A > G level assessment was 53.9 years (SD = 12.4, range = 23.0, 85.0) for the mothers and 27.8 years (SD = 12.2, range = 0.4, 58.0) for the children. The data is accessible online at the following DOI: 10.25405/data.ncl.20286219.

Construction of binary logistic regressions

The binary logistic regressions were performed on both the longitudinal and mother–child data sets. For each data set, the final mutation level was subtracted from the initial mutation level. Then, each pair of datapoints was assigned a binary value for their mutation level directional change. A ‘1’ was assigned for increasing mutation levels and a ‘0’ was assigned for decreasing mutation levels. Data point pairs that had no change were excluded from the analysis. A simple binary logistic regression was performed on each set of data using the initial mutation level values of each data point pair as the independent variable and the corresponding binary indicator as the dependent variable. Each regression was performed with a likelihood ratio test, goodness of fit test and with 95% confidence intervals. Significance was recorded as P-values. For each binary regression for each data set, the predictive curve was plotted with the respective data points. The longitudinal data set was graphed individually (Fig. 1B), and the longitudinal and mother–child data set was graphed together for comparative analyses (Fig. 2B). Statistical analyses and graphing were performed in GraphPad Prism version 9.

Determination of the unbiased mutation decline rates

The approach we used to calculate the unbiased mutation decline rates in the various subsets of the longitudinal dataset (12) is illustrated in Figure 4. Of note, we limited analysis to the first and the last measurement for each person, so that there were only two measurements for each person in the dataset, MF1 and MF2. For every individual (of 96 individuals in the dataset) and thus every pair of data points (MF1, MF2), the MF2 was predicted on the basis of MF1 using an exponential model with variable parameter R i.e. the fractional decline of mutational load per year (R is positive for an increase, and negative for decline) where ΔA is the age difference between the times of mutation level measurements. This is function (12) as described in the results section. R was varied between 0.05 decrease and 0.05 increase per year (i.e. from −0.05 to 0.05) as shown in Figure 4 in steps of 0.0000001 (1 000 000 steps overall). Thus, we obtained 1 000 000 values of MF2pred (R) for each R, for each individual (i.e. 100×96 total). For each MF2pred (R), an error ratio (MF2pred (R)/MF2) and absolute error (sqrt[(MF2pred/MF2)]2) was calculated and then both error and absolute error were averaged among individuals within each of eight subsets. We obtained 1 000 000 × 2 × 8 of the averaged data points and plotted them for each value of R in eight graphs shown in Supplementary Material, Figure S1 (1 graph per subset). The minimum of the absolute error (corresponding to the best fit rate) and the zero of the average non-absolute error (‘unbiased rate’) were determined graphically by identifying the minima of the curve and the intercept of the x-axis, respectively. Unbiased rates were then used for plotting the graphs in Figure 3B. The best fit rates were used to make sure that the unbiased model was close to the best fit model.

Figure 4.

Figure 4

(A-B) Examples of the calculation of an unbiased rate of decline for the two subsets: < 15% (top) ≥15% (bottom). See Materials and Methods for the procedure and Supplementary Material, Figure S1 for a complete set of curves. Red vertical lines indicate the unbiased rate.

Calculation of the enrichments/selection of m.3243A > G per generation in the germline

The enrichments/selection per generation for a specific subset of the data S is determined as geometric average of ratios of age-corrected child mutation levels to those of their mothers. In practice, we calculated the average of logarithms of the ratios and then converted the average back from logarithm to real ratio/geometric mean. That is, we first determine the germline selection for each child–mother pair within each subset using Eq. (2) –

graphic file with name DmEquation2.gif (2)

where Rs is the estimated unbiased rate for the subset S to which the mother–child pair has been assigned (as determined by the value of MFchild) and both MFchild and MFmother are expressed as proportions between 0 and 1. ΔA is the age difference between mother and child.

For example, for the 20% threshold, a child with an m.3243A > G mutation level of 40% would be assigned to the ‘>20%’ subset with an Rs of −0.029 (see Fig. 3A). If their mother was 30 years older and had an m.3243A > G level of 10%, the germline selection estimate in that individual mother/child pair would be: (0.4 × 0.97130)/0.1 = 1.65.

We then calculated median log ratios across all mother–child pairs in the given data subset and used two-tailed sign test to determine the P-value associated with this median of logarithms being different from zero (rejection of the null hypothesis of no selection). Finally, the median log ratios for the various data subsets were converted into ‘median mutation level ratios’ by calculating exponent of the median of log ratios and 1 was subtracted from it to produce ‘median enrichment’ (i.e. fractional increase, positive for enrichment, negative for depletion, analogous to the rate of decline used describe the dynamics of m.3243A > G in blood).

Kernel density estimation was performed using on-line portal at http://www.wessa.net/rwasp_density.wasp with default parameters.

Supplementary Material

Supplement_figure_lengends_ddac149
Supp_Fig1A_ddac149
Supp_Fig1B_ddac149
Supp_Fig1C_ddac149
Supp_Fig1D_ddac149
Supp_Fig1E_ddac149
Supp_Fig1F_ddac149
Supp_Fig1G_ddac149
Supp_Fig1H_ddac149
Supp_Fig2A_ddac149
Supp_Fig2B_ddac149
Supp_Fig2C_ddac149
Supp_Fig2D_ddac149

Acknowledgements

For the purpose of Open Access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. We would like to acknowledge those involved in the collection of data for the Grady et al. (12) and Pickett et al. (20) manuscripts.

Conflict of Interest statement. The authors declare no conflict of interest.

Contributor Information

Melissa Franco, Department of Biology, Northeastern University, Boston, MA 02115, USA.

Sarah J Pickett, Wellcome Centre for Mitochondrial Research and Institute for Translational and Clinical Research, Newcastle University and Newcastle Medical School, Newcastle-upon-Tyne NE2 4HH, UK.

Zoe Fleischmann, Department of Biology, Northeastern University, Boston, MA 02115, USA.

Mark Khrapko, Department of Biology, Northeastern University, Boston, MA 02115, USA.

Auden Cote-L’Heureux, Department of Biology, Northeastern University, Boston, MA 02115, USA.

Dylan Aidlen, Department of Biology, Northeastern University, Boston, MA 02115, USA.

David Stein, Department of Biology, Northeastern University, Boston, MA 02115, USA.

Natasha Markuzon, Draper Laboratories, Cambridge, MA 02139, USA.

Konstantin Popadin, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland; Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland; Center for Mitochondrial Functional Genomics, Institute of Living Systems, Immanuel Kant Baltic Federal University, Kaliningrad 236040, Russia.

Maxim Braverman, Department of Mathematics, Northeastern University, Boston, MA 02115, USA.

Dori C Woods, Department of Biology, Northeastern University, Boston, MA 02115, USA.

Jonathan L Tilly, Department of Biology, Northeastern University, Boston, MA 02115, USA.

Doug M Turnbull, Wellcome Centre for Mitochondrial Research and Institute for Translational and Clinical Research, Newcastle University and Newcastle Medical School, Newcastle-upon-Tyne NE2 4HH, UK.

Konstantin Khrapko, Department of Biology, Northeastern University, Boston, MA 02115, USA.

Funding

U.S. National Institutes of Health (R01-HD091439 to J.L.T., D.C.W. and K.K.), K.P. was supported by the Ministry of Science and Higher Education of the Russian Federation (agreement no. 075-02-2022-872). This research was funded in part, by the Wellcome Trust [204709/Z/16/Z to S.J.P. and 203105/Z/16/Z to the Wellcome Centre for Mitochondrial Research].

References

  • 1. Gorman, G.S., Chinnery, P.F., DiMauro, S., Hirano, M., Koga, Y., McFarland, R., Suomalainen, A., Thorburn, D.R., Zeviani, M. and Turnbull, D.M. (2016) Mitochondrial diseases. Nat. Rev. Dis. Primers., 2, 1–22. [DOI] [PubMed] [Google Scholar]
  • 2. Goto, Y., Nonaka, I. and Horai, S. (1990) A mutation in the tRNA(Leu)(UUR) gene associated with the MELAS subgroup of mitochondrial encephalomyopathies. Nature, 348, 651–653. [DOI] [PubMed] [Google Scholar]
  • 3. Manwaring, N., Jones, M.M., Wang, J.J., Rochtchina, E., Howard, C., Mitchell, P. and Sue, C.M. (2007) Population prevalence of the MELAS A3243G mutation. Mitochondrion, 7, 230–233. [DOI] [PubMed] [Google Scholar]
  • 4. Elliott, H.R., Samuels, D.C., Eden, J.A., Relton, C.L. and Chinnery, P.F. (2008) Pathogenic mitochondrial DNA mutations are common in the general population. Am. J. Hum. Genet., 83, 254–260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Gorman, G.S., Schaefer, A.M., Ng, Y., Gomez, N., Blakely, E.L., Alston, C.L., Feeney, C., Horvath, R., Yu-Wai-Man, P., Chinnery, P.F.  et al. (2015) Prevalence of nuclear and mitochondrial DNA mutations related to adult mitochondrial disease. Ann. Neurol., 77, 753–759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Koga, Y., Akita, Y., Takane, N., Sato, Y. and Kato, H. (2000) Heterogeneous presentation in A3243G mutation in the mitochondrial tRNA(Leu(UUR)) gene. Arch. Dis. Child., 82, 407–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. de  Laat, P., Koene, S., van den  Heuvel, L.P.W.J., Rodenburg, R.J.T., Janssen, M.C.H. and Smeitink, J.A.M. (2012) Clinical features and heteroplasmy in blood, urine and saliva in 34 Dutch families carrying the m.3243A > G mutation. J. Inherit. Metab. Dis., 35, 1059–1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Nesbitt, V., Pitceathly, R.D.S., Turnbull, D.M., Taylor, R.W., Sweeney, M.G., Mudanohwo, E.E., Rahman, S., Hanna, M.G. and McFarland, R. (2013) The UK MRC mitochondrial disease patient cohort study: clinical phenotypes associated with the m.3243A>G mutation--implications for diagnosis and management. J. Neurol. Neurosurg. Psychiatry, 84, 936–938. [DOI] [PubMed] [Google Scholar]
  • 9. Mancuso, M., Orsucci, D., Angelini, C., Bertini, E., Carelli, V., Comi, G.P., Donati, A., Minetti, C., Moggio, M., Mongini, T.  et al. (2014) The m.3243A>G mitochondrial DNA mutation and related phenotypes. A matter of gender?  J. Neurol., 261, 504–510. [DOI] [PubMed] [Google Scholar]
  • 10. Fayssoil, A., Laforêt, P., Bougouin, W., Jardel, C., Lombès, A., Bécane, H.M., Berber, N., Stojkovic, T., Béhin, A., Eymard, B., Duboc, D. and Wahbi, K. (2017) Prediction of long-term prognosis by heteroplasmy levels of the m.3243A>G mutation in patients with the mitochondrial encephalomyopathy, lactic acidosis and stroke-like episodes syndrome. Eur. J. Neurol., 24, 255–261. [DOI] [PubMed] [Google Scholar]
  • 11. Pickett, S.J., Grady, J.P., Ng, Y.S., Gorman, G.S., Schaefer, A.M., Wilson, I.J., Cordell, H.J., Turnbull, D.M., Taylor, R.W. and McFarland, R. (2018) Phenotypic heterogeneity in m.3243A>G mitochondrial disease: the role of nuclear factors. Ann. Clin. Transl. Neurol., 5, 333–345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Grady, J.P., Pickett, S.J., Ng, Y.S., Alston, C.L., Blakely, E.L., Hardy, S.A., Feeney, C.L., Bright, A.A., Schaefer, A.M., Gorman, G.S.  et al. (2018) mtDNA heteroplasmy level and copy number indicate disease burden in m.3243A>G mitochondrial disease. EMBO Mol. Med., 10, e8262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Bggan, R.M., Lim, A., Taylor, R.W., McFarland, R. and Pickett, S.J. (2019) Resolving complexity in mitochondrial disease: towards precision medicine. Mol. Genet. Metab., 128, 19–29. [DOI] [PubMed] [Google Scholar]
  • 14. Sue, C.M., Quigley, A., Katsabanis, S., Kapsa, R., Crimmins, D.S., Byrne, E. and Morris, J.G. (1998) Detection of MELAS A3243G point mutation in muscle, blood and hair follicles. J. Neurol. Sci., 161, 36–39. [DOI] [PubMed] [Google Scholar]
  • 15. Pyle, A., Taylor, R.W., Durham, S.E., Deschauer, M., Schaefer, A.M., Samuels, D.C. and Chinnery, P.F. (2007) Depletion of mitochondrial DNA in leucocytes harbouring the 3243A->G mtDNA mutation. J. Med. Genet., 44, 69–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Rajasimha, H.K., Chinnery, P.F. and Samuels, D.C. (2008) Selection against pathogenic mtDNA mutations in a stem cell population leads to the loss of the 3243A→G mutation in blood. Am. J. Hum. Genet., 82, 333–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Mehrazin, M., Shanske, S., Kaufmann, P., Wei, Y., Coku, J., Engelstad, K., Naini, A., De Vivo, D.C. and DiMauro, S. (2009) Longitudinal changes of mtDNA A3243G mutation load and level of functioning in MELAS. Am. J. Med. Genet. A, 149A, 584–587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Langdahl, J.H., Larsen, M., Frost, M., Andersen, P.H., Yderstraede, K.B., Vissing, J., Dunø, M., Thomassen, M. and Frederiksen, A.L. (2018) Lecocytes mutation load declines with age in carriers of the m.3243A>G mutation: a 10-year prospective cohort. Clin. Genet., 93, 925–928. [DOI] [PubMed] [Google Scholar]
  • 19. Veitia, R.A. (2018) How the most common mitochondrial DNA mutation (m.3243A>G) vanishes from leukocytes: a mathematical model. Hum. Mol. Genet., 27, 1565–1571. [DOI] [PubMed] [Google Scholar]
  • 20. Pickett, S.J., Blain, A., Ng, Y.S., Wilson, I.J., Taylor, R.W., McFarland, R., Turnbull, D.M. and Gorman, G.S. (2019) Mitochondrial donation-which women could benefit?  N. Engl. J. Med., 380, 1971–1972. [DOI] [PubMed] [Google Scholar]
  • 21. Chinnery, P.F., Thorburn, D.R., Samuels, D.C., White, S.L., Dahl, H.M., Turnbull, D.M., Lightowlers, R.N. and Howell, N. (2000) The inheritance of mitochondrial DNA heteroplasmy: random drift, selection or both?  Trends Genet., 16, 500–505. [DOI] [PubMed] [Google Scholar]
  • 22. Wilson, I.J., Carling, P.J., Alston, C.L., Floros, V.I., Pyle, A., Hudson, G., Sallevelt, S.C.E.H., Lamperti, C., Carelli, V., Bindoff, L.A.  et al. (2016) Mitochondrial DNA sequence characteristics modulate the size of the genetic bottleneck. Hum. Mol. Genet., 25, 1031–1041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Otten, A.B.C., Sallevelt, S.C.E.H., Carling, P.J., Dreesen, J.C.F.M., Drüsedau, M., Spierts, S., Paulussen, A.D.C., de  Die-Smulders, C.E.M., Herbert, M., Chinnery, P.F.  et al. (2018) Mutation-specific effects in germline transmission of pathogenic mtDNA variants. Hum. Reprod., 33, 1331–1341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Kaufmann, P., Engelstad, K., Wei, Y., Kulikova, R., Oskoui, M., Sproule, D.M., Battista, V., Koenigsberger, D.Y., Pascual, J.M., Shanske, S.  et al. (2011) Natural history of MELAS associated with mitochondrial DNA m.3243A>G genotype. Neurology, 77, 1965–1971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Chin, J., Marotta, R., Chiotis, M., Allan, E.H. and Collins, S.J. (2014) Detection rates and phenotypic spectrum of m.3243A>G in the MT-TL1 gene: a molecular diagnostic laboratory perspective. Mitochondrion, 17, 34–41. [DOI] [PubMed] [Google Scholar]
  • 26. Walker, M.A., Lareau, C.A., Ludwig, L.S., Karaa, A., Sankaran, V.G., Regev, A. and Mootha, V.K. (2020) Purifying selection against pathogenic mitochondrial DNA in human T cells. N. Engl. J. Med., 383, 1556–1563. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Nandakumar, P., Tian, C., O’Connell, J., 23andMe Research Team, Hinds, D., Paterson, A.D. and Sondheimer, N. (2021) Nuclear genome-wide associations with mitochondrial heteroplasmy. Sci. Adv., 7, eabe7520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Smith, A.L.M., Whitehall, J.C., Bradshaw, C., Gay, D., Robertson, F., Blain, A.P., Hudson, G., Pyle, A., Houghton, D., Hunt, M.  et al. (2020) Age-associated mitochondrial DNA mutations cause metabolic remodeling that contributes to accelerated intestinal tumorigenesis. Nat. Cancer., 1, 976–989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Fan, W., Waymire, K.G., Narula, N., Li, P., Rocher, C., Coskun, P.E., Vannan, M.A., Narula, J., Macgregor, G.R. and Wallace, D.C. (2008) A mouse model of mitochondrial disease reveals germline selection against severe mtDNA mutations. Science, 319, 958–962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Freyer, C., Cree, L.M., Mourier, A., Stewart, J.B., Koolmeister, C., Milenkovic, D., Wai, T., Floros, V.I., Hagström, E., Chatzidaki, E.E.  et al. (2012) Variation in germline mtDNA heteroplasmy is determined prenatally but modified during subsequent transmission. Nat. Genet., 44, 1282–1285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Fleischmann, Z., Pickett, S.J., Franco, M., Aidlen, D., Khrapko, M., Stein, D., Markuzon, N., Popadin, K., Braverman, M., Woods, D.C.  et al. (2021) Bi-phasic dynamics of the mitochondrial DNA mutation m.3243A>G in blood: an unbiased, mutation level-dependent model implies positive selection in the germline. bioRxiv 2021.02.26.433045.
  • 32. Zhang, H., Esposito, M., Pezet, M.G., Aryaman, J., Wei, W., Klimm, F., Calabrese, C., Burr, S.P., Macabelli, C.H., Viscomi, C.  et al. (2021) Mitochondrial DNA heteroplasmy is modulated during oocyte development propagating mutation transmission. Sci. Adv., 7, eabi5657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Galton, F. (1889) Natural inheritance. Macmillan, London, pp. 1–282. [Google Scholar]
  • 34. Khrapko, K. and Turnbull, D. (2014) Mitochondrial DNA mutations in aging. Prog. Mol. Biol. Transl. Sci., 127, 29–62. [DOI] [PubMed] [Google Scholar]
  • 35. Zhang, Y., Wang, Z.H., Liu, Y., Chen, Y., Sun, N., Gucek, M., Zhang, F. and Xu, H. (2019) PINK1 inhibits local protein synthesis to limit transmission of deleterious mitochondrial DNA mutations. Mol. Cell, 73, 1127–1137.e5. 10.1016/j.molcel.2019.01.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Ko, C.H., Lam, C.W., Tse, P.W., Kong, C.K., Chan, A.K. and Wong, L.J. (2001) De novo mutation in the mitochondrial tRNALeu(UUR) gene (A3243G) with rapid segregation resulting in MELAS in the offspring. J. Paediatr. Child Health, 37, 87–90. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement_figure_lengends_ddac149
Supp_Fig1A_ddac149
Supp_Fig1B_ddac149
Supp_Fig1C_ddac149
Supp_Fig1D_ddac149
Supp_Fig1E_ddac149
Supp_Fig1F_ddac149
Supp_Fig1G_ddac149
Supp_Fig1H_ddac149
Supp_Fig2A_ddac149
Supp_Fig2B_ddac149
Supp_Fig2C_ddac149
Supp_Fig2D_ddac149

Articles from Human Molecular Genetics are provided here courtesy of Oxford University Press

RESOURCES