Abstract
Mitochondrial genome heteroplasmy—the presence of more than one genomic variant in individuals—is considered only occasional in animals, and most often involves molecules differing only by a few recent mutations. Thanks to new sequencing technologies, a large number of DNA fragments from a single individual can now be sequenced and visualized separately, allowing new insights into intra-individual mitochondrial genome variation. Here, we report evidence from both (i) massive parallel sequencing (MPS) of genomic extracts and (ii) Sanger sequencing of PCR products, for the widespread co-occurrence of two distantly related (greater than 1% nucleotide divergence, excluding the control region) mitochondrial genomes in individuals of a natural population of the leaf beetle Gonioctena intermedia. Sanger sequencing of PCR products using universal primers previously failed to identify heteroplasmy in this population. Its occurrence was detected with MPS data and may have important implications for evolutionary studies. It suggests the need to re-evaluate, using MPS techniques, the proportion of animal species displaying heteroplasmy.
Keywords: heteroplasmy, massive parallel sequencing, mtDNA, paternal leakage
1. Introduction
Animal cells typically carry hundreds to thousands of mitochondria [1], each with its own genome, which in theory could result in the co-occurrence of multiple genomic variants per cell/individual. However, it is usually believed that most individual animals are homoplasmic, i.e. harbouring mostly a single mitochondrial (mt) genome variant, to the extent that in population genetic studies, individuals are generally considered haploid for their mt genome [2,3]. Individual homoplasmy is essentially achieved through a strictly maternal inheritance of the mt genome, and possibly through strong genetic bottlenecks occurring inside germline cells [4]. Nonetheless, the list of reported exceptions to the rule of strictly maternal inheritance is growing in animals, with paternal leakage (i.e. a fraction of the mtDNA molecules are inherited from the father) detected in many animal groups, including insects [4–7]. This suggests that the actual occurrence of heteroplasmy may be higher than previously suspected, which can have major implications for evolutionary studies. For example, the detection of widespread heteroplasmy in a species would increase the assumed mtDNA effective size of its populations (relative to that of nuclear DNA), a parameter that is often used to estimate divergence times between two populations and species [8].
The emergence of next-generation sequencing techniques enables the massive sequencing of a large amount of DNA fragments from a genomic extract, and thus offers a tool to investigate intra-individual genomic variation. Here, following MPS, we report mitochondrial heteroplasmy in the leaf beetle Gonioctena intermedia, which was not detected previously by traditional Sanger sequencing using universal primers. In this instance, two divergent mitochondrial genomes were recovered from each individual, suggesting that mutation was not the mechanism responsible for heteroplasmy. We first identified a large number of polymorphic sites in the mt genome of two individuals by analysing MPS reads, and inferred the presence of at least 2 mt genome variants per individual. We then confirmed the heteroplasmic nature of these individuals through PCR and Sanger sequencing of 1 mt gene fragment (cytochrome c oxidase subunit I; COI). We extended our investigation to the sequencing of this mt gene for 22 additional insects collected in five sampling sites, and found that heteroplasmy is widespread in the region. This could be a consequence of a relatively ancient origin and subsequent persistence in the population, and/or of the frequent occurrence of paternal leakage (i.e. multiple recent origins leading to convergent patterns of heteroplasmy among individuals).
2. Material and methods
Twenty-four individuals of the European leaf beetle G. intermedia were collected from five sampling sites in the Belgian Ardennes (pairwise distances from 4 to 20 km; electronic supplementary material, S1). For two individuals, DNA libraries were prepared and sequenced with MPS technology. A mitochondrial genome reference sequence was assembled and annotated for both individuals (electronic supplementary material, S2). MPS reads were mapped to the assemblies after excluding the control region (whose assembly was more challenging due to its repetitive nature; electronic supplementary material, S2), allowing identification of single-nucleotide polymorphisms (SNPs) in the mt genome.
To further verify that the sequence of the identified mt haplotypes is correct, we amplified via PCR and sequenced (Sanger technology) an 814 bp fragment of the COI gene. We used primers designed to amplify specifically each of the two haplotypes, and determined by trial-and-error at which annealing temperature each primer pair amplified only one of the two haplotypes (electronic supplementary material, table S5 and S4). Sequences of 740 bp were obtained, which were compared to those of the MPS assemblies. Furthermore, we PCR-amplified and sequenced the COI fragment from all 24 individuals sampled, using the two specific primer pairs plus one universal primer pair (electronic supplementary material, S4) capable of amplifying both haplotypes (three PCR amplifications per insect: one using a primer pair specific to the first haplotype, one using a primer pair specific to the second haplotype and one using a ‘universal’ primer pair). Using these sequences, we could determine, for each individual, their haplotype composition, and thus their homoplasmic or heteroplasmic status.
3. Results
Mapping MPS reads to mt genome assemblies of two individuals, after excluding the control region, highlighted numerous and homogeneously distributed polymorphic sites (figure 1; electronic supplementary material, table S6). We thus postulated the existence of at least two different mt haplotypes in each individual and that it is possible to phase both haplotypes based on the relative frequency of each variant: the dominant haplotype, hereafter designated HF (high frequency), was supported by a mean proportion of reads mapped to the mt genome equal to 79% (individual 1) and 81% (individual 2). Its sequence corresponded exactly to that produced by our assembly. The alternative haplotype, built by replacing the nucleotide at each SNP by the low frequency variant, is called LF (low frequency) hereafter. PCR amplification and Sanger sequencing of a fragment of the COI gene from the HF and LF variants (using primer pairs specific to each) largely confirmed their sequence (electronic supplementary material, S4). The possibility that one of the two variants was actually of nuclear origin could reasonably be rejected because (i) each variant was absent from some individuals and (ii) the number of MPS reads of each mt variant was much higher than that of nuclear genes (electronic supplementary material, S5).
Eleven individuals out of 24 were clearly heteroplasmic for two distant mtDNA haplotypes (one identical to the initially sequenced LF haplotype, the other belonging to one variant of the HF haplotype), while the remaining individuals were homoplasmic (figure 2). The absence of a HF or LF haplotype in all homoplasmic individuals was verified by two independent PCR/sequencing, one using a haplotype-specific primer pair, and another using a universal primer pair (electronic supplementary material, S4). Importantly, heteroplasmic individuals were found in four out of five sampled localities.
Interestingly, amplifying and sequencing the COI fragment in heteroplasmic individuals (HF/LF) using ‘universal’ primers produced HF haplotype sequences only. Therefore, a classic Sanger sequencing of the mt COI using universal primers would not have detected the presence of a second haplotype in those individuals.
4. Discussion
We have established the widespread occurrence of heteroplasmy in the Ardennes region for the leaf beetle G. intermedia, by highlighting the coexistence of two distantly related mt genomes in almost half of the sampled individuals, as well as in four out of five surveyed localities. This finding was only possible thanks to MPS data from two individuals that allowed us to identify nucleotide polymorphism in the mt genome, and to subsequently design PCR primer pairs specific to each of the two inferred variants. Because the presence of a second and distant haplotype in heteroplasmic individuals was not identified in the Ardennes previously using traditional PCR and Sanger sequencing, it raises the question of whether heteroplasmy is more frequent in animals than previously thought.
The widespread occurrence of heteroplasmy within a species has important implications for evolutionary studies using mitochondrial markers. Indeed, the coexistence of two mt genomes within a single cell renders recombination possible between them [9]. White et al. [4] have already discussed how heteroplasmy and recombination could obscure phylogenetic signal, and this is relevant both at the inter- and intra-species levels. Heteroplasmy can remain undetected in a sample when using classic Sanger sequencing, e.g. because the proportion of the less frequent haplotype(s) is(are) too low. This is exemplified by our previous study on genetic variation across the entire range of the species [10], and in which we did not detect the LF (or related) haplotype in the Ardennes (because the HF haplotype was preferentially amplified by PCR when both LF and HF were present), a haplotype closely related to others found in the Ural Mountains (eastern Europe) and Finland (northern Europe). As a consequence, mt haplotype diversity within the Ardennes region was underestimated, as well as the sharing of haplotypes between this region and other northern locations, therefore biasing the observed phylogeographic pattern. Also, population effective size, a major parameter for population genetic inferences, should be re-evaluated for the mt genome of G. intermedia: it is generally assumed to be four times smaller than that of the nuclear genome (case of 1 : 1 sex ratio) because it is considered haploid and transmitted exclusively along maternal lineages. However, with almost 50% of heteroplasmic individuals, it is no longer correct to model individuals as haploids for their mt genome. They might be better modelled as diploid individuals instead, and the real effective population size of the mt genome may be closer to two (rather than four) times smaller than that of the nuclear genome. This effective size could even be larger if paternal leakage occurred frequently, and thus if the mt genome were also transmitted somewhat along paternal lineages.
The mt genome contributes to the respiratory chain, one of the cell's most vital functions, and is believed to be subject to strong selection pressures [11,12]. The widespread occurrence of the HF/LF pattern of heteroplasmy observed in the Ardennes suggests that its presence in the population is quite stable. This may appear surprising if we assume that (i) selection indeed constrains the evolution of the mt genome and (ii) strong mtDNA bottlenecks occur within germline cells at each generation. On the other hand, it has been suggested that mtDNA bottlenecks in germline cells are weaker in insects compared to mammals, and that once heteroplasmy appears, it can persist for several hundred generations in populations, due to their often large effective size [4]. Nonetheless, one intriguing possibility is that the mt genome could be subject to balancing selection [13], which is normally not considered for the mt genome. If two mt genomes can coexist, each bringing some selective advantage that the other does not, then selection could favour heteroplasmic individuals and maintain a large proportion of them in the population. In fact, analysing COI sequence variation in our sample suggested balancing selection could be at work (Tajima's D test; electronic supplementary material, S4). Another hypothesis to explain the widespread pattern of heteroplasmy found here would involve frequent paternal leakage [14,15]. In fact, the relatively high divergence observed between the two coexisting haplotypes could have favoured paternal leakage, by allowing mitochondria from sperm origin to avoid detection by egg cells that would normally destroy them [16]. In that case, heteroplasmy would not need to be stable to explain our observations; instead, one of the two competing mt genomes would quickly disappear within a maternal line by genetic drift, and possibly selection, but paternal leakage would constantly create new heteroplasmic individuals in the population. Further studies will be needed to investigate these alternative hypotheses.
Supplementary Material
Supplementary Material
Acknowledgement
S Lukicheva provided a preliminary assembly of the nuclear genome of G. intermedia. Comments from three anonymous reviewers significantly improved the manuscript.
Ethics
Gonioctena intermedia is a common species found in large numbers across its range; its collection during the course of this work has not endangered the sampled populations (samples represented a minute fraction of populations) and did not violate any local or international law.
Data accessibility
Two mt genome and COI haplotype sequences available from GenBank: accession nos. MF563958-MF563963. Additional information uploaded as the electronic supplementary material and table S6.
Authors' contributions
C.K. and P.M. designed the study, carried out the sampling and laboratory work, analysed the data and wrote the paper. Both authors agreed to be held accountable for the content therein and approved the final version of the manuscript.
Competing interests
We have no competing interests.
Funding
This work was supported by a FER grant from the Université Libre de Bruxelles to P.M. C.K. benefitted from a PhD fellowship from the FRS-FNRS (Fonds de la Recherche Scientifique).
References
- 1.Cole LW. 2016. The evolution of per-cell organelle number. Front. Cell Dev. Biol. 4, 85 ( 10.3389/fcell.2016.00085) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Avise JC. 2004. Molecular markers, natural history and evolution. Sunderland, MA: Sinauer Associates. [Google Scholar]
- 3.Ballard JWO, Whitlock MC. 2004. The incomplete natural history of mitochondria. Mol. Ecol. 13, 729–744. ( 10.1046/j.1365-294X.2003.02063.x) [DOI] [PubMed] [Google Scholar]
- 4.White DJ, Wolff JN, Pierson M, Gemmell NJ. 2008. Revealing the hidden complexities of mtDNA inheritance. Mol. Ecol. 17, 4925–4942. ( 10.1111/j.1365-294X.2008.03982.x) [DOI] [PubMed] [Google Scholar]
- 5.Zouros E, Freeman KR, Oberhauser Ball A, Pogson GH. 1992. Direct evidence for extensive paternal mitochondrial DNA inheritance in the marine mussel Mytilus. Nature 359, 412–414. ( 10.1038/359412a0) [DOI] [PubMed] [Google Scholar]
- 6.Kvist L, Martens J, Nazarenko AA, Orell M. 2003. Paternal leakage of mitochondrial DNA in the great tit (Parus major). Mol. Biol. Evol. 20, 243–247. ( 10.1093/molbev/msg025) [DOI] [PubMed] [Google Scholar]
- 7.Robison GA, Balvin O, Schal C, Vargo EL, Booth W. 2015. Extensive mitochondrial heteroplasmy in natural populations of a resurging human pest, the bed bug (Hemiptera: Cimicidae). J. Med. Entomol. 52, 734–738. ( 10.1093/jme/tjv055) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hey J, Nielsen R. 2004. Multilocus methods for estimating population sizes, migration rates and divergence time, with applications to the divergence of Drosophila pseudoobscura and D. persimilis. Genetics 147, 747–760. ( 10.1534/genetics.103.024182) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ma H, O'Farrell PH. 2015. Selections that isolate recombinant mitochondrial genomes in animals. eLIFE 4, e07247 ( 10.7554/eLife.07247) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Quinzin MC, Mardulyn P. 2014. Multi-locus DNA sequence variation in a complex of four leaf beetle species with parapatric distributions: mitochondrial and nuclear introgressions reveal recent hybridization. Mol. Phylogenet. Evol. 78, 14–24. ( 10.1016/j.ympev.2014.05.003) [DOI] [PubMed] [Google Scholar]
- 11.Vafai SB, Mootha VK. 2012. Mitochondrial disorders as windows into an ancient organelle. Nature 491, 374–383. ( 10.1038/nature11707) [DOI] [PubMed] [Google Scholar]
- 12.Bazin E, Glémin S, Galtier N. 2006. Population size does not influence mitochondrial genetic diversity in animals. Science 312, 570–572. ( 10.1126/science.1122033) [DOI] [PubMed] [Google Scholar]
- 13.Ma H, Xu H, O'Farrell PH. 2014. Transmission of mitochondrial mutation and action of purifying selection in Drosophila. Nat. Genet. 46, 393–397. ( 10.1038/ng.2919) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Nunes MDS, Dolezal M, Schlötterer C. 2013. Extensive paternal mtDNA leakage in natural populations of Drosophila melanogaster. Mol. Ecol. 22, 2106–2117. ( 10.1111/mec.12256) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wolff JN, Nafisinia M, Sutovsky P, Ballard JWO. 2013. Paternal transmission of mitochondrial DNA as an integral part of mitochondrial inheritance in metapopulations of Drosophila simulans. Heredity 110, 57–62. ( 10.1038/hdy.2012.60) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sherengul W, Kondo R, Matsuura ET. 2006. Analysis of paternal transmission of mitochondrial DNA in Drosophila. Genes Genet. Syst. 81, 399–404. ( 10.1266/ggs.81.399) [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Two mt genome and COI haplotype sequences available from GenBank: accession nos. MF563958-MF563963. Additional information uploaded as the electronic supplementary material and table S6.