Evaluation of mitochondrial DNA copy number estimation techniques

Ryan J Longchamps; Christina A Castellani; Stephanie Y Yang; Charles E Newcomb; Jason A Sumpter; John Lane; Megan L Grove; Eliseo Guallar; Nathan Pankratz; Kent D Taylor; Jerome I Rotter; Eric Boerwinkle; Dan E Arking

doi:10.1371/journal.pone.0228166

. 2020 Jan 31;15(1):e0228166. doi: 10.1371/journal.pone.0228166

Evaluation of mitochondrial DNA copy number estimation techniques

Ryan J Longchamps ¹, Christina A Castellani ¹, Stephanie Y Yang ¹, Charles E Newcomb ¹, Jason A Sumpter ¹, John Lane ², Megan L Grove ³, Eliseo Guallar ⁴, Nathan Pankratz ², Kent D Taylor ⁵, Jerome I Rotter ⁵, Eric Boerwinkle ^3,⁶, Dan E Arking ^1,^*

Editor: David C Samuels⁷

¹Department of Genetic Medicine, McKusick-Nathans Institute, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America

²Department of Laboratory Medicine and Pathology, University of Minnesota Medical School, Minneapolis, MN, United States of America

³Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, United States of America

⁴Department of Epidemiology and the Welch Center for Prevention, Epidemiology and Clinical Research, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States of America

⁵LABioMed and Department of Pediatrics, at Harbor-UCLA Medical Center, Institute for Translational Genomics and Population Sciences, Torrance, CA, United States of America

⁶Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, United States of America

⁷Vanderbilt University Medical Center, UNITED STATES

Competing Interests: The authors have declared that no competing interests exist.

^✉

* E-mail: arking@jhmi.edu

Roles

Ryan J Longchamps: Conceptualization, Data curation, Formal analysis, Writing – original draft, Writing – review & editing

Christina A Castellani: Data curation, Formal analysis, Writing – review & editing

Stephanie Y Yang: Data curation, Formal analysis, Writing – review & editing

Charles E Newcomb: Data curation, Formal analysis, Writing – review & editing

Jason A Sumpter: Data curation, Formal analysis, Writing – review & editing

John Lane: Formal analysis, Software, Writing – review & editing

Megan L Grove: Data curation, Project administration, Resources, Writing – review & editing

Eliseo Guallar: Conceptualization, Funding acquisition, Supervision, Writing – review & editing

Nathan Pankratz: Conceptualization, Supervision, Writing – review & editing

Kent D Taylor: Data curation, Funding acquisition, Writing – review & editing

Jerome I Rotter: Data curation, Funding acquisition, Supervision, Writing – review & editing

Eric Boerwinkle: Data curation, Funding acquisition, Supervision, Writing – review & editing

Dan E Arking: Conceptualization, Data curation, Formal analysis, Funding acquisition, Supervision, Writing – original draft, Writing – review & editing

David C Samuels: Editor

PMCID: PMC6994099 PMID: 32004343

Abstract

Mitochondrial DNA copy number (mtDNA-CN), a measure of the number of mitochondrial genomes per cell, is a minimally invasive proxy measure for mitochondrial function and has been associated with several aging-related diseases. Although quantitative real-time PCR (qPCR) is the current gold standard method for measuring mtDNA-CN, mtDNA-CN can also be measured from genotyping microarray probe intensities and DNA sequencing read counts. To conduct a comprehensive examination on the performance of these methods, we use known mtDNA-CN correlates (age, sex, white blood cell count, Duffy locus genotype, incident cardiovascular disease) to evaluate mtDNA-CN calculated from qPCR, two microarray platforms, as well as whole genome (WGS) and whole exome sequence (WES) data across 1,085 participants from the Atherosclerosis Risk in Communities (ARIC) study and 3,489 participants from the Multi-Ethnic Study of Atherosclerosis (MESA). We observe mtDNA-CN derived from WGS data is significantly more associated with known correlates compared to all other methods (p < 0.001). Additionally, mtDNA-CN measured from WGS is on average more significantly associated with traits by 5.6 orders of magnitude and has effect size estimates 5.8 times more extreme than the current gold standard of qPCR. We further investigated the role of DNA extraction method on mtDNA-CN estimate reproducibility and found mtDNA-CN estimated from cell lysate is significantly less variable than traditional phenol-chloroform-isoamyl alcohol (p = 5.44x10^-4) and silica-based column selection (p = 2.82x10^-7). In conclusion, we recommend the field moves towards more accurate methods for mtDNA-CN, as well as re-analyze trait associations as more WGS data becomes available from larger initiatives such as TOPMed.

Introduction

Mitochondrial dysfunction has long been known to play an important role in the underlying etiology of several aging-related diseases, including cardiovascular disease (CVD), neurodegenerative disorders and cancer[1]. As an easily measurable and accessible proxy for mitochondrial function, mitochondrial DNA copy number (mtDNA-CN) is increasingly used to assess the role of mitochondria in disease. Several population-based studies have shown higher levels of mtDNA-CN to be associated with decreased incidence for CVD and its component parts: coronary artery disease (CAD) and stroke[2,3]; neurodegenerative disorders such as Parkinson’s and Alzheimer’s[4,5]; as well as several types of cancer including breast, kidney, liver and colorectal[6–8]. Furthermore, mtDNA-CN measured from peripheral blood has consistently been shown to be higher in women, decline with age, and correlate negatively with white blood cell (WBC) count[9–11].

Although the mtDNA-CN field is relatively young, the number of publications has been steadily increasing at an average rate of 12% per year since 2015[12]. However, there has yet to be a rigorous examination of the various methods for measuring this novel phenotype and the factors which may influence its accurate estimation. Without such an examination, studies may be severely underestimating or misrepresenting the relationship of mtDNA-CN with their traits of interest.

Quantitative real-time PCR (qPCR) has been the most widely used method for measuring mtDNA-CN, partly due to its low cost and quick turnaround time. However, recent work has demonstrated the feasibility of accurately measuring mtDNA-CN from preexisting microarray, whole exome sequencing (WES) and whole genome sequencing (WGS) data[2,10,13]. With these advances, it is important for the field to evaluate these methods in the context of the current gold standard.

In addition to the method for determining mtDNA-CN, it is important to consider the impact of DNA extraction method on mtDNA-CN, particularly due to the small size and circular nature of the mitochondrial genome. Previous research has shown organic solvent extraction is more accurate than silica-based methods at measuring mtDNA-CN, which is unsurprising as column kit parameters are typically optimized for DNA fragments ≥50 Kb[14]. However, as all DNA extraction methods have bias in the DNA which they target, measuring mtDNA-CN from direct cell lysate may prove to be a more accurate method.

In the present study, we assess the relative performance of various methods for measuring mtDNA-CN and the effects of DNA extraction on mtDNA-CN estimation accuracy. We leverage mtDNA-CN calculated across 4,574 individuals from two prospective cohorts, the Atherosclerosis Risk in Communities study (ARIC) and the Multi-Ethnic Study of Atherosclerosis (MESA). Using mtDNA-CN estimates calculated from qPCR, WES, WGS, and two microarray platforms–the Affymetrix Genome-Wide Human SNP Array 6.0 and the Illumina HumanExome BeadChip genotyping array–we compare associations for known correlates of mtDNA-CN including age, sex, white blood cell count, the Duffy locus and incident CVD to determine the optimal method for calculating copy number. We additionally determined the reproducibility of mtDNA-CN measurements in vitro from three separate DNA extraction methods: silica-based column selection, organic solvent extraction (phenol-chloroform-isoamyl alcohol), and measuring mtDNA-CN from direct cell lysis without performing a traditional DNA extraction. We hypothesized that mtDNA-CN calculated from WGS data would outperform other estimation methods and mtDNA-CN measured from direct cell lysate would be more accurate than traditional DNA extraction methods.

Methods

Study populations

The ARIC study recruited 15,792 individuals between 1987 and 1989 aged 45 to 65 years from 4 US communities. DNA for mtDNA-CN estimation was collected from different visits and was derived from buffy coat using the Gentra Puregene Blood Kit (Qiagen). Relevant covariates were derived from the same visit in which DNA was collected. Our analyses were limited to 1,085 individuals with mtDNA-CN data available across all four platforms performed within ARIC: Affymetrix Genome-Wide Human SNP Array 6.0, Illumina HumanExome BeadChip genotyping array, WES and WGS. Eighty-eight percent of our final ARIC participants were Black.

The MESA study recruited 6,814 individuals free of prevalent clinical CVD from 6 US communities across 4 ethnicities. Age range at baseline was 45 to 84 and the baseline exam occurred between 2000 and 2002. DNA for mtDNA-CN analyses was isolated from exam 1 peripheral leukocytes using the Gentra Puregene Blood Kit. Our analyses were restricted to 3,489 White and Black (36%) individuals with mtDNA-CN data available across the three platforms with mtDNA-CN data available at the time of analysis: qPCR, Affymetrix Genome-Wide Human SNP Array 6.0 and Illumina HumanExome BeadChip genotyping array. Exam 1 DNA for the exploratory dPCR pilot study was derived from packed red blood cells.

Measurement of mtDNA-CN

qPCR

mtDNA-CN was determined using a multiplexed real time qPCR assay as previously described[11]. Briefly, the cycle threshold (Ct) value of a mitochondrial-specific (ND1) and nuclear-specific (RPPH1) target were determined in triplicate for each sample. The difference in Ct values (ΔCt) for each replicate represents a raw relative measure of mtDNA-CN. Replicates were removed if they had Ct values for ND1 >28, Ct values for RPPH1 >5 standard deviations from the mean, or ΔCt values >3 standard deviations from the mean of the plate. Outlier replicates were identified and excluded for samples with a ΔCt standard deviation >0.5. The sample was excluded if the ΔCt standard deviation remained >0.5 after replicate removal. We corrected for an observed linear increase in ΔCt value due to the pipetting order of each replicate via linear regression. The mean ΔCt across all replicates was further adjusted for plate effects as a random effect to represent a raw relative measure of mtDNA-CN.

Microarray

mtDNA-CN was determined using the Genvisis[15] software package for both the Affymetrix Genome-Wide Human SNP Array 6.0 and the Illumina HumanExome BeadChip genotyping array. A list of high-quality mitochondrial SNPs were hand-curated by employing BLAST to remove SNPs without a perfect match to the annotated mitochondrial location and SNPs with off-target matches longer than 20 bp. The probe intensities of the remaining mitochondrial SNPs (25 Affymetrix, 58 Illumina Exome Chip) were determined using quantile sketch normalization (apt-probeset-summarize) as implemented in the Affymetrix Power Tools software. The median of the normalized intensity, log R ratio (LRR) for all homozygous calls was GC corrected and used as initial estimates of mtDNA-CN for each sample.

Technical covariates such as DNA quality, DNA quantity, and hybridization efficiency were captured via surrogate variable analysis or principal component analysis as previously described[2]. Surrogate variables or principal components were applied to the BLAST filtered, GC corrected LRR of the remaining autosomal SNPs (43,316 Affymetrix, 47,512 Exome Chip).

These autosomal SNPs were selected based on the following quality filters: call rate >98%, HWE p value >0.00001, PLINK mishap for non-random missingness p value >0.0001, association with sex p value >0.00001, linkage disequilibrium pruning (r² <0.30), with maximal spacing between autosomal SNPs of 41.7 kb.

WES

Whole exome capture was performed using Nimblegen’s VChrome2.1 (Roche) and sequencing was performed on the Illumina HiSeq 2000. Sequence reads were aligned using Burrows-Wheeler Aligner (BWA)[16] to the hg19 reference genome. Variant calling, and quality control were performed as previously described[17]. mtDNA-CN was calculated using the mitoAnalyzer software package, which determines the observed ratios of sequence coverages between autosomal and mtDNA[18,19].

Due to large batch effects observed in our raw mtDNA-CN calls, alignment summary, insert size, quality score, base distribution, sequencing artifact and quality yield metrics were collected using Picard tools (version 1.87) to take into account differences in capture efficiency as well as sequencing and alignment quality[20]. Picard sequencing summary metrics to incorporate into our final model were selected through a stepwise backwards elimination model (S1 Table).

WGS

Whole genome sequencing data was generated at the Baylor College of Medicine Human Genome Sequencing Center using Nano or PCR-free DNA libraries on the Illumina HiSeq 2000. Sequence reads were mapped to the hg19 reference genome using BWA[16]. Variant calling and quality control were performed as previously described[21]. A count for the total number of reads in a sample was scraped from the NCBI sequence read archive using the R package RCurl[22] while reads aligned to the mitochondrial genome were downloaded directly through Samtools (version 1.3.1). A raw measure of mtDNA-CN was calculated as the ratio of mitochondrial reads to the number of total aligned reads. Unlike WES, we did not observe large batch effects in our WGS raw mtDNA-CN calls, obviating the need for adjustment for Picard sequencing summary metrics.

Digital PCR

mtDNA-CN was calculated using a multiplexed digital plate-based PCR (dPCR) method utilizing the ND1 and RPPH1 qPCR probes previously described. Samples were divided into 36,000 partitions on a 24-well plate and the fluorescence for each probe was measured with the Constellation Digital PCR System (Formulatrix, Boston MA). Fluorescence intensity was evaluated with the Formulatrix software and thresholds were based on visual inspection of the aggregate data for each plate. Thresholds were then used to determine the number of positive and negative partitions. Positive counts were fitted to a Poisson distribution to determine copy number[23]. mtDNA-CN was represented as the ratio between the number of ND1 copies/μL and the number of RPPH1 copies/μl. Samples were included if they had fewer than 30,000 positives for ND1 and between 5 and 2,000 positives for RPPH1. Samples were filtered if the observed ratio was not between 15 and 300 ND1:RPPH1. The initial mtDNA-CN ratio was adjusted for plate as a random effect to represent a raw absolute measure of mtDNA-CN.

Cardiovascular disease definition and adjudication

Event adjudication through 2017 in ARIC and 2015 in MESA consisted of expert committee review of death certificates, hospital records and telephone interviews. Incident cardiovascular disease (CVD) was defined as either incident coronary artery disease (CAD) or incident stroke. Incident CAD was defined as first incident MI or death owing to CAD while incident stroke was defined as first nonfatal stroke or death due to stroke. Individuals in ARIC with prevalent CVD at baseline were excluded from incident analyses.

Genotyping and imputation

Genotype calling for the WBC count locus was derived from the Affymetrix Genome-wide Human SNP Array 6.0 in ARIC and MESA. Haplotype phasing for both cohorts was performed using ShapeIt[24] and imputation was performed using IMPUTE2[25]. Genotypes were imputed to the 1000G reference panel (Phase I, version 3). Imputation quality for the Duffy locus lead SNP (rs2814778) was 0.95 and 0.92 in ARIC and MESA, respectively.

DNA extraction method

All DNA used in the DNA extraction comparison were derived from HEK293T cells grown in a single 150T flask to minimize variation due to clonality and cell culture procedures. Extraction were performed with 15 replicates each containing one million cells. mtDNA-CN was determined using qPCR as described previously. To account for the inherent variability in mtDNA-CN estimation, qPCR was run in triplicate.

Silica-based column extraction

We performed a silica-based column extraction using the AllPrep DNA/RNA Mini Kit (Qiagen) according to the manufacturer’s instructions for fewer than 5 x 10⁶ cells. Briefly, HEK293T cells were lysed and the subsequent lysate was pipetted directly onto the DNA Allprep spin column for homogenization and DNA binding. The bound DNA was then washed and eluted.

Organic solvent extraction

An aliquot of cells were lysed with 350 μL of RLT Plus Buffer (Qiagen) and one volume of phenol:chloroform:isoamyl alcohol (25:24:1) (PCIAA) was added to the sample and mixed until it turned milky white. The solution was centrifuged and the upper aqueous phase containing DNA was transferred to a separate tube. We proceeded with an ethanol precipitation protocol using 3M sodium acetate to complete the DNA extraction.

Direct cell lysis

Cells were pelleted at 500 g for 5 minutes and the supernatant was removed. The cell pellet was agitated in 100 μL of QuickExtract DNA Solution (Lucigen) to disrupt the pellet and placed in a thermocycler for 15 minutes at 68°C followed by 10 minutes at 95°C. The cell lysate was then centrifuged at 17,000 g for 15 minutes to pellet any insoluble inhibitors and the supernatant was transferred to a clean tube. The supernatant containing DNA was finally diluted 1:30 with water to limit the impact of any soluble inhibitors on qPCR.

Statistical analyses

The final mtDNA-CN phenotype for all measurement techniques is represented as the standardized residuals from a linear model adjusting the raw measure of mtDNA-CN for age, sex, DNA collection center, and technical covariates. Additionally, mtDNA-CN in ARIC was adjusted for WBC count, and the14.9% of individuals with missing WBC data were imputed to the mean. WBC was not available in MESA for the same visit in which the DNA was obtained. As mtDNA-CN was standardized, the effect size estimates are in units of standard deviations, with positive betas corresponding to an increase in mtDNA-CN.

For analyses involving outcomes which also served as covariates in our final phenotype model (age, sex, WBC count), mtDNA-CN was calculated using the full model minus the outcome variable. For example, when exploring the relationship between mtDNA-CN and age, our mtDNA-CN phenotype would represent the standardized residuals from a model controlling for sex, sample collection center, WBC count and any technical covariates. We would then use this phenotype to explore the association between age and mtDNA-CN such that effect sizes for all comparisons remain in standard deviation units.

The Duffy locus is highly associated with WBC count in Blacks[26] due to its role in conferring a selective advantage to malaria, however this association is limited or absent in other ethnicities[27]. As such, single SNP regression for mtDNA-CN on the Duffy locus was limited to Blacks. Due to the association of mtDNA-CN with WBC count, the Duffy locus acts as another independent external validator for mtDNA-CN unadjusted for WBC count. In ARIC, mtDNA-CN not adjusted for WBC count was used as the independent variable. Single SNP regression models were additionally adjusted for age, sex, sample collection site, and genotyping PCs. Regression analyses were performed with FAST[28].

Cox-proportional hazards regression was used to estimate hazard ratios (HRs) for incident CVD outcomes. Follow-up time was defined from DNA collection through death, lost to follow-up, or study end point (through 2017 in ARIC and 2015 in MESA).

Pairwise F-tests were used to test the null hypothesis that the ratio of variances between the DNA extraction methods is equal to one.

All statistical analyses were performed using R (version 3.3.3).

Ethics statement

Johns Hopkins IRB approved of this study (NA_00091014 / CR00027367). All participants provided written informed consent and all centers obtained approval from their respective institutional review boards.

Results

The study included 1,085 participants from ARIC with mtDNA-CN data from the Affymetrix 6.0 microarray, the Illumina Exome Chip microarray, WES, and WGS while MESA included 3,489 participants with mtDNA-CN data available from qPCR, the Affymetrix 6.0 microarray, and the Illumina Exome Chip microarray (combined N = 4,574). The mean age of study participants was 61.4 years (ARIC, 57.1 years; MESA 62.7 years), 55.3% of participants were female (n = 2,528), and 46.4% of participants were Black (n = 2,124) (Table 1). While the Affymetrix and Illumina Exome Chip arrays were run in both cohorts, at the time of analysis WES and WGS were unique to ARIC and qPCR was unique to MESA.

Table 1. Participant characteristics.

Participant Characteristics	ARIC	MESA
N	1,085	3,489
Sex (female)	672 (61.9)	1,856 (53.2)
Ethnicity (Black)	958 (88.3)	1,226 (35.1)
Age	57.1 ± 5.9	62.7 ± 10.2
WBC count (10³/μl)	5.8 ± 1.7	NA
Incident CVD	174 (16.0)	270 (7.7)

Open in a new tab

Values are number (%) or mean ± SD;

Abbreviations: SD, standard deviation; WBC, white blood cell;

CVD, cardiovascular disease

mtDNA-CN estimation method comparison

To determine the optimal method for measuring mtDNA-CN, we ranked the performance of each technique based on strength of the association, as measured by p values, with the relevant mtDNA-CN correlate (S2 Table). Kendall’s W tests[29] show significant agreement in rankings across correlates in ARIC (p = 0.0019, Kendall’s W = 0.79) and MESA (p = 0.036, Kendall’s W = 0.82) with WGS and the Affymetrix array performing best for each measure in ARIC and MESA, respectively (Table 2).

Table 2. Performance rankings for mtDNA-CN estimation methods.

Cohort	Assay	Age	Sex	WBC	Duffy locus^*	Incident CVD	Mean Rank	Kendall's W p value
ARIC	Exome	2	4	3	4	4	3.4	0.001
	Affy	3	2	2	2	2	2.2
	WES	4	3	4	3	3	3.4
	WGS	1	1	1	1	1	1
MESA	Exome	2	3	NA	2.5	3	2.625	0.03
	Affy	1	1	NA	1	1	1
	qPCR	3	2	NA	2.5	2	2.375

Open in a new tab

*Duffy locus associations were performed in Blacks only

To additionally quantify performance, we created a scoring system for each method using negative log transformed p values standardized to the least significant method for each correlate. These values were then summed across the correlates for each method to achieve an overall rating of performance (S3 Table). These ratings were compared to 1,000 permutations of a random sampling of the standardized and transformed p values for each correlate across the different estimation techniques. In ARIC, WGS had a significantly higher performance score compared to all other methods (p < 0.002) while the Illumina Exome Chip had a significantly lower score (p = 0.03) (S1A Fig). In MESA, Affymetrix had a significantly higher score than qPCR and the Illumina Exome Chip (p = 0.002) (S1B Fig). When removing the contribution of WGS in ARIC, the Affymetrix array had a significantly higher score than the Illumina Exome Chip and WES (p = 0.01) (S1C Fig).

As WGS and Affymetrix performed similarly, we sought to further parse out their performance by evaluating the 2,746 ARIC samples which contained mtDNA-CN from both platforms. On average, WGS performed 2.2 orders of magnitude more significantly than the Affymetrix array (S4 Table).

Due to the recent emergence of digital PCR (dPCR) as a viable method for calculating mtDNA-CN, we performed an additional exploratory analysis in 983 individuals of the MESA cohort comparing the performance of dPCR to qPCR and the Affymetrix genotyping array (S5 Table). While mtDNA-CN calculated from dPCR was more significantly associated with age then either qPCR or the Affymetrix array, dPCR was the least significantly associated metric with sex and the observed association with incident CVD was in the opposite direction as expected (S6 Table).

DNA extraction comparison

Raw mitochondrial estimates from qPCR were mean-zeroed to the plate average and the mean value across the triplicate plates was used to determine the variance across the 15 replicates for each method (Fig 1). The variance for our novel Lyse method was significantly lower at 0.02 compared to 0.17 and 0.59 for the PCIAA and Qiagen Kit extractions respectively (F = 0.13, p = 5.44x10^-4; F = 0.04, p = 2.82x10^-7). Additionally, our findings support previous work[14] demonstrating PCIAA had significantly lower variability compared to the Qiagen Kit (F = 0.29, p = 0.03).

Fig 1 — mtDNA-CN measured by qPCR was mean-zeroed and averaged across three runs for Lyse, PCIAA and Qiagen Kit DNA extractions. Variance for Lyse, PCIAA and Qiagen Kit are 0.02, 0.17 and 0.59 respectively. PCIAA, phenol:chloroform:isoamyl alcohol.

Discussion

We explored several methods for measuring mtDNA-CN in 4,574 self-identified White and Black participants from the ARIC and MESA studies. We found mtDNA-CN estimated from WGS read counts and Affymetrix Genome-Wide Human SNP Array 6.0 probe intensities was more significantly associated with known mtDNA-CN correlates compared to mtDNA-CN estimated from WES, qPCR and the Illumina HumanExome BeadChip. When observing the relative performance of these methods, mtDNA-CN calculated from either WGS or Affymetrix array are, respectively, 5.6 and 5.4 orders of magnitude more significant than the current gold standard of qPCR (Fig 2). These results are not limited to significance as we see similar trends when exploring effect size estimates (Fig 3). For example, when looking at incident CVD, mtDNA-CN measured from WGS observes a substantial HR of 0.63 (0.54–0.74) where as mtDNA-CN measured from qPCR only has a HR of 0.93 (0.82–1.05), a marked difference. As a result, when exploring the relationship between mtDNA-CN and a trait of interest, on average one could expect a result 5.6 orders of magnitude less significant and 6 times less extreme when using mtDNA-CN estimated from qPCR data as opposed to WGS.

Fig 2 — Overall performance for each method scored as mean or median of the negative log-transformed p value across all correlates normalized to the least significant method of each correlate. For ExomeChip and Affymetrix, the mean value across both cohorts was used as the final measure of performance.

Fig 3 — Data points and their corresponding 95% confidence intervals represent the effect size or hazard ratio estimates for mtDNA-CN with Age, Sex, white blood cell (WBC) count, Duffy locus, and incident cardiovascular disease (CVD). Effect size estimates are in standard deviation units. The significance of each estimate is represented as ‘*’ for P < 0.05, ‘**’ for P < 0.01, and ‘***’ for P < 0.001. WBC, white blood cell.

Several recent reports have touted dPCR as the new gold standard for mtDNA-CN estimation due to its ability to quantify absolute copy number[30–32]. In a small subset of MESA samples, we found mtDNA-CN estimates from dPCR were on average 1.15 and 0.55 orders of magnitude less significant than Affymetrix and qPCR respectively (S7 Table). These results suggest dPCR may not measure mtDNA-CN as accurately as both the current gold standard and other recently developed methods. However, it is important to note these findings were derived from a subset of samples a fifth of the size as those from the main findings of the overall study, and thus should be interpreted with caution. Additionally, whereas the dPCR data was derived from DNA from packed red blood cells, the qPCR and Affymetrix data was obtained from peripheral leukocytes potentially explaining the poor performance of dPCR relative to other methods.

Interestingly, mtDNA-CN measured from two seemingly similar microarray platforms differed drastically (S2 Fig). However, this finding is unsurprising when exploring the underlying biochemistry of sample preparation for each microarray platform. While the Affymetrix protocol starts with two restriction enzyme digests prior to whole genome amplification (WGA), the Illumina Exome Chip requires WGA with a processive polymerase prior to sonication. As a result, the mitochondrial genome undergoes rolling circle amplification which occurs at a significantly faster rate than linear WGA[33].

Lower mtDNA-CN has been found to be associated with an increased incidence for several diseases, including end stage renal disease, type 2 diabetes, and non-alcoholic fatty liver disease[34–36]. However, such studies have relied on mtDNA-CN estimated from qPCR data. Our findings suggest much of the current literature may be severely underestimating disease associations with mtDNA-CN as well as its potential as a predictor of disease outcomes. Despite this, at <$2 per sample qPCR may remain the principal method for measuring mtDNA-CN due to the prohibitive costs of WGS. Furthermore, absolute quantification of mtDNA-CN through the use of standard curves may improve upon the performance of qPCR furthering its continuing use[37].

We additionally showed DNA extraction method affects mtDNA-CN estimate reproducibility with copy number measured directly from cell lysate significantly outperforming silica-based column extraction and organic solvent extraction. Although several other studies have explored the impact of DNA isolation protocol on mtDNA-CN estimation[14,38,39], to our knowledge, this is the first study to interrogate the possibility of measuring mtDNA-CN directly from cell lysate. In addition to the superior performance of direct cell lysis, this method is cheaper and has less hands-on time than PCIAA or Qiagen Kit extractions. However, the authors recognize DNA from cell lysate has less downstream utility than traditional DNA extraction procedures potentially limiting its adoption within the mtDNA-CN field when sample availability is limited. Additionally, as our application of the lyse method was limited to cultured cells, it is important to further validate this method in the context of different sample types which may have higher concentrations of inhibitors. Furthermore, it is important to note the various DNA extraction methods resulted in significantly different mtDNA-CN estimates (p = 3.56x10^-11, 0.02, 2.85x10^-7 for Lyse:PCIAA, Lyse:Qiagen Kit, and PCIAA:Qiagen Kit respectively). As such, when choosing an extraction method, it is important to remain consistent across the study.

In conclusion, our study demonstrates mtDNA-CN calculated from WGS reads or Affymetrix microarray probe intensities significantly improves upon the current gold standard method of qPCR. Furthermore, we show direct cell lysis introduces less variability to mtDNA-CN estimates than popular DNA extraction methods. Despite the relative infancy of using mtDNA-CN as a novel risk marker, these findings highlight the need for the field to adapt to current technologies to ensure disease and trait associations are fully realized with a move toward more accurate microarray and WGS methods. Furthermore, due to the prevalence of qPCR in the literature, the authors recommend re-analyzing trait associations as more WGS data becomes available from large initiatives such as TOPMed.

Supporting information

S1 Fig. Permutation test for mtDNA-CN estimation method performance.

(TIF)

Click here for additional data file.^{(926KB, tif)}

S2 Fig. Phenotype correlation plots.

(TIF)

Click here for additional data file.^{(1.1MB, tif)}

S1 Table. Picard sequencing summary metrics definitions.

(XLSX)

Click here for additional data file.^{(10.1KB, xlsx)}

S2 Table. Associations of known correlates with mtDNA-CN estimation platforms.

*Duffy locus associations were performed in Blacks only.

(XLSX)

Click here for additional data file.^{(10.9KB, xlsx)}

S3 Table. Relative performance of methods as rated by standardized -log p values.

*Duffy locus associations were performed in Blacks only.

(XLSX)

Click here for additional data file.^{(10.2KB, xlsx)}

S4 Table. Relative performance of WGS and Affymetrix as rated by standardized -log p values.

*Duffy locus associations were performed in Blacks only.

(XLSX)

Click here for additional data file.^{(9.9KB, xlsx)}

S5 Table. Participant characteristics for dPCR subset.

Values are number (%) or mean ± SD; Abbreviations: SD, standard deviation; CVD, cardiovascular disease.

(XLSX)

Click here for additional data file.^{(10.4KB, xlsx)}

S6 Table. Associations of known correlates with mtDNA-CN estimation platforms for dPCR subset.

*Duffy locus associations were performed in blacks only.

(XLSX)

Click here for additional data file.^{(10.2KB, xlsx)}

S7 Table. Relative performance of methods as rated by standardized -log p values for dPCR subset.

*Duffy locus associations were performed in blacks only. +Affymetrix and dPCR effect size estimates were in opposite direction as known effects and thus the -log p value of qPCR was standardized to a p value of 1 for Affymetrix and dPCR.

(XLSX)

Click here for additional data file.^{(10KB, xlsx)}

Acknowledgments

We thank the staff and participants of the Atherosclerosis Risk in Communities Study, Cardiovascular Health Study, and the Multi-Ethnic Study of Atherosclerosis studies for their important contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org. Digital PCR was conducted at the Genetic Resources Core Facility, Johns Hopkins Institute of Genetic Medicine, Baltimore, MD.

Data Availability

Data from this study are available upon request as these data contain potentially identifying and sensitive patient information. Individuals who wish to access these data are welcome to contact the ARIC coordinating center at UNC (aricpub@unc.edu) or the MESA coordinating center (chsccweb@u.washington.edu).

Funding Statement

This research was supported by grant R01HL131573 from the US National Institutes of Health (Longchamps, Castellani, Guallar, and Arking) and by grant P30AG021334 from the Johns Hopkins University Claude D. Pepper Older Americans Independence Center National Institute on Aging (Dr Arking). The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I). The Multi-Ethnic Study of Atherosclerosis is supported by contracts HHSN268201500003I, N01-HC 95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168 and N01-HC-95169 from the National Heart, Lung, and Blood Institute, and by grants UL1-TR-000040, UL1-TR-001079, and UL1-TR-001420 from the National Center for Advancing Translational Sciences (NCATS). Dr. Rotter and Dr. Taylor’s efforts were supported in part by the National Center for Advancing Translational Sciences, CTSI grant UL1TR001881, and the National Institute of Diabetes and Digestive and Kidney Disease Diabetes Research Center (DRC) grant DK063491 to the Southern California Diabetes Endocrinology Research Center. Funding for SHARe genotyping was provided by National Heart, Lung, and Blood Institute contract N02-HL-64278. The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

References

1.Gómez-Serrano M., Camafeita E., Loureiro M. & Peral B. Mitoproteomics: Tackling Mitochondrial Dysfunction in Human Disease. Oxid. Med. Cell. Longev. 2018, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Ashar F. N. et al. Association of Mitochondrial DNA Copy Number With Cardiovascular Disease. JAMA Cardiol. 2, 1247–1255 (2017). 10.1001/jamacardio.2017.3683 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Chen S. et al. Association between leukocyte mitochondrial DNA content and risk of coronary heart disease: A case-control study. Atherosclerosis 237, 220–226 (2014). 10.1016/j.atherosclerosis.2014.08.051 [DOI] [PubMed] [Google Scholar]
4.Pyle A. et al. Reduced mitochondrial DNA copy number is a biomarker of Parkinson’s disease. Neurobiol. Aging 38, 216.e7–216.e10 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Wei W. et al. Mitochondrial DNA point mutations and relative copy number in 1363 disease and control human brains. Acta Neuropathol. Commun. 5, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Reznik E. et al. Mitochondrial DNA copy number variation across human cancers. eLife 5,. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Hertweck K. L. & Dasgupta S. The Landscape of mtDNA Modifications in Cancer: A Tale of Two Cities. Front. Oncol. 7, 262 (2017). 10.3389/fonc.2017.00262 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Thyagarajan B., Wang R., Barcelo H., Koh W.-P. & Yuan J.-M. Mitochondrial copy number is associated with colorectal cancer risk. Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol. 21, 1574–1581 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Knez J. et al. Correlates of Peripheral Blood Mitochondrial DNA Content in a General Population. Am. J. Epidemiol. 183, 138–146 (2016). 10.1093/aje/kwv175 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Tin A. et al. Association between Mitochondrial DNA Copy Number in Peripheral Blood and Incident CKD in the Atherosclerosis Risk in Communities Study. J. Am. Soc. Nephrol. JASN 27, 2467–2473 (2016). 10.1681/ASN.2015060661 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Ashar F. N. et al. Association of mitochondrial DNA levels with frailty and all-cause mortality. J. Mol. Med. Berl. Ger. 93, 177–186 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Google Scholar. Search Terms: ‘mitochondrial DNA copy number’, ‘mitochondrial DNA content’. https://scholar.google.com.
13.Cai N. et al. Genetic Control over mtDNA and Its Relationship to Major Depressive Disorder. Curr. Biol. 25, 3170–3177 (2015). 10.1016/j.cub.2015.10.065 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Guo W., Jiang L., Bhasin S., Khan S. M. & Swerdlow R. H. DNA Extraction Procedures Meaningfully Influence qPCR-Based mtDNA Copy Number Determination. Mitochondrion 9, 261–265 (2009). 10.1016/j.mito.2009.03.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.MitoPipeline: Generating Mitochondrial copy number estimates from SNP array data in Genvisis. http://genvisis.org/MitoPipeline/.
16.Li H. & Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009). 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Yu B. et al. Association of Rare Loss-Of-Function Alleles in HAL, Serum Histidine Levels and Incident Coronary Heart Disease. Circ. Cardiovasc. Genet. 8, 351–355 (2015). 10.1161/CIRCGENETICS.114.000697 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Ding J. et al. Assessing Mitochondrial DNA Variation and Copy Number in Lymphocytes of ~2,000 Sardinians Using Tailored Sequencing Analysis Tools. PLOS Genet. 11, e1005306 (2015). 10.1371/journal.pgen.1005306 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Qian Y. et al. fastMitoCalc: an ultra-fast program to estimate mitochondrial DNA copy number from whole-genome sequences. Bioinformatics 33, 1399–1401 (2017). 10.1093/bioinformatics/btw835 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Picard Tools—By Broad Institute. https://broadinstitute.github.io/picard/.
21.Morrison A. C. et al. Whole Genome Sequence-Based Analysis of a Model Complex Trait, High Density Lipoprotein Cholesterol. Nat. Genet. 45, 899–901 (2013). 10.1038/ng.2671 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Duncan Temple Lang and the CRAN team (2018). RCurl: General Network (HTTP/FTP/ …) Client Interface for R. R package version 1.95–4.11. https://CRAN.R-project.org/package=RCurl.
23.Quan P.-L., Sauzade M. & Brouzes E. dPCR: A Technology Review. Sensors 18, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Delaneau O., Zagury J.-F. & Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013). 10.1038/nmeth.2307 [DOI] [PubMed] [Google Scholar]
25.Howie B., Fuchsberger C., Stephens M., Marchini J. & Abecasis G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012). 10.1038/ng.2354 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Reiner A. P. et al. Genome-wide association study of white blood cell count in 16,388 African Americans: the continental origins and genetic epidemiology network (COGENT). PLoS Genet. 7, e1002108 (2011). 10.1371/journal.pgen.1002108 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Keller M. F. et al. Trans-ethnic meta-analysis of white blood cell phenotypes. Hum. Mol. Genet. 23, 6944–6960 (2014). 10.1093/hmg/ddu401 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Chanda P., Huang H., Arking D. E. & Bader J. S. Fast Association Tests for Genes with FAST. PLOS ONE 8, e68585 (2013). 10.1371/journal.pone.0068585 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Gouhier T. C. & Guichard F. Synchrony: quantifying variability in space and time. Methods Ecol. Evol. 5, 524–533 (2014). [Google Scholar]
30.Memon A. A. et al. Quantification of mitochondrial DNA copy number in suspected cancer patients by a well optimized ddPCR method. Biomol. Detect. Quantif. 13, 32–39 (2017). 10.1016/j.bdq.2017.08.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Li B. et al. Droplet digital PCR shows the D-Loop to be an error prone locus for mitochondrial DNA copy number determination. Sci. Rep. 8, 11392 (2018). 10.1038/s41598-018-29621-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Ye W. et al. Accurate quantitation of circulating cell-free mitochondrial DNA in plasma by droplet digital PCR. Anal. Bioanal. Chem. 409, 2727–2735 (2017). 10.1007/s00216-017-0217-x [DOI] [PubMed] [Google Scholar]
33.Dean F. B., Nelson J. R., Giesler T. L. & Lasken R. S. Rapid Amplification of Plasmid and Phage DNA Using Phi29 DNA Polymerase and Multiply-Primed Rolling Circle Amplification. Genome Res. 11, 1095–1099 (2001). 10.1101/gr.180501 [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Zhang Y. et al. Associations of mitochondrial haplogroups and mitochondrial DNA copy numbers with end-stage renal disease in a Han population. Mitochondrial DNA Part A 28, 725–731 (2017). [DOI] [PubMed] [Google Scholar]
35.Lee H. K. et al. Decreased mitochondrial DNA content in peripheral blood precedes the development of non-insulin-dependent diabetes mellitus. Diabetes Res. Clin. Pract. 42, 161–167 (1998). 10.1016/s0168-8227(98)00110-7 [DOI] [PubMed] [Google Scholar]
36.Sookoian S. et al. Epigenetic regulation of insulin resistance in nonalcoholic fatty liver disease: Impact of liver methylation of the peroxisome proliferator–activated receptor γ coactivator 1α promoter. Hepatology 52, 1992–2000 (2010). 10.1002/hep.23927 [DOI] [PubMed] [Google Scholar]
37.Phillips N. R., Sprouse M. L. & Roby R. K. Simultaneous quantification of mitochondrial DNA copy number and deletion ratio: A multiplex real-time PCR assay. Sci. Rep. 4, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Hurtado-Roca Y. et al. Adjusting MtDNA Quantification in Whole Blood for Peripheral Blood Platelet and Leukocyte Counts. PLOS ONE 11, e0163770 (2016). 10.1371/journal.pone.0163770 [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Nacheva E. et al. DNA isolation protocol effects on nuclear DNA analysis by microarrays, droplet digital PCR, and whole genome sequencing, and on mitochondrial DNA copy number estimation. PLoS ONE 12, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0228166.r001

Decision Letter 0

David C Samuels

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

26 Nov 2019

PONE-D-19-28415

Evaluation of mitochondrial DNA copy number estimation

PLOS ONE

Dear Dr. Arking,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Jan 10 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

David C. Samuels

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following financial disclosure:

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

a) Please provide an amended Funding Statement that declares *all* the funding or sources of support received during this specific study (whether external or internal to your organization) as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now.

b) Please state what role the funders took in the study. If any authors received a salary from any of your funders, please state which authors and which funder. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

3. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

4. PLOS requires an ORCID iD for the corresponding author in Editorial Manager on papers submitted after December 6th, 2016. Please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. Please see the following video for instructions on linking an ORCID iD to your Editorial Manager account: https://www.youtube.com/watch?v=_xcclfuvtxQ

5. Your ethics statement must appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please also ensure that your ethics statement is included in your manuscript, as the ethics section of your online submission will not be published alongside your manuscript.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Lonchamps et al. provide their thorough work in evaluating several methods for assessing mtDNA-CN. Overall, the paper is well-written and provides robust data analysis methods. I recommend minor revisions to address the following points.

• Can the authors please either (1) provide and explanation for why the Duffy locus was only analyzed in African Americans, or (2) provide citation(s) as justification?

• Figure 3- the legend is seemingly mislabeled; I believe that the green series represents the ExomeChip data and the Purple series represents Affy.

• Given that the authors only tested the “Lyse” method on cultured cells, which likely have fewer inhibitors compared to other sample types (e.g., heme carry over in blood derived samples), the authors should discuss briefly that the method may have limited applicability based on sample type.

• The authors discuss and evaluate relative qPCR for quantification of mtDNA-CN, but do not evaluate or discuss any of the published absolute qPCR assays; this may warrant a brief discussion point since absolute qPCR has the advantage of monitoring PCR efficiency (which can greatly alter copy number estimates) and batch effects.

• Throughout the manuscript, the authors interchangeably use "black" and "African Americans"; I would suggest picking one or the other to maintain consistency.

• If the authors choose to use "white" and "black" as their racial descriptors, please capitalize the first letter of each term.

• Pg.10: "DNA for mtDNA-CN estimation was collected from different visits and was derived from buffy coat using the Gentra Puregene Blood Kit (Qiagen)."---Can the authors verify that relevant variables (e.g., WBC count) were also collected from the same visit as the DNA sample?

• Pg.16: "Follow-up time was defined from DNA collection through death, loss to follow-up, or study end point (through 2017 in ARIC and 2015 in MESA)."—suggested edit to “lost to follow-up”.

• Many of the tables have poor resolution or illegible text; particularly, in supplemental Figure 2, the authors should consider changing the x and y axis labels to a larger font size.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Jan 31;15(1):e0228166. doi: 10.1371/journal.pone.0228166.r002

Author response to Decision Letter 0

30 Dec 2019

We wish to thank the reviewer for their thorough and helpful review of our manuscript entitled “Evaluation of mitochondrial DNA copy number estimation techniques”. We have addressed the reviewer comments in the revised manuscript and listed our responses to the comments below. Reviewer comments are listed in red text, our response is shown in black and text which was added to our manuscript is italicized in quote blocks.

Comments:

1. Can the authors please either (1) provide and explanation for why the Duffy locus was only analyzed in African Americans, or (2) provide citation(s) as justification?

We agree with the reviewer that we were not clear as to why the Duffy locus was 1) chosen as a correlate and 2) why it was only analyzed in Blacks. We have provided the following additional information between lines 259 and 263 on page 13 within the statistical analyses section of the Methods.

“The Duffy locus is highly associated with WBC count in Blacks26 due to its role in conferring a selective advantage to malaria, however this association is limited or absent in other ethnicities27. As such, single SNP regression for mtDNA-CN on the Duffy locus was limited to Blacks. Due to the association of mtDNA-CN with WBC count, the Duffy locus acts as another independent external validator for mtDNA-CN unadjusted for WBC count.”

2. Figure 3- the legend is seemingly mislabeled; I believe that the green series represents the ExomeChip data and the Purple series represents Affy.

Thank you, the figure has been modified.

3. Given that the authors only tested the “Lyse” method on cultured cells, which likely have fewer inhibitors compared to other sample types (e.g., heme carry over in blood derived samples), the authors should discuss briefly that the method may have limited applicability based on sample type.

We have added the following comment on page 19 (lines 416-419) within our discussion to highlight that our Lyse method needs further validation in non-cultured cells.

“Additionally, as our application of the lyse method was limited to cultured cells, it is important to further validate this method in the context of different sample types which may have higher concentrations of inhibitors.”

4. The authors discuss and evaluate relative qPCR for quantification of mtDNA-CN, but do not evaluate or discuss any of the published absolute qPCR assays; this may warrant a brief discussion point since absolute qPCR has the advantage of monitoring PCR efficiency (which can greatly alter copy number estimates) and batch effects.

We agree and have added a comment on page 19 of the discussion (lines 404-407) pointing out that absolute qPCR may improve upon the performance of mtDNA-CN estimation.

“Furthermore, absolute quantification of mtDNA-CN through the use of standard curves may improve upon the performance of qPCR furthering its continuing use37.”

5. Throughout the manuscript, the authors interchangeably use "black" and "African Americans"; I would suggest picking one or the other to maintain consistency.

Thank you, this change has been made as suggested

6. If the authors choose to use "white" and "black" as their racial descriptors, please capitalize the first letter of each term.

Thank you, we have capitalized all racial descriptors.

7. Pg.10: "DNA for mtDNA-CN estimation was collected from different visits and was derived from buffy coat using the Gentra Puregene Blood Kit (Qiagen)."---Can the authors verify that relevant variables (e.g., WBC count) were also collected from the same visit as the DNA sample?

We have clarified that relevant covariates were derived from the same visit in which the DNA was collected.

8. Pg.16: "Follow-up time was defined from DNA collection through death, loss to follow-up, or study end point (through 2017 in ARIC and 2015 in MESA)."—suggested edit to “lost to follow-up”.

The suggested edits have been made.

9. Many of the tables have poor resolution or illegible text; particularly, in supplemental Figure 2, the authors should consider changing the x and y axis labels to a larger font size.

Thank you for pointing this out, the figures and tables have been adjusted.

During the review process we were additionally made aware that the DNA used in our dPCR exploratory study was derived from packed red blood cells and not peripheral leukocytes as we previously believed. We have made this clear on page 6 of the methods as well as commented briefly on this difference on page 18 of the discussion the discussion.

“Additionally, whereas the dPCR data was derived from DNA from packed red blood cells, the qPCR and Affymetrix data was obtained from peripheral leukocytes potentially explaining the poor performance of dPCR relative to other methods.”

We thank you for your time and consideration of our revised manuscript. We look forward to your feedback.

Yours sincerely,

Dan E. Arking, PhD

Attachment

Submitted filename: Response To Reviewers.docx

Click here for additional data file.^{(66.2KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0228166.r003

Decision Letter 1

David C Samuels

9 Jan 2020

Evaluation of mitochondrial DNA copy number estimation techniques

PONE-D-19-28415R1

Dear Dr. Arking,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

David C. Samuels

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0228166.r004

Acceptance letter

David C Samuels

14 Jan 2020

PONE-D-19-28415R1

Evaluation of mitochondrial DNA copy number estimation techniques

Dear Dr. Arking:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. David C. Samuels

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Fig. Permutation test for mtDNA-CN estimation method performance.

(TIF)

Click here for additional data file.^{(926KB, tif)}

S2 Fig. Phenotype correlation plots.

(TIF)

Click here for additional data file.^{(1.1MB, tif)}

S1 Table. Picard sequencing summary metrics definitions.

(XLSX)

Click here for additional data file.^{(10.1KB, xlsx)}

S2 Table. Associations of known correlates with mtDNA-CN estimation platforms.

*Duffy locus associations were performed in Blacks only.

(XLSX)

Click here for additional data file.^{(10.9KB, xlsx)}

S3 Table. Relative performance of methods as rated by standardized -log p values.

*Duffy locus associations were performed in Blacks only.

(XLSX)

Click here for additional data file.^{(10.2KB, xlsx)}

S4 Table. Relative performance of WGS and Affymetrix as rated by standardized -log p values.

*Duffy locus associations were performed in Blacks only.

(XLSX)

Click here for additional data file.^{(9.9KB, xlsx)}

S5 Table. Participant characteristics for dPCR subset.

Values are number (%) or mean ± SD; Abbreviations: SD, standard deviation; CVD, cardiovascular disease.

(XLSX)

Click here for additional data file.^{(10.4KB, xlsx)}

S6 Table. Associations of known correlates with mtDNA-CN estimation platforms for dPCR subset.

*Duffy locus associations were performed in blacks only.

(XLSX)

Click here for additional data file.^{(10.2KB, xlsx)}

S7 Table. Relative performance of methods as rated by standardized -log p values for dPCR subset.

(XLSX)

Click here for additional data file.^{(10KB, xlsx)}

Attachment

Submitted filename: Response To Reviewers.docx

Click here for additional data file.^{(66.2KB, docx)}

Data Availability Statement

[pone.0228166.ref001] 1.Gómez-Serrano M., Camafeita E., Loureiro M. & Peral B. Mitoproteomics: Tackling Mitochondrial Dysfunction in Human Disease. Oxid. Med. Cell. Longev. 2018, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref002] 2.Ashar F. N. et al. Association of Mitochondrial DNA Copy Number With Cardiovascular Disease. JAMA Cardiol. 2, 1247–1255 (2017). 10.1001/jamacardio.2017.3683 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref003] 3.Chen S. et al. Association between leukocyte mitochondrial DNA content and risk of coronary heart disease: A case-control study. Atherosclerosis 237, 220–226 (2014). 10.1016/j.atherosclerosis.2014.08.051 [DOI] [PubMed] [Google Scholar]

[pone.0228166.ref004] 4.Pyle A. et al. Reduced mitochondrial DNA copy number is a biomarker of Parkinson’s disease. Neurobiol. Aging 38, 216.e7–216.e10 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref005] 5.Wei W. et al. Mitochondrial DNA point mutations and relative copy number in 1363 disease and control human brains. Acta Neuropathol. Commun. 5, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref006] 6.Reznik E. et al. Mitochondrial DNA copy number variation across human cancers. eLife 5,. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref007] 7.Hertweck K. L. & Dasgupta S. The Landscape of mtDNA Modifications in Cancer: A Tale of Two Cities. Front. Oncol. 7, 262 (2017). 10.3389/fonc.2017.00262 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref008] 8.Thyagarajan B., Wang R., Barcelo H., Koh W.-P. & Yuan J.-M. Mitochondrial copy number is associated with colorectal cancer risk. Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol. 21, 1574–1581 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref009] 9.Knez J. et al. Correlates of Peripheral Blood Mitochondrial DNA Content in a General Population. Am. J. Epidemiol. 183, 138–146 (2016). 10.1093/aje/kwv175 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref010] 10.Tin A. et al. Association between Mitochondrial DNA Copy Number in Peripheral Blood and Incident CKD in the Atherosclerosis Risk in Communities Study. J. Am. Soc. Nephrol. JASN 27, 2467–2473 (2016). 10.1681/ASN.2015060661 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref011] 11.Ashar F. N. et al. Association of mitochondrial DNA levels with frailty and all-cause mortality. J. Mol. Med. Berl. Ger. 93, 177–186 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref012] 12.Google Scholar. Search Terms: ‘mitochondrial DNA copy number’, ‘mitochondrial DNA content’. https://scholar.google.com.

[pone.0228166.ref013] 13.Cai N. et al. Genetic Control over mtDNA and Its Relationship to Major Depressive Disorder. Curr. Biol. 25, 3170–3177 (2015). 10.1016/j.cub.2015.10.065 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref014] 14.Guo W., Jiang L., Bhasin S., Khan S. M. & Swerdlow R. H. DNA Extraction Procedures Meaningfully Influence qPCR-Based mtDNA Copy Number Determination. Mitochondrion 9, 261–265 (2009). 10.1016/j.mito.2009.03.003 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref015] 15.MitoPipeline: Generating Mitochondrial copy number estimates from SNP array data in Genvisis. http://genvisis.org/MitoPipeline/.

[pone.0228166.ref016] 16.Li H. & Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009). 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref017] 17.Yu B. et al. Association of Rare Loss-Of-Function Alleles in HAL, Serum Histidine Levels and Incident Coronary Heart Disease. Circ. Cardiovasc. Genet. 8, 351–355 (2015). 10.1161/CIRCGENETICS.114.000697 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref018] 18.Ding J. et al. Assessing Mitochondrial DNA Variation and Copy Number in Lymphocytes of ~2,000 Sardinians Using Tailored Sequencing Analysis Tools. PLOS Genet. 11, e1005306 (2015). 10.1371/journal.pgen.1005306 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref019] 19.Qian Y. et al. fastMitoCalc: an ultra-fast program to estimate mitochondrial DNA copy number from whole-genome sequences. Bioinformatics 33, 1399–1401 (2017). 10.1093/bioinformatics/btw835 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref020] 20.Picard Tools—By Broad Institute. https://broadinstitute.github.io/picard/.

[pone.0228166.ref021] 21.Morrison A. C. et al. Whole Genome Sequence-Based Analysis of a Model Complex Trait, High Density Lipoprotein Cholesterol. Nat. Genet. 45, 899–901 (2013). 10.1038/ng.2671 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref022] 22.Duncan Temple Lang and the CRAN team (2018). RCurl: General Network (HTTP/FTP/ …) Client Interface for R. R package version 1.95–4.11. https://CRAN.R-project.org/package=RCurl.

[pone.0228166.ref023] 23.Quan P.-L., Sauzade M. & Brouzes E. dPCR: A Technology Review. Sensors 18, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref024] 24.Delaneau O., Zagury J.-F. & Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013). 10.1038/nmeth.2307 [DOI] [PubMed] [Google Scholar]

[pone.0228166.ref025] 25.Howie B., Fuchsberger C., Stephens M., Marchini J. & Abecasis G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012). 10.1038/ng.2354 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref026] 26.Reiner A. P. et al. Genome-wide association study of white blood cell count in 16,388 African Americans: the continental origins and genetic epidemiology network (COGENT). PLoS Genet. 7, e1002108 (2011). 10.1371/journal.pgen.1002108 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref027] 27.Keller M. F. et al. Trans-ethnic meta-analysis of white blood cell phenotypes. Hum. Mol. Genet. 23, 6944–6960 (2014). 10.1093/hmg/ddu401 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref028] 28.Chanda P., Huang H., Arking D. E. & Bader J. S. Fast Association Tests for Genes with FAST. PLOS ONE 8, e68585 (2013). 10.1371/journal.pone.0068585 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref029] 29.Gouhier T. C. & Guichard F. Synchrony: quantifying variability in space and time. Methods Ecol. Evol. 5, 524–533 (2014). [Google Scholar]

[pone.0228166.ref030] 30.Memon A. A. et al. Quantification of mitochondrial DNA copy number in suspected cancer patients by a well optimized ddPCR method. Biomol. Detect. Quantif. 13, 32–39 (2017). 10.1016/j.bdq.2017.08.001 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref031] 31.Li B. et al. Droplet digital PCR shows the D-Loop to be an error prone locus for mitochondrial DNA copy number determination. Sci. Rep. 8, 11392 (2018). 10.1038/s41598-018-29621-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref032] 32.Ye W. et al. Accurate quantitation of circulating cell-free mitochondrial DNA in plasma by droplet digital PCR. Anal. Bioanal. Chem. 409, 2727–2735 (2017). 10.1007/s00216-017-0217-x [DOI] [PubMed] [Google Scholar]

[pone.0228166.ref033] 33.Dean F. B., Nelson J. R., Giesler T. L. & Lasken R. S. Rapid Amplification of Plasmid and Phage DNA Using Phi29 DNA Polymerase and Multiply-Primed Rolling Circle Amplification. Genome Res. 11, 1095–1099 (2001). 10.1101/gr.180501 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref034] 34.Zhang Y. et al. Associations of mitochondrial haplogroups and mitochondrial DNA copy numbers with end-stage renal disease in a Han population. Mitochondrial DNA Part A 28, 725–731 (2017). [DOI] [PubMed] [Google Scholar]

[pone.0228166.ref035] 35.Lee H. K. et al. Decreased mitochondrial DNA content in peripheral blood precedes the development of non-insulin-dependent diabetes mellitus. Diabetes Res. Clin. Pract. 42, 161–167 (1998). 10.1016/s0168-8227(98)00110-7 [DOI] [PubMed] [Google Scholar]

[pone.0228166.ref036] 36.Sookoian S. et al. Epigenetic regulation of insulin resistance in nonalcoholic fatty liver disease: Impact of liver methylation of the peroxisome proliferator–activated receptor γ coactivator 1α promoter. Hepatology 52, 1992–2000 (2010). 10.1002/hep.23927 [DOI] [PubMed] [Google Scholar]

[pone.0228166.ref037] 37.Phillips N. R., Sprouse M. L. & Roby R. K. Simultaneous quantification of mitochondrial DNA copy number and deletion ratio: A multiplex real-time PCR assay. Sci. Rep. 4, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref038] 38.Hurtado-Roca Y. et al. Adjusting MtDNA Quantification in Whole Blood for Peripheral Blood Platelet and Leukocyte Counts. PLOS ONE 11, e0163770 (2016). 10.1371/journal.pone.0163770 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0228166.ref039] 39.Nacheva E. et al. DNA isolation protocol effects on nuclear DNA analysis by microarrays, droplet digital PCR, and whole genome sequencing, and on mitochondrial DNA copy number estimation. PLoS ONE 12, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Evaluation of mitochondrial DNA copy number estimation techniques

Ryan J Longchamps

Christina A Castellani

Stephanie Y Yang

Charles E Newcomb

Jason A Sumpter

John Lane

Megan L Grove

Eliseo Guallar

Nathan Pankratz

Kent D Taylor

Jerome I Rotter

Eric Boerwinkle

Dan E Arking

Roles

Abstract

Introduction

Methods

Study populations

Measurement of mtDNA-CN

qPCR

Microarray

WES

WGS

Digital PCR

Cardiovascular disease definition and adjudication

Genotyping and imputation

DNA extraction method

Silica-based column extraction

Organic solvent extraction

Direct cell lysis

Statistical analyses

Ethics statement

Results

Table 1. Participant characteristics.

mtDNA-CN estimation method comparison

Table 2. Performance rankings for mtDNA-CN estimation methods.

DNA extraction comparison

Fig 1. mtDNA-CN measured across DNA extraction methods.

Discussion

Fig 2. Relative overall performance of mtDNA-CN estimation methods.

Fig 3. Effect size and hazard ratio estimates for mtDNA-CN with known correlates.

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

David C Samuels

Roles

Transfer Alert

Author response to Decision Letter 0

Decision Letter 1

David C Samuels

Roles

Acceptance letter

David C Samuels

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases